[maker-devel] Dashes in transcript predictions

Carson Holt carsonhh at gmail.com
Mon Mar 24 12:15:39 MDT 2014


One more note on this.  The sequence is actually fully correct if you just
remove the '-' characters.  So if you don't want to rerun MAKER with the
patch, then you can use the attached script to just repair the transcript
file by removing the '-' characters.  Your GFF3 files and proteins files
should already be correct as is.

Usage --> perl fix_dash transcript_file.fasta > new_file.fasta

You may need to place the script in the .../maker/bin/ directory so it can
detect BioPerl if you don't have BioPerl installed system wide.

Thanks,
Carson

From:  Carson Holt <carsonhh at gmail.com>
Date:  Monday, March 24, 2014 at 10:49 AM
To:  Sujai <sujaikumar at gmail.com>, "maker-devel at yandell-lab.org"
<maker-devel at yandell-lab.org>
Subject:  Re: [maker-devel] Dashes in transcript predictions

I've actually never seen that before, but looking through your output it
appears to be specifically caused by setting correct_est_fusion=1, and how
it interacts with some features of your dataset.

I've attached a patch in the form of a file you can use to replace
.../maker/lib/maker/join.pm.  I'm also going to add it to the MAKER
download.

Thanks,
Carson


From:  Sujai <sujaikumar at gmail.com>
Date:  Monday, March 24, 2014 at 8:15 AM
To:  "maker-devel at yandell-lab.org" <maker-devel at yandell-lab.org>
Subject:  [maker-devel] Dashes in transcript predictions

Dear Maker Team

On a recent run with maker 2.31, I noticed that a couple of the transcripts
had dashes/hyphens in them.

Example:
>snap_masked-nGt.0.3.035610-processed-gene-0.2-mRNA-1 transcript offset:261
AED:0.25 eAED:0.25 QI:261|0.4|0.83|0.83|0.8|0.83|6|0|240
TTTGATTATTAATTATTTTTGTCTTTATTAA-------AAAATAATTTTGGTACAAACAATCGAATTAATAT-TAA
TTAAAGTTTTTATCAGCCTTATAAAATCTACGACACCGGCTTTTACCAATGTTTAGCG
AGTGATTCTCTCAACAGAAGTATCTCCAAATCAATATTCGTTGAATGTAAATGAACCCAAACACCTTATTCTCATT
CCTCCGGAAGAAGCTCCTGAATCAACTTTTGATCTCTACAGTAATGTATCTATGAATT
GCGAAGGAAGAAGTTATTTTCCGAATCAACCAATCATTGTTAATTGGATGTTTAAACATAAAGACTCATATACGAC
CATAACAAGAGATCACAAAATGGCTACAAGAATAATCACTGCATCAAACAGATCAAAG
GAAACTAATCTTGATTTGGTCAATATATTTTCTTACCTTACCATAAATGATATCCGCGAAGAAGATGGTGGAGTTT
ACAAATGTGTGATGACTCAAGGAAGTGTTGACGAAGAACAAGAATTTCTAGTAACTAT
AAACAATCAAAGTGAAAAGGAAATTGATGTATCCATTTTTTACCAAGATGATGACTTTGTAAGTGTTCGAGCAGCC
TTAGAAACAGTCAAGATTTTAGAGAATTACCAGTTTCGATGTTGGTTGTACGACCGGG
ATAAGACGTATGGTCAAGACGCCGGGAAGCCGACGAAATCGACAGAAAACCGTATAGGTCGTTATTATCAGTCAAA
ATATTCTGATTGTTCTCAATTTCGCATAGAAAGTTTCTATCAGCTGCCAATTTCTGTT
AACCGATGGCTGAAAAAAGAACTCAGTTTACAGTCTTTCTTTCAGCCATTTAGCTTTAATTGGGACCCTCAAAAAA
CCCCTAAAAACAAGAAAATGGTAGTATGGGTTGTTTCTTCCCTACCCTCAGCGGCGAT
TCGTAATGCAAAGAGAAGAATCAATGAACAATCTTCTCATGTATAA

The protein prediction for this transcript is ok:

>snap_masked-nGt.0.3.035610-processed-gene-0.2-mRNA-1 protein AED:0.25 eAED:0.25
QI:261|0.4|0.83|0.83|0.8|0.83|6|0|240
MNCEGRSYFPNQPIIVNWMFKHKDSYTTITRDHKMATRIITASNRSKETNLDLVNIFSYLTINDIREEDGGVYKCV
MTQGSVDEEQEFLVTINNQSEKEIDVSIFYQDDDFVSVRAALETVKILENYQFRCWLY
DRDKTYGQDAGKPTKSTENRIGRYYQSKYSDCSQFRIESFYQLPISVNRWLKKELSLQSFFQPFSFNWDPQKTPKN
KKMVVWVVSSLPSAAIRNAKRRINEQSSHV

Is this a known bug? I tried searching for "dash|hyphen" in the email list
but couldn't find anything else.

Best wishes,

- Sujai

ps. I pulled out just this one contig and ran maker on it. all the
.maker.output files are attached.


_______________________________________________ maker-devel mailing list
maker-devel at box290.bluehost.comhttp://box290.bluehost.com/mailman/listinfo/m
aker-devel_yandell-lab.org


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20140324/0a71390d/attachment-0003.html>


More information about the maker-devel mailing list