[maker-devel] Dashes in transcript predictions
Carson Holt
carsonhh at gmail.com
Mon Mar 24 12:15:39 MDT 2014
One more note on this. The sequence is actually fully correct if you just
remove the '-' characters. So if you don't want to rerun MAKER with the
patch, then you can use the attached script to just repair the transcript
file by removing the '-' characters. Your GFF3 files and proteins files
should already be correct as is.
Usage --> perl fix_dash transcript_file.fasta > new_file.fasta
You may need to place the script in the .../maker/bin/ directory so it can
detect BioPerl if you don't have BioPerl installed system wide.
Thanks,
Carson
From: Carson Holt <carsonhh at gmail.com>
Date: Monday, March 24, 2014 at 10:49 AM
To: Sujai <sujaikumar at gmail.com>, "maker-devel at yandell-lab.org"
<maker-devel at yandell-lab.org>
Subject: Re: [maker-devel] Dashes in transcript predictions
I've actually never seen that before, but looking through your output it
appears to be specifically caused by setting correct_est_fusion=1, and how
it interacts with some features of your dataset.
I've attached a patch in the form of a file you can use to replace
.../maker/lib/maker/join.pm. I'm also going to add it to the MAKER
download.
Thanks,
Carson
From: Sujai <sujaikumar at gmail.com>
Date: Monday, March 24, 2014 at 8:15 AM
To: "maker-devel at yandell-lab.org" <maker-devel at yandell-lab.org>
Subject: [maker-devel] Dashes in transcript predictions
Dear Maker Team
On a recent run with maker 2.31, I noticed that a couple of the transcripts
had dashes/hyphens in them.
Example:
>snap_masked-nGt.0.3.035610-processed-gene-0.2-mRNA-1 transcript offset:261
AED:0.25 eAED:0.25 QI:261|0.4|0.83|0.83|0.8|0.83|6|0|240
TTTGATTATTAATTATTTTTGTCTTTATTAA-------AAAATAATTTTGGTACAAACAATCGAATTAATAT-TAA
TTAAAGTTTTTATCAGCCTTATAAAATCTACGACACCGGCTTTTACCAATGTTTAGCG
AGTGATTCTCTCAACAGAAGTATCTCCAAATCAATATTCGTTGAATGTAAATGAACCCAAACACCTTATTCTCATT
CCTCCGGAAGAAGCTCCTGAATCAACTTTTGATCTCTACAGTAATGTATCTATGAATT
GCGAAGGAAGAAGTTATTTTCCGAATCAACCAATCATTGTTAATTGGATGTTTAAACATAAAGACTCATATACGAC
CATAACAAGAGATCACAAAATGGCTACAAGAATAATCACTGCATCAAACAGATCAAAG
GAAACTAATCTTGATTTGGTCAATATATTTTCTTACCTTACCATAAATGATATCCGCGAAGAAGATGGTGGAGTTT
ACAAATGTGTGATGACTCAAGGAAGTGTTGACGAAGAACAAGAATTTCTAGTAACTAT
AAACAATCAAAGTGAAAAGGAAATTGATGTATCCATTTTTTACCAAGATGATGACTTTGTAAGTGTTCGAGCAGCC
TTAGAAACAGTCAAGATTTTAGAGAATTACCAGTTTCGATGTTGGTTGTACGACCGGG
ATAAGACGTATGGTCAAGACGCCGGGAAGCCGACGAAATCGACAGAAAACCGTATAGGTCGTTATTATCAGTCAAA
ATATTCTGATTGTTCTCAATTTCGCATAGAAAGTTTCTATCAGCTGCCAATTTCTGTT
AACCGATGGCTGAAAAAAGAACTCAGTTTACAGTCTTTCTTTCAGCCATTTAGCTTTAATTGGGACCCTCAAAAAA
CCCCTAAAAACAAGAAAATGGTAGTATGGGTTGTTTCTTCCCTACCCTCAGCGGCGAT
TCGTAATGCAAAGAGAAGAATCAATGAACAATCTTCTCATGTATAA
The protein prediction for this transcript is ok:
>snap_masked-nGt.0.3.035610-processed-gene-0.2-mRNA-1 protein AED:0.25 eAED:0.25
QI:261|0.4|0.83|0.83|0.8|0.83|6|0|240
MNCEGRSYFPNQPIIVNWMFKHKDSYTTITRDHKMATRIITASNRSKETNLDLVNIFSYLTINDIREEDGGVYKCV
MTQGSVDEEQEFLVTINNQSEKEIDVSIFYQDDDFVSVRAALETVKILENYQFRCWLY
DRDKTYGQDAGKPTKSTENRIGRYYQSKYSDCSQFRIESFYQLPISVNRWLKKELSLQSFFQPFSFNWDPQKTPKN
KKMVVWVVSSLPSAAIRNAKRRINEQSSHV
Is this a known bug? I tried searching for "dash|hyphen" in the email list
but couldn't find anything else.
Best wishes,
- Sujai
ps. I pulled out just this one contig and ran maker on it. all the
.maker.output files are attached.
_______________________________________________ maker-devel mailing list
maker-devel at box290.bluehost.comhttp://box290.bluehost.com/mailman/listinfo/m
aker-devel_yandell-lab.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20140324/0a71390d/attachment-0003.html>
More information about the maker-devel
mailing list