[maker-devel] est2genome wrong strand
Carson Holt
carsonhh at gmail.com
Fri Mar 20 08:54:28 MDT 2015
Hi Brian,
Multi-exon ESTs are stranded by their splice sites which can only work on one strand. Single exon ESTs on the other hand are stranded by their open reading frame. The standard chemistry used in most sequencing does not allow for strand specific sequence. There are technologies that do, but unless you used one of those, re-stranding single exon ESTs based on ORF is the best way to figure out where single exon alignments should go (not a completely reliable method but it works most of the time). I believe trinity tries to determine strand the same way (but it is unreliable there too). For example since your alignment is not an exact match to the genomic sequence (single bp deletion in the alignment) the best open reading frame in the transcript is not the same as the best open reading frame in the genomic sequence (so one of them likely contains an error). MAKER since it is annotating the genome logically re-strands it to the genome and trinity (being unaware of the genome strands it to the transcript). Because single exon alignments are very unreliable, they are ignored in MAKER by default. They will not be used as hints for gene predictors and can only be used to support a gene if there is also protein evidence or a single exon ab initio prediction at the same location to support it (even then this will only happen if you set single_exon=1 in the control files).
—Carson
On Mar 20, 2015, at 7:17 AM, Mack, Brian <Brian.Mack at ARS.USDA.GOV <mailto:Brian.Mack at ARS.USDA.GOV>> wrote:
> Hi, I’ve noticed a what seems to be a flipping of the strand of some of my transcripts from est2genome. I assembled my directional rna-seq reads using Trinity. I’ve copied an example below. The blastn within maker shows the transcript aligning on the positive strand as does blastn against my genome in sequencserver. But the est2genome shows the strand to be negative. I’ve noticed this for quite a few transcripts while examining it in WebApollo. Any ideas what might be causing this?
>
> Thanks,
> Brian
>
> Query= comp17103_c1_seq3 len=612 path=[8488307:0-177 8492039:178-496
> >contig_69 <http://10.114.143.20:4567/get_sequence/?id=contig_69&db=/home/brian/blastdb/af70_20130423_id-modified.assembly.txt> <>
> Length=108040
>
> Score = 1043 bits (1156), Expect = 0.0
> Identities = 589/592 (99%), Gaps = 3/592 (1%)
> Strand=Plus/Plus
>
> Query 24 TCTTTATTCTTTTATTTCCACTTGAGCAATTATTTCCGGGTCAACCTATTCGGTCGTTCT 83
> ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
> Sbjct 105546 TCTTTATTCTTTTATTTCCACTTGAGCAATTATTTCCGGGTCAACCTATTCGGTCGTTCT 105605
>
> Query 84 CTCCGTTGAGCC-TTCCCTCCCCAAGTAATATTGGAAAGTCGTTCTCTCGCTCATAATTT 142
> |||||||||||| |||||||||||||||||||||||||||||||||||||||||||||||
> Sbjct 105606 CTCCGTTGAGCCCTTCCCTCCCCAAGTAATATTGGAAAGTCGTTCTCTCGCTCATAATTT 105665
>
>
>
> 69 blastn expressed_sequence_match 105546 106137 559 + . ID=69:hit:182380:3.2.0.0;Name=comp17103_c1_seq3
> 69 blastn match_part 105546 106137 559 + . ID=69:hsp:377369:3.2.0.0;Parent=69:hit:182380:3.2.0.0;Target=comp17103_c1_seq3 24 612 +;Gap=M70 D1 M85 D1 M75 D1 M359
> 69 est2genome expressed_sequence_match 105546 106137 2909 - . ID=69:hit:182644:3.2.0.0;Name=comp17103_c1_seq3
> 69 est2genome match_part 105546 106137 2909 - . ID=69:hsp:377775:3.2.0.0;Parent=69:hit:182644:3.2.0.0;Target=comp17103_c1_seq3 24 612 -;Gap=M70 D1 M85 D1 M75 D1 M359
> _______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com <mailto:maker-devel at box290.bluehost.com>
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org <http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20150320/f91a44d0/attachment-0003.html>
More information about the maker-devel
mailing list