[maker-devel] tRNAscan and map_gff_ids
Carson Holt
carsonhh at gmail.com
Tue May 20 13:43:48 MDT 2014
Thanks. trnascan support is new enough that there are these kinds of
issues that we need to find and fix. MAKER tries to use the codon name
supplied by trnascan, and it looks like the codon is 'Undet_???'. I don't
know why that is. We currently don't do any filtering of trnascan results
(i.e. we keep everything). This might be something that we really just
want to be filtering out since it doesn't have a determinable codon? At
the very least I should change the codon to NNN instead of ??? to
correspond to the standard ambiguity nucleotides used in FASTA format.
--Carson
On 5/20/14, 1:17 PM, "Fields, Christopher J" <cjfields at illinois.edu> wrote:
>I found a problem with some tRNAscan output using MAKER 2.31.5. I had a
>full MAKER data set (run initially using MAKER 2.31.5) that I mapped IDs
>for. This was then run as follows, with the requisite error:
>
>-system-specific-4.1$ map_gff_ids id.map Zalbi.all.gff3
>Nested quantifiers in regex; marked by <-- HERE in
>m/trnascan-KB913038.1-noncoding-Undet_??? <-- HERE -gene-79.0/ at
>/home/groups/hpcbio/apps/maker/maker-2.31.5/bin/map_gff_ids line 111,
><$IN> line 3067590.
>
>The problematic lines:
>
>----------------------------------------------
>-system-specific-4.1$ grep "???" Zalbi.all.gff3
>KB913038.1 maker gene 23847890 23847958 . - . ID=trnascan-KB913038.1-nonco
>ding-Undet_???-gene-79.0;Name=trnascan-KB913038.1-noncoding-Undet_???-gene
>-79.0
>KB913038.1 maker tRNA 23847890 23847958 . - . ID=trnascan-KB913038.1-nonco
>ding-Undet_???-gene-79.0-tRNA-1;Parent=trnascan-KB913038.1-noncoding-Undet
>_???-gene-79.0;Name=trnascan-KB913038.1-noncoding-Undet_???-gene-79.0-tRNA
>-1;_AED=1.00;_eAED=1.00;_QI=0|-1|0|0|-1|0|1|70|0
>KB913038.1 maker exon 23847890 23847958 . - . ID=trnascan-KB913038.1-nonco
>ding-Undet_???-gene-79.0-tRNA-1:exon:2193;Parent=trnascan-KB913038.1-nonco
>ding-Undet_???-gene-79.0-tRNA-1
>KB913039.1 maker gene 21710152 21710224 . - . ID=trnascan-KB913039.1-nonco
>ding-Undet_???-gene-72.0;Name=trnascan-KB913039.1-noncoding-Undet_???-gene
>-72.0
>KB913039.1 maker tRNA 21710152 21710224 . - . ID=trnascan-KB913039.1-nonco
>ding-Undet_???-gene-72.0-tRNA-1;Parent=trnascan-KB913039.1-noncoding-Undet
>_???-gene-72.0;Name=trnascan-KB913039.1-noncoding-Undet_???-gene-72.0-tRNA
>-1;_AED=1.00;_eAED=1.00;_QI=0|-1|0|0|-1|0|1|74|0
>KB913039.1 maker exon 21710152 21710224 . - . ID=trnascan-KB913039.1-nonco
>ding-Undet_???-gene-72.0-tRNA-1:exon:4036;Parent=trnascan-KB913039.1-nonco
>ding-Undet_???-gene-72.0-tRNA-1
>----------------------------------------------
>
>I managed to get it going by using the following modifications (regex
>quotemeta) in map_gff_ids (lines 107-112):
>
> for my $id (@map_ids) {
> # Only if the value (or the portion preceding
> # the first colon) is equal to the map key.
> next unless ($value eq $id || $value =~ /^\Q$id\E:/);
> $value =~ s/\Q$id\E/$map{$id}/ unless($tag eq 'Name' && $id !~
>/\-gene\-\d+\.\d+|^CG\:|^....\:|^[^\:]+\:temp\d+\:/);
> }
>
>I’m guessing there may be a similar problem with map_fasta_ids?
>
>chris
>_______________________________________________
>maker-devel mailing list
>maker-devel at box290.bluehost.com
>http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org
More information about the maker-devel
mailing list