[maker-devel] ncRNA predictions
Simon Blanchoud
simon.blanchoud at otago.ac.nz
Tue Apr 5 18:15:14 MDT 2016
Hi all,
I have been annotating ab initio my de novo assembly of the Botrylloides
leachi genome with MAKER 2.31.8 for some time now (3rd round running as
I write). For this last round, I also wanted to get some predictions for
non-coding RNAs as mentioned in the maker_opts.ctl. Now that this (seems
to) work properly, I thought I should share a few issues I faced with you.
First of all, both tRNAscan-SE and snoscan have really really limited
documentation (which I know is none of your business), which makes
things a bit trickier.
Second, snoscan requires an rRNA file to work (not very obvious from
maker_opts.ctl), and it turns out that there is a hard-coded limit in
snoscan of 100 sequences for that rRNA file (not that the error message
is helpful either). Overall, this was not exactly practical as I'm
assembling a de novo genome, and thus do not have these rRNA sequences.
What I did (and it seems to work okay) was to pull out the closest
sequences I could find from the Rfam database sequences. By combining
the information from their webiste on the RF families, the taxonomy.txt
file and the corresponding fasta files (all from their FTP site), I
extracted (for an eukaryote organism that is), one complete sequence for
each subunit i.e. RF00001, RF00002, RF01960 and RF02543. Turns out
pooling more than just one makes it extremely slow to run. You might
know a better approach for getting such rRNA file but it does look like
a pretty sound approach to me, and might deserve a comment in
maker_opts.ctl.
Third, once snoscan was running, I ran into the same issue as
https://groups.google.com/d/topic/maker-devel/E6BKjXx2ra0/discussion
i.e. the parsing of the snoscan output crashed. After (quite) some
debugging, I found out that theere is an issue in the creation of the
hash table containing the hits. As I am not sure how you wanted to
organize them originally, I made a wild guess and re-wrote this section
of the Widget. So it might not group the hits as you wanted but at least
it now runs properly (and the output appears quite correct to me). I've
attached the Widget.
Otherwise, thanks heaps for all the hard work, it's an amazing tool and
it does work great !
Cheers,
Simon
-------------- next part --------------
A non-text attachment was scrubbed...
Name: snoscan.pm
Type: text/x-perl-script
Size: 8128 bytes
Desc: not available
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20160406/580b0bbb/attachment-0002.bin>
More information about the maker-devel
mailing list