[maker-devel] ncRNA predictions

Simon Blanchoud simon.blanchoud at otago.ac.nz
Tue Apr 5 18:15:14 MDT 2016


Hi all,

I have been annotating ab initio my de novo assembly of the Botrylloides 
leachi genome with MAKER 2.31.8 for some time now (3rd round running as 
I write). For this last round, I also wanted to get some predictions for 
non-coding RNAs as mentioned in the maker_opts.ctl. Now that this (seems 
to) work properly, I thought I should share a few issues I faced with you.

First of all, both tRNAscan-SE and snoscan have really really limited 
documentation (which I know is none of your business), which makes 
things a bit trickier.
Second, snoscan requires an rRNA file to work (not very obvious from 
maker_opts.ctl), and it turns out that there is a hard-coded limit in 
snoscan of 100 sequences for that rRNA file (not that the error message 
is helpful either). Overall, this was not exactly practical as I'm 
assembling a de novo genome, and thus do not have these rRNA sequences. 
What I did (and it seems to work okay) was to pull out the closest 
sequences I could find from the Rfam database sequences. By combining 
the information from their webiste on the RF families, the taxonomy.txt 
file and the corresponding fasta files (all from their FTP site), I 
extracted (for an eukaryote organism that is), one complete sequence for 
each subunit i.e. RF00001, RF00002, RF01960 and RF02543. Turns out 
pooling more than just one makes it extremely slow to run. You might 
know a better approach for getting such rRNA file but it does look like 
a pretty sound approach to me, and might deserve a comment in 
maker_opts.ctl.
Third, once snoscan was running, I ran into the same issue as 
https://groups.google.com/d/topic/maker-devel/E6BKjXx2ra0/discussion 
i.e. the parsing of the snoscan output crashed. After (quite) some 
debugging, I found out that theere is an issue in the creation of the 
hash table containing the hits. As I am not sure how you wanted to 
organize them originally, I made a wild guess and re-wrote this section 
of the Widget. So it might not group the hits as you wanted but at least 
it now runs properly (and the output appears quite correct to me). I've 
attached the Widget.

Otherwise, thanks heaps for all the hard work, it's an amazing tool and 
it does work great !

Cheers,
Simon
-------------- next part --------------
A non-text attachment was scrubbed...
Name: snoscan.pm
Type: text/x-perl-script
Size: 8128 bytes
Desc: not available
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20160406/580b0bbb/attachment-0002.bin>


More information about the maker-devel mailing list