[maker-devel] Using multiple protein profiles as queries for prediction in intergenic regions?

Carson Holt carsonhh at gmail.com
Thu Sep 25 12:43:35 MDT 2014


When you say "gene-like structures:, are you saying that you are looking for
pseudogenes and non-coding genes?

You can use the trnascan and snoscan options in the maker_opts.ctl file to
find some non-coding RNAS. You may just want to leave off all ab initio gene
predictors like SNAP and Augustus as those will be looking for canonical
coding genes.

If you first hard mask any coding genes, and then provide ESTs or assembled
mRNA-seq and proteins, you may be able to use the exonerate alignments
produced to identify potential gene like structures. It might require a
little post processing of the resulting GFF3 by you.

Thanks,
Carson


From:  Anand K S Rao <aksrao at ucdavis.edu>
Date:  Thursday, September 25, 2014 at 10:18 AM
To:  <maker-devel at yandell-lab.org>
Subject:  [maker-devel] Using multiple protein profiles as queries for
prediction in intergenic regions?

Greetings!

I am exploring the use of MAKER-P. But I need your advice in determining if
MAKER-P is the best choice for me.

In the recent past, I've tried using the AUGUSTUS --profile option which
allows for user defined protein profiles to be used as query.

I am interested in predicted gene-like structures in intergenic regions
(I've masked away genic regions as predicted by genome annotation pipeline)
- in some orphan legume plant species - so not much in the way of extrinsic
/ external data in the way of EST, NGS data - let alone extrinsic data that
might map to so called intergenic regions i.e. whatever little data there
exists, has been already used to predict 'genes'.

When I tried using --profile option of AUGUSTUS, I was not satisfied with
the frequency and magnitude of fusion genes. Additionally, there was no easy
way for me to consolidate gene-like structures that varied, but overlapped
when using different protein profiles as queries (one profile per Pfam HMM
within a 4 member clan).

Additionally, training all the orphan legume species is not an exciting
undertaking... because of time and computing resource requirements.

All this led me to consider MAKER-P as an option. Based on what I've
described above, do you think I should proceed with trying to use MAKER-P
for my purposes?

Thank you, in advance.

Sincerely,
Anand



-- 
Anand K.S. Rao  PhD candidate, Plant Biology <http://www-plb.ucdavis.edu/>
with a Designated Emphasis in Biotechnology <http://deb.ucdavis.edu/> ,
UC- Davis <http://ucdavis.edu/> ,  CA - 95616 USA   |   aksrao at ucdavis.edu
|   (530) 574-5134   |   LinkedIn <http://www.linkedin.com/in/anandksrao>
_________________________________________________________________________
  CTTATTGTTGAACTTOAATGGTGCTAATGATCCTCGTOTCTCCTGAACGT - translate THAT!
_______________________________________________ maker-devel mailing list
maker-devel at box290.bluehost.com
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20140925/2918320f/attachment-0003.html>


More information about the maker-devel mailing list