[maker-devel] evidence for MAKER vs evidence to train gene finders

Carson Holt carsonhh at gmail.com
Tue Sep 20 14:15:21 MDT 2016


You would need to create a custom database without the sequences you wish to exclude.

—Carson

> On Sep 20, 2016, at 1:28 PM, Steven Sullivan <sullis02 at nyu.edu> wrote:
> 
> Thanks! So, I think for training the gene predictors, I'll try to identify any sequences in my gold-standard set that have structural in information...i.e. genes for which the genomic sequence was cloned....and use those.  But  I doubt there's enough of those to train e.g. Augustus, so I'll probably have to use the bootstrap method as well . Is there a way to combine both?
> 
> For the BLAST-based annotation, if I use entire Uniprot/Swissprot or Genbank FASTA sets as protein homology evidence , my gold standards are already included in those.  I gather from these replies that that's not a problem. 
> 
> However, there *are* public database sequences (predicted genes from an older annotation of this species) that I *do* want to exclude from evidence.  (Because we want to run MAKER as if this genome was 'new', never before annotated.)  Can I use something  like the -negative_gilist  option in blastp , to omit previous genome project predictions from consideration?  (An  option that only works with Genbank sequences, I think) .  Or do I have to create a  custom version of the large public database?
> 
> 
> 
> 
> 





More information about the maker-devel mailing list