[maker-devel] evidence for MAKER vs evidence to train gene finders
Carson Holt
carsonhh at gmail.com
Tue Sep 20 14:15:21 MDT 2016
You would need to create a custom database without the sequences you wish to exclude.
—Carson
> On Sep 20, 2016, at 1:28 PM, Steven Sullivan <sullis02 at nyu.edu> wrote:
>
> Thanks! So, I think for training the gene predictors, I'll try to identify any sequences in my gold-standard set that have structural in information...i.e. genes for which the genomic sequence was cloned....and use those. But I doubt there's enough of those to train e.g. Augustus, so I'll probably have to use the bootstrap method as well . Is there a way to combine both?
>
> For the BLAST-based annotation, if I use entire Uniprot/Swissprot or Genbank FASTA sets as protein homology evidence , my gold standards are already included in those. I gather from these replies that that's not a problem.
>
> However, there *are* public database sequences (predicted genes from an older annotation of this species) that I *do* want to exclude from evidence. (Because we want to run MAKER as if this genome was 'new', never before annotated.) Can I use something like the -negative_gilist option in blastp , to omit previous genome project predictions from consideration? (An option that only works with Genbank sequences, I think) . Or do I have to create a custom version of the large public database?
>
>
>
>
>
More information about the maker-devel
mailing list