<div dir="ltr">Thanks! So, I think for training the gene predictors, I'll try to identify any sequences in my gold-standard set that have structural in information...i.e. genes for which the genomic sequence was cloned....and use those.  But  I doubt there's enough of those to train e.g. Augustus, so I'll probably have to use the bootstrap method as well . Is there a way to combine both?<div><br></div><div>For the BLAST-based annotation, if I use entire Uniprot/Swissprot or Genbank FASTA sets as protein homology evidence , my gold standards are already included in those.  I gather from these replies that that's not a problem. </div><div><br></div><div>However, there *are* public database sequences (predicted genes from an older annotation of this species) that I *do* want to exclude from evidence.  (Because we want to run MAKER as if this genome was 'new', never before annotated.)  Can I use something  like the -negative_gilist  option in blastp , to omit previous genome project predictions from consideration?  (An  option that only works with Genbank sequences, I think) .  Or do I have to create a  custom version of the large public database?</div><div><br></div><div><br></div><div><br></div><div><br></div><div class="gmail_extra"><div class="gmail_signature"><br></div>

</div></div>