[maker-devel] Question regarding MAKER
Carson Holt
carson.holt at genetics.utah.edu
Thu Jul 13 11:00:18 MDT 2017
est2genome and protein2genome take BLAST hits, polish them with exonerate around splice sites and then turn the alignment directly into a gene model. So if the alignment is partial because the EST or mRNA-seq do not cross the entire transcript or the protein homology does not cross the entire CDS, then the resulting model will be partial. But hundreds of even partial models are sufficient to train SNAP. Then I usually do just one round of bootstrap training (more than that and you get into the overtraining paradox).
So you can use just est2genome, just protein2genome, or both. You just need something to train SNAP with.
—Carson
On Jul 11, 2017, at 3:37 PM, Ghosh, Arnab <arnab.ghosh at ttu.edu<mailto:arnab.ghosh at ttu.edu>> wrote:
Hi Carson,
My name is Arnab and I am from Texas Tech University.
I am using MAKER for gene annotation in a new genome assembly for a non-model organism. I have mostly figured out everything of this amazing piece of software but had two questions.
1. Is it okay to use only est2genome =1 and leave the protein2genome=0 option out in the first round of running MAKER ? Will it hurt my prediction and eventual annotation of gene if I don’t use the protein2genomeoption ALONGSIDE est2genome in the first round? I have a protein fasta file for the same organism but using the transcript fasta file (same organism) AND the protein fasta file for the whole genome (~ 2.2 GB in size) is just taking too long to finish.
1. I will of course run SNAP in the second round which also leads me to my second question as to what according to you is an acceptable number of iterations to run bootstrapping of SNAP with MAKER?
Thanks and regards
Arnab
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170713/b2f7eb32/attachment-0002.html>
More information about the maker-devel
mailing list