[maker-devel] Missing genes in lift-over with est2genome
Carson Holt
carsonhh at gmail.com
Thu Apr 23 11:43:30 MDT 2020
There are percent cutoffs for the est2genome algorithm you can set in the maker_bopts.ctl file. Additionally, maker will give the alignment but not produce a gene model if it can’t translate through the est2genome alignment (i.e. stop codons in the assembly). I believe the cutoff is 50%. If you add est_forward=1 to the maker_opts.ctl file names will be copied from the alignment source and the score in the GFF3 column will be the percent match to the original transcript.
—Carson
> On Apr 21, 2020, at 7:08 AM, Lior Glick <liorglic at mail.tau.ac.il> wrote:
>
> Hello,
> I am using MAKER to annotate a plant genome assembly. A high-quality reference genome and annotation exists for another variety of the same species, so my first step is lifting over reference genes to my genome. I do this by setting est2genome = 1 and providing MAKER with the reference cDNA (transcriptome). No other evidence is provided and no prediction is performed. Repeat masking is done using the reference repeats library.
> When checking the results, I found out lots of reference genes missing from the lift-over result. However, if I blast the sequences of these genes myself, I get good matches. I even see these matches when I look at the blast results buried in the MAKER data_store.
> For example, a transcript of length 1077 got a match of length 855 - 100% identity and no gaps. Bitscore was 1709 and E-value 0. This looks like a pretty good match, but it is not found in the final MAKER results (gff/fasta).
> Why is this happening? Are there some cutoffs that are not satisfied? If so, what are they and how can they be configured?
>
> Thanks,
> Lior
> _______________________________________________
> maker-devel mailing list
> maker-devel at yandell-lab.org
> http://yandell-lab.org/mailman/listinfo/maker-devel_yandell-lab.org
More information about the maker-devel
mailing list