[maker-devel] Fragmented annotation
Ole Kristian Tørresen
o.k.torresen at ibv.uio.no
Thu Jul 7 15:55:01 MDT 2016
Hi Daniel,
thank you for your prompt answer.
I've created a repeat library using this approach: https://github.com/uio-cels/Repeats, which I hope is quite thorough, and used that in the annotation.
I guess I could do
model_org=all
in addition.
I guess I could and should look at bit more at the annotation in a genome browser, but it is hard to diagnosis something like this without knowing where to start.
Thank you.
Ole
________________________________________
From: Daniel Ence <dence at genetics.utah.edu>
Sent: 07 July 2016 23:44
To: Ole Kristian Tørresen
Cc: maker-devel at yandell-lab.org
Subject: Re: [maker-devel] Fragmented annotation
Hi Ole, when I hear that a genome had too many genes annotated, one of the first things I think of is masking repetitive elements in the genome. Those can contribute a large number of spurious gene annotations which are originating from transposable elements. What did you use for repeat masking for your genome? Did you run MAKER on a pre-masked version of the assembly?
~Daniel
Daniel Ence
Graduate Student
Eccles Institute of Human Genetics
University of Utah
15 North 2030 East, Room 2100
Salt Lake City, UT 84112-5330
> On Jul 7, 2016, at 3:29 PM, Ole Kristian Tørresen <o.k.torresen at ibv.uio.no> wrote:
>
> Hi all,
> I have annotated a fish genome (about 700 Mbp total, 90 kbp N50 contig, 270 kbp N50 scaffold), where I get 96576 gene models, 67917 with default filtering (quality_filter.pl -d) and 67917 with standard filtering (quality_filter.pl -s). I chose to report all genes with AED less than 0.5 (27437) as the high quality set.
>
> However, I wonder a bit. One thing is that 70k genes cannot be correct for this species (it is not polyploid), and the correct number of genes should be a bit more than 20k I think. I suspect that many of my genes are fragmented, how can I fix this? I have tried searching the forum, but cannot find any good answers. Is there some parameters I can adjust?
>
> I have used SwissProt/UniProt and a Trinity assembly of reads from several stages of embryo development as evidence. I used SNAP with CEGMA, AUGUSTUS trained with BUSCO actinoptergyrii genes and GeneMark-ES in first pass, SNAP trained on first pass annotation and AUGUSTUS trained on the transcriptome and first pass annotation together with GeneMark for second pass annotation.
>
> Thank you.
>
> Ole
> _______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org
More information about the maker-devel
mailing list