[maker-devel] Fragmented annotation
Daniel Ence
dence at genetics.utah.edu
Thu Jul 7 15:48:21 MDT 2016
Addressing your suspicion that your genes are fragmented, can you check how many of the protein or transcript sequeces begin and end with canonical start and stop codons? That might tell you whether you have “gene-parts” rather than full genes.
~Daniel
Daniel Ence
Graduate Student
Eccles Institute of Human Genetics
University of Utah
15 North 2030 East, Room 2100
Salt Lake City, UT 84112-5330
> On Jul 7, 2016, at 3:44 PM, Daniel Ence <dence at genetics.utah.edu> wrote:
>
> Hi Ole, when I hear that a genome had too many genes annotated, one of the first things I think of is masking repetitive elements in the genome. Those can contribute a large number of spurious gene annotations which are originating from transposable elements. What did you use for repeat masking for your genome? Did you run MAKER on a pre-masked version of the assembly?
>
> ~Daniel
>
>
> Daniel Ence
> Graduate Student
> Eccles Institute of Human Genetics
> University of Utah
> 15 North 2030 East, Room 2100
> Salt Lake City, UT 84112-5330
>
>> On Jul 7, 2016, at 3:29 PM, Ole Kristian Tørresen <o.k.torresen at ibv.uio.no> wrote:
>>
>> Hi all,
>> I have annotated a fish genome (about 700 Mbp total, 90 kbp N50 contig, 270 kbp N50 scaffold), where I get 96576 gene models, 67917 with default filtering (quality_filter.pl -d) and 67917 with standard filtering (quality_filter.pl -s). I chose to report all genes with AED less than 0.5 (27437) as the high quality set.
>>
>> However, I wonder a bit. One thing is that 70k genes cannot be correct for this species (it is not polyploid), and the correct number of genes should be a bit more than 20k I think. I suspect that many of my genes are fragmented, how can I fix this? I have tried searching the forum, but cannot find any good answers. Is there some parameters I can adjust?
>>
>> I have used SwissProt/UniProt and a Trinity assembly of reads from several stages of embryo development as evidence. I used SNAP with CEGMA, AUGUSTUS trained with BUSCO actinoptergyrii genes and GeneMark-ES in first pass, SNAP trained on first pass annotation and AUGUSTUS trained on the transcriptome and first pass annotation together with GeneMark for second pass annotation.
>>
>> Thank you.
>>
>> Ole
>> _______________________________________________
>> maker-devel mailing list
>> maker-devel at box290.bluehost.com
>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org
>
> _______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org
More information about the maker-devel
mailing list