[maker-devel] Fragmented annotation
Ole Kristian Tørresen
o.k.torresen at ibv.uio.no
Thu Jul 7 15:29:54 MDT 2016
Hi all,
I have annotated a fish genome (about 700 Mbp total, 90 kbp N50 contig, 270 kbp N50 scaffold), where I get 96576 gene models, 67917 with default filtering (quality_filter.pl -d) and 67917 with standard filtering (quality_filter.pl -s). I chose to report all genes with AED less than 0.5 (27437) as the high quality set.
However, I wonder a bit. One thing is that 70k genes cannot be correct for this species (it is not polyploid), and the correct number of genes should be a bit more than 20k I think. I suspect that many of my genes are fragmented, how can I fix this? I have tried searching the forum, but cannot find any good answers. Is there some parameters I can adjust?
I have used SwissProt/UniProt and a Trinity assembly of reads from several stages of embryo development as evidence. I used SNAP with CEGMA, AUGUSTUS trained with BUSCO actinoptergyrii genes and GeneMark-ES in first pass, SNAP trained on first pass annotation and AUGUSTUS trained on the transcriptome and first pass annotation together with GeneMark for second pass annotation.
Thank you.
Ole
More information about the maker-devel
mailing list