[maker-devel] A way to compare 2 annotation runs?

Tue Apr 19 15:36:35 MDT 2016

The Sequence Ontology provides some tools for this:

SOBAcl has some pre-configured reports/graphs with some flexibility to modify their content/layout.
https://github.com/The-Sequence-Ontology/SOBA

This simple example provides a table for two GFF3 files of the count of feature types:

SOBAcl --columns file --rows type --data type --data_type count   \
  data/dmel-all-r5.30_0001000.gff data/dmel-all-r5.30_0010000.gff

More complex examples are available in the test file SOBA/t/sobacl_test.sh

The GAL library is a perl library that works well with MAKER output and other valid GFF3 documents.  I has some scripts that would provide metrics along the lines of what you’re looking for, but is primarily a programing library to make it easy to roll your own
https://github.com/The-Sequence-Ontology/GAL<https://github.com/The-Sequence-Ontology/SOBA>

If you’re OK with a little bit of perl code, modifying the synopsis code in the README a bit you can generate the splice complexity metrics described here (http://www.ncbi.nlm.nih.gov/pubmed/19236712) are easy to produce:

use GAL::Annotation;

my $annot = GAL::Annotation->new(qw(file.gff file.fasta);
my $features = $annot->features;

my $genes = $features->search( {type => ‘gene'} );
while (my $gene = $genes->next) {
    print $gene->feature_id        . “\t";
    print $gene->splice_complexity . “\n”;
    }
}

Hope that helps,

Barry

On Apr 19, 2016, at 9:08 AM, Carson Holt <carsonhh at gmail.com<mailto:carsonhh at gmail.com>> wrote:

I’m going to ask Michael Campbell to answer this. He wrote a protocols paper that will help.

—Carson

On Apr 19, 2016, at 6:08 AM, Florian <fdolze at students.uni-mainz.de<mailto:fdolze at students.uni-mainz.de>> wrote:

Hello All,

We ran MAKER on a newly assembled genome for 3 iterations, since 2 seems to be the recommended standard and while on holiday I just ran it a third time. Now I want to compare the results of the iterations to see where the annotation (hopefully) improved/changed but I cant really come up with a clever way to this.

I reckon this has to be an often solved problem though I couldnt find a solution except an older entry in this mail-list but that wasnt helpful.

So how are people assessing quality of a maker run? How do you say one run was 'better' than another?

best regards & thanks for your input,
Florian

_______________________________________________
maker-devel mailing list
maker-devel at yandell-lab.org<mailto:maker-devel at yandell-lab.org>
http://yandell-lab.org/mailman/listinfo/maker-devel_yandell-lab.org

_______________________________________________
maker-devel mailing list
maker-devel at yandell-lab.org<mailto:maker-devel at yandell-lab.org>
http://yandell-lab.org/mailman/listinfo/maker-devel_yandell-lab.org

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20160419/8be959ba/attachment-0003.html>