[maker-devel] A way to compare 2 annotation runs?
Cook, Malcolm
MEC at stowers.org
Tue Apr 19 15:44:04 MDT 2016
Just a quick thought
The smallest summary of what you’re after might be the jaccard difference between you annotation as computed by bedtools http://bedtools.readthedocs.org/en/latest/content/tools/jaccard.html
??
From: maker-devel [mailto:maker-devel-bounces at yandell-lab.org] On Behalf Of Barry Moore
Sent: Tuesday, April 19, 2016 4:37 PM
To: Florian <fdolze at students.uni-mainz.de>; maker-devel <maker-devel at yandell-lab.org>
Cc: Campbell, Michael <mcampbel at cshl.edu>
Subject: Re: [maker-devel] A way to compare 2 annotation runs?
The Sequence Ontology provides some tools for this:
SOBAcl has some pre-configured reports/graphs with some flexibility to modify their content/layout.
https://github.com/The-Sequence-Ontology/SOBA
This simple example provides a table for two GFF3 files of the count of feature types:
SOBAcl --columns file --rows type --data type --data_type count \
data/dmel-all-r5.30_0001000.gff data/dmel-all-r5.30_0010000.gff
More complex examples are available in the test file SOBA/t/sobacl_test.sh
The GAL library is a perl library that works well with MAKER output and other valid GFF3 documents. I has some scripts that would provide metrics along the lines of what you’re looking for, but is primarily a programing library to make it easy to roll your own
https://github.com/The-Sequence-Ontology/GAL<https://github.com/The-Sequence-Ontology/SOBA>
If you’re OK with a little bit of perl code, modifying the synopsis code in the README a bit you can generate the splice complexity metrics described here (http://www.ncbi.nlm.nih.gov/pubmed/19236712) are easy to produce:
use GAL::Annotation;
my $annot = GAL::Annotation->new(qw(file.gff file.fasta);
my $features = $annot->features;
my $genes = $features->search( {type => ‘gene'} );
while (my $gene = $genes->next) {
print $gene->feature_id . “\t";
print $gene->splice_complexity . “\n”;
}
}
Hope that helps,
Barry
On Apr 19, 2016, at 9:08 AM, Carson Holt <carsonhh at gmail.com<mailto:carsonhh at gmail.com>> wrote:
I’m going to ask Michael Campbell to answer this. He wrote a protocols paper that will help.
—Carson
On Apr 19, 2016, at 6:08 AM, Florian <fdolze at students.uni-mainz.de<mailto:fdolze at students.uni-mainz.de>> wrote:
Hello All,
We ran MAKER on a newly assembled genome for 3 iterations, since 2 seems to be the recommended standard and while on holiday I just ran it a third time. Now I want to compare the results of the iterations to see where the annotation (hopefully) improved/changed but I cant really come up with a clever way to this.
I reckon this has to be an often solved problem though I couldnt find a solution except an older entry in this mail-list but that wasnt helpful.
So how are people assessing quality of a maker run? How do you say one run was 'better' than another?
best regards & thanks for your input,
Florian
_______________________________________________
maker-devel mailing list
maker-devel at yandell-lab.org<mailto:maker-devel at yandell-lab.org>
http://yandell-lab.org/mailman/listinfo/maker-devel_yandell-lab.org
_______________________________________________
maker-devel mailing list
maker-devel at yandell-lab.org<mailto:maker-devel at yandell-lab.org>
http://yandell-lab.org/mailman/listinfo/maker-devel_yandell-lab.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20160419/e2c1b854/attachment-0003.html>
More information about the maker-devel
mailing list