[maker-devel] AED scores from MAKER pipeline - deterministic or not?

Carson Holt carsonhh at gmail.com
Mon Aug 31 09:08:34 MDT 2015


I would have to see the actual GFF3 files (full file including fast at the end). Give me both GFF3 files and the coordinates of the gene in question. My first guess is that you had the single_exon= filter set to different values on each run.  The gene in question is an unspliced single exon gene (based on the QI), your primary piece of evidence appears to be a single exon EST, and the only value that changes in the QI is the exon overlap.  Single exon evidence will be ignored by default for the AED calculation unless you have single_exon set to 1.

Thanks,
Carson

> On Aug 31, 2015, at 8:47 AM, Cheng, Chia-Yi <ccheng at jcvi.org> wrote:
> 
> Hello MAKER team,
> 
> We at JCVI have been using MAKER (2.31.8) to calculate the AED of Arabidopsis gene models. We provided the annotation set as ‘model_gff’ with evidence file in ‘protein_gff’ and ‘est_gff’. All the other settings were default. One issue I’ve noticed was that the AED scores did not seem to be deterministic. When I compare the AED scores from two runs using identical control files, ~1,000 (out of 35,385) gene models had different AED scores. The difference between two sets of AED scores could range from 0.01 to 1.00.
> 
> I looked into several gene models with lager difference, i.e. AED = 0.00 in run 1 and AED = 1.00 in run 2, and noticed a disagreement in the QI:
> 
> Run 1: _AED=0.00;_eAED=-0.00;_QI=0|-1|0|1|-1|0|1|0|344
> Run 2: _AED=1.00;_eAED=1.00;_QI=0|-1|0|0|-1|0|1|0|344
> 
> The discrepancy in the 4th column seemed to suggest the evidence file was not used properly in run 2. I’m not sure what may have caused as both runs have used the same input. A snapshot of the evidence files are pasted in the end of the email in case needed. 
> 
> Please let me know if more info is needed. Any help is appreciated. Thank you.
> 
> Chia-Yi
> 
> 
> RNA-seq evidence file:
> Chr1	assembler-aerial2_pasa	cDNA_match	3624	5927	.	+	.	ID=aerial2_align_161343;Target=asmbl_1 1082 1234 +%2Casmbl_1 692 1081 +%2Casmbl_1 572 691 +%2Casmbl_1 1 290 +%2Casmbl_1 291 571 +%2Casmbl_1 1235 1723 +
> Chr1	assembler-aerial2_pasa	match_part	3624	3913	.	+	.	ID=aerial2_align_161343-1;Parent=aerial2_align_161343
> Chr1	assembler-aerial2_pasa	match_part	3996	4276	.	+	.	ID=aerial2_align_161343-2;Parent=aerial2_align_161343
> 
> EST evidence file:
> Chr1	est2genome	expressed_sequence_match	5470	5899	2150	-	.	ID=Chr1:hit:213:3.2.0.0;Name=gi|19829901|gb|AV795918|RAFL08-19-M04
> Chr1	est2genome	match_part	5470	5899	2150	-	.	ID=Chr1:hsp:500:3.2.0.0;Parent=Chr1:hit:213:3.2.0.0;Target=gi|19829901|gb|AV795918|RAFL08-19-M04 2 431 +;Gap=M430
> 
> Protein evidence file:
> Chr1	protein2genome	protein_match	3760	5284	727	+	.	ID=Chr1:hit:202:3.10.0.0;Name=UniRef90_M4EWW1
> Chr1	protein2genome	match_part	3760	3913	727	+	.	ID=Chr1:hsp:488:3.10.0.0;Parent=Chr1:hit:202:3.10.0.0;Target=UniRef90_M4EWW1 1 50;Gap=M31 D1 M19 F1
> Chr1	protein2genome	match_part	3996	4276	727	+	.	ID=Chr1:hsp:489:3.10.0.0;Parent=Chr1:hit:202:3.10.0.0;Target=UniRef90_M4EWW1 51 144;Gap=R1 M23 D1 M28 D1 M36 I2 M5
> 
> _______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20150831/8bc11cd9/attachment-0003.html>


More information about the maker-devel mailing list