[maker-devel] AED calculations using the MAKER pipeline
Krishnakumar, Vivek
vKrishna at jcvi.org
Wed Mar 20 07:05:55 MDT 2013
Hi,
We have been using the MAKER pipeline here at JCVI to calculate AED scores by feeding in our annotation set as `model_gff` and the protein and EST evidence as `protein_gff` and `est_gff` respectively. Here is the issue we are having:
When running the above pipeline with protein2genome and est2genome evidence generated earlier by MAKER, there are no problems calculating the AED score. Normally this pipeline takes a little over 12 hours to complete.
But if we use our own evidence, AAT and Genewise aligned proteins for `protein_gff` and PASA assembled ESTs for `est_gff`, the same pipeline runs very very slow and the intermediary *.gff.ann file has many chunks (separated by '###') that are completely empty. Our evidence in formatted in the same way as est2genome or protein2genome (GFF file with "expressed_sequence_match::match_part" or "protein_match::match_part" features respectively)
The input to my pipeline is 8 chromosomes, ~2200 scaffolds and I use the default `max_dna_len` parameter used to split the large assemblies into chunks.
Investigating the master_datastore.log shows me that the scaffolds run through without any issues and the chromosomes are still being processed.
For any of the chromosomes, investigating the 'run.log' file, one level above 'theVoid' shows me how many "final.section" jobs were started and how many finished. And in the case of all the chromosomes, it tells me that everything that was started has finished. And the 'log.child.*' files within `theVoid` are all empty. Also within `theVoid`, I'm noticing that the "raw.section" and "evidence_*.gff" files are not empty. But one thing that is surprising is that of all the "final.section" files, only the one pertaining to the last chunk is very large (proportional to the size of the evidnce), the rest are all exactly the same size (exactly 331 bytes).
I'm running MAKER in MPI mode spawning 48 processes on a high memory machine with 64 available cores and 1TB of RAM.
I hope I've been able to explain my situation clearly in this email.
Any help is appreciated.
Thank you.
Vivek
More information about the maker-devel
mailing list