[maker-devel] Maker not producing expected output

Kevin Kocot kmkocot at gmail.com
Fri Oct 16 22:10:14 MDT 2015


Hello,

I've run Maker on a draft invertebrate genome and it seemed to finish 
successfully. However, many of the expected output files were not 
produced. If I go to, for example, XX_datastore/00/0C/scaffold-334630/, 
all I see is:

theVoid.scaffold-334630
run.log
scaffold-334630.gff

In particular, I'm looking for the transcripts and proteins fasta files. 
I'm sure I have a configuration setting incorrect or one of the 
dependencies not correctly installed, but I can't figure out what the 
problem is. Any thoughts on how I can resolve this issue and generate 
these files? Ideally I would love to be able to generate these files 
without having to run the whole pipeline again. Details on my 
configuration settings and the contents of the run.log file from my 
example above are pasted below.

Thank you,
Kevin

-----
run.log from the example folder above looks like this:
-----
SHARED_ID    d574e9ca9b0019a9fe147ccb9db3588b
CTL_OPTIONS    maker_gff
CTL_OPTIONS    other_gff
CTL_OPTIONS    est    test-transcriptome.fa
CTL_OPTIONS    est_reads
CTL_OPTIONS    altest    KK273.fa
CTL_OPTIONS    est_gff
CTL_OPTIONS    altest_gff
CTL_OPTIONS    protein    test-AA.fa
CTL_OPTIONS    protein_gff
CTL_OPTIONS    model_org    all
CTL_OPTIONS    repeat_protein    te_proteins.fasta
CTL_OPTIONS    rmlib
CTL_OPTIONS    rm_gff
CTL_OPTIONS    organism_type    eukaryotic
CTL_OPTIONS    predictor    est2genome,genemark,protein2genome
CTL_OPTIONS    est2genome    1
CTL_OPTIONS    altest2genome    0
CTL_OPTIONS    snaphmm
CTL_OPTIONS    gmhmm    output/gmhmm.mod
CTL_OPTIONS    augustus_species
CTL_OPTIONS    fgenesh_par_file
CTL_OPTIONS    model_gff
CTL_OPTIONS    pred_gff
CTL_OPTIONS    max_dna_len    100000
CTL_OPTIONS    split_hit    10000
CTL_OPTIONS    pred_flank    200
CTL_OPTIONS    pred_stats    0
CTL_OPTIONS    min_protein    0
CTL_OPTIONS    AED_threshold    1
CTL_OPTIONS    single_exon    0
CTL_OPTIONS    single_length    250
CTL_OPTIONS    keep_preds    0
CTL_OPTIONS    map_forward    0
CTL_OPTIONS    est_forward    0
CTL_OPTIONS    correct_est_fusion    0
CTL_OPTIONS    alt_splice    0
CTL_OPTIONS    always_complete    0
CTL_OPTIONS    alt_peptide    C
CTL_OPTIONS    evaluate    0
CTL_OPTIONS    blast_type    ncbi+
CTL_OPTIONS    softmask    1
CTL_OPTIONS    pcov_blastn    0.8
CTL_OPTIONS    pid_blastn    0.85
CTL_OPTIONS    eval_blastn    1e-10
CTL_OPTIONS    bit_blastn    40
CTL_OPTIONS    depth_blastn    0
CTL_OPTIONS    pcov_rm_blastx    0.5
CTL_OPTIONS    pid_rm_blastx    0.4
CTL_OPTIONS    eval_rm_blastx    1e-06
CTL_OPTIONS    bit_rm_blastx    30
CTL_OPTIONS    pcov_blastx    0.5
CTL_OPTIONS    pid_blastx    0.4
CTL_OPTIONS    depth_blastx    0
CTL_OPTIONS    eval_blastx    1e-06
CTL_OPTIONS    bit_blastx    30
CTL_OPTIONS    pcov_tblastx    0.8
CTL_OPTIONS    pid_tblastx    0.85
CTL_OPTIONS    eval_tblastx    1e-10
CTL_OPTIONS    bit_tblastx    40
CTL_OPTIONS    depth_tblastx    0
CTL_OPTIONS    ep_score_limit    20
CTL_OPTIONS    en_score_limit    20
CTL_OPTIONS    enable_fathom    0
CTL_OPTIONS    unmask    0
CTL_OPTIONS    model_pass    0
CTL_OPTIONS    est_pass    0
CTL_OPTIONS    altest_pass    0
CTL_OPTIONS    protein_pass    0
CTL_OPTIONS    rm_pass    0
CTL_OPTIONS    other_pass    0
CTL_OPTIONS    pred_pass    0
CTL_OPTIONS    run    genemark
LOGCHILD 
/media/kmkocot/Sclerite/genome_projects/test/Ray_2_assembly/MAKER/test_scaffolds_annotated_as_metazoan_by_MG-RAST.maker.output/test_scaffolds_annotated_as_metazoan_by_MG-RAST_datastore/00/0C/scaffold-334630//theVoid.scaffold-334630/run.log.child.0 

LOGCHILD 
/media/kmkocot/Sclerite/genome_projects/test/Ray_2_assembly/MAKER/test_scaffolds_annotated_as_metazoan_by_MG-RAST.maker.output/test_scaffolds_annotated_as_metazoan_by_MG-RAST_datastore/00/0C/scaffold-334630//theVoid.scaffold-334630/run.log.child.0 

LOGCHILD 
/media/kmkocot/Sclerite/genome_projects/test/Ray_2_assembly/MAKER/test_scaffolds_annotated_as_metazoan_by_MG-RAST.maker.output/test_scaffolds_annotated_as_metazoan_by_MG-RAST_datastore/00/0C/scaffold-334630//theVoid.scaffold-334630/run.log.child.0 

STARTED 
test_scaffolds_annotated_as_metazoan_by_MG-RAST.maker.output/test_scaffolds_annotated_as_metazoan_by_MG-RAST_datastore/00/0C/scaffold-334630//theVoid.scaffold-334630/scaffold-334630.abinit_nomask.0.gmhmm%2Emod.genemark 

FINISHED 
test_scaffolds_annotated_as_metazoan_by_MG-RAST.maker.output/test_scaffolds_annotated_as_metazoan_by_MG-RAST_datastore/00/0C/scaffold-334630//theVoid.scaffold-334630/scaffold-334630.abinit_nomask.0.gmhmm%2Emod.genemark 

STARTED 
test_scaffolds_annotated_as_metazoan_by_MG-RAST.maker.output/test_scaffolds_annotated_as_metazoan_by_MG-RAST_datastore/00/0C/scaffold-334630//theVoid.scaffold-334630/scaffold-334630.0.pred.raw.section 

FINISHED 
test_scaffolds_annotated_as_metazoan_by_MG-RAST.maker.output/test_scaffolds_annotated_as_metazoan_by_MG-RAST_datastore/00/0C/scaffold-334630//theVoid.scaffold-334630/scaffold-334630.0.pred.raw.section 

LOGCHILD 
/media/kmkocot/Sclerite/genome_projects/test/Ray_2_assembly/MAKER/test_scaffolds_annotated_as_metazoan_by_MG-RAST.maker.output/test_scaffolds_annotated_as_metazoan_by_MG-RAST_datastore/00/0C/scaffold-334630//theVoid.scaffold-334630/run.log.child.0 

LOGCHILD 
/media/kmkocot/Sclerite/genome_projects/test/Ray_2_assembly/MAKER/test_scaffolds_annotated_as_metazoan_by_MG-RAST.maker.output/test_scaffolds_annotated_as_metazoan_by_MG-RAST_datastore/00/0C/scaffold-334630//theVoid.scaffold-334630/run.log.child.0 

LOGCHILD 
/media/kmkocot/Sclerite/genome_projects/test/Ray_2_assembly/MAKER/test_scaffolds_annotated_as_metazoan_by_MG-RAST.maker.output/test_scaffolds_annotated_as_metazoan_by_MG-RAST_datastore/00/0C/scaffold-334630//theVoid.scaffold-334630/run.log.child.0 

STARTED 
test_scaffolds_annotated_as_metazoan_by_MG-RAST.maker.output/test_scaffolds_annotated_as_metazoan_by_MG-RAST_datastore/00/0C/scaffold-334630//theVoid.scaffold-334630/scaffold-334630.0.final.section 

FINISHED 
test_scaffolds_annotated_as_metazoan_by_MG-RAST.maker.output/test_scaffolds_annotated_as_metazoan_by_MG-RAST_datastore/00/0C/scaffold-334630//theVoid.scaffold-334630/scaffold-334630.0.final.section 

LOGCHILD 
/media/kmkocot/Sclerite/genome_projects/test/Ray_2_assembly/MAKER/test_scaffolds_annotated_as_metazoan_by_MG-RAST.maker.output/test_scaffolds_annotated_as_metazoan_by_MG-RAST_datastore/00/0C/scaffold-334630//theVoid.scaffold-334630/run.log.child.0 

LOGCHILD 
/media/kmkocot/Sclerite/genome_projects/test/Ray_2_assembly/MAKER/test_scaffolds_annotated_as_metazoan_by_MG-RAST.maker.output/test_scaffolds_annotated_as_metazoan_by_MG-RAST_datastore/00/0C/scaffold-334630//theVoid.scaffold-334630/run.log.child.0 

LOGCHILD 
/media/kmkocot/Sclerite/genome_projects/test/Ray_2_assembly/MAKER/test_scaffolds_annotated_as_metazoan_by_MG-RAST.maker.output/test_scaffolds_annotated_as_metazoan_by_MG-RAST_datastore/00/0C/scaffold-334630//theVoid.scaffold-334630/run.log.child.0 


-----
maker_opts
-----
#-----Genome (these are always required)
genome=/media/kmkocot/Sclerite/genome_projects/test/Ray_2_assembly/MAKER/test_scaffolds_annotated_as_metazoan_by_MG-RAST.fas 
#genome sequence (fasta file or fasta embeded in GFF3 file)
organism_type=eukaryotic #eukaryotic or prokaryotic. Default is eukaryotic

#-----Re-annotation Using MAKER Derived GFF3
maker_gff= #MAKER derived GFF3 file
est_pass=0 #use ESTs in maker_gff: 1 = yes, 0 = no
altest_pass=0 #use alternate organism ESTs in maker_gff: 1 = yes, 0 = no
protein_pass=0 #use protein alignments in maker_gff: 1 = yes, 0 = no
rm_pass=0 #use repeats in maker_gff: 1 = yes, 0 = no
model_pass=0 #use gene models in maker_gff: 1 = yes, 0 = no
pred_pass=0 #use ab-initio predictions in maker_gff: 1 = yes, 0 = no
other_pass=0 #passthrough anyything else in maker_gff: 1 = yes, 0 = no

#-----EST Evidence (for best results provide a file for at least one)
est=/media/kmkocot/Sclerite/genome_projects/test/Ray_2_assembly/MAKER/test-transcriptome.fa 
#set of ESTs or assembled mRNA-seq in fasta format
altest=/media/kmkocot/Sclerite/genome_projects/test/Ray_2_assembly/MAKER/KK273.fa 
#EST/cDNA sequence file in fasta format from an alternate organism
est_gff= #aligned ESTs or mRNA-seq from an external GFF3 file
altest_gff= #aligned ESTs from a closly relate species in GFF3 format

#-----Protein Homology Evidence (for best results provide a file for at 
least one)
protein=/media/kmkocot/Sclerite/genome_projects/test/Ray_2_assembly/MAKER/test-AA.fa 
#protein sequence file in fasta format (i.e. from mutiple oransisms)
protein_gff=  #aligned protein homology evidence from an external GFF3 file

#-----Repeat Masking (leave values blank to skip repeat masking)
model_org=all #select a model organism for RepBase masking in RepeatMasker
rmlib= #provide an organism specific repeat library in fasta format for 
RepeatMasker
repeat_protein=/usr/local/bin/maker/data/te_proteins.fasta #provide a 
fasta file of transposable element proteins for RepeatRunner
rm_gff= #pre-identified repeat elements from an external GFF3 file
prok_rm=0 #forces MAKER to repeatmask prokaryotes (no reason to change 
this), 1 = yes, 0 = no
softmask=1 #use soft-masking rather than hard-masking in BLAST (i.e. seg 
and dust filtering)

#-----Gene Prediction
snaphmm= #SNAP HMM file
gmhmm=/media/kmkocot/Sclerite/genome_projects/test/Ray_2_assembly/MAKER/output/gmhmm.mod 
#GeneMark HMM file
augustus_species= #Augustus gene prediction species model
fgenesh_par_file= #FGENESH parameter file
pred_gff= #ab-initio predictions from an external GFF3 file
model_gff= #annotated gene models from an external GFF3 file (annotation 
pass-through)
est2genome=1 #infer gene predictions directly from ESTs, 1 = yes, 0 = no
protein2genome=1 #infer predictions from protein homology, 1 = yes, 0 = no
trna=0 #find tRNAs with tRNAscan, 1 = yes, 0 = no
snoscan_rrna= #rRNA file to have Snoscan find snoRNAs
unmask=0 #also run ab-initio prediction programs on unmasked sequence, 1 
= yes, 0 = no

#-----Other Annotation Feature Types (features MAKER doesn't recognize)
other_gff= #extra features to pass-through to final MAKER generated GFF3 
file

#-----External Application Behavior Options
alt_peptide=C #amino acid used to replace non-standard amino acids in 
BLAST databases
cpus=1 #max number of cpus to use in BLAST and RepeatMasker (not for 
MPI, leave 1 when using MPI)

#-----MAKER Behavior Options
max_dna_len=100000 #length for dividing up contigs into chunks 
(increases/decreases memory usage)
min_contig=1 #skip genome contigs below this length (under 10kb are 
often useless)

pred_flank=200 #flank for extending evidence clusters sent to gene 
predictors
pred_stats=0 #report AED and QI statistics for all predictions as well 
as models
AED_threshold=1 #Maximum Annotation Edit Distance allowed (bound by 0 and 1)
min_protein=0 #require at least this many amino acids in predicted proteins
alt_splice=0 #Take extra steps to try and find alternative splicing, 1 = 
yes, 0 = no
always_complete=0 #extra steps to force start and stop codons, 1 = yes, 
0 = no
map_forward=0 #map names and attributes forward from old GFF3 genes, 1 = 
yes, 0 = no
keep_preds=0 #Concordance threshold to add unsupported gene prediction 
(bound by 0 and 1)

split_hit=10000 #length for the splitting of hits (expected max intron 
size for evidence alignments)
single_exon=0 #consider single exon EST evidence when generating 
annotations, 1 = yes, 0 = no
single_length=250 #min length required for single exon ESTs if 
'single_exon is enabled'
correct_est_fusion=0 #limits use of ESTs in annotation to avoid fusion genes

tries=2 #number of times to try a contig if there is a failure for some 
reason
clean_try=0 #remove all data from previous run before retrying, 1 = yes, 
0 = no
clean_up=0 #removes theVoid directory with individual analysis files, 1 
= yes, 0 = no
TMP= #specify a directory other than the system default temporary 
directory for temporary files

-- 
Kevin M. Kocot, Ph.D.
NSF International Postdoctoral Research Fellow
Degnan Lab
The University of Queensland
School of Biological Sciences
325 Goddard Building 8
St. Lucia, QLD 4072
Australia
Ph: +61 0402 488 430





More information about the maker-devel mailing list