[maker-devel] spliting genome for annotation

michel.moser at ips.unibe.ch michel.moser at ips.unibe.ch
Thu Jun 27 07:33:15 MDT 2013


Dear Maker-developers 

If i understood correctly, in order to increase speed and reduce needed resources one can split the genome into chunks and annotate each chunk separately. 
(i would really like to use that as i am working with a 1.2 Gbasepair draftgenome and cant use MPI on the computing cluster) 
I am a bit worried about how this might affect the annotation as the gene-predictor would get trained quite differently for each chunk, right? 
Or is there communication between the chunks using the -base function of maker?

Could you maybe name some pros and cons of splitting your genome for the annotation with maker?

Thank you very much, 
Michel




________________________________________
Von: Moser, Michel (IPS)
Gesendet: Donnerstag, 27. Juni 2013 15:24
An: Carson Holt
Betreff: AW: [maker-devel] start position for some genes results

________________________________________
Von: maker-devel [maker-devel-bounces at yandell-lab.org]" im Auftrag von "Carson Holt [carsonhh at gmail.com]
Gesendet: Mittwoch, 26. Juni 2013 04:02
An: Jingjing Jin; maker-devel at yandell-lab.org
Betreff: Re: [maker-devel] start position for some genes results

The point of the failure you are seeing is occurring in the initialization stage, before reaching any of the changes that would have been introduced by 2.28.  Try running the test data that comes with MAKER, does it fail as well?

--Carson



From: Jingjing Jin <jjin01 at mail.rockefeller.edu<mailto:jjin01 at mail.rockefeller.edu>>
Date: Tuesday, 25 June, 2013 9:53 PM
To: Carson Holt <carsonhh at gmail.com<mailto:carsonhh at gmail.com>>, "maker-devel at yandell-lab.org<mailto:maker-devel at yandell-lab.org>" <maker-devel at yandell-lab.org<mailto:maker-devel at yandell-lab.org>>
Subject: RE: [maker-devel] start position for some genes results

Yes, this is the real name.

There is also no ":" in the name.

Because I have use the same file for maker.2.27 and have no problem.

I am not sure what is wrong with the new version.

Jingjing


________________________________
From: Carson Holt [carsonhh at gmail.com<mailto:carsonhh at gmail.com>]
Sent: Tuesday, June 25, 2013 9:47 PM
To: Jingjing Jin; maker-devel at yandell-lab.org<mailto:maker-devel at yandell-lab.org>
Subject: Re: [maker-devel] start position for some genes results

Could you check for this sequence in your input genome file for "processed_tobacco_genome_sequences_c1", make sure that it is in fact that exact name, and there are no ':' characters in the name because they can confuse the bioperl fasta indexer.

--Carson


From: Jingjing Jin <jjin01 at mail.rockefeller.edu<mailto:jjin01 at mail.rockefeller.edu>>
Date: Tuesday, 25 June, 2013 9:30 PM
To: Carson Holt <carsonhh at gmail.com<mailto:carsonhh at gmail.com>>, "maker-devel at yandell-lab.org<mailto:maker-devel at yandell-lab.org>" <maker-devel at yandell-lab.org<mailto:maker-devel at yandell-lab.org>>
Subject: RE: [maker-devel] start position for some genes results

Dear Carson,


I am so sorry. The problem is still here.

STATUS: Parsing control files...
STATUS: Processing and indexing input FASTA files...
STATUS: Setting up database for any GFF3 input...
A data structure will be created for you at:
/home/jingjing/project/tobacco/Nicotiana_tabacum/maker.2.28/1/tobacco_seq_1.maker.output/tobacco_seq_1_datastore

To access files for individual sequences use the datastore index:
/home/jingjing/project/tobacco/Nicotiana_tabacum/maker.2.28/1/tobacco_seq_1.maker.output/tobacco_seq_1_master_datastore_index.log

STATUS: Now running MAKER...
WARNING: Cannot find >processed_tobacco_genome_sequences_c1, trying to re-index the fasta.
stop here: processed_tobacco_genome_sequences_c1
ERROR: Fasta index error
 at /home/jingjing/software/maker.2.28/maker/bin/../lib/Process/MpiChunk.pm line 239.
        Process::MpiChunk::_prepare('Process::MpiChunk=HASH(0x4e16178)', 'HASH(0x4e10810)', 0) called at /home/jingjing/software/maker.2.28/maker/bin/../lib/Process/MpiTiers.pm line 73
        Process::MpiTiers::__ANON__() called at /home/jingjing/software/maker.2.28/maker/bin/../lib/Error.pm line 415
        eval {...} called at /home/jingjing/software/maker.2.28/maker/bin/../lib/Error.pm line 407
        Error::subs::try('CODE(0x4e19100)', 'HASH(0x4e1bd58)') called at /home/jingjing/software/maker.2.28/maker/bin/../lib/Process/MpiTiers.pm line 79
        Process::MpiTiers::_prepare('Process::MpiTiers=HASH(0x4e16e68)') called at /home/jingjing/software/maker.2.28/maker/bin/../lib/Process/MpiTiers.pm line 56
        Process::MpiTiers::new('Process::MpiTiers', 'HASH(0x4e16ad8)', 0, 'Process::MpiChunk') called at /home/jingjing/software/maker.2.28/maker/bin/./maker line 650
--> rank=NA, hostname=ChuaServer1
ERROR: Failed in tier preparation
WARNING: You must always set a rank before running MpiTiers
FATAL: argument `seq_id` does not exist in MpiTier object

________________________________
From: Carson Holt [carsonhh at gmail.com<mailto:carsonhh at gmail.com>]
Sent: Tuesday, June 25, 2013 8:55 PM
To: Jingjing Jin; maker-devel at yandell-lab.org<mailto:maker-devel at yandell-lab.org>
Subject: Re: [maker-devel] start position for some genes results

Delete the mpi_blastdb directory before starting, to make sure all indexes get rebuilt.  Also make sure you are not setting TMP= to a network mounted location.

--Carson


From: Jingjing Jin <jjin01 at mail.rockefeller.edu<mailto:jjin01 at mail.rockefeller.edu>>
Date: Tuesday, 25 June, 2013 8:53 PM
To: Carson Holt <carsonhh at gmail.com<mailto:carsonhh at gmail.com>>, "maker-devel at yandell-lab.org<mailto:maker-devel at yandell-lab.org>" <maker-devel at yandell-lab.org<mailto:maker-devel at yandell-lab.org>>
Subject: RE: [maker-devel] start position for some genes results

Dear Carson,

When I use the new version of maker, I have another problem like this:

jingjing at ChuaServer1:~/project/$ /home/jingjing/software/maker.2.28/maker/bin/./maker
STATUS: Parsing control files...
STATUS: Processing and indexing input FASTA files...
STATUS: Setting up database for any GFF3 input...
A data structure will be created for you at:
/home/jingjing/project/tobacco/Nicotiana_tabacum/maker.2.28/1/tobacco_seq_1.maker.output/tobacco_seq_1_datastore

To access files for individual sequences use the datastore index:
/home/jingjing/project/tobacco/Nicotiana_tabacum/maker.2.28/1/tobacco_seq_1.maker.output/tobacco_seq_1_master_datastore_index.log

STATUS: Now running MAKER...
WARNING: Cannot find >processed_tobacco_genome_sequences_c1, trying to re-index the fasta.
stop here: processed_tobacco_genome_sequences_c1
ERROR: Fasta index error


Do you know how to fix this problem about new version?

Thanks!

Jingjing



________________________________
From: Carson Holt [carsonhh at gmail.com<mailto:carsonhh at gmail.com>]
Sent: Tuesday, June 25, 2013 6:55 PM
To: Jingjing Jin; maker-devel at yandell-lab.org<mailto:maker-devel at yandell-lab.org>
Subject: Re: [maker-devel] start position for some genes results

What MAKER version are you using?  This should be fixed in the current 2.28.  It only happened under a very specific set of circumstances, but I remember fixing it. So let me know if you are using 2.28.

--Carson



From: Jingjing Jin <jjin01 at mail.rockefeller.edu<mailto:jjin01 at mail.rockefeller.edu>>
Date: Tuesday, 25 June, 2013 5:13 PM
To: "maker-devel at yandell-lab.org<mailto:maker-devel at yandell-lab.org>" <maker-devel at yandell-lab.org<mailto:maker-devel at yandell-lab.org>>
Subject: [maker-devel] start position for some genes results

Dear all,

I find some strange things about location for my final result.

Like for some start position of final gene model:

c124062 maker   gene    -1      507     .       -       .       ID=maker-c124062-snap-gene-0.2;Name=maker-c124062-snap-gene-0.2


It start position is -1.

Does someone know why the start position is  -1?

Is there something wrong?

Thanks!

Jingjing


_______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com<mailto:maker-devel at box290.bluehost.com> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org




More information about the maker-devel mailing list