From jennifer.anderson at ebc.uu.se  Tue Jul  3 06:57:56 2018
From: jennifer.anderson at ebc.uu.se (Jennifer Anderson)
Date: Tue, 3 Jul 2018 13:57:56 +0200
Subject: [maker-devel] Genemark XXX.mod files
Message-ID: <902950FF-775C-46DC-987A-5666A56A6650@ebc.uu.se>


Hello,

I am working on annotations for fungal genomes, using GenemarkES with ?fungi for gene prediction.  In earlier attempts, I did not use the  training flag, and I did get the output gmhmm file.  Now I have tried with the training flag and do not get this file.  In the /run/ directory I do get mod files  ES_A.mod, ES_B.mod, and ES_C.mod, as well as ini.mod.  Does one of these files work as the ES.mod file as in
"gmhmm=../train_genemark/es.mod #GeneMark HMM file? from http://weatherby.genetics.utah.edu/MAKER/wiki/index.php/The_MAKER_control_files_explained? I don?t find documentation of the genemarkES output online.

Thank you.

Jenni


N?r du har kontakt med oss p? Uppsala universitet med e-post s? inneb?r det att vi behandlar dina personuppgifter. F?r att l?sa mer om hur vi g?r det kan du l?sa h?r: http://www.uu.se/om-uu/dataskydd-personuppgifter/

E-mailing Uppsala University means that we will process your personal data. For more information on how this is performed, please read here: http://www.uu.se/om-uu/dataskydd-personuppgifter/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://box290.bluehost.com/pipermail/maker-devel_yandell-lab.org/attachments/20180703/b44d9709/attachment.html>

From liorglic at mail.tau.ac.il  Wed Jul  4 07:32:05 2018
From: liorglic at mail.tau.ac.il (Lior Glick)
Date: Wed, 4 Jul 2018 14:32:05 +0200
Subject: [maker-devel] How sensitive is MAKER to redundant/partial
 transcripts?
Message-ID: <CAOzMDPyi3465OxB1n2oQJFjdG_rG7EvsODvLoexDBLQYVd7jhQ@mail.gmail.com>

 Dear MAKER users,

I am new to MAKER and would like your advice.
I am planning to annotate multiple genomes of tomato variants and wild
relatives. To this end, I have been working on generating a diverse
transcripts data set to be used as input for MAKER (along with protein
sequences and the 'official' tomato annotation). My transcripts set was
generated by collecting multiple available RNA-Seq results from SRA,
covering diverse variants, conditions and tissues, and assembling them into
transcripts using Trinity. My goal is to have a data set as diverse and
broad as possible.
Now I have ~30 fasta files of transcripts, originating from different
studies. Of course, many of the transcripts are redundant and/or partial. I
am exploring ways to merge the multiple data sets into a non-redundant one,
while also stitching partial transcripts into longer ones based on overlaps.
However, this turns out to be not-so-trivial and I am wandering if this is
really necessary in order to get a good annotation? Maybe I can just
concatenate all my transcriptome assembly results, and MAKER will handle
redundant and partial transcripts?
Can someone clarify how this works, and try to assess if an annotation
based on a merged data set should be superior to one that didn't undergo
such a process? If someone has actual experience with such data, that
would be really helpful, but any advice would be highly appreciated.

Thanks a lot and best regards,
Lior
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://box290.bluehost.com/pipermail/maker-devel_yandell-lab.org/attachments/20180704/90ee431a/attachment.html>

From jason.stajich at gmail.com  Thu Jul  5 13:13:57 2018
From: jason.stajich at gmail.com (Jason Stajich)
Date: Thu, 5 Jul 2018 11:13:57 -0700
Subject: [maker-devel] Genemark XXX.mod files
In-Reply-To: <902950FF-775C-46DC-987A-5666A56A6650@ebc.uu.se>
References: <902950FF-775C-46DC-987A-5666A56A6650@ebc.uu.se>
Message-ID: <CALf8Lpy-KretSGjwmMTyxOARiJ251mzbxb8HuV4XU8asOfW0dg@mail.gmail.com>

the run/ES_C.mod should be the right one if it is there.
It is possible is crashing on one of the training / retraining?

Jason Stajich
jason.stajich at gmail.com


On Tue, Jul 3, 2018 at 11:05 AM Jennifer Anderson <
jennifer.anderson at ebc.uu.se> wrote:

>
> Hello,
>
> I am working on annotations for fungal genomes, using GenemarkES with
> ?fungi for gene prediction.  In earlier attempts, I did not use the
>  training flag, and I did get the output gmhmm file.  Now I have tried with
> the training flag and do not get this file.  In the /run/ directory I do
> get mod files  ES_A.mod, ES_B.mod, and ES_C.mod, as well as ini.mod.  Does
> one of these files work as the ES.mod file as in
> "gmhmm=../train_genemark/es.mod #GeneMark HMM file? from
> http://weatherby.genetics.utah.edu/MAKER/wiki/index.php/The_MAKER_control_files_explained? I
> don?t find documentation of the genemarkES output online.
>
> Thank you.
>
> Jenni
>
>
>
>
>
>
>
>
>
> N?r du har kontakt med oss p? Uppsala universitet med e-post s? inneb?r
> det att vi behandlar dina personuppgifter. F?r att l?sa mer om hur vi g?r
> det kan du l?sa h?r: http://www.uu.se/om-uu/dataskydd-personuppgifter/
>
> E-mailing Uppsala University means that we will process your personal
> data. For more information on how this is performed, please read here:
> http://www.uu.se/om-uu/dataskydd-personuppgifter/
> _______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://box290.bluehost.com/pipermail/maker-devel_yandell-lab.org/attachments/20180705/6ab6707b/attachment.html>

From carsonhh at gmail.com  Thu Jul  5 13:47:38 2018
From: carsonhh at gmail.com (Carson Holt)
Date: Thu, 5 Jul 2018 12:47:38 -0600
Subject: [maker-devel] How sensitive is MAKER to redundant/partial
 transcripts?
In-Reply-To: <CAOzMDPyi3465OxB1n2oQJFjdG_rG7EvsODvLoexDBLQYVd7jhQ@mail.gmail.com>
References: <CAOzMDPyi3465OxB1n2oQJFjdG_rG7EvsODvLoexDBLQYVd7jhQ@mail.gmail.com>
Message-ID: <788E84AB-DB85-43AD-8FE1-C1D8A7DBD4B5@gmail.com>

MAKER will collapse redundant evidence after alignment, so it will primarily just increase run time. The main issue with so many datasets would be false positive alignments (assembled background transcription). You can look at individual contigs in Apollo, IGV, or other browser to see where spurious alignments occur and if they are overall associated with a particular dataset (it?s ok to throw out a noisy dataset especially if you have additional data).

?Carson


> On Jul 4, 2018, at 6:32 AM, Lior Glick <liorglic at mail.tau.ac.il> wrote:
> 
> Dear MAKER users,
> 
> I am new to MAKER and would like your advice.
> I am planning to annotate multiple genomes of tomato variants and wild relatives. To this end, I have been working on generating a diverse transcripts data set to be used as input for MAKER (along with protein sequences and the 'official' tomato annotation). My transcripts set was generated by collecting multiple available RNA-Seq results from SRA, covering diverse variants, conditions and tissues, and assembling them into transcripts using Trinity. My goal is to have a data set as diverse and broad as possible.
> Now I have ~30 fasta files of transcripts, originating from different studies. Of course, many of the transcripts are redundant and/or partial. I am exploring ways to merge the multiple data sets into a non-redundant one, while also stitching partial transcripts into longer ones based on overlaps.
> However, this turns out to be not-so-trivial and I am wandering if this is really necessary in order to get a good annotation? Maybe I can just concatenate all my transcriptome assembly results, and MAKER will handle redundant and partial transcripts?
> Can someone clarify how this works, and try to assess if an annotation based on a merged data set should be superior to one that didn't undergo such a process? If someone has actual experience with such data, that  would be really helpful, but any advice would be highly appreciated.
> 
> Thanks a lot and best regards,
> Lior
> 
> _______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://box290.bluehost.com/pipermail/maker-devel_yandell-lab.org/attachments/20180705/b1d1cdc6/attachment.html>

From carsonhh at gmail.com  Thu Jul  5 13:50:36 2018
From: carsonhh at gmail.com (Carson Holt)
Date: Thu, 5 Jul 2018 12:50:36 -0600
Subject: [maker-devel] [CAUTION: Suspicious Link] map_forward=1 not
 mapping reference ID's to output correctly
In-Reply-To: <D5A4E18F-CFDC-489E-BA1B-FB88FA66C338@ars.usda.gov>
References: <CA+DOteeTFd06_k5ONYLvn7FpUuv-JDNqp1PCFa9QF0TxDa9iEg@mail.gmail.com>
	<D5A4E18F-CFDC-489E-BA1B-FB88FA66C338@ars.usda.gov>
Message-ID: <4EE96E7F-5F5B-4988-BC9C-FC441848B768@gmail.com>

A quick overview of MAKER behavior. MAKER will keep everything in model_gff as long as you don?t provide another predictor to run or pred_gff file to use. But if you give it a predictor to run, it takes that as an indicator that you want to update models. So model_gff may get replaced by another prediction that overlaps it but scores better.

So depending on the behavior you want, make sure you are using model_gff and do or don?t provide a gene predictor to run.

?Carson


> On Jun 22, 2018, at 2:04 PM, Poelchau, Monica <monica.poelchau at ars.usda.gov> wrote:
> 
> Hi Kapeel,
>  
> If you just want your community annotations to replace models in an existing gene set, we have a tool for this:
>  
> https://github.com/NAL-i5K/GFF3toolkit <https://github.com/NAL-i5K/GFF3toolkit>
>  
> You?d need to run gff3_QC on your annotation files first to make sure your annotations are okay, then use gff3_merge to merge your community annotations with your existing gene set (in gff3 format). If you end up trying this out - we?re actively developing the GFF3toolkit, so feel free to post an issue if you notice any problems.
>  
> Hth,
>  
> Monica 
>  
> From: maker-devel <maker-devel-bounces at yandell-lab.org <mailto:maker-devel-bounces at yandell-lab.org>> on behalf of Kapeel Chougule <kapeelc at gmail.com <mailto:kapeelc at gmail.com>>
> Date: Friday, June 22, 2018 at 13:53
> To: "maker-devel at yandell-lab.org <mailto:maker-devel at yandell-lab.org>" <maker-devel at yandell-lab.org <mailto:maker-devel at yandell-lab.org>>
> Subject: [CAUTION: Suspicious Link][maker-devel] map_forward=1 not mapping reference ID's to output correctly
>  
> PROCEED WITH CAUTION: This message triggered warnings of potentially malicious web content. Evaluate this email by considering whether you are expecting the message, along with inspection for suspicious links.
> 
> Questions: Spam.Abuse at wdc.usda.gov <mailto:Spam.Abuse at wdc.usda.gov>
> 
> Hi,
>  
> I am trying to update community annotation <https://de.cyverse.org/dl/d/39D60E88-078D-4CF5-9F3A-D712B714CDD8/community.annotation.gff3> in the light of new evidence data but my MAKER runs are not keeping all the genes from the community annotation.
> 
> 
> Community annotation feature count: 2 1 bicolor 239969 CDS 266301 exon 51066 five_prime_UTR 34129 gene 47121 mRNA 53708 three_prime_UTR
> MAKER gene count->  
> awk '$3=="gene"{print}' maker_output.all.gff | grep "Sobic*" | wc -l 21105
>  
> In the maker_opts.ctl file attached, I did make keep_preds=1 and map_forward=1 which keep all the community gene models even if they dont have evidence support. This was explained here:
> http://weatherby.genetics.utah.edu/MAKER/wiki/index.php/Updating_annotations_in_light_of_new_data <http://weatherby.genetics.utah.edu/MAKER/wiki/index.php/Updating_annotations_in_light_of_new_data>
> . So not sure why we dont have the all the community gene models mapped in the MAKER output
> 
> Thanks
> 
> Kapeel
> -- 
>  
> Kapeel Chougule
> Computational Scientist Developer II
> One Bungtown Road Cold Spring Harbor, NY 11724
> http://www.warelab.org/ <http://www.warelab.org/>
> 
> 
> 
> This electronic message contains information generated by the USDA solely for the intended recipients. Any unauthorized interception of this message or the use or disclosure of the information it contains may violate the law and subject the violator to civil or criminal penalties. If you believe you have received this message in error, please notify the sender and delete the email immediately. _______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com <mailto:maker-devel at box290.bluehost.com>
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org <http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://box290.bluehost.com/pipermail/maker-devel_yandell-lab.org/attachments/20180705/0a9af988/attachment.html>

From carsonhh at gmail.com  Thu Jul  5 14:17:14 2018
From: carsonhh at gmail.com (Carson Holt)
Date: Thu, 5 Jul 2018 13:17:14 -0600
Subject: [maker-devel] Maker Error : Thread 1 terminated abnormally..
In-Reply-To: <CAAXcPKC3mnkqP9OU7L9bBLtts4KujCoBrUNieuUfgo+wd-E4Yw@mail.gmail.com>
References: <CAAXcPKC3mnkqP9OU7L9bBLtts4KujCoBrUNieuUfgo+wd-E4Yw@mail.gmail.com>
Message-ID: <C61CC367-F138-47F2-AA61-876811458353@gmail.com>

Sorry for the slow reply. Make sure you find out what flavor of MPI you are using (MPICH, MVAPICH2, Intel MPI, or OpenMPI). MAKER does not work with MVAPICH2. It can work with Intel MPI and OpenMPI with some command line modification. And it always works with MPICH, but MPICH may not be able to scale to more than ~100 CPUs.

This command ?-mca btl ^openib?, is only for OpenMPI for example. Also if using OpenMPI, set LD_PRELOAD in accordance with the INSTALL documentation. Also make sure you do not have multiple MPI flavors installed and you compiled MAKER with one then are running with a different flavor. That will cause failure shortly after starting MAKER.

Try looking further back in your STDER for the actual cause. The ?Thread 1 terminated abnormally:? message is the tail end of the failure snowball, so the actual cause is often much further back. 

?Carson


> On Jun 26, 2018, at 9:36 AM, Andr? Machado <andremmachado25 at gmail.com> wrote:
> 
> Hi ,
> 
> First of all thanks for your efforts in Maker pipeline. Its a tremendous help for the people that works with genomes.
> In the last 4 days i have broke my head.. with an error .. but still without a solution.
> I found this old thread: https://groups.google.com/forum/#!msg/maker-devel/X2-76BH9gvg/rU4kLJ3B6tsJ <https://groups.google.com/forum/#!msg/maker-devel/X2-76BH9gvg/rU4kLJ3B6tsJ>
> Seems to be a quite similar... but don't point to a specific solution.
> I have run maker with the data test and all runned ok. Maker finalize the entire process without errors.
> Recently, i?m trying to aplly my own data on MPI cluster. But this error, frequently occurred.
> Thread 1 terminated abnormally: ../dna.maker.output/mpi_blastdb/dna%2Efa.mpi.1/dna%2Efa.mpi.1.0
> --> rank=8, hostname=compute-0-1.local, at ../Analysis/Geno/maker/bin/maker line 1451 thread 1.
> --> rank=8, hostname=compute-0-1.local
> deleted:0 hits
> deleted:0 hits
> preparing ab-inits
> deleted:0 hits
> deleted:0 hits
> FATAL: Thread terminated, causing all processes to fail
> --> rank=8, hostname=compute-0-1.local
> deleted:0 hits
> 
> Basically im tring to run a maker with dna.fa, rna.fa, prot.fa and my_custom_lib_of_repeats.fa, to produce raw genes models which will be used to train SNAP.
> 
> I already used several command lines and all gave me the same error.. The only change between different tests was the local of the error, sometimes happened in compute-0-1.local other time in compute-0-4.local or in another one.
> mpiexec -n 63 --hostfile Host maker 1>1.log 2>2.err
> 
> mpiexec --hostfile Host maker 1>1.log 2>2.err
> mpiexec -mca btl ^openib -n 63 --hostfile Host maker 1>1.log 2>2.err
> nohup mpiexec -mca btl ^openib -n 63 --hostfile Host maker -a 1>1.log 2>2.err
> 
> The log file as well the option files are provided below.
> 
> Many thanks in advance,
> 
> Andr?
> 
> <2.log><maker_exe.ctl><maker_opts.ctl><maker_bopts.ctl>_______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://box290.bluehost.com/pipermail/maker-devel_yandell-lab.org/attachments/20180705/a5f9a756/attachment.html>

From andremmachado25 at gmail.com  Wed Jul  4 06:16:08 2018
From: andremmachado25 at gmail.com (=?UTF-8?Q?Andr=C3=A9_Machado?=)
Date: Wed, 4 Jul 2018 12:16:08 +0100
Subject: [maker-devel] =?utf-8?q?Maker_Error_=3A_Thread_1_terminated_abno?=
	=?utf-8?q?rmally=2E=2E_Andr=C3=A9_Machado_=3Candremmachado25=40gma?=
	=?utf-8?q?il=2Ecom=3E_AttachmentsJun_26_=288_days_ago=29_to_maker-?=
	=?utf-8?q?devel_Hi_=2C_First_of_all_thanks_for_your_efforts_in_Mak?=
	=?utf-8?q?er_pipeline=2E_Its_a_tremendous_help_for_the_people_that?=
	=?utf-8?q?_works_with_genomes=2E_In_the_last_4_days_i_have_broke_m?=
	=?utf-8?q?y_head=2E=2E_with_an_error_=2E=2E_but_still_without_a_so?=
	=?utf-8?q?lution=2E_I_found_this_old_thread=3A_https=3A//groups=2E?=
	=?utf-8?q?google=2Ecom/forum/=23!msg/maker-devel/X2-76BH9gvg/rU4kL?=
	=?utf-8?q?J3B6tsJ_Seems_to_be_a_quite_similar=2E=2E=2E_but_don=27t?=
	=?utf-8?q?_point_to_a_specific_solution=2E_I_have_run_maker_with_t?=
	=?utf-8?q?he_data_test_and_all_runned_ok=2E_Maker_finalize_the_ent?=
	=?utf-8?q?ire_process_without_errors=2E_Recently=2C_i=E2=80=99m_tr?=
	=?utf-8?q?ying_to_aplly_my_own_data_on_MPI_cluster=2E_But_this_err?=
	=?utf-8?q?or=2C_frequently_occurred=2E_Thread_1_terminated_abnorma?=
	=?utf-8?q?lly=3A_=2E=2E/dna=2Emaker=2Eoutput/mpi=5Fblastdb/dna=252?=
	=?utf-8?b?RWZhLm1waS4xL2RuYSUyRWZhLm1waS4xLjAgLS0+IHJhbms9OCwgaG9z?=
	=?utf-8?q?tname=3Dcompute-0-1=2Elocal=2C_at_=2E=2E/Analysis/Geno/m?=
	=?utf-8?q?aker/bin/maker_line_1451_thread_1=2E_--=3E_rank=3D8=2C_h?=
	=?utf-8?q?ostname=3Dcompute-0-1=2Elocal_deleted=3A0_hits_deleted?=
	=?utf-8?q?=3A0_hits_preparing_ab-ini?=
Message-ID: <CAAXcPKBUtfN3aSxqjo9qHgiS1WNXLRz6Z+Qm2USZkJ_HkvH-Dw@mail.gmail.com>

Hi ,


First of all thanks for your efforts in Maker pipeline. Its a tremendous
help for the people that works with genomes.

In the last 4 days i have broke my head.. with an error .. but still
without a solution.

I found this old thread: https://groups.google.com/
forum/#!msg/maker-devel/X2-76BH9gvg/rU4kLJ3B6tsJ

Seems to be a quite similar... but don't point to a specific solution.

I have run maker with the data test and all runned ok. Maker finalize the
entire process without errors.

Recently, i?m trying to aplly my own data on MPI cluster. But this error,
frequently occurred.

Thread 1 terminated abnormally: ../dna.maker.output/mpi_
blastdb/dna%2Efa.mpi.1/dna%2Efa.mpi.1.0

--> rank=8, hostname=compute-0-1.local, at ../Analysis/Geno/maker/bin/maker
line 1451 thread 1.

--> rank=8, hostname=compute-0-1.local

deleted:0 hits

deleted:0 hits

preparing ab-inits

deleted:0 hits

deleted:0 hits

FATAL: Thread terminated, causing all processes to fail

--> rank=8, hostname=compute-0-1.local

deleted:0 hits


Basically im tring to run a maker with dna.fa, rna.fa, prot.fa and
my_custom_lib_of_repeats.fa, to produce raw genes models which will be used
to train SNAP.


I already used several command lines and all gave me the same error.. The
only change between different tests was the local of the error, sometimes
happened in compute-0-1.local other time in compute-0-4.local or in another
one.

mpiexec -n 63 --hostfile Host maker 1>1.log 2>2.err

mpiexec --hostfile Host maker 1>1.log 2>2.err

mpiexec -mca btl ^openib -n 63 --hostfile Host maker 1>1.log 2>2.err

nohup mpiexec -mca btl ^openib -n 63 --hostfile Host maker -a 1>1.log
2>2.err


The log file as well the option files are provided below.


Many thanks in advance,


Andr?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://box290.bluehost.com/pipermail/maker-devel_yandell-lab.org/attachments/20180704/20722397/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 2.log
Type: text/x-log
Size: 38654 bytes
Desc: not available
URL: <http://box290.bluehost.com/pipermail/maker-devel_yandell-lab.org/attachments/20180704/20722397/attachment.log>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: maker_exe.ctl
Type: application/octet-stream
Size: 1223 bytes
Desc: not available
URL: <http://box290.bluehost.com/pipermail/maker-devel_yandell-lab.org/attachments/20180704/20722397/attachment.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: maker_opts.ctl
Type: application/octet-stream
Size: 4547 bytes
Desc: not available
URL: <http://box290.bluehost.com/pipermail/maker-devel_yandell-lab.org/attachments/20180704/20722397/attachment-0001.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: maker_bopts.ctl
Type: application/octet-stream
Size: 1412 bytes
Desc: not available
URL: <http://box290.bluehost.com/pipermail/maker-devel_yandell-lab.org/attachments/20180704/20722397/attachment-0002.obj>

From liorglck at gmail.com  Wed Jul  4 07:28:14 2018
From: liorglck at gmail.com (Lior Glick)
Date: Wed, 4 Jul 2018 14:28:14 +0200
Subject: [maker-devel] How sensitive is MAKER to redundant/partial
 transcripts?
Message-ID: <CAFOVipOCZphMxMLitWEVdjJg1WrF2LxVRkJBWtQOEOSEFUzDOA@mail.gmail.com>

Dear MAKER users,

I am new to MAKER and would like your advice.
I am planning to annotate multiple genomes of tomato variants and wild
relatives. To this end, I have been working on generating a diverse
transcripts data set to be used as input for MAKER (along with protein
sequences and the 'official' tomato annotation). My transcripts set was
generated by collecting multiple available RNA-Seq results from SRA,
covering diverse variants, conditions and tissues, and assembling them into
transcripts using Trinity. My goal is to have a data set as diverse and
broad as possible.
Now I have ~30 fasta files of transcripts, originating from different
studies. Of course, many of the transcripts are redundant and/or partial. I
am exploring ways to merge the multiple data sets into a non-redundant one,
while also stitching partial transcripts into longer ones based on overlaps.
However, this turns out to be not-so-trivial and I am wandering if this is
really necessary in order to get a good annotation? Maybe I can just
concatenate all my transcriptome assembly results, and MAKER will handle
redundant and partial transcripts?
Can someone clarify how this works, and try to assess if an annotation
based on a merged data set should be superior to one that didn't undergo
such a process? If someone has actual experience with such data, that
would be really helpful, but any advice would be highly appreciated.

Thanks a lot and best regards,
Lior
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://box290.bluehost.com/pipermail/maker-devel_yandell-lab.org/attachments/20180704/71d71274/attachment.html>

From carsonhh at gmail.com  Thu Jul 12 15:05:00 2018
From: carsonhh at gmail.com (Carson Holt)
Date: Thu, 12 Jul 2018 14:05:00 -0600
Subject: [maker-devel] 
 =?utf-8?q?Maker_Error_=3A_Thread_1_terminated_abno?=
 =?utf-8?q?rmally=2E=2E_Andr=C3=A9_Machado_=3Candremmachado25=40gmail=2Eco?=
 =?utf-8?q?m=3E_AttachmentsJun_26_=288_days_ago=29_to_maker-devel_Hi_=2C_F?=
 =?utf-8?q?irst_of_all_thanks_for_your_efforts_in_Maker_pipeline=2E_Its_a_?=
 =?utf-8?q?tremendous_help_for_the_people_that_works_with_genomes=2E_In_th?=
 =?utf-8?q?e_last_4_days_i_have_broke_my_head=2E=2E_with_an_error_=2E=2E_b?=
 =?utf-8?q?ut_still_without_a_solution=2E_I_found_this_old_thread=3A_https?=
 =?utf-8?q?=3A//groups=2Egoogle=2Ecom/forum/=23!msg/maker-devel/X2-76BH9gv?=
 =?utf-8?q?g/rU4kLJ3B6tsJ_Seems_to_be_a_quite_similar=2E=2E=2E_but_don=27t?=
 =?utf-8?q?_point_to_a_specific_solution=2E_I_have_run_maker_with_the_data?=
 =?utf-8?q?_test_and_all_runned_ok=2E_Maker_finalize_the_entire_process_wi?=
 =?utf-8?q?thout_errors=2E_Recently=2C_i=E2=80=99m_trying_to_aplly_my_own_?=
 =?utf-8?q?data_on_MPI_cluster=2E_But_this_error=2C_frequently_occurred=2E?=
 =?utf-8?q?_Thread_1_terminated_abnormally=3A_=2E=2E/dna=2Emaker=2Eoutput/?=
 =?utf-8?q?mpi=5Fblastdb/dna=252Efa=2Empi=2E1/dna=252Efa=2Empi=2E1=2E0_--?=
 =?utf-8?q?=3E_rank=3D8=2C_hostname=3Dcompute-0-1=2Elocal=2C_at_=2E=2E/Ana?=
 =?utf-8?q?lysis/Geno/maker/bin/maker_line_1451_thread_1=2E_--=3E_rank=3D8?=
 =?utf-8?q?=2C_hostname=3Dcompute-0-1=2Elocal_deleted=3A0_hits_deleted=3A0?=
 =?utf-8?q?_hits_preparing_ab-ini?=
In-Reply-To: <CAAXcPKBUtfN3aSxqjo9qHgiS1WNXLRz6Z+Qm2USZkJ_HkvH-Dw@mail.gmail.com>
References: <CAAXcPKBUtfN3aSxqjo9qHgiS1WNXLRz6Z+Qm2USZkJ_HkvH-Dw@mail.gmail.com>
Message-ID: <5F1E5499-239E-405E-81EC-CECC755D7838@gmail.com>

Because you truncated / removed line before the actual error (I need to see the several hundred lines that happened before  "Thread 1 terminated abnormally: ../dna.maker.output/mpi_blastdb/dna%2Efa.mpi.1/dna%2Efa.mpi.1.0?), I can?t give hyou more info.

But you are getting a lot of OpenMPI complaints at the start. You may need to reinstall OpenMPI or use MPICH instead (both will require you to reinstall maker as it will need to rebuild the MPI C/Perl binding for the new installation). Also when using OpenMPI, make sure to export LD_PRELOAD in the way outlined in the ?/maker/INSTALL instructions. 

?Carson


> On Jul 4, 2018, at 5:16 AM, Andr? Machado <andremmachado25 at gmail.com> wrote:
> 
> Hi ,
> 
> First of all thanks for your efforts in Maker pipeline. Its a tremendous help for the people that works with genomes.
> In the last 4 days i have broke my head.. with an error .. but still without a solution.
> I found this old thread: https://groups.google.com/forum/#!msg/maker-devel/X2-76BH9gvg/rU4kLJ3B6tsJ <https://groups.google.com/forum/#!msg/maker-devel/X2-76BH9gvg/rU4kLJ3B6tsJ>
> Seems to be a quite similar... but don't point to a specific solution.
> I have run maker with the data test and all runned ok. Maker finalize the entire process without errors.
> Recently, i?m trying to aplly my own data on MPI cluster. But this error, frequently occurred.
> Thread 1 terminated abnormally: ../dna.maker.output/mpi_blastdb/dna%2Efa.mpi.1/dna%2Efa.mpi.1.0
> --> rank=8, hostname=compute-0-1.local, at ../Analysis/Geno/maker/bin/maker line 1451 thread 1.
> --> rank=8, hostname=compute-0-1.local
> deleted:0 hits
> deleted:0 hits
> preparing ab-inits
> deleted:0 hits
> deleted:0 hits
> FATAL: Thread terminated, causing all processes to fail
> --> rank=8, hostname=compute-0-1.local
> deleted:0 hits
> 
> Basically im tring to run a maker with dna.fa, rna.fa, prot.fa and my_custom_lib_of_repeats.fa, to produce raw genes models which will be used to train SNAP.
> 
> I already used several command lines and all gave me the same error.. The only change between different tests was the local of the error, sometimes happened in compute-0-1.local other time in compute-0-4.local or in another one.
> mpiexec -n 63 --hostfile Host maker 1>1.log 2>2.err
> 
> mpiexec --hostfile Host maker 1>1.log 2>2.err
> mpiexec -mca btl ^openib -n 63 --hostfile Host maker 1>1.log 2>2.err
> nohup mpiexec -mca btl ^openib -n 63 --hostfile Host maker -a 1>1.log 2>2.err
> 
> The log file as well the option files are provided below.
> 
> Many thanks in advance,
> 
> Andr?
> 
> <2.log><maker_exe.ctl><maker_opts.ctl><maker_bopts.ctl>_______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://box290.bluehost.com/pipermail/maker-devel_yandell-lab.org/attachments/20180712/dadbe692/attachment.html>

From carsonhh at gmail.com  Thu Jul 12 15:38:33 2018
From: carsonhh at gmail.com (Carson Holt)
Date: Thu, 12 Jul 2018 14:38:33 -0600
Subject: [maker-devel] How sensitive is MAKER to redundant/partial
 transcripts?
In-Reply-To: <CAFOVipOCZphMxMLitWEVdjJg1WrF2LxVRkJBWtQOEOSEFUzDOA@mail.gmail.com>
References: <CAFOVipOCZphMxMLitWEVdjJg1WrF2LxVRkJBWtQOEOSEFUzDOA@mail.gmail.com>
Message-ID: <C3CE3772-8538-42A9-9178-BEBF719EFFC8@gmail.com>

MAKER will automatically collapse redundant evidence. The only thing you may need to worry about with too many datasets is background transcription. With more datasets you will have more spurious assemblies from background transcription (if you sequence deep enough everything is transcribed at some level). You should also look at the results in a browser like apollo, you may find that some datasets are more noisy than others and it would be beneficial to drop them especially if they are redundant. So always do a  visual review of results.

?Carson


> On Jul 4, 2018, at 6:28 AM, Lior Glick <liorglck at gmail.com> wrote:
> 
> Dear MAKER users,
> 
> I am new to MAKER and would like your advice.
> I am planning to annotate multiple genomes of tomato variants and wild relatives. To this end, I have been working on generating a diverse transcripts data set to be used as input for MAKER (along with protein sequences and the 'official' tomato annotation). My transcripts set was generated by collecting multiple available RNA-Seq results from SRA, covering diverse variants, conditions and tissues, and assembling them into transcripts using Trinity. My goal is to have a data set as diverse and broad as possible.
> Now I have ~30 fasta files of transcripts, originating from different studies. Of course, many of the transcripts are redundant and/or partial. I am exploring ways to merge the multiple data sets into a non-redundant one, while also stitching partial transcripts into longer ones based on overlaps.
> However, this turns out to be not-so-trivial and I am wandering if this is really necessary in order to get a good annotation? Maybe I can just concatenate all my transcriptome assembly results, and MAKER will handle redundant and partial transcripts?
> Can someone clarify how this works, and try to assess if an annotation based on a merged data set should be superior to one that didn't undergo such a process? If someone has actual experience with such data, that  would be really helpful, but any advice would be highly appreciated.
> 
> Thanks a lot and best regards,
> Lior
> _______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org


From shijunpeng at cau.edu.cn  Sat Jul 14 03:04:38 2018
From: shijunpeng at cau.edu.cn (=?UTF-8?B?5Y+y5L+K6bmP?=)
Date: Sat, 14 Jul 2018 16:04:38 +0800 (GMT+08:00)
Subject: [maker-devel] Ask for help about the collapse of Maker (version
 2.31.9) when annotated with Fgenesh
In-Reply-To: <C3CE3772-8538-42A9-9178-BEBF719EFFC8@gmail.com>
References: <CAFOVipOCZphMxMLitWEVdjJg1WrF2LxVRkJBWtQOEOSEFUzDOA@mail.gmail.com>
	<C3CE3772-8538-42A9-9178-BEBF719EFFC8@gmail.com>
Message-ID: <183e519e.83bf.16497d1fd4b.Coremail.shijunpeng@cau.edu.cn>

Dear Carson,

First of all, I must apologize that I could't post my questions in Google group since I can't get access to Google in mainland China.

I am using Maker (version 2.31.9) to annotate several foxtail millet genomes. I combined Augustus and Fgenesh (v.3.1.1) for the de novo annotation of these genomes.

The majority of contigs were anotated well with maker pipeline. While, several contigs failed when annotated with Fgenesh with the following error information:

#--------- command -------------#
Widget::fgenesh:
/NAS7/home/shijunpeng/software/maker/bin/../lib/Widget/fgenesh/fgenesh_wrap /NAS7/home/shijunpeng/software/fgenesh/fgenesh /NAS7/home/shijunpeng/software/fgenesh/Monocots /tmp/43438.1.all.q/maker_8zLUxB/0/108_0.4597215-4597401.Monocots.auto_annotator.fgenesh.fasta -exon_table:/tmp/43438.1.all.q/maker_8zLUxB/0/108_0.4597215-4597401.Monocots.auto_annotator.xdef.fgenesh > /tmp/43438.1.all.q/maker_8zLUxB/0/108_0.4597215-
#-------------------------------#
ERROR: FgenesH failed
--> rank=NA, hostname=bioinfor3.local
ERROR: Failed while annotating transcripts
ERROR: Chunk failed at level:1, tier_type:4
FAILED CONTIG:scaffold_1

ERROR: Chunk failed at level:6, tier_type:0
FAILED CONTIG:scaffold_1
###############################################################################################################################################

A system core file generated after this collapse. I checked the temperate fasta file 108_0.4597215-4597401.Monocots.auto_annotator.fgenesh.fasta to be normal about ~300 bp.

I also checked my original sequence file and confirmed no problem (A,T,C,G and N). I also tried to set the pred_flank option from 200 (original) to 0 and the error still exists.

I ran the Maker pipeline in a single node with 16 processors and 256 Gb RAMs, so it may be not due to the MPI problems.

Below were my detailed maker bahavior options:
#-----MAKER Behavior Options
max_dna_len=300000 #length for dividing up contigs into chunks (increases/decreases memory usage)
min_contig=10000 #skip genome contigs below this length (under 10kb are often useless)

pred_flank=0 #flank for extending evidence clusters sent to gene predictors
pred_stats=1 #report AED and QI statistics for all predictions as well as models
AED_threshold=1 #Maximum Annotation Edit Distance allowed (bound by 0 and 1)
min_protein=0 #require at least this many amino acids in predicted proteins
alt_splice=1 #Take extra steps to try and find alternative splicing, 1 = yes, 0 = no
always_complete=1 #extra steps to force start and stop codons, 1 = yes, 0 = no
map_forward=1 #map names and attributes forward from old GFF3 genes, 1 = yes, 0 = no
keep_preds=0 #Concordance threshold to add unsupported gene prediction (bound by 0 and 1)

split_hit=10000 #length for the splitting of hits (expected max intron size for evidence alignments)
single_exon=0 #consider single exon EST evidence when generating annotations, 1 = yes, 0 = no
single_length=250 #min length required for single exon ESTs if 'single_exon is enabled'
correct_est_fusion=0 #limits use of ESTs in annotation to avoid fusion genes

tries=5 #number of times to try a contig if there is a failure for some reason
clean_try=0 #remove all data from previous run before retrying, 1 = yes, 0 = no
clean_up=0 #removes theVoid directory with individual analysis files, 1 = yes, 0 = no
TMP= #specify a directory other than the system default temporary directory for temporary files 

Could you please help me to solve this error? I am looking forward to hearing from you.

Sincerely, 
Junpeng

--
Junpeng Shi, PhD
State Key Lab For Agrobiotech, China Agricultural University
National Maize Improvement Center of China 
Center For Life Science, NO.2, 
The West Street of Yuanmingyuan Park, Beijing, P.R.China 
Tel?+86-13581863941

From liorglic at mail.tau.ac.il  Tue Jul 24 02:45:06 2018
From: liorglic at mail.tau.ac.il (Lior Glick)
Date: Tue, 24 Jul 2018 09:45:06 +0200
Subject: [maker-devel] Annotation of a new variant within a species
Message-ID: <CAOzMDPxSUnk5zJXQhsu_SwbzHiJJ0sP0H5KOhD6L0OFFdD8sKg@mail.gmail.com>

Hello,

I am trying to annotate multiple  variants of tomato. While a good
annotation of the reference genome is available, I have denovo-assembled
other variants of the same species and wish to annotate them.
Most MAKER documentation refers to annotation of a new species, while using
transcripts and proteins from either the exact same sample (individual) or
from "an alternate organism", so I'm not sure what to do in this case,
where I am annotating various samples from the same species. I have two
questions:

1. Regarding transcripts data, how should I use transcripts from other
variants of the same species? Namely, should I use the est or the altest
parameter? What is the actual difference in behavior?

2. Is there a way to incorporate gene models (in gff format) from the
reference annotation? I expect high similarity in my assembled variants,
but not identity in terms of content and coordinates, so neither pred_gff
nor model_gff sound like what I need, as far as I understand.
I could also use the reference annotation and sequence to extract cDNA and
provide them as EST data. Is this the way to go? It feels like some
information on introns might be lost this way.

Would highly appreciate your answers to these questions or any other advice.

Thank you very much!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://box290.bluehost.com/pipermail/maker-devel_yandell-lab.org/attachments/20180724/181eef74/attachment.html>

From roscito at mpi-cbg.de  Tue Jul 31 07:59:58 2018
From: roscito at mpi-cbg.de (Ju Roscito)
Date: Tue, 31 Jul 2018 14:59:58 +0200
Subject: [maker-devel] Few alternative isoforms when alt_splice=0
Message-ID: <2C92DF72-0733-490F-A2EE-6F3724EF7099@mpi-cbg.de>

Dear all,

I have a question about the behaviour of alt_splice option, seems there?s not much about it on the forum.

I have run a single round of MAKER (2.31.9) on a vertebrate genome, with trinity mRNA data and mapped proteins from closely-related species. I set alt_splice to 0, but still got from two to four mRNAs for ~20 out of the 19.000 predicted genes. Has someone also seen the same? Any idea why would that happen?

Thanks a lot in advance.


From timo.metz at googlemail.com  Fri Jul 20 07:20:05 2018
From: timo.metz at googlemail.com (Timo Metz)
Date: Fri, 20 Jul 2018 12:20:05 -0000
Subject: [maker-devel] MAKER chooser algorithm
Message-ID: <CAKGvZVN6En4AmnMV1neZ_OmAGS341CJaZ7Fbgny1KB1CUd1_Jg@mail.gmail.com>

Hey,

I am working on the improvement of an already existing annotation. I could
find that sometimes MAKER would split or merge genes where it intuitively
does not look correct when looking at the evidence. Please find two
examples attached. The first track is the old annotation, the second track
the new annotation, then there is RNA-seq data, proteins, repeats, snap
prediction, augustus prediction. It is visible, that in both cases the
evidence supports two genes, and one gene predictor in each case tends to
create one gene where the other one creates two genes. I do not understand
why in this case the gene is merged, if evidence and also one ab initio
prediction support rather two genes. Are there any suggestions on how to
solve this?

best
Timo
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://box290.bluehost.com/pipermail/maker-devel_yandell-lab.org/attachments/20180720/81363d18/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Picture1.png
Type: image/png
Size: 26778 bytes
Desc: not available
URL: <http://box290.bluehost.com/pipermail/maker-devel_yandell-lab.org/attachments/20180720/81363d18/attachment.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Picutre2.png
Type: image/png
Size: 24145 bytes
Desc: not available
URL: <http://box290.bluehost.com/pipermail/maker-devel_yandell-lab.org/attachments/20180720/81363d18/attachment-0001.png>

From cganote at iu.edu  Tue Jul 24 11:31:02 2018
From: cganote at iu.edu (Ganote, Carrie L)
Date: Tue, 24 Jul 2018 16:31:02 -0000
Subject: [maker-devel] Maker ignores evidence and just returns gffs with
 genome contigs
Message-ID: <D77CCC75.46875%cganote@iu.edu>

Running maker, I don't see anything in the gff except the names of the contigs and their lengths:

##gff-version 3
SczI0sq_2092%3%3D3122    .       contig  1       119548  .       .       .       ID=SczI0sq_2092%3%3D3122;Name=SczI0sq_2092%3%3D3122
###
SczI0sq_842%3%3D1778     .       contig  1       4693    .       .       .       ID=SczI0sq_842%3B%3D1778;Name=SczI0sq_842%3%3D1778
###
...

In my opts file, I have:

#-----Genome (these are always required)
genome=/projects/Reference/genome.chr.fa #genome sequence (fasta file or fasta embeded in GFF3 file)
organism_type=eukaryotic #eukaryotic or prokaryotic. Default is eukaryotic

#-----Re-annotation Using MAKER Derived GFF3
maker_gff= #MAKER derived GFF3 file
est_pass=0 #use ESTs in maker_gff: 1 = yes, 0 = no
altest_pass=0 #use alternate organism ESTs in maker_gff: 1 = yes, 0 = no
protein_pass=0 #use protein alignments in maker_gff: 1 = yes, 0 = no
rm_pass=0 #use repeats in maker_gff: 1 = yes, 0 = no
model_pass=0 #use gene models in maker_gff: 1 = yes, 0 = no
pred_pass=0 #use ab-initio predictions in maker_gff: 1 = yes, 0 = no
other_pass=0 #passthrough anyything else in maker_gff: 1 = yes, 0 = no

#-----EST Evidence (for best results provide a file for at least one)
est= #set of ESTs or assembled mRNA-seq in fasta format
altest= #EST/cDNA sequence file in fasta format from an alternate organism
est_gff=/projects/Reference/Maker/EST_assembled.all.gff #aligned ESTs or mRNA-seq from an external GFF3 file
altest_gff= #aligned ESTs from a closly relate species in GFF3 format

#-----Protein Homology Evidence (for best results provide a file for at least one)
protein=  #protein sequence file in fasta format (i.e. from mutiple oransisms)
protein_gff=/projects/Reference/Maker/exonerate_withCC.gff3  #aligned protein homology evidence from an external GFF3 file

#-----Repeat Masking (leave values blank to skip repeat masking)
model_org= #select a model organism for RepBase masking in RepeatMasker
rmlib= #provide an organism specific repeat library in fasta format for RepeatMasker
repeat_protein= #provide a fasta file of transposable element proteins for RepeatRunner
rm_gff= #pre-identified repeat elements from an external GFF3 file
prok_rm=0 #forces MAKER to repeatmask prokaryotes (no reason to change this), 1 = yes, 0 = no
softmask=1 #use soft-masking rather than hard-masking in BLAST (i.e. seg and dust filtering)

#-----Gene Prediction
snaphmm= #SNAP HMM file
gmhmm= #GeneMark HMM file
augustus_species= #Augustus gene prediction species model
fgenesh_par_file= #FGENESH parameter file
pred_gff=/projects/Reference/Maker/augustus_output.reformated.gff #ab-initio predictions from an external GFF3 file
model_gff= #annotated gene models from an external GFF3 file (annotation pass-through)
est2genome=0 #infer gene predictions directly from ESTs, 1 = yes, 0 = no
protein2genome=0 #infer predictions from protein homology, 1 = yes, 0 = no
trna=0 #find tRNAs with tRNAscan, 1 = yes, 0 = no
snoscan_rrna= #rRNA file to have Snoscan find snoRNAs
unmask=0 #also run ab-initio prediction programs on unmasked sequence, 1 = yes, 0 = no

#-----Other Annotation Feature Types (features MAKER doesn't recognize)
other_gff= #extra features to pass-through to final MAKER generated GFF3 file

#-----External Application Behavior Options
alt_peptide=C #amino acid used to replace non-standard amino acids in BLAST databases
cpus=1 #max number of cpus to use in BLAST and RepeatMasker (not for MPI, leave 1 when using MPI)

#-----MAKER Behavior Options
max_dna_len=100000 #length for dividing up contigs into chunks (increases/decreases memory usage)
min_contig=1 #skip genome contigs below this length (under 10kb are often useless)

pred_flank=200 #flank for extending evidence clusters sent to gene predictors
pred_stats=0 #report AED and QI statistics for all predictions as well as models
AED_threshold=1 #Maximum Annotation Edit Distance allowed (bound by 0 and 1)
min_protein=0 #require at least this many amino acids in predicted proteins
alt_splice=0 #Take extra steps to try and find alternative splicing, 1 = yes, 0 = no
always_complete=0 #extra steps to force start and stop codons, 1 = yes, 0 = no
map_forward=0 #map names and attributes forward from old GFF3 genes, 1 = yes, 0 = no
keep_preds=0 #Concordance threshold to add unsupported gene prediction (bound by 0 and 1)

split_hit=10000 #length for the splitting of hits (expected max intron size for evidence alignments)
single_exon=0 #consider single exon EST evidence when generating annotations, 1 = yes, 0 = no
single_length=250 #min length required for single exon ESTs if 'single_exon is enabled'
correct_est_fusion=0 #limits use of ESTs in annotation to avoid fusion genes

tries=2 #number of times to try a contig if there is a failure for some reason
clean_try=0 #remove all data from previous run before retrying, 1 = yes, 0 = no
clean_up=0 #removes theVoid directory with individual analysis files, 1 = yes, 0 = no
TMP= #specify a directory other than the system default temporary directory for temporary files

It ran for ~3 hours and all contigs in the log file said FINISHED. No failures. Did I set something wrong?

-Carrie
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://box290.bluehost.com/pipermail/maker-devel_yandell-lab.org/attachments/20180724/aa12e191/attachment.html>

From jennifer.anderson at ebc.uu.se  Tue Jul  3 05:57:56 2018
From: jennifer.anderson at ebc.uu.se (Jennifer Anderson)
Date: Tue, 3 Jul 2018 13:57:56 +0200
Subject: [maker-devel] Genemark XXX.mod files
Message-ID: <902950FF-775C-46DC-987A-5666A56A6650@ebc.uu.se>


Hello,

I am working on annotations for fungal genomes, using GenemarkES with ?fungi for gene prediction.  In earlier attempts, I did not use the  training flag, and I did get the output gmhmm file.  Now I have tried with the training flag and do not get this file.  In the /run/ directory I do get mod files  ES_A.mod, ES_B.mod, and ES_C.mod, as well as ini.mod.  Does one of these files work as the ES.mod file as in
"gmhmm=../train_genemark/es.mod #GeneMark HMM file? from http://weatherby.genetics.utah.edu/MAKER/wiki/index.php/The_MAKER_control_files_explained? I don?t find documentation of the genemarkES output online.

Thank you.

Jenni


N?r du har kontakt med oss p? Uppsala universitet med e-post s? inneb?r det att vi behandlar dina personuppgifter. F?r att l?sa mer om hur vi g?r det kan du l?sa h?r: http://www.uu.se/om-uu/dataskydd-personuppgifter/

E-mailing Uppsala University means that we will process your personal data. For more information on how this is performed, please read here: http://www.uu.se/om-uu/dataskydd-personuppgifter/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20180703/b44d9709/attachment-0001.html>

From liorglic at mail.tau.ac.il  Wed Jul  4 06:32:05 2018
From: liorglic at mail.tau.ac.il (Lior Glick)
Date: Wed, 4 Jul 2018 14:32:05 +0200
Subject: [maker-devel] How sensitive is MAKER to redundant/partial
 transcripts?
Message-ID: <CAOzMDPyi3465OxB1n2oQJFjdG_rG7EvsODvLoexDBLQYVd7jhQ@mail.gmail.com>

 Dear MAKER users,

I am new to MAKER and would like your advice.
I am planning to annotate multiple genomes of tomato variants and wild
relatives. To this end, I have been working on generating a diverse
transcripts data set to be used as input for MAKER (along with protein
sequences and the 'official' tomato annotation). My transcripts set was
generated by collecting multiple available RNA-Seq results from SRA,
covering diverse variants, conditions and tissues, and assembling them into
transcripts using Trinity. My goal is to have a data set as diverse and
broad as possible.
Now I have ~30 fasta files of transcripts, originating from different
studies. Of course, many of the transcripts are redundant and/or partial. I
am exploring ways to merge the multiple data sets into a non-redundant one,
while also stitching partial transcripts into longer ones based on overlaps.
However, this turns out to be not-so-trivial and I am wandering if this is
really necessary in order to get a good annotation? Maybe I can just
concatenate all my transcriptome assembly results, and MAKER will handle
redundant and partial transcripts?
Can someone clarify how this works, and try to assess if an annotation
based on a merged data set should be superior to one that didn't undergo
such a process? If someone has actual experience with such data, that
would be really helpful, but any advice would be highly appreciated.

Thanks a lot and best regards,
Lior
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20180704/90ee431a/attachment-0001.html>

From jason.stajich at gmail.com  Thu Jul  5 12:13:57 2018
From: jason.stajich at gmail.com (Jason Stajich)
Date: Thu, 5 Jul 2018 11:13:57 -0700
Subject: [maker-devel] Genemark XXX.mod files
In-Reply-To: <902950FF-775C-46DC-987A-5666A56A6650@ebc.uu.se>
References: <902950FF-775C-46DC-987A-5666A56A6650@ebc.uu.se>
Message-ID: <CALf8Lpy-KretSGjwmMTyxOARiJ251mzbxb8HuV4XU8asOfW0dg@mail.gmail.com>

the run/ES_C.mod should be the right one if it is there.
It is possible is crashing on one of the training / retraining?

Jason Stajich
jason.stajich at gmail.com


On Tue, Jul 3, 2018 at 11:05 AM Jennifer Anderson <
jennifer.anderson at ebc.uu.se> wrote:

>
> Hello,
>
> I am working on annotations for fungal genomes, using GenemarkES with
> ?fungi for gene prediction.  In earlier attempts, I did not use the
>  training flag, and I did get the output gmhmm file.  Now I have tried with
> the training flag and do not get this file.  In the /run/ directory I do
> get mod files  ES_A.mod, ES_B.mod, and ES_C.mod, as well as ini.mod.  Does
> one of these files work as the ES.mod file as in
> "gmhmm=../train_genemark/es.mod #GeneMark HMM file? from
> http://weatherby.genetics.utah.edu/MAKER/wiki/index.php/The_MAKER_control_files_explained? I
> don?t find documentation of the genemarkES output online.
>
> Thank you.
>
> Jenni
>
>
>
>
>
>
>
>
>
> N?r du har kontakt med oss p? Uppsala universitet med e-post s? inneb?r
> det att vi behandlar dina personuppgifter. F?r att l?sa mer om hur vi g?r
> det kan du l?sa h?r: http://www.uu.se/om-uu/dataskydd-personuppgifter/
>
> E-mailing Uppsala University means that we will process your personal
> data. For more information on how this is performed, please read here:
> http://www.uu.se/om-uu/dataskydd-personuppgifter/
> _______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20180705/6ab6707b/attachment-0001.html>

From carsonhh at gmail.com  Thu Jul  5 12:47:38 2018
From: carsonhh at gmail.com (Carson Holt)
Date: Thu, 5 Jul 2018 12:47:38 -0600
Subject: [maker-devel] How sensitive is MAKER to redundant/partial
 transcripts?
In-Reply-To: <CAOzMDPyi3465OxB1n2oQJFjdG_rG7EvsODvLoexDBLQYVd7jhQ@mail.gmail.com>
References: <CAOzMDPyi3465OxB1n2oQJFjdG_rG7EvsODvLoexDBLQYVd7jhQ@mail.gmail.com>
Message-ID: <788E84AB-DB85-43AD-8FE1-C1D8A7DBD4B5@gmail.com>

MAKER will collapse redundant evidence after alignment, so it will primarily just increase run time. The main issue with so many datasets would be false positive alignments (assembled background transcription). You can look at individual contigs in Apollo, IGV, or other browser to see where spurious alignments occur and if they are overall associated with a particular dataset (it?s ok to throw out a noisy dataset especially if you have additional data).

?Carson


> On Jul 4, 2018, at 6:32 AM, Lior Glick <liorglic at mail.tau.ac.il> wrote:
> 
> Dear MAKER users,
> 
> I am new to MAKER and would like your advice.
> I am planning to annotate multiple genomes of tomato variants and wild relatives. To this end, I have been working on generating a diverse transcripts data set to be used as input for MAKER (along with protein sequences and the 'official' tomato annotation). My transcripts set was generated by collecting multiple available RNA-Seq results from SRA, covering diverse variants, conditions and tissues, and assembling them into transcripts using Trinity. My goal is to have a data set as diverse and broad as possible.
> Now I have ~30 fasta files of transcripts, originating from different studies. Of course, many of the transcripts are redundant and/or partial. I am exploring ways to merge the multiple data sets into a non-redundant one, while also stitching partial transcripts into longer ones based on overlaps.
> However, this turns out to be not-so-trivial and I am wandering if this is really necessary in order to get a good annotation? Maybe I can just concatenate all my transcriptome assembly results, and MAKER will handle redundant and partial transcripts?
> Can someone clarify how this works, and try to assess if an annotation based on a merged data set should be superior to one that didn't undergo such a process? If someone has actual experience with such data, that  would be really helpful, but any advice would be highly appreciated.
> 
> Thanks a lot and best regards,
> Lior
> 
> _______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20180705/b1d1cdc6/attachment-0001.html>

From carsonhh at gmail.com  Thu Jul  5 12:50:36 2018
From: carsonhh at gmail.com (Carson Holt)
Date: Thu, 5 Jul 2018 12:50:36 -0600
Subject: [maker-devel] [CAUTION: Suspicious Link] map_forward=1 not
 mapping reference ID's to output correctly
In-Reply-To: <D5A4E18F-CFDC-489E-BA1B-FB88FA66C338@ars.usda.gov>
References: <CA+DOteeTFd06_k5ONYLvn7FpUuv-JDNqp1PCFa9QF0TxDa9iEg@mail.gmail.com>
	<D5A4E18F-CFDC-489E-BA1B-FB88FA66C338@ars.usda.gov>
Message-ID: <4EE96E7F-5F5B-4988-BC9C-FC441848B768@gmail.com>

A quick overview of MAKER behavior. MAKER will keep everything in model_gff as long as you don?t provide another predictor to run or pred_gff file to use. But if you give it a predictor to run, it takes that as an indicator that you want to update models. So model_gff may get replaced by another prediction that overlaps it but scores better.

So depending on the behavior you want, make sure you are using model_gff and do or don?t provide a gene predictor to run.

?Carson


> On Jun 22, 2018, at 2:04 PM, Poelchau, Monica <monica.poelchau at ars.usda.gov> wrote:
> 
> Hi Kapeel,
>  
> If you just want your community annotations to replace models in an existing gene set, we have a tool for this:
>  
> https://github.com/NAL-i5K/GFF3toolkit <https://github.com/NAL-i5K/GFF3toolkit>
>  
> You?d need to run gff3_QC on your annotation files first to make sure your annotations are okay, then use gff3_merge to merge your community annotations with your existing gene set (in gff3 format). If you end up trying this out - we?re actively developing the GFF3toolkit, so feel free to post an issue if you notice any problems.
>  
> Hth,
>  
> Monica 
>  
> From: maker-devel <maker-devel-bounces at yandell-lab.org <mailto:maker-devel-bounces at yandell-lab.org>> on behalf of Kapeel Chougule <kapeelc at gmail.com <mailto:kapeelc at gmail.com>>
> Date: Friday, June 22, 2018 at 13:53
> To: "maker-devel at yandell-lab.org <mailto:maker-devel at yandell-lab.org>" <maker-devel at yandell-lab.org <mailto:maker-devel at yandell-lab.org>>
> Subject: [CAUTION: Suspicious Link][maker-devel] map_forward=1 not mapping reference ID's to output correctly
>  
> PROCEED WITH CAUTION: This message triggered warnings of potentially malicious web content. Evaluate this email by considering whether you are expecting the message, along with inspection for suspicious links.
> 
> Questions: Spam.Abuse at wdc.usda.gov <mailto:Spam.Abuse at wdc.usda.gov>
> 
> Hi,
>  
> I am trying to update community annotation <https://de.cyverse.org/dl/d/39D60E88-078D-4CF5-9F3A-D712B714CDD8/community.annotation.gff3> in the light of new evidence data but my MAKER runs are not keeping all the genes from the community annotation.
> 
> 
> Community annotation feature count: 2 1 bicolor 239969 CDS 266301 exon 51066 five_prime_UTR 34129 gene 47121 mRNA 53708 three_prime_UTR
> MAKER gene count->  
> awk '$3=="gene"{print}' maker_output.all.gff | grep "Sobic*" | wc -l 21105
>  
> In the maker_opts.ctl file attached, I did make keep_preds=1 and map_forward=1 which keep all the community gene models even if they dont have evidence support. This was explained here:
> http://weatherby.genetics.utah.edu/MAKER/wiki/index.php/Updating_annotations_in_light_of_new_data <http://weatherby.genetics.utah.edu/MAKER/wiki/index.php/Updating_annotations_in_light_of_new_data>
> . So not sure why we dont have the all the community gene models mapped in the MAKER output
> 
> Thanks
> 
> Kapeel
> -- 
>  
> Kapeel Chougule
> Computational Scientist Developer II
> One Bungtown Road Cold Spring Harbor, NY 11724
> http://www.warelab.org/ <http://www.warelab.org/>
> 
> 
> 
> This electronic message contains information generated by the USDA solely for the intended recipients. Any unauthorized interception of this message or the use or disclosure of the information it contains may violate the law and subject the violator to civil or criminal penalties. If you believe you have received this message in error, please notify the sender and delete the email immediately. _______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com <mailto:maker-devel at box290.bluehost.com>
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org <http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20180705/0a9af988/attachment-0001.html>

From carsonhh at gmail.com  Thu Jul  5 13:17:14 2018
From: carsonhh at gmail.com (Carson Holt)
Date: Thu, 5 Jul 2018 13:17:14 -0600
Subject: [maker-devel] Maker Error : Thread 1 terminated abnormally..
In-Reply-To: <CAAXcPKC3mnkqP9OU7L9bBLtts4KujCoBrUNieuUfgo+wd-E4Yw@mail.gmail.com>
References: <CAAXcPKC3mnkqP9OU7L9bBLtts4KujCoBrUNieuUfgo+wd-E4Yw@mail.gmail.com>
Message-ID: <C61CC367-F138-47F2-AA61-876811458353@gmail.com>

Sorry for the slow reply. Make sure you find out what flavor of MPI you are using (MPICH, MVAPICH2, Intel MPI, or OpenMPI). MAKER does not work with MVAPICH2. It can work with Intel MPI and OpenMPI with some command line modification. And it always works with MPICH, but MPICH may not be able to scale to more than ~100 CPUs.

This command ?-mca btl ^openib?, is only for OpenMPI for example. Also if using OpenMPI, set LD_PRELOAD in accordance with the INSTALL documentation. Also make sure you do not have multiple MPI flavors installed and you compiled MAKER with one then are running with a different flavor. That will cause failure shortly after starting MAKER.

Try looking further back in your STDER for the actual cause. The ?Thread 1 terminated abnormally:? message is the tail end of the failure snowball, so the actual cause is often much further back. 

?Carson


> On Jun 26, 2018, at 9:36 AM, Andr? Machado <andremmachado25 at gmail.com> wrote:
> 
> Hi ,
> 
> First of all thanks for your efforts in Maker pipeline. Its a tremendous help for the people that works with genomes.
> In the last 4 days i have broke my head.. with an error .. but still without a solution.
> I found this old thread: https://groups.google.com/forum/#!msg/maker-devel/X2-76BH9gvg/rU4kLJ3B6tsJ <https://groups.google.com/forum/#!msg/maker-devel/X2-76BH9gvg/rU4kLJ3B6tsJ>
> Seems to be a quite similar... but don't point to a specific solution.
> I have run maker with the data test and all runned ok. Maker finalize the entire process without errors.
> Recently, i?m trying to aplly my own data on MPI cluster. But this error, frequently occurred.
> Thread 1 terminated abnormally: ../dna.maker.output/mpi_blastdb/dna%2Efa.mpi.1/dna%2Efa.mpi.1.0
> --> rank=8, hostname=compute-0-1.local, at ../Analysis/Geno/maker/bin/maker line 1451 thread 1.
> --> rank=8, hostname=compute-0-1.local
> deleted:0 hits
> deleted:0 hits
> preparing ab-inits
> deleted:0 hits
> deleted:0 hits
> FATAL: Thread terminated, causing all processes to fail
> --> rank=8, hostname=compute-0-1.local
> deleted:0 hits
> 
> Basically im tring to run a maker with dna.fa, rna.fa, prot.fa and my_custom_lib_of_repeats.fa, to produce raw genes models which will be used to train SNAP.
> 
> I already used several command lines and all gave me the same error.. The only change between different tests was the local of the error, sometimes happened in compute-0-1.local other time in compute-0-4.local or in another one.
> mpiexec -n 63 --hostfile Host maker 1>1.log 2>2.err
> 
> mpiexec --hostfile Host maker 1>1.log 2>2.err
> mpiexec -mca btl ^openib -n 63 --hostfile Host maker 1>1.log 2>2.err
> nohup mpiexec -mca btl ^openib -n 63 --hostfile Host maker -a 1>1.log 2>2.err
> 
> The log file as well the option files are provided below.
> 
> Many thanks in advance,
> 
> Andr?
> 
> <2.log><maker_exe.ctl><maker_opts.ctl><maker_bopts.ctl>_______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20180705/a5f9a756/attachment-0001.html>

From andremmachado25 at gmail.com  Wed Jul  4 05:16:08 2018
From: andremmachado25 at gmail.com (=?UTF-8?Q?Andr=C3=A9_Machado?=)
Date: Wed, 4 Jul 2018 12:16:08 +0100
Subject: [maker-devel] =?utf-8?q?Maker_Error_=3A_Thread_1_terminated_abno?=
	=?utf-8?q?rmally=2E=2E_Andr=C3=A9_Machado_=3Candremmachado25=40gma?=
	=?utf-8?q?il=2Ecom=3E_AttachmentsJun_26_=288_days_ago=29_to_maker-?=
	=?utf-8?q?devel_Hi_=2C_First_of_all_thanks_for_your_efforts_in_Mak?=
	=?utf-8?q?er_pipeline=2E_Its_a_tremendous_help_for_the_people_that?=
	=?utf-8?q?_works_with_genomes=2E_In_the_last_4_days_i_have_broke_m?=
	=?utf-8?q?y_head=2E=2E_with_an_error_=2E=2E_but_still_without_a_so?=
	=?utf-8?q?lution=2E_I_found_this_old_thread=3A_https=3A//groups=2E?=
	=?utf-8?q?google=2Ecom/forum/=23!msg/maker-devel/X2-76BH9gvg/rU4kL?=
	=?utf-8?q?J3B6tsJ_Seems_to_be_a_quite_similar=2E=2E=2E_but_don=27t?=
	=?utf-8?q?_point_to_a_specific_solution=2E_I_have_run_maker_with_t?=
	=?utf-8?q?he_data_test_and_all_runned_ok=2E_Maker_finalize_the_ent?=
	=?utf-8?q?ire_process_without_errors=2E_Recently=2C_i=E2=80=99m_tr?=
	=?utf-8?q?ying_to_aplly_my_own_data_on_MPI_cluster=2E_But_this_err?=
	=?utf-8?q?or=2C_frequently_occurred=2E_Thread_1_terminated_abnorma?=
	=?utf-8?q?lly=3A_=2E=2E/dna=2Emaker=2Eoutput/mpi=5Fblastdb/dna=252?=
	=?utf-8?b?RWZhLm1waS4xL2RuYSUyRWZhLm1waS4xLjAgLS0+IHJhbms9OCwgaG9z?=
	=?utf-8?q?tname=3Dcompute-0-1=2Elocal=2C_at_=2E=2E/Analysis/Geno/m?=
	=?utf-8?q?aker/bin/maker_line_1451_thread_1=2E_--=3E_rank=3D8=2C_h?=
	=?utf-8?q?ostname=3Dcompute-0-1=2Elocal_deleted=3A0_hits_deleted?=
	=?utf-8?q?=3A0_hits_preparing_ab-ini?=
Message-ID: <CAAXcPKBUtfN3aSxqjo9qHgiS1WNXLRz6Z+Qm2USZkJ_HkvH-Dw@mail.gmail.com>

Hi ,


First of all thanks for your efforts in Maker pipeline. Its a tremendous
help for the people that works with genomes.

In the last 4 days i have broke my head.. with an error .. but still
without a solution.

I found this old thread: https://groups.google.com/
forum/#!msg/maker-devel/X2-76BH9gvg/rU4kLJ3B6tsJ

Seems to be a quite similar... but don't point to a specific solution.

I have run maker with the data test and all runned ok. Maker finalize the
entire process without errors.

Recently, i?m trying to aplly my own data on MPI cluster. But this error,
frequently occurred.

Thread 1 terminated abnormally: ../dna.maker.output/mpi_
blastdb/dna%2Efa.mpi.1/dna%2Efa.mpi.1.0

--> rank=8, hostname=compute-0-1.local, at ../Analysis/Geno/maker/bin/maker
line 1451 thread 1.

--> rank=8, hostname=compute-0-1.local

deleted:0 hits

deleted:0 hits

preparing ab-inits

deleted:0 hits

deleted:0 hits

FATAL: Thread terminated, causing all processes to fail

--> rank=8, hostname=compute-0-1.local

deleted:0 hits


Basically im tring to run a maker with dna.fa, rna.fa, prot.fa and
my_custom_lib_of_repeats.fa, to produce raw genes models which will be used
to train SNAP.


I already used several command lines and all gave me the same error.. The
only change between different tests was the local of the error, sometimes
happened in compute-0-1.local other time in compute-0-4.local or in another
one.

mpiexec -n 63 --hostfile Host maker 1>1.log 2>2.err

mpiexec --hostfile Host maker 1>1.log 2>2.err

mpiexec -mca btl ^openib -n 63 --hostfile Host maker 1>1.log 2>2.err

nohup mpiexec -mca btl ^openib -n 63 --hostfile Host maker -a 1>1.log
2>2.err


The log file as well the option files are provided below.


Many thanks in advance,


Andr?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20180704/20722397/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 2.log
Type: text/x-log
Size: 38654 bytes
Desc: not available
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20180704/20722397/attachment-0001.log>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: maker_exe.ctl
Type: application/octet-stream
Size: 1223 bytes
Desc: not available
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20180704/20722397/attachment-0003.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: maker_opts.ctl
Type: application/octet-stream
Size: 4547 bytes
Desc: not available
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20180704/20722397/attachment-0004.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: maker_bopts.ctl
Type: application/octet-stream
Size: 1412 bytes
Desc: not available
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20180704/20722397/attachment-0005.obj>

From liorglck at gmail.com  Wed Jul  4 06:28:14 2018
From: liorglck at gmail.com (Lior Glick)
Date: Wed, 4 Jul 2018 14:28:14 +0200
Subject: [maker-devel] How sensitive is MAKER to redundant/partial
 transcripts?
Message-ID: <CAFOVipOCZphMxMLitWEVdjJg1WrF2LxVRkJBWtQOEOSEFUzDOA@mail.gmail.com>

Dear MAKER users,

I am new to MAKER and would like your advice.
I am planning to annotate multiple genomes of tomato variants and wild
relatives. To this end, I have been working on generating a diverse
transcripts data set to be used as input for MAKER (along with protein
sequences and the 'official' tomato annotation). My transcripts set was
generated by collecting multiple available RNA-Seq results from SRA,
covering diverse variants, conditions and tissues, and assembling them into
transcripts using Trinity. My goal is to have a data set as diverse and
broad as possible.
Now I have ~30 fasta files of transcripts, originating from different
studies. Of course, many of the transcripts are redundant and/or partial. I
am exploring ways to merge the multiple data sets into a non-redundant one,
while also stitching partial transcripts into longer ones based on overlaps.
However, this turns out to be not-so-trivial and I am wandering if this is
really necessary in order to get a good annotation? Maybe I can just
concatenate all my transcriptome assembly results, and MAKER will handle
redundant and partial transcripts?
Can someone clarify how this works, and try to assess if an annotation
based on a merged data set should be superior to one that didn't undergo
such a process? If someone has actual experience with such data, that
would be really helpful, but any advice would be highly appreciated.

Thanks a lot and best regards,
Lior
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20180704/71d71274/attachment-0001.html>

From carsonhh at gmail.com  Thu Jul 12 14:05:00 2018
From: carsonhh at gmail.com (Carson Holt)
Date: Thu, 12 Jul 2018 14:05:00 -0600
Subject: [maker-devel] 
 =?utf-8?q?Maker_Error_=3A_Thread_1_terminated_abno?=
 =?utf-8?q?rmally=2E=2E_Andr=C3=A9_Machado_=3Candremmachado25=40gmail=2Eco?=
 =?utf-8?q?m=3E_AttachmentsJun_26_=288_days_ago=29_to_maker-devel_Hi_=2C_F?=
 =?utf-8?q?irst_of_all_thanks_for_your_efforts_in_Maker_pipeline=2E_Its_a_?=
 =?utf-8?q?tremendous_help_for_the_people_that_works_with_genomes=2E_In_th?=
 =?utf-8?q?e_last_4_days_i_have_broke_my_head=2E=2E_with_an_error_=2E=2E_b?=
 =?utf-8?q?ut_still_without_a_solution=2E_I_found_this_old_thread=3A_https?=
 =?utf-8?q?=3A//groups=2Egoogle=2Ecom/forum/=23!msg/maker-devel/X2-76BH9gv?=
 =?utf-8?q?g/rU4kLJ3B6tsJ_Seems_to_be_a_quite_similar=2E=2E=2E_but_don=27t?=
 =?utf-8?q?_point_to_a_specific_solution=2E_I_have_run_maker_with_the_data?=
 =?utf-8?q?_test_and_all_runned_ok=2E_Maker_finalize_the_entire_process_wi?=
 =?utf-8?q?thout_errors=2E_Recently=2C_i=E2=80=99m_trying_to_aplly_my_own_?=
 =?utf-8?q?data_on_MPI_cluster=2E_But_this_error=2C_frequently_occurred=2E?=
 =?utf-8?q?_Thread_1_terminated_abnormally=3A_=2E=2E/dna=2Emaker=2Eoutput/?=
 =?utf-8?q?mpi=5Fblastdb/dna=252Efa=2Empi=2E1/dna=252Efa=2Empi=2E1=2E0_--?=
 =?utf-8?q?=3E_rank=3D8=2C_hostname=3Dcompute-0-1=2Elocal=2C_at_=2E=2E/Ana?=
 =?utf-8?q?lysis/Geno/maker/bin/maker_line_1451_thread_1=2E_--=3E_rank=3D8?=
 =?utf-8?q?=2C_hostname=3Dcompute-0-1=2Elocal_deleted=3A0_hits_deleted=3A0?=
 =?utf-8?q?_hits_preparing_ab-ini?=
In-Reply-To: <CAAXcPKBUtfN3aSxqjo9qHgiS1WNXLRz6Z+Qm2USZkJ_HkvH-Dw@mail.gmail.com>
References: <CAAXcPKBUtfN3aSxqjo9qHgiS1WNXLRz6Z+Qm2USZkJ_HkvH-Dw@mail.gmail.com>
Message-ID: <5F1E5499-239E-405E-81EC-CECC755D7838@gmail.com>

Because you truncated / removed line before the actual error (I need to see the several hundred lines that happened before  "Thread 1 terminated abnormally: ../dna.maker.output/mpi_blastdb/dna%2Efa.mpi.1/dna%2Efa.mpi.1.0?), I can?t give hyou more info.

But you are getting a lot of OpenMPI complaints at the start. You may need to reinstall OpenMPI or use MPICH instead (both will require you to reinstall maker as it will need to rebuild the MPI C/Perl binding for the new installation). Also when using OpenMPI, make sure to export LD_PRELOAD in the way outlined in the ?/maker/INSTALL instructions. 

?Carson


> On Jul 4, 2018, at 5:16 AM, Andr? Machado <andremmachado25 at gmail.com> wrote:
> 
> Hi ,
> 
> First of all thanks for your efforts in Maker pipeline. Its a tremendous help for the people that works with genomes.
> In the last 4 days i have broke my head.. with an error .. but still without a solution.
> I found this old thread: https://groups.google.com/forum/#!msg/maker-devel/X2-76BH9gvg/rU4kLJ3B6tsJ <https://groups.google.com/forum/#!msg/maker-devel/X2-76BH9gvg/rU4kLJ3B6tsJ>
> Seems to be a quite similar... but don't point to a specific solution.
> I have run maker with the data test and all runned ok. Maker finalize the entire process without errors.
> Recently, i?m trying to aplly my own data on MPI cluster. But this error, frequently occurred.
> Thread 1 terminated abnormally: ../dna.maker.output/mpi_blastdb/dna%2Efa.mpi.1/dna%2Efa.mpi.1.0
> --> rank=8, hostname=compute-0-1.local, at ../Analysis/Geno/maker/bin/maker line 1451 thread 1.
> --> rank=8, hostname=compute-0-1.local
> deleted:0 hits
> deleted:0 hits
> preparing ab-inits
> deleted:0 hits
> deleted:0 hits
> FATAL: Thread terminated, causing all processes to fail
> --> rank=8, hostname=compute-0-1.local
> deleted:0 hits
> 
> Basically im tring to run a maker with dna.fa, rna.fa, prot.fa and my_custom_lib_of_repeats.fa, to produce raw genes models which will be used to train SNAP.
> 
> I already used several command lines and all gave me the same error.. The only change between different tests was the local of the error, sometimes happened in compute-0-1.local other time in compute-0-4.local or in another one.
> mpiexec -n 63 --hostfile Host maker 1>1.log 2>2.err
> 
> mpiexec --hostfile Host maker 1>1.log 2>2.err
> mpiexec -mca btl ^openib -n 63 --hostfile Host maker 1>1.log 2>2.err
> nohup mpiexec -mca btl ^openib -n 63 --hostfile Host maker -a 1>1.log 2>2.err
> 
> The log file as well the option files are provided below.
> 
> Many thanks in advance,
> 
> Andr?
> 
> <2.log><maker_exe.ctl><maker_opts.ctl><maker_bopts.ctl>_______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20180712/dadbe692/attachment-0001.html>

From carsonhh at gmail.com  Thu Jul 12 14:38:33 2018
From: carsonhh at gmail.com (Carson Holt)
Date: Thu, 12 Jul 2018 14:38:33 -0600
Subject: [maker-devel] How sensitive is MAKER to redundant/partial
 transcripts?
In-Reply-To: <CAFOVipOCZphMxMLitWEVdjJg1WrF2LxVRkJBWtQOEOSEFUzDOA@mail.gmail.com>
References: <CAFOVipOCZphMxMLitWEVdjJg1WrF2LxVRkJBWtQOEOSEFUzDOA@mail.gmail.com>
Message-ID: <C3CE3772-8538-42A9-9178-BEBF719EFFC8@gmail.com>

MAKER will automatically collapse redundant evidence. The only thing you may need to worry about with too many datasets is background transcription. With more datasets you will have more spurious assemblies from background transcription (if you sequence deep enough everything is transcribed at some level). You should also look at the results in a browser like apollo, you may find that some datasets are more noisy than others and it would be beneficial to drop them especially if they are redundant. So always do a  visual review of results.

?Carson


> On Jul 4, 2018, at 6:28 AM, Lior Glick <liorglck at gmail.com> wrote:
> 
> Dear MAKER users,
> 
> I am new to MAKER and would like your advice.
> I am planning to annotate multiple genomes of tomato variants and wild relatives. To this end, I have been working on generating a diverse transcripts data set to be used as input for MAKER (along with protein sequences and the 'official' tomato annotation). My transcripts set was generated by collecting multiple available RNA-Seq results from SRA, covering diverse variants, conditions and tissues, and assembling them into transcripts using Trinity. My goal is to have a data set as diverse and broad as possible.
> Now I have ~30 fasta files of transcripts, originating from different studies. Of course, many of the transcripts are redundant and/or partial. I am exploring ways to merge the multiple data sets into a non-redundant one, while also stitching partial transcripts into longer ones based on overlaps.
> However, this turns out to be not-so-trivial and I am wandering if this is really necessary in order to get a good annotation? Maybe I can just concatenate all my transcriptome assembly results, and MAKER will handle redundant and partial transcripts?
> Can someone clarify how this works, and try to assess if an annotation based on a merged data set should be superior to one that didn't undergo such a process? If someone has actual experience with such data, that  would be really helpful, but any advice would be highly appreciated.
> 
> Thanks a lot and best regards,
> Lior
> _______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org


From shijunpeng at cau.edu.cn  Sat Jul 14 02:04:38 2018
From: shijunpeng at cau.edu.cn (=?UTF-8?B?5Y+y5L+K6bmP?=)
Date: Sat, 14 Jul 2018 16:04:38 +0800 (GMT+08:00)
Subject: [maker-devel] Ask for help about the collapse of Maker (version
 2.31.9) when annotated with Fgenesh
In-Reply-To: <C3CE3772-8538-42A9-9178-BEBF719EFFC8@gmail.com>
References: <CAFOVipOCZphMxMLitWEVdjJg1WrF2LxVRkJBWtQOEOSEFUzDOA@mail.gmail.com>
	<C3CE3772-8538-42A9-9178-BEBF719EFFC8@gmail.com>
Message-ID: <183e519e.83bf.16497d1fd4b.Coremail.shijunpeng@cau.edu.cn>

Dear Carson,

First of all, I must apologize that I could't post my questions in Google group since I can't get access to Google in mainland China.

I am using Maker (version 2.31.9) to annotate several foxtail millet genomes. I combined Augustus and Fgenesh (v.3.1.1) for the de novo annotation of these genomes.

The majority of contigs were anotated well with maker pipeline. While, several contigs failed when annotated with Fgenesh with the following error information:

#--------- command -------------#
Widget::fgenesh:
/NAS7/home/shijunpeng/software/maker/bin/../lib/Widget/fgenesh/fgenesh_wrap /NAS7/home/shijunpeng/software/fgenesh/fgenesh /NAS7/home/shijunpeng/software/fgenesh/Monocots /tmp/43438.1.all.q/maker_8zLUxB/0/108_0.4597215-4597401.Monocots.auto_annotator.fgenesh.fasta -exon_table:/tmp/43438.1.all.q/maker_8zLUxB/0/108_0.4597215-4597401.Monocots.auto_annotator.xdef.fgenesh > /tmp/43438.1.all.q/maker_8zLUxB/0/108_0.4597215-
#-------------------------------#
ERROR: FgenesH failed
--> rank=NA, hostname=bioinfor3.local
ERROR: Failed while annotating transcripts
ERROR: Chunk failed at level:1, tier_type:4
FAILED CONTIG:scaffold_1

ERROR: Chunk failed at level:6, tier_type:0
FAILED CONTIG:scaffold_1
###############################################################################################################################################

A system core file generated after this collapse. I checked the temperate fasta file 108_0.4597215-4597401.Monocots.auto_annotator.fgenesh.fasta to be normal about ~300 bp.

I also checked my original sequence file and confirmed no problem (A,T,C,G and N). I also tried to set the pred_flank option from 200 (original) to 0 and the error still exists.

I ran the Maker pipeline in a single node with 16 processors and 256 Gb RAMs, so it may be not due to the MPI problems.

Below were my detailed maker bahavior options:
#-----MAKER Behavior Options
max_dna_len=300000 #length for dividing up contigs into chunks (increases/decreases memory usage)
min_contig=10000 #skip genome contigs below this length (under 10kb are often useless)

pred_flank=0 #flank for extending evidence clusters sent to gene predictors
pred_stats=1 #report AED and QI statistics for all predictions as well as models
AED_threshold=1 #Maximum Annotation Edit Distance allowed (bound by 0 and 1)
min_protein=0 #require at least this many amino acids in predicted proteins
alt_splice=1 #Take extra steps to try and find alternative splicing, 1 = yes, 0 = no
always_complete=1 #extra steps to force start and stop codons, 1 = yes, 0 = no
map_forward=1 #map names and attributes forward from old GFF3 genes, 1 = yes, 0 = no
keep_preds=0 #Concordance threshold to add unsupported gene prediction (bound by 0 and 1)

split_hit=10000 #length for the splitting of hits (expected max intron size for evidence alignments)
single_exon=0 #consider single exon EST evidence when generating annotations, 1 = yes, 0 = no
single_length=250 #min length required for single exon ESTs if 'single_exon is enabled'
correct_est_fusion=0 #limits use of ESTs in annotation to avoid fusion genes

tries=5 #number of times to try a contig if there is a failure for some reason
clean_try=0 #remove all data from previous run before retrying, 1 = yes, 0 = no
clean_up=0 #removes theVoid directory with individual analysis files, 1 = yes, 0 = no
TMP= #specify a directory other than the system default temporary directory for temporary files 

Could you please help me to solve this error? I am looking forward to hearing from you.

Sincerely, 
Junpeng

--
Junpeng Shi, PhD
State Key Lab For Agrobiotech, China Agricultural University
National Maize Improvement Center of China 
Center For Life Science, NO.2, 
The West Street of Yuanmingyuan Park, Beijing, P.R.China 
Tel?+86-13581863941

From liorglic at mail.tau.ac.il  Tue Jul 24 01:45:06 2018
From: liorglic at mail.tau.ac.il (Lior Glick)
Date: Tue, 24 Jul 2018 09:45:06 +0200
Subject: [maker-devel] Annotation of a new variant within a species
Message-ID: <CAOzMDPxSUnk5zJXQhsu_SwbzHiJJ0sP0H5KOhD6L0OFFdD8sKg@mail.gmail.com>

Hello,

I am trying to annotate multiple  variants of tomato. While a good
annotation of the reference genome is available, I have denovo-assembled
other variants of the same species and wish to annotate them.
Most MAKER documentation refers to annotation of a new species, while using
transcripts and proteins from either the exact same sample (individual) or
from "an alternate organism", so I'm not sure what to do in this case,
where I am annotating various samples from the same species. I have two
questions:

1. Regarding transcripts data, how should I use transcripts from other
variants of the same species? Namely, should I use the est or the altest
parameter? What is the actual difference in behavior?

2. Is there a way to incorporate gene models (in gff format) from the
reference annotation? I expect high similarity in my assembled variants,
but not identity in terms of content and coordinates, so neither pred_gff
nor model_gff sound like what I need, as far as I understand.
I could also use the reference annotation and sequence to extract cDNA and
provide them as EST data. Is this the way to go? It feels like some
information on introns might be lost this way.

Would highly appreciate your answers to these questions or any other advice.

Thank you very much!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20180724/181eef74/attachment-0001.html>

From roscito at mpi-cbg.de  Tue Jul 31 06:59:58 2018
From: roscito at mpi-cbg.de (Ju Roscito)
Date: Tue, 31 Jul 2018 14:59:58 +0200
Subject: [maker-devel] Few alternative isoforms when alt_splice=0
Message-ID: <2C92DF72-0733-490F-A2EE-6F3724EF7099@mpi-cbg.de>

Dear all,

I have a question about the behaviour of alt_splice option, seems there?s not much about it on the forum.

I have run a single round of MAKER (2.31.9) on a vertebrate genome, with trinity mRNA data and mapped proteins from closely-related species. I set alt_splice to 0, but still got from two to four mRNAs for ~20 out of the 19.000 predicted genes. Has someone also seen the same? Any idea why would that happen?

Thanks a lot in advance.


From timo.metz at googlemail.com  Fri Jul 20 06:20:05 2018
From: timo.metz at googlemail.com (Timo Metz)
Date: Fri, 20 Jul 2018 12:20:05 -0000
Subject: [maker-devel] MAKER chooser algorithm
Message-ID: <CAKGvZVN6En4AmnMV1neZ_OmAGS341CJaZ7Fbgny1KB1CUd1_Jg@mail.gmail.com>

Hey,

I am working on the improvement of an already existing annotation. I could
find that sometimes MAKER would split or merge genes where it intuitively
does not look correct when looking at the evidence. Please find two
examples attached. The first track is the old annotation, the second track
the new annotation, then there is RNA-seq data, proteins, repeats, snap
prediction, augustus prediction. It is visible, that in both cases the
evidence supports two genes, and one gene predictor in each case tends to
create one gene where the other one creates two genes. I do not understand
why in this case the gene is merged, if evidence and also one ab initio
prediction support rather two genes. Are there any suggestions on how to
solve this?

best
Timo
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20180720/81363d18/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Picture1.png
Type: image/png
Size: 26778 bytes
Desc: not available
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20180720/81363d18/attachment-0002.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Picutre2.png
Type: image/png
Size: 24145 bytes
Desc: not available
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20180720/81363d18/attachment-0003.png>

From cganote at iu.edu  Tue Jul 24 10:31:02 2018
From: cganote at iu.edu (Ganote, Carrie L)
Date: Tue, 24 Jul 2018 16:31:02 -0000
Subject: [maker-devel] Maker ignores evidence and just returns gffs with
 genome contigs
Message-ID: <D77CCC75.46875%cganote@iu.edu>

Running maker, I don't see anything in the gff except the names of the contigs and their lengths:

##gff-version 3
SczI0sq_2092%3%3D3122    .       contig  1       119548  .       .       .       ID=SczI0sq_2092%3%3D3122;Name=SczI0sq_2092%3%3D3122
###
SczI0sq_842%3%3D1778     .       contig  1       4693    .       .       .       ID=SczI0sq_842%3B%3D1778;Name=SczI0sq_842%3%3D1778
###
...

In my opts file, I have:

#-----Genome (these are always required)
genome=/projects/Reference/genome.chr.fa #genome sequence (fasta file or fasta embeded in GFF3 file)
organism_type=eukaryotic #eukaryotic or prokaryotic. Default is eukaryotic

#-----Re-annotation Using MAKER Derived GFF3
maker_gff= #MAKER derived GFF3 file
est_pass=0 #use ESTs in maker_gff: 1 = yes, 0 = no
altest_pass=0 #use alternate organism ESTs in maker_gff: 1 = yes, 0 = no
protein_pass=0 #use protein alignments in maker_gff: 1 = yes, 0 = no
rm_pass=0 #use repeats in maker_gff: 1 = yes, 0 = no
model_pass=0 #use gene models in maker_gff: 1 = yes, 0 = no
pred_pass=0 #use ab-initio predictions in maker_gff: 1 = yes, 0 = no
other_pass=0 #passthrough anyything else in maker_gff: 1 = yes, 0 = no

#-----EST Evidence (for best results provide a file for at least one)
est= #set of ESTs or assembled mRNA-seq in fasta format
altest= #EST/cDNA sequence file in fasta format from an alternate organism
est_gff=/projects/Reference/Maker/EST_assembled.all.gff #aligned ESTs or mRNA-seq from an external GFF3 file
altest_gff= #aligned ESTs from a closly relate species in GFF3 format

#-----Protein Homology Evidence (for best results provide a file for at least one)
protein=  #protein sequence file in fasta format (i.e. from mutiple oransisms)
protein_gff=/projects/Reference/Maker/exonerate_withCC.gff3  #aligned protein homology evidence from an external GFF3 file

#-----Repeat Masking (leave values blank to skip repeat masking)
model_org= #select a model organism for RepBase masking in RepeatMasker
rmlib= #provide an organism specific repeat library in fasta format for RepeatMasker
repeat_protein= #provide a fasta file of transposable element proteins for RepeatRunner
rm_gff= #pre-identified repeat elements from an external GFF3 file
prok_rm=0 #forces MAKER to repeatmask prokaryotes (no reason to change this), 1 = yes, 0 = no
softmask=1 #use soft-masking rather than hard-masking in BLAST (i.e. seg and dust filtering)

#-----Gene Prediction
snaphmm= #SNAP HMM file
gmhmm= #GeneMark HMM file
augustus_species= #Augustus gene prediction species model
fgenesh_par_file= #FGENESH parameter file
pred_gff=/projects/Reference/Maker/augustus_output.reformated.gff #ab-initio predictions from an external GFF3 file
model_gff= #annotated gene models from an external GFF3 file (annotation pass-through)
est2genome=0 #infer gene predictions directly from ESTs, 1 = yes, 0 = no
protein2genome=0 #infer predictions from protein homology, 1 = yes, 0 = no
trna=0 #find tRNAs with tRNAscan, 1 = yes, 0 = no
snoscan_rrna= #rRNA file to have Snoscan find snoRNAs
unmask=0 #also run ab-initio prediction programs on unmasked sequence, 1 = yes, 0 = no

#-----Other Annotation Feature Types (features MAKER doesn't recognize)
other_gff= #extra features to pass-through to final MAKER generated GFF3 file

#-----External Application Behavior Options
alt_peptide=C #amino acid used to replace non-standard amino acids in BLAST databases
cpus=1 #max number of cpus to use in BLAST and RepeatMasker (not for MPI, leave 1 when using MPI)

#-----MAKER Behavior Options
max_dna_len=100000 #length for dividing up contigs into chunks (increases/decreases memory usage)
min_contig=1 #skip genome contigs below this length (under 10kb are often useless)

pred_flank=200 #flank for extending evidence clusters sent to gene predictors
pred_stats=0 #report AED and QI statistics for all predictions as well as models
AED_threshold=1 #Maximum Annotation Edit Distance allowed (bound by 0 and 1)
min_protein=0 #require at least this many amino acids in predicted proteins
alt_splice=0 #Take extra steps to try and find alternative splicing, 1 = yes, 0 = no
always_complete=0 #extra steps to force start and stop codons, 1 = yes, 0 = no
map_forward=0 #map names and attributes forward from old GFF3 genes, 1 = yes, 0 = no
keep_preds=0 #Concordance threshold to add unsupported gene prediction (bound by 0 and 1)

split_hit=10000 #length for the splitting of hits (expected max intron size for evidence alignments)
single_exon=0 #consider single exon EST evidence when generating annotations, 1 = yes, 0 = no
single_length=250 #min length required for single exon ESTs if 'single_exon is enabled'
correct_est_fusion=0 #limits use of ESTs in annotation to avoid fusion genes

tries=2 #number of times to try a contig if there is a failure for some reason
clean_try=0 #remove all data from previous run before retrying, 1 = yes, 0 = no
clean_up=0 #removes theVoid directory with individual analysis files, 1 = yes, 0 = no
TMP= #specify a directory other than the system default temporary directory for temporary files

It ran for ~3 hours and all contigs in the log file said FINISHED. No failures. Did I set something wrong?

-Carrie
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20180724/aa12e191/attachment-0001.html>

From jennifer.anderson at ebc.uu.se  Tue Jul  3 05:57:56 2018
From: jennifer.anderson at ebc.uu.se (Jennifer Anderson)
Date: Tue, 3 Jul 2018 13:57:56 +0200
Subject: [maker-devel] Genemark XXX.mod files
Message-ID: <902950FF-775C-46DC-987A-5666A56A6650@ebc.uu.se>


Hello,

I am working on annotations for fungal genomes, using GenemarkES with ?fungi for gene prediction.  In earlier attempts, I did not use the  training flag, and I did get the output gmhmm file.  Now I have tried with the training flag and do not get this file.  In the /run/ directory I do get mod files  ES_A.mod, ES_B.mod, and ES_C.mod, as well as ini.mod.  Does one of these files work as the ES.mod file as in
"gmhmm=../train_genemark/es.mod #GeneMark HMM file? from http://weatherby.genetics.utah.edu/MAKER/wiki/index.php/The_MAKER_control_files_explained? I don?t find documentation of the genemarkES output online.

Thank you.

Jenni


N?r du har kontakt med oss p? Uppsala universitet med e-post s? inneb?r det att vi behandlar dina personuppgifter. F?r att l?sa mer om hur vi g?r det kan du l?sa h?r: http://www.uu.se/om-uu/dataskydd-personuppgifter/

E-mailing Uppsala University means that we will process your personal data. For more information on how this is performed, please read here: http://www.uu.se/om-uu/dataskydd-personuppgifter/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20180703/b44d9709/attachment-0002.html>

From liorglic at mail.tau.ac.il  Wed Jul  4 06:32:05 2018
From: liorglic at mail.tau.ac.il (Lior Glick)
Date: Wed, 4 Jul 2018 14:32:05 +0200
Subject: [maker-devel] How sensitive is MAKER to redundant/partial
 transcripts?
Message-ID: <CAOzMDPyi3465OxB1n2oQJFjdG_rG7EvsODvLoexDBLQYVd7jhQ@mail.gmail.com>

 Dear MAKER users,

I am new to MAKER and would like your advice.
I am planning to annotate multiple genomes of tomato variants and wild
relatives. To this end, I have been working on generating a diverse
transcripts data set to be used as input for MAKER (along with protein
sequences and the 'official' tomato annotation). My transcripts set was
generated by collecting multiple available RNA-Seq results from SRA,
covering diverse variants, conditions and tissues, and assembling them into
transcripts using Trinity. My goal is to have a data set as diverse and
broad as possible.
Now I have ~30 fasta files of transcripts, originating from different
studies. Of course, many of the transcripts are redundant and/or partial. I
am exploring ways to merge the multiple data sets into a non-redundant one,
while also stitching partial transcripts into longer ones based on overlaps.
However, this turns out to be not-so-trivial and I am wandering if this is
really necessary in order to get a good annotation? Maybe I can just
concatenate all my transcriptome assembly results, and MAKER will handle
redundant and partial transcripts?
Can someone clarify how this works, and try to assess if an annotation
based on a merged data set should be superior to one that didn't undergo
such a process? If someone has actual experience with such data, that
would be really helpful, but any advice would be highly appreciated.

Thanks a lot and best regards,
Lior
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20180704/90ee431a/attachment-0002.html>

From jason.stajich at gmail.com  Thu Jul  5 12:13:57 2018
From: jason.stajich at gmail.com (Jason Stajich)
Date: Thu, 5 Jul 2018 11:13:57 -0700
Subject: [maker-devel] Genemark XXX.mod files
In-Reply-To: <902950FF-775C-46DC-987A-5666A56A6650@ebc.uu.se>
References: <902950FF-775C-46DC-987A-5666A56A6650@ebc.uu.se>
Message-ID: <CALf8Lpy-KretSGjwmMTyxOARiJ251mzbxb8HuV4XU8asOfW0dg@mail.gmail.com>

the run/ES_C.mod should be the right one if it is there.
It is possible is crashing on one of the training / retraining?

Jason Stajich
jason.stajich at gmail.com


On Tue, Jul 3, 2018 at 11:05 AM Jennifer Anderson <
jennifer.anderson at ebc.uu.se> wrote:

>
> Hello,
>
> I am working on annotations for fungal genomes, using GenemarkES with
> ?fungi for gene prediction.  In earlier attempts, I did not use the
>  training flag, and I did get the output gmhmm file.  Now I have tried with
> the training flag and do not get this file.  In the /run/ directory I do
> get mod files  ES_A.mod, ES_B.mod, and ES_C.mod, as well as ini.mod.  Does
> one of these files work as the ES.mod file as in
> "gmhmm=../train_genemark/es.mod #GeneMark HMM file? from
> http://weatherby.genetics.utah.edu/MAKER/wiki/index.php/The_MAKER_control_files_explained? I
> don?t find documentation of the genemarkES output online.
>
> Thank you.
>
> Jenni
>
>
>
>
>
>
>
>
>
> N?r du har kontakt med oss p? Uppsala universitet med e-post s? inneb?r
> det att vi behandlar dina personuppgifter. F?r att l?sa mer om hur vi g?r
> det kan du l?sa h?r: http://www.uu.se/om-uu/dataskydd-personuppgifter/
>
> E-mailing Uppsala University means that we will process your personal
> data. For more information on how this is performed, please read here:
> http://www.uu.se/om-uu/dataskydd-personuppgifter/
> _______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20180705/6ab6707b/attachment-0002.html>

From carsonhh at gmail.com  Thu Jul  5 12:47:38 2018
From: carsonhh at gmail.com (Carson Holt)
Date: Thu, 5 Jul 2018 12:47:38 -0600
Subject: [maker-devel] How sensitive is MAKER to redundant/partial
 transcripts?
In-Reply-To: <CAOzMDPyi3465OxB1n2oQJFjdG_rG7EvsODvLoexDBLQYVd7jhQ@mail.gmail.com>
References: <CAOzMDPyi3465OxB1n2oQJFjdG_rG7EvsODvLoexDBLQYVd7jhQ@mail.gmail.com>
Message-ID: <788E84AB-DB85-43AD-8FE1-C1D8A7DBD4B5@gmail.com>

MAKER will collapse redundant evidence after alignment, so it will primarily just increase run time. The main issue with so many datasets would be false positive alignments (assembled background transcription). You can look at individual contigs in Apollo, IGV, or other browser to see where spurious alignments occur and if they are overall associated with a particular dataset (it?s ok to throw out a noisy dataset especially if you have additional data).

?Carson


> On Jul 4, 2018, at 6:32 AM, Lior Glick <liorglic at mail.tau.ac.il> wrote:
> 
> Dear MAKER users,
> 
> I am new to MAKER and would like your advice.
> I am planning to annotate multiple genomes of tomato variants and wild relatives. To this end, I have been working on generating a diverse transcripts data set to be used as input for MAKER (along with protein sequences and the 'official' tomato annotation). My transcripts set was generated by collecting multiple available RNA-Seq results from SRA, covering diverse variants, conditions and tissues, and assembling them into transcripts using Trinity. My goal is to have a data set as diverse and broad as possible.
> Now I have ~30 fasta files of transcripts, originating from different studies. Of course, many of the transcripts are redundant and/or partial. I am exploring ways to merge the multiple data sets into a non-redundant one, while also stitching partial transcripts into longer ones based on overlaps.
> However, this turns out to be not-so-trivial and I am wandering if this is really necessary in order to get a good annotation? Maybe I can just concatenate all my transcriptome assembly results, and MAKER will handle redundant and partial transcripts?
> Can someone clarify how this works, and try to assess if an annotation based on a merged data set should be superior to one that didn't undergo such a process? If someone has actual experience with such data, that  would be really helpful, but any advice would be highly appreciated.
> 
> Thanks a lot and best regards,
> Lior
> 
> _______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20180705/b1d1cdc6/attachment-0002.html>

From carsonhh at gmail.com  Thu Jul  5 12:50:36 2018
From: carsonhh at gmail.com (Carson Holt)
Date: Thu, 5 Jul 2018 12:50:36 -0600
Subject: [maker-devel] [CAUTION: Suspicious Link] map_forward=1 not
 mapping reference ID's to output correctly
In-Reply-To: <D5A4E18F-CFDC-489E-BA1B-FB88FA66C338@ars.usda.gov>
References: <CA+DOteeTFd06_k5ONYLvn7FpUuv-JDNqp1PCFa9QF0TxDa9iEg@mail.gmail.com>
	<D5A4E18F-CFDC-489E-BA1B-FB88FA66C338@ars.usda.gov>
Message-ID: <4EE96E7F-5F5B-4988-BC9C-FC441848B768@gmail.com>

A quick overview of MAKER behavior. MAKER will keep everything in model_gff as long as you don?t provide another predictor to run or pred_gff file to use. But if you give it a predictor to run, it takes that as an indicator that you want to update models. So model_gff may get replaced by another prediction that overlaps it but scores better.

So depending on the behavior you want, make sure you are using model_gff and do or don?t provide a gene predictor to run.

?Carson


> On Jun 22, 2018, at 2:04 PM, Poelchau, Monica <monica.poelchau at ars.usda.gov> wrote:
> 
> Hi Kapeel,
>  
> If you just want your community annotations to replace models in an existing gene set, we have a tool for this:
>  
> https://github.com/NAL-i5K/GFF3toolkit <https://github.com/NAL-i5K/GFF3toolkit>
>  
> You?d need to run gff3_QC on your annotation files first to make sure your annotations are okay, then use gff3_merge to merge your community annotations with your existing gene set (in gff3 format). If you end up trying this out - we?re actively developing the GFF3toolkit, so feel free to post an issue if you notice any problems.
>  
> Hth,
>  
> Monica 
>  
> From: maker-devel <maker-devel-bounces at yandell-lab.org <mailto:maker-devel-bounces at yandell-lab.org>> on behalf of Kapeel Chougule <kapeelc at gmail.com <mailto:kapeelc at gmail.com>>
> Date: Friday, June 22, 2018 at 13:53
> To: "maker-devel at yandell-lab.org <mailto:maker-devel at yandell-lab.org>" <maker-devel at yandell-lab.org <mailto:maker-devel at yandell-lab.org>>
> Subject: [CAUTION: Suspicious Link][maker-devel] map_forward=1 not mapping reference ID's to output correctly
>  
> PROCEED WITH CAUTION: This message triggered warnings of potentially malicious web content. Evaluate this email by considering whether you are expecting the message, along with inspection for suspicious links.
> 
> Questions: Spam.Abuse at wdc.usda.gov <mailto:Spam.Abuse at wdc.usda.gov>
> 
> Hi,
>  
> I am trying to update community annotation <https://de.cyverse.org/dl/d/39D60E88-078D-4CF5-9F3A-D712B714CDD8/community.annotation.gff3> in the light of new evidence data but my MAKER runs are not keeping all the genes from the community annotation.
> 
> 
> Community annotation feature count: 2 1 bicolor 239969 CDS 266301 exon 51066 five_prime_UTR 34129 gene 47121 mRNA 53708 three_prime_UTR
> MAKER gene count->  
> awk '$3=="gene"{print}' maker_output.all.gff | grep "Sobic*" | wc -l 21105
>  
> In the maker_opts.ctl file attached, I did make keep_preds=1 and map_forward=1 which keep all the community gene models even if they dont have evidence support. This was explained here:
> http://weatherby.genetics.utah.edu/MAKER/wiki/index.php/Updating_annotations_in_light_of_new_data <http://weatherby.genetics.utah.edu/MAKER/wiki/index.php/Updating_annotations_in_light_of_new_data>
> . So not sure why we dont have the all the community gene models mapped in the MAKER output
> 
> Thanks
> 
> Kapeel
> -- 
>  
> Kapeel Chougule
> Computational Scientist Developer II
> One Bungtown Road Cold Spring Harbor, NY 11724
> http://www.warelab.org/ <http://www.warelab.org/>
> 
> 
> 
> This electronic message contains information generated by the USDA solely for the intended recipients. Any unauthorized interception of this message or the use or disclosure of the information it contains may violate the law and subject the violator to civil or criminal penalties. If you believe you have received this message in error, please notify the sender and delete the email immediately. _______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com <mailto:maker-devel at box290.bluehost.com>
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org <http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20180705/0a9af988/attachment-0002.html>

From carsonhh at gmail.com  Thu Jul  5 13:17:14 2018
From: carsonhh at gmail.com (Carson Holt)
Date: Thu, 5 Jul 2018 13:17:14 -0600
Subject: [maker-devel] Maker Error : Thread 1 terminated abnormally..
In-Reply-To: <CAAXcPKC3mnkqP9OU7L9bBLtts4KujCoBrUNieuUfgo+wd-E4Yw@mail.gmail.com>
References: <CAAXcPKC3mnkqP9OU7L9bBLtts4KujCoBrUNieuUfgo+wd-E4Yw@mail.gmail.com>
Message-ID: <C61CC367-F138-47F2-AA61-876811458353@gmail.com>

Sorry for the slow reply. Make sure you find out what flavor of MPI you are using (MPICH, MVAPICH2, Intel MPI, or OpenMPI). MAKER does not work with MVAPICH2. It can work with Intel MPI and OpenMPI with some command line modification. And it always works with MPICH, but MPICH may not be able to scale to more than ~100 CPUs.

This command ?-mca btl ^openib?, is only for OpenMPI for example. Also if using OpenMPI, set LD_PRELOAD in accordance with the INSTALL documentation. Also make sure you do not have multiple MPI flavors installed and you compiled MAKER with one then are running with a different flavor. That will cause failure shortly after starting MAKER.

Try looking further back in your STDER for the actual cause. The ?Thread 1 terminated abnormally:? message is the tail end of the failure snowball, so the actual cause is often much further back. 

?Carson


> On Jun 26, 2018, at 9:36 AM, Andr? Machado <andremmachado25 at gmail.com> wrote:
> 
> Hi ,
> 
> First of all thanks for your efforts in Maker pipeline. Its a tremendous help for the people that works with genomes.
> In the last 4 days i have broke my head.. with an error .. but still without a solution.
> I found this old thread: https://groups.google.com/forum/#!msg/maker-devel/X2-76BH9gvg/rU4kLJ3B6tsJ <https://groups.google.com/forum/#!msg/maker-devel/X2-76BH9gvg/rU4kLJ3B6tsJ>
> Seems to be a quite similar... but don't point to a specific solution.
> I have run maker with the data test and all runned ok. Maker finalize the entire process without errors.
> Recently, i?m trying to aplly my own data on MPI cluster. But this error, frequently occurred.
> Thread 1 terminated abnormally: ../dna.maker.output/mpi_blastdb/dna%2Efa.mpi.1/dna%2Efa.mpi.1.0
> --> rank=8, hostname=compute-0-1.local, at ../Analysis/Geno/maker/bin/maker line 1451 thread 1.
> --> rank=8, hostname=compute-0-1.local
> deleted:0 hits
> deleted:0 hits
> preparing ab-inits
> deleted:0 hits
> deleted:0 hits
> FATAL: Thread terminated, causing all processes to fail
> --> rank=8, hostname=compute-0-1.local
> deleted:0 hits
> 
> Basically im tring to run a maker with dna.fa, rna.fa, prot.fa and my_custom_lib_of_repeats.fa, to produce raw genes models which will be used to train SNAP.
> 
> I already used several command lines and all gave me the same error.. The only change between different tests was the local of the error, sometimes happened in compute-0-1.local other time in compute-0-4.local or in another one.
> mpiexec -n 63 --hostfile Host maker 1>1.log 2>2.err
> 
> mpiexec --hostfile Host maker 1>1.log 2>2.err
> mpiexec -mca btl ^openib -n 63 --hostfile Host maker 1>1.log 2>2.err
> nohup mpiexec -mca btl ^openib -n 63 --hostfile Host maker -a 1>1.log 2>2.err
> 
> The log file as well the option files are provided below.
> 
> Many thanks in advance,
> 
> Andr?
> 
> <2.log><maker_exe.ctl><maker_opts.ctl><maker_bopts.ctl>_______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20180705/a5f9a756/attachment-0002.html>

From andremmachado25 at gmail.com  Wed Jul  4 05:16:08 2018
From: andremmachado25 at gmail.com (=?UTF-8?Q?Andr=C3=A9_Machado?=)
Date: Wed, 4 Jul 2018 12:16:08 +0100
Subject: [maker-devel] =?utf-8?q?Maker_Error_=3A_Thread_1_terminated_abno?=
	=?utf-8?q?rmally=2E=2E_Andr=C3=A9_Machado_=3Candremmachado25=40gma?=
	=?utf-8?q?il=2Ecom=3E_AttachmentsJun_26_=288_days_ago=29_to_maker-?=
	=?utf-8?q?devel_Hi_=2C_First_of_all_thanks_for_your_efforts_in_Mak?=
	=?utf-8?q?er_pipeline=2E_Its_a_tremendous_help_for_the_people_that?=
	=?utf-8?q?_works_with_genomes=2E_In_the_last_4_days_i_have_broke_m?=
	=?utf-8?q?y_head=2E=2E_with_an_error_=2E=2E_but_still_without_a_so?=
	=?utf-8?q?lution=2E_I_found_this_old_thread=3A_https=3A//groups=2E?=
	=?utf-8?q?google=2Ecom/forum/=23!msg/maker-devel/X2-76BH9gvg/rU4kL?=
	=?utf-8?q?J3B6tsJ_Seems_to_be_a_quite_similar=2E=2E=2E_but_don=27t?=
	=?utf-8?q?_point_to_a_specific_solution=2E_I_have_run_maker_with_t?=
	=?utf-8?q?he_data_test_and_all_runned_ok=2E_Maker_finalize_the_ent?=
	=?utf-8?q?ire_process_without_errors=2E_Recently=2C_i=E2=80=99m_tr?=
	=?utf-8?q?ying_to_aplly_my_own_data_on_MPI_cluster=2E_But_this_err?=
	=?utf-8?q?or=2C_frequently_occurred=2E_Thread_1_terminated_abnorma?=
	=?utf-8?q?lly=3A_=2E=2E/dna=2Emaker=2Eoutput/mpi=5Fblastdb/dna=252?=
	=?utf-8?b?RWZhLm1waS4xL2RuYSUyRWZhLm1waS4xLjAgLS0+IHJhbms9OCwgaG9z?=
	=?utf-8?q?tname=3Dcompute-0-1=2Elocal=2C_at_=2E=2E/Analysis/Geno/m?=
	=?utf-8?q?aker/bin/maker_line_1451_thread_1=2E_--=3E_rank=3D8=2C_h?=
	=?utf-8?q?ostname=3Dcompute-0-1=2Elocal_deleted=3A0_hits_deleted?=
	=?utf-8?q?=3A0_hits_preparing_ab-ini?=
Message-ID: <CAAXcPKBUtfN3aSxqjo9qHgiS1WNXLRz6Z+Qm2USZkJ_HkvH-Dw@mail.gmail.com>

Hi ,


First of all thanks for your efforts in Maker pipeline. Its a tremendous
help for the people that works with genomes.

In the last 4 days i have broke my head.. with an error .. but still
without a solution.

I found this old thread: https://groups.google.com/
forum/#!msg/maker-devel/X2-76BH9gvg/rU4kLJ3B6tsJ

Seems to be a quite similar... but don't point to a specific solution.

I have run maker with the data test and all runned ok. Maker finalize the
entire process without errors.

Recently, i?m trying to aplly my own data on MPI cluster. But this error,
frequently occurred.

Thread 1 terminated abnormally: ../dna.maker.output/mpi_
blastdb/dna%2Efa.mpi.1/dna%2Efa.mpi.1.0

--> rank=8, hostname=compute-0-1.local, at ../Analysis/Geno/maker/bin/maker
line 1451 thread 1.

--> rank=8, hostname=compute-0-1.local

deleted:0 hits

deleted:0 hits

preparing ab-inits

deleted:0 hits

deleted:0 hits

FATAL: Thread terminated, causing all processes to fail

--> rank=8, hostname=compute-0-1.local

deleted:0 hits


Basically im tring to run a maker with dna.fa, rna.fa, prot.fa and
my_custom_lib_of_repeats.fa, to produce raw genes models which will be used
to train SNAP.


I already used several command lines and all gave me the same error.. The
only change between different tests was the local of the error, sometimes
happened in compute-0-1.local other time in compute-0-4.local or in another
one.

mpiexec -n 63 --hostfile Host maker 1>1.log 2>2.err

mpiexec --hostfile Host maker 1>1.log 2>2.err

mpiexec -mca btl ^openib -n 63 --hostfile Host maker 1>1.log 2>2.err

nohup mpiexec -mca btl ^openib -n 63 --hostfile Host maker -a 1>1.log
2>2.err


The log file as well the option files are provided below.


Many thanks in advance,


Andr?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20180704/20722397/attachment-0002.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 2.log
Type: text/x-log
Size: 38655 bytes
Desc: not available
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20180704/20722397/attachment-0002.log>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: maker_exe.ctl
Type: application/octet-stream
Size: 1224 bytes
Desc: not available
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20180704/20722397/attachment-0006.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: maker_opts.ctl
Type: application/octet-stream
Size: 4548 bytes
Desc: not available
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20180704/20722397/attachment-0007.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: maker_bopts.ctl
Type: application/octet-stream
Size: 1413 bytes
Desc: not available
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20180704/20722397/attachment-0008.obj>

From liorglck at gmail.com  Wed Jul  4 06:28:14 2018
From: liorglck at gmail.com (Lior Glick)
Date: Wed, 4 Jul 2018 14:28:14 +0200
Subject: [maker-devel] How sensitive is MAKER to redundant/partial
 transcripts?
Message-ID: <CAFOVipOCZphMxMLitWEVdjJg1WrF2LxVRkJBWtQOEOSEFUzDOA@mail.gmail.com>

Dear MAKER users,

I am new to MAKER and would like your advice.
I am planning to annotate multiple genomes of tomato variants and wild
relatives. To this end, I have been working on generating a diverse
transcripts data set to be used as input for MAKER (along with protein
sequences and the 'official' tomato annotation). My transcripts set was
generated by collecting multiple available RNA-Seq results from SRA,
covering diverse variants, conditions and tissues, and assembling them into
transcripts using Trinity. My goal is to have a data set as diverse and
broad as possible.
Now I have ~30 fasta files of transcripts, originating from different
studies. Of course, many of the transcripts are redundant and/or partial. I
am exploring ways to merge the multiple data sets into a non-redundant one,
while also stitching partial transcripts into longer ones based on overlaps.
However, this turns out to be not-so-trivial and I am wandering if this is
really necessary in order to get a good annotation? Maybe I can just
concatenate all my transcriptome assembly results, and MAKER will handle
redundant and partial transcripts?
Can someone clarify how this works, and try to assess if an annotation
based on a merged data set should be superior to one that didn't undergo
such a process? If someone has actual experience with such data, that
would be really helpful, but any advice would be highly appreciated.

Thanks a lot and best regards,
Lior
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20180704/71d71274/attachment-0002.html>

From carsonhh at gmail.com  Thu Jul 12 14:05:00 2018
From: carsonhh at gmail.com (Carson Holt)
Date: Thu, 12 Jul 2018 14:05:00 -0600
Subject: [maker-devel] 
 =?utf-8?q?Maker_Error_=3A_Thread_1_terminated_abno?=
 =?utf-8?q?rmally=2E=2E_Andr=C3=A9_Machado_=3Candremmachado25=40gmail=2Eco?=
 =?utf-8?q?m=3E_AttachmentsJun_26_=288_days_ago=29_to_maker-devel_Hi_=2C_F?=
 =?utf-8?q?irst_of_all_thanks_for_your_efforts_in_Maker_pipeline=2E_Its_a_?=
 =?utf-8?q?tremendous_help_for_the_people_that_works_with_genomes=2E_In_th?=
 =?utf-8?q?e_last_4_days_i_have_broke_my_head=2E=2E_with_an_error_=2E=2E_b?=
 =?utf-8?q?ut_still_without_a_solution=2E_I_found_this_old_thread=3A_https?=
 =?utf-8?q?=3A//groups=2Egoogle=2Ecom/forum/=23!msg/maker-devel/X2-76BH9gv?=
 =?utf-8?q?g/rU4kLJ3B6tsJ_Seems_to_be_a_quite_similar=2E=2E=2E_but_don=27t?=
 =?utf-8?q?_point_to_a_specific_solution=2E_I_have_run_maker_with_the_data?=
 =?utf-8?q?_test_and_all_runned_ok=2E_Maker_finalize_the_entire_process_wi?=
 =?utf-8?q?thout_errors=2E_Recently=2C_i=E2=80=99m_trying_to_aplly_my_own_?=
 =?utf-8?q?data_on_MPI_cluster=2E_But_this_error=2C_frequently_occurred=2E?=
 =?utf-8?q?_Thread_1_terminated_abnormally=3A_=2E=2E/dna=2Emaker=2Eoutput/?=
 =?utf-8?q?mpi=5Fblastdb/dna=252Efa=2Empi=2E1/dna=252Efa=2Empi=2E1=2E0_--?=
 =?utf-8?q?=3E_rank=3D8=2C_hostname=3Dcompute-0-1=2Elocal=2C_at_=2E=2E/Ana?=
 =?utf-8?q?lysis/Geno/maker/bin/maker_line_1451_thread_1=2E_--=3E_rank=3D8?=
 =?utf-8?q?=2C_hostname=3Dcompute-0-1=2Elocal_deleted=3A0_hits_deleted=3A0?=
 =?utf-8?q?_hits_preparing_ab-ini?=
In-Reply-To: <CAAXcPKBUtfN3aSxqjo9qHgiS1WNXLRz6Z+Qm2USZkJ_HkvH-Dw@mail.gmail.com>
References: <CAAXcPKBUtfN3aSxqjo9qHgiS1WNXLRz6Z+Qm2USZkJ_HkvH-Dw@mail.gmail.com>
Message-ID: <5F1E5499-239E-405E-81EC-CECC755D7838@gmail.com>

Because you truncated / removed line before the actual error (I need to see the several hundred lines that happened before  "Thread 1 terminated abnormally: ../dna.maker.output/mpi_blastdb/dna%2Efa.mpi.1/dna%2Efa.mpi.1.0?), I can?t give hyou more info.

But you are getting a lot of OpenMPI complaints at the start. You may need to reinstall OpenMPI or use MPICH instead (both will require you to reinstall maker as it will need to rebuild the MPI C/Perl binding for the new installation). Also when using OpenMPI, make sure to export LD_PRELOAD in the way outlined in the ?/maker/INSTALL instructions. 

?Carson


> On Jul 4, 2018, at 5:16 AM, Andr? Machado <andremmachado25 at gmail.com> wrote:
> 
> Hi ,
> 
> First of all thanks for your efforts in Maker pipeline. Its a tremendous help for the people that works with genomes.
> In the last 4 days i have broke my head.. with an error .. but still without a solution.
> I found this old thread: https://groups.google.com/forum/#!msg/maker-devel/X2-76BH9gvg/rU4kLJ3B6tsJ <https://groups.google.com/forum/#!msg/maker-devel/X2-76BH9gvg/rU4kLJ3B6tsJ>
> Seems to be a quite similar... but don't point to a specific solution.
> I have run maker with the data test and all runned ok. Maker finalize the entire process without errors.
> Recently, i?m trying to aplly my own data on MPI cluster. But this error, frequently occurred.
> Thread 1 terminated abnormally: ../dna.maker.output/mpi_blastdb/dna%2Efa.mpi.1/dna%2Efa.mpi.1.0
> --> rank=8, hostname=compute-0-1.local, at ../Analysis/Geno/maker/bin/maker line 1451 thread 1.
> --> rank=8, hostname=compute-0-1.local
> deleted:0 hits
> deleted:0 hits
> preparing ab-inits
> deleted:0 hits
> deleted:0 hits
> FATAL: Thread terminated, causing all processes to fail
> --> rank=8, hostname=compute-0-1.local
> deleted:0 hits
> 
> Basically im tring to run a maker with dna.fa, rna.fa, prot.fa and my_custom_lib_of_repeats.fa, to produce raw genes models which will be used to train SNAP.
> 
> I already used several command lines and all gave me the same error.. The only change between different tests was the local of the error, sometimes happened in compute-0-1.local other time in compute-0-4.local or in another one.
> mpiexec -n 63 --hostfile Host maker 1>1.log 2>2.err
> 
> mpiexec --hostfile Host maker 1>1.log 2>2.err
> mpiexec -mca btl ^openib -n 63 --hostfile Host maker 1>1.log 2>2.err
> nohup mpiexec -mca btl ^openib -n 63 --hostfile Host maker -a 1>1.log 2>2.err
> 
> The log file as well the option files are provided below.
> 
> Many thanks in advance,
> 
> Andr?
> 
> <2.log><maker_exe.ctl><maker_opts.ctl><maker_bopts.ctl>_______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20180712/dadbe692/attachment-0002.html>

From carsonhh at gmail.com  Thu Jul 12 14:38:33 2018
From: carsonhh at gmail.com (Carson Holt)
Date: Thu, 12 Jul 2018 14:38:33 -0600
Subject: [maker-devel] How sensitive is MAKER to redundant/partial
 transcripts?
In-Reply-To: <CAFOVipOCZphMxMLitWEVdjJg1WrF2LxVRkJBWtQOEOSEFUzDOA@mail.gmail.com>
References: <CAFOVipOCZphMxMLitWEVdjJg1WrF2LxVRkJBWtQOEOSEFUzDOA@mail.gmail.com>
Message-ID: <C3CE3772-8538-42A9-9178-BEBF719EFFC8@gmail.com>

MAKER will automatically collapse redundant evidence. The only thing you may need to worry about with too many datasets is background transcription. With more datasets you will have more spurious assemblies from background transcription (if you sequence deep enough everything is transcribed at some level). You should also look at the results in a browser like apollo, you may find that some datasets are more noisy than others and it would be beneficial to drop them especially if they are redundant. So always do a  visual review of results.

?Carson


> On Jul 4, 2018, at 6:28 AM, Lior Glick <liorglck at gmail.com> wrote:
> 
> Dear MAKER users,
> 
> I am new to MAKER and would like your advice.
> I am planning to annotate multiple genomes of tomato variants and wild relatives. To this end, I have been working on generating a diverse transcripts data set to be used as input for MAKER (along with protein sequences and the 'official' tomato annotation). My transcripts set was generated by collecting multiple available RNA-Seq results from SRA, covering diverse variants, conditions and tissues, and assembling them into transcripts using Trinity. My goal is to have a data set as diverse and broad as possible.
> Now I have ~30 fasta files of transcripts, originating from different studies. Of course, many of the transcripts are redundant and/or partial. I am exploring ways to merge the multiple data sets into a non-redundant one, while also stitching partial transcripts into longer ones based on overlaps.
> However, this turns out to be not-so-trivial and I am wandering if this is really necessary in order to get a good annotation? Maybe I can just concatenate all my transcriptome assembly results, and MAKER will handle redundant and partial transcripts?
> Can someone clarify how this works, and try to assess if an annotation based on a merged data set should be superior to one that didn't undergo such a process? If someone has actual experience with such data, that  would be really helpful, but any advice would be highly appreciated.
> 
> Thanks a lot and best regards,
> Lior
> _______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org


From shijunpeng at cau.edu.cn  Sat Jul 14 02:04:38 2018
From: shijunpeng at cau.edu.cn (=?UTF-8?B?5Y+y5L+K6bmP?=)
Date: Sat, 14 Jul 2018 16:04:38 +0800 (GMT+08:00)
Subject: [maker-devel] Ask for help about the collapse of Maker (version
 2.31.9) when annotated with Fgenesh
In-Reply-To: <C3CE3772-8538-42A9-9178-BEBF719EFFC8@gmail.com>
References: <CAFOVipOCZphMxMLitWEVdjJg1WrF2LxVRkJBWtQOEOSEFUzDOA@mail.gmail.com>
	<C3CE3772-8538-42A9-9178-BEBF719EFFC8@gmail.com>
Message-ID: <183e519e.83bf.16497d1fd4b.Coremail.shijunpeng@cau.edu.cn>

Dear Carson,

First of all, I must apologize that I could't post my questions in Google group since I can't get access to Google in mainland China.

I am using Maker (version 2.31.9) to annotate several foxtail millet genomes. I combined Augustus and Fgenesh (v.3.1.1) for the de novo annotation of these genomes.

The majority of contigs were anotated well with maker pipeline. While, several contigs failed when annotated with Fgenesh with the following error information:

#--------- command -------------#
Widget::fgenesh:
/NAS7/home/shijunpeng/software/maker/bin/../lib/Widget/fgenesh/fgenesh_wrap /NAS7/home/shijunpeng/software/fgenesh/fgenesh /NAS7/home/shijunpeng/software/fgenesh/Monocots /tmp/43438.1.all.q/maker_8zLUxB/0/108_0.4597215-4597401.Monocots.auto_annotator.fgenesh.fasta -exon_table:/tmp/43438.1.all.q/maker_8zLUxB/0/108_0.4597215-4597401.Monocots.auto_annotator.xdef.fgenesh > /tmp/43438.1.all.q/maker_8zLUxB/0/108_0.4597215-
#-------------------------------#
ERROR: FgenesH failed
--> rank=NA, hostname=bioinfor3.local
ERROR: Failed while annotating transcripts
ERROR: Chunk failed at level:1, tier_type:4
FAILED CONTIG:scaffold_1

ERROR: Chunk failed at level:6, tier_type:0
FAILED CONTIG:scaffold_1
###############################################################################################################################################

A system core file generated after this collapse. I checked the temperate fasta file 108_0.4597215-4597401.Monocots.auto_annotator.fgenesh.fasta to be normal about ~300 bp.

I also checked my original sequence file and confirmed no problem (A,T,C,G and N). I also tried to set the pred_flank option from 200 (original) to 0 and the error still exists.

I ran the Maker pipeline in a single node with 16 processors and 256 Gb RAMs, so it may be not due to the MPI problems.

Below were my detailed maker bahavior options:
#-----MAKER Behavior Options
max_dna_len=300000 #length for dividing up contigs into chunks (increases/decreases memory usage)
min_contig=10000 #skip genome contigs below this length (under 10kb are often useless)

pred_flank=0 #flank for extending evidence clusters sent to gene predictors
pred_stats=1 #report AED and QI statistics for all predictions as well as models
AED_threshold=1 #Maximum Annotation Edit Distance allowed (bound by 0 and 1)
min_protein=0 #require at least this many amino acids in predicted proteins
alt_splice=1 #Take extra steps to try and find alternative splicing, 1 = yes, 0 = no
always_complete=1 #extra steps to force start and stop codons, 1 = yes, 0 = no
map_forward=1 #map names and attributes forward from old GFF3 genes, 1 = yes, 0 = no
keep_preds=0 #Concordance threshold to add unsupported gene prediction (bound by 0 and 1)

split_hit=10000 #length for the splitting of hits (expected max intron size for evidence alignments)
single_exon=0 #consider single exon EST evidence when generating annotations, 1 = yes, 0 = no
single_length=250 #min length required for single exon ESTs if 'single_exon is enabled'
correct_est_fusion=0 #limits use of ESTs in annotation to avoid fusion genes

tries=5 #number of times to try a contig if there is a failure for some reason
clean_try=0 #remove all data from previous run before retrying, 1 = yes, 0 = no
clean_up=0 #removes theVoid directory with individual analysis files, 1 = yes, 0 = no
TMP= #specify a directory other than the system default temporary directory for temporary files 

Could you please help me to solve this error? I am looking forward to hearing from you.

Sincerely, 
Junpeng

--
Junpeng Shi, PhD
State Key Lab For Agrobiotech, China Agricultural University
National Maize Improvement Center of China 
Center For Life Science, NO.2, 
The West Street of Yuanmingyuan Park, Beijing, P.R.China 
Tel?+86-13581863941

From liorglic at mail.tau.ac.il  Tue Jul 24 01:45:06 2018
From: liorglic at mail.tau.ac.il (Lior Glick)
Date: Tue, 24 Jul 2018 09:45:06 +0200
Subject: [maker-devel] Annotation of a new variant within a species
Message-ID: <CAOzMDPxSUnk5zJXQhsu_SwbzHiJJ0sP0H5KOhD6L0OFFdD8sKg@mail.gmail.com>

Hello,

I am trying to annotate multiple  variants of tomato. While a good
annotation of the reference genome is available, I have denovo-assembled
other variants of the same species and wish to annotate them.
Most MAKER documentation refers to annotation of a new species, while using
transcripts and proteins from either the exact same sample (individual) or
from "an alternate organism", so I'm not sure what to do in this case,
where I am annotating various samples from the same species. I have two
questions:

1. Regarding transcripts data, how should I use transcripts from other
variants of the same species? Namely, should I use the est or the altest
parameter? What is the actual difference in behavior?

2. Is there a way to incorporate gene models (in gff format) from the
reference annotation? I expect high similarity in my assembled variants,
but not identity in terms of content and coordinates, so neither pred_gff
nor model_gff sound like what I need, as far as I understand.
I could also use the reference annotation and sequence to extract cDNA and
provide them as EST data. Is this the way to go? It feels like some
information on introns might be lost this way.

Would highly appreciate your answers to these questions or any other advice.

Thank you very much!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20180724/181eef74/attachment-0002.html>

From roscito at mpi-cbg.de  Tue Jul 31 06:59:58 2018
From: roscito at mpi-cbg.de (Ju Roscito)
Date: Tue, 31 Jul 2018 14:59:58 +0200
Subject: [maker-devel] Few alternative isoforms when alt_splice=0
Message-ID: <2C92DF72-0733-490F-A2EE-6F3724EF7099@mpi-cbg.de>

Dear all,

I have a question about the behaviour of alt_splice option, seems there?s not much about it on the forum.

I have run a single round of MAKER (2.31.9) on a vertebrate genome, with trinity mRNA data and mapped proteins from closely-related species. I set alt_splice to 0, but still got from two to four mRNAs for ~20 out of the 19.000 predicted genes. Has someone also seen the same? Any idea why would that happen?

Thanks a lot in advance.


From timo.metz at googlemail.com  Fri Jul 20 06:20:05 2018
From: timo.metz at googlemail.com (Timo Metz)
Date: Fri, 20 Jul 2018 12:20:05 -0000
Subject: [maker-devel] MAKER chooser algorithm
Message-ID: <CAKGvZVN6En4AmnMV1neZ_OmAGS341CJaZ7Fbgny1KB1CUd1_Jg@mail.gmail.com>

Hey,

I am working on the improvement of an already existing annotation. I could
find that sometimes MAKER would split or merge genes where it intuitively
does not look correct when looking at the evidence. Please find two
examples attached. The first track is the old annotation, the second track
the new annotation, then there is RNA-seq data, proteins, repeats, snap
prediction, augustus prediction. It is visible, that in both cases the
evidence supports two genes, and one gene predictor in each case tends to
create one gene where the other one creates two genes. I do not understand
why in this case the gene is merged, if evidence and also one ab initio
prediction support rather two genes. Are there any suggestions on how to
solve this?

best
Timo
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20180720/81363d18/attachment-0002.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Picture1.png
Type: image/png
Size: 26778 bytes
Desc: not available
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20180720/81363d18/attachment-0004.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Picutre2.png
Type: image/png
Size: 24145 bytes
Desc: not available
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20180720/81363d18/attachment-0005.png>

From cganote at iu.edu  Tue Jul 24 10:31:02 2018
From: cganote at iu.edu (Ganote, Carrie L)
Date: Tue, 24 Jul 2018 16:31:02 -0000
Subject: [maker-devel] Maker ignores evidence and just returns gffs with
 genome contigs
Message-ID: <D77CCC75.46875%cganote@iu.edu>

Running maker, I don't see anything in the gff except the names of the contigs and their lengths:

##gff-version 3
SczI0sq_2092%3%3D3122    .       contig  1       119548  .       .       .       ID=SczI0sq_2092%3%3D3122;Name=SczI0sq_2092%3%3D3122
###
SczI0sq_842%3%3D1778     .       contig  1       4693    .       .       .       ID=SczI0sq_842%3B%3D1778;Name=SczI0sq_842%3%3D1778
###
...

In my opts file, I have:

#-----Genome (these are always required)
genome=/projects/Reference/genome.chr.fa #genome sequence (fasta file or fasta embeded in GFF3 file)
organism_type=eukaryotic #eukaryotic or prokaryotic. Default is eukaryotic

#-----Re-annotation Using MAKER Derived GFF3
maker_gff= #MAKER derived GFF3 file
est_pass=0 #use ESTs in maker_gff: 1 = yes, 0 = no
altest_pass=0 #use alternate organism ESTs in maker_gff: 1 = yes, 0 = no
protein_pass=0 #use protein alignments in maker_gff: 1 = yes, 0 = no
rm_pass=0 #use repeats in maker_gff: 1 = yes, 0 = no
model_pass=0 #use gene models in maker_gff: 1 = yes, 0 = no
pred_pass=0 #use ab-initio predictions in maker_gff: 1 = yes, 0 = no
other_pass=0 #passthrough anyything else in maker_gff: 1 = yes, 0 = no

#-----EST Evidence (for best results provide a file for at least one)
est= #set of ESTs or assembled mRNA-seq in fasta format
altest= #EST/cDNA sequence file in fasta format from an alternate organism
est_gff=/projects/Reference/Maker/EST_assembled.all.gff #aligned ESTs or mRNA-seq from an external GFF3 file
altest_gff= #aligned ESTs from a closly relate species in GFF3 format

#-----Protein Homology Evidence (for best results provide a file for at least one)
protein=  #protein sequence file in fasta format (i.e. from mutiple oransisms)
protein_gff=/projects/Reference/Maker/exonerate_withCC.gff3  #aligned protein homology evidence from an external GFF3 file

#-----Repeat Masking (leave values blank to skip repeat masking)
model_org= #select a model organism for RepBase masking in RepeatMasker
rmlib= #provide an organism specific repeat library in fasta format for RepeatMasker
repeat_protein= #provide a fasta file of transposable element proteins for RepeatRunner
rm_gff= #pre-identified repeat elements from an external GFF3 file
prok_rm=0 #forces MAKER to repeatmask prokaryotes (no reason to change this), 1 = yes, 0 = no
softmask=1 #use soft-masking rather than hard-masking in BLAST (i.e. seg and dust filtering)

#-----Gene Prediction
snaphmm= #SNAP HMM file
gmhmm= #GeneMark HMM file
augustus_species= #Augustus gene prediction species model
fgenesh_par_file= #FGENESH parameter file
pred_gff=/projects/Reference/Maker/augustus_output.reformated.gff #ab-initio predictions from an external GFF3 file
model_gff= #annotated gene models from an external GFF3 file (annotation pass-through)
est2genome=0 #infer gene predictions directly from ESTs, 1 = yes, 0 = no
protein2genome=0 #infer predictions from protein homology, 1 = yes, 0 = no
trna=0 #find tRNAs with tRNAscan, 1 = yes, 0 = no
snoscan_rrna= #rRNA file to have Snoscan find snoRNAs
unmask=0 #also run ab-initio prediction programs on unmasked sequence, 1 = yes, 0 = no

#-----Other Annotation Feature Types (features MAKER doesn't recognize)
other_gff= #extra features to pass-through to final MAKER generated GFF3 file

#-----External Application Behavior Options
alt_peptide=C #amino acid used to replace non-standard amino acids in BLAST databases
cpus=1 #max number of cpus to use in BLAST and RepeatMasker (not for MPI, leave 1 when using MPI)

#-----MAKER Behavior Options
max_dna_len=100000 #length for dividing up contigs into chunks (increases/decreases memory usage)
min_contig=1 #skip genome contigs below this length (under 10kb are often useless)

pred_flank=200 #flank for extending evidence clusters sent to gene predictors
pred_stats=0 #report AED and QI statistics for all predictions as well as models
AED_threshold=1 #Maximum Annotation Edit Distance allowed (bound by 0 and 1)
min_protein=0 #require at least this many amino acids in predicted proteins
alt_splice=0 #Take extra steps to try and find alternative splicing, 1 = yes, 0 = no
always_complete=0 #extra steps to force start and stop codons, 1 = yes, 0 = no
map_forward=0 #map names and attributes forward from old GFF3 genes, 1 = yes, 0 = no
keep_preds=0 #Concordance threshold to add unsupported gene prediction (bound by 0 and 1)

split_hit=10000 #length for the splitting of hits (expected max intron size for evidence alignments)
single_exon=0 #consider single exon EST evidence when generating annotations, 1 = yes, 0 = no
single_length=250 #min length required for single exon ESTs if 'single_exon is enabled'
correct_est_fusion=0 #limits use of ESTs in annotation to avoid fusion genes

tries=2 #number of times to try a contig if there is a failure for some reason
clean_try=0 #remove all data from previous run before retrying, 1 = yes, 0 = no
clean_up=0 #removes theVoid directory with individual analysis files, 1 = yes, 0 = no
TMP= #specify a directory other than the system default temporary directory for temporary files

It ran for ~3 hours and all contigs in the log file said FINISHED. No failures. Did I set something wrong?

-Carrie
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20180724/aa12e191/attachment-0002.html>

From jennifer.anderson at ebc.uu.se  Tue Jul  3 05:57:56 2018
From: jennifer.anderson at ebc.uu.se (Jennifer Anderson)
Date: Tue, 3 Jul 2018 13:57:56 +0200
Subject: [maker-devel] Genemark XXX.mod files
Message-ID: <902950FF-775C-46DC-987A-5666A56A6650@ebc.uu.se>


Hello,

I am working on annotations for fungal genomes, using GenemarkES with ?fungi for gene prediction.  In earlier attempts, I did not use the  training flag, and I did get the output gmhmm file.  Now I have tried with the training flag and do not get this file.  In the /run/ directory I do get mod files  ES_A.mod, ES_B.mod, and ES_C.mod, as well as ini.mod.  Does one of these files work as the ES.mod file as in
"gmhmm=../train_genemark/es.mod #GeneMark HMM file? from http://weatherby.genetics.utah.edu/MAKER/wiki/index.php/The_MAKER_control_files_explained? I don?t find documentation of the genemarkES output online.

Thank you.

Jenni


N?r du har kontakt med oss p? Uppsala universitet med e-post s? inneb?r det att vi behandlar dina personuppgifter. F?r att l?sa mer om hur vi g?r det kan du l?sa h?r: http://www.uu.se/om-uu/dataskydd-personuppgifter/

E-mailing Uppsala University means that we will process your personal data. For more information on how this is performed, please read here: http://www.uu.se/om-uu/dataskydd-personuppgifter/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20180703/b44d9709/attachment-0003.html>

From liorglic at mail.tau.ac.il  Wed Jul  4 06:32:05 2018
From: liorglic at mail.tau.ac.il (Lior Glick)
Date: Wed, 4 Jul 2018 14:32:05 +0200
Subject: [maker-devel] How sensitive is MAKER to redundant/partial
 transcripts?
Message-ID: <CAOzMDPyi3465OxB1n2oQJFjdG_rG7EvsODvLoexDBLQYVd7jhQ@mail.gmail.com>

 Dear MAKER users,

I am new to MAKER and would like your advice.
I am planning to annotate multiple genomes of tomato variants and wild
relatives. To this end, I have been working on generating a diverse
transcripts data set to be used as input for MAKER (along with protein
sequences and the 'official' tomato annotation). My transcripts set was
generated by collecting multiple available RNA-Seq results from SRA,
covering diverse variants, conditions and tissues, and assembling them into
transcripts using Trinity. My goal is to have a data set as diverse and
broad as possible.
Now I have ~30 fasta files of transcripts, originating from different
studies. Of course, many of the transcripts are redundant and/or partial. I
am exploring ways to merge the multiple data sets into a non-redundant one,
while also stitching partial transcripts into longer ones based on overlaps.
However, this turns out to be not-so-trivial and I am wandering if this is
really necessary in order to get a good annotation? Maybe I can just
concatenate all my transcriptome assembly results, and MAKER will handle
redundant and partial transcripts?
Can someone clarify how this works, and try to assess if an annotation
based on a merged data set should be superior to one that didn't undergo
such a process? If someone has actual experience with such data, that
would be really helpful, but any advice would be highly appreciated.

Thanks a lot and best regards,
Lior
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20180704/90ee431a/attachment-0003.html>

From jason.stajich at gmail.com  Thu Jul  5 12:13:57 2018
From: jason.stajich at gmail.com (Jason Stajich)
Date: Thu, 5 Jul 2018 11:13:57 -0700
Subject: [maker-devel] Genemark XXX.mod files
In-Reply-To: <902950FF-775C-46DC-987A-5666A56A6650@ebc.uu.se>
References: <902950FF-775C-46DC-987A-5666A56A6650@ebc.uu.se>
Message-ID: <CALf8Lpy-KretSGjwmMTyxOARiJ251mzbxb8HuV4XU8asOfW0dg@mail.gmail.com>

the run/ES_C.mod should be the right one if it is there.
It is possible is crashing on one of the training / retraining?

Jason Stajich
jason.stajich at gmail.com


On Tue, Jul 3, 2018 at 11:05 AM Jennifer Anderson <
jennifer.anderson at ebc.uu.se> wrote:

>
> Hello,
>
> I am working on annotations for fungal genomes, using GenemarkES with
> ?fungi for gene prediction.  In earlier attempts, I did not use the
>  training flag, and I did get the output gmhmm file.  Now I have tried with
> the training flag and do not get this file.  In the /run/ directory I do
> get mod files  ES_A.mod, ES_B.mod, and ES_C.mod, as well as ini.mod.  Does
> one of these files work as the ES.mod file as in
> "gmhmm=../train_genemark/es.mod #GeneMark HMM file? from
> http://weatherby.genetics.utah.edu/MAKER/wiki/index.php/The_MAKER_control_files_explained? I
> don?t find documentation of the genemarkES output online.
>
> Thank you.
>
> Jenni
>
>
>
>
>
>
>
>
>
> N?r du har kontakt med oss p? Uppsala universitet med e-post s? inneb?r
> det att vi behandlar dina personuppgifter. F?r att l?sa mer om hur vi g?r
> det kan du l?sa h?r: http://www.uu.se/om-uu/dataskydd-personuppgifter/
>
> E-mailing Uppsala University means that we will process your personal
> data. For more information on how this is performed, please read here:
> http://www.uu.se/om-uu/dataskydd-personuppgifter/
> _______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20180705/6ab6707b/attachment-0003.html>

From carsonhh at gmail.com  Thu Jul  5 12:47:38 2018
From: carsonhh at gmail.com (Carson Holt)
Date: Thu, 5 Jul 2018 12:47:38 -0600
Subject: [maker-devel] How sensitive is MAKER to redundant/partial
 transcripts?
In-Reply-To: <CAOzMDPyi3465OxB1n2oQJFjdG_rG7EvsODvLoexDBLQYVd7jhQ@mail.gmail.com>
References: <CAOzMDPyi3465OxB1n2oQJFjdG_rG7EvsODvLoexDBLQYVd7jhQ@mail.gmail.com>
Message-ID: <788E84AB-DB85-43AD-8FE1-C1D8A7DBD4B5@gmail.com>

MAKER will collapse redundant evidence after alignment, so it will primarily just increase run time. The main issue with so many datasets would be false positive alignments (assembled background transcription). You can look at individual contigs in Apollo, IGV, or other browser to see where spurious alignments occur and if they are overall associated with a particular dataset (it?s ok to throw out a noisy dataset especially if you have additional data).

?Carson


> On Jul 4, 2018, at 6:32 AM, Lior Glick <liorglic at mail.tau.ac.il> wrote:
> 
> Dear MAKER users,
> 
> I am new to MAKER and would like your advice.
> I am planning to annotate multiple genomes of tomato variants and wild relatives. To this end, I have been working on generating a diverse transcripts data set to be used as input for MAKER (along with protein sequences and the 'official' tomato annotation). My transcripts set was generated by collecting multiple available RNA-Seq results from SRA, covering diverse variants, conditions and tissues, and assembling them into transcripts using Trinity. My goal is to have a data set as diverse and broad as possible.
> Now I have ~30 fasta files of transcripts, originating from different studies. Of course, many of the transcripts are redundant and/or partial. I am exploring ways to merge the multiple data sets into a non-redundant one, while also stitching partial transcripts into longer ones based on overlaps.
> However, this turns out to be not-so-trivial and I am wandering if this is really necessary in order to get a good annotation? Maybe I can just concatenate all my transcriptome assembly results, and MAKER will handle redundant and partial transcripts?
> Can someone clarify how this works, and try to assess if an annotation based on a merged data set should be superior to one that didn't undergo such a process? If someone has actual experience with such data, that  would be really helpful, but any advice would be highly appreciated.
> 
> Thanks a lot and best regards,
> Lior
> 
> _______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20180705/b1d1cdc6/attachment-0003.html>

From carsonhh at gmail.com  Thu Jul  5 12:50:36 2018
From: carsonhh at gmail.com (Carson Holt)
Date: Thu, 5 Jul 2018 12:50:36 -0600
Subject: [maker-devel] [CAUTION: Suspicious Link] map_forward=1 not
 mapping reference ID's to output correctly
In-Reply-To: <D5A4E18F-CFDC-489E-BA1B-FB88FA66C338@ars.usda.gov>
References: <CA+DOteeTFd06_k5ONYLvn7FpUuv-JDNqp1PCFa9QF0TxDa9iEg@mail.gmail.com>
	<D5A4E18F-CFDC-489E-BA1B-FB88FA66C338@ars.usda.gov>
Message-ID: <4EE96E7F-5F5B-4988-BC9C-FC441848B768@gmail.com>

A quick overview of MAKER behavior. MAKER will keep everything in model_gff as long as you don?t provide another predictor to run or pred_gff file to use. But if you give it a predictor to run, it takes that as an indicator that you want to update models. So model_gff may get replaced by another prediction that overlaps it but scores better.

So depending on the behavior you want, make sure you are using model_gff and do or don?t provide a gene predictor to run.

?Carson


> On Jun 22, 2018, at 2:04 PM, Poelchau, Monica <monica.poelchau at ars.usda.gov> wrote:
> 
> Hi Kapeel,
>  
> If you just want your community annotations to replace models in an existing gene set, we have a tool for this:
>  
> https://github.com/NAL-i5K/GFF3toolkit <https://github.com/NAL-i5K/GFF3toolkit>
>  
> You?d need to run gff3_QC on your annotation files first to make sure your annotations are okay, then use gff3_merge to merge your community annotations with your existing gene set (in gff3 format). If you end up trying this out - we?re actively developing the GFF3toolkit, so feel free to post an issue if you notice any problems.
>  
> Hth,
>  
> Monica 
>  
> From: maker-devel <maker-devel-bounces at yandell-lab.org <mailto:maker-devel-bounces at yandell-lab.org>> on behalf of Kapeel Chougule <kapeelc at gmail.com <mailto:kapeelc at gmail.com>>
> Date: Friday, June 22, 2018 at 13:53
> To: "maker-devel at yandell-lab.org <mailto:maker-devel at yandell-lab.org>" <maker-devel at yandell-lab.org <mailto:maker-devel at yandell-lab.org>>
> Subject: [CAUTION: Suspicious Link][maker-devel] map_forward=1 not mapping reference ID's to output correctly
>  
> PROCEED WITH CAUTION: This message triggered warnings of potentially malicious web content. Evaluate this email by considering whether you are expecting the message, along with inspection for suspicious links.
> 
> Questions: Spam.Abuse at wdc.usda.gov <mailto:Spam.Abuse at wdc.usda.gov>
> 
> Hi,
>  
> I am trying to update community annotation <https://de.cyverse.org/dl/d/39D60E88-078D-4CF5-9F3A-D712B714CDD8/community.annotation.gff3> in the light of new evidence data but my MAKER runs are not keeping all the genes from the community annotation.
> 
> 
> Community annotation feature count: 2 1 bicolor 239969 CDS 266301 exon 51066 five_prime_UTR 34129 gene 47121 mRNA 53708 three_prime_UTR
> MAKER gene count->  
> awk '$3=="gene"{print}' maker_output.all.gff | grep "Sobic*" | wc -l 21105
>  
> In the maker_opts.ctl file attached, I did make keep_preds=1 and map_forward=1 which keep all the community gene models even if they dont have evidence support. This was explained here:
> http://weatherby.genetics.utah.edu/MAKER/wiki/index.php/Updating_annotations_in_light_of_new_data <http://weatherby.genetics.utah.edu/MAKER/wiki/index.php/Updating_annotations_in_light_of_new_data>
> . So not sure why we dont have the all the community gene models mapped in the MAKER output
> 
> Thanks
> 
> Kapeel
> -- 
>  
> Kapeel Chougule
> Computational Scientist Developer II
> One Bungtown Road Cold Spring Harbor, NY 11724
> http://www.warelab.org/ <http://www.warelab.org/>
> 
> 
> 
> This electronic message contains information generated by the USDA solely for the intended recipients. Any unauthorized interception of this message or the use or disclosure of the information it contains may violate the law and subject the violator to civil or criminal penalties. If you believe you have received this message in error, please notify the sender and delete the email immediately. _______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com <mailto:maker-devel at box290.bluehost.com>
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org <http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20180705/0a9af988/attachment-0003.html>

From carsonhh at gmail.com  Thu Jul  5 13:17:14 2018
From: carsonhh at gmail.com (Carson Holt)
Date: Thu, 5 Jul 2018 13:17:14 -0600
Subject: [maker-devel] Maker Error : Thread 1 terminated abnormally..
In-Reply-To: <CAAXcPKC3mnkqP9OU7L9bBLtts4KujCoBrUNieuUfgo+wd-E4Yw@mail.gmail.com>
References: <CAAXcPKC3mnkqP9OU7L9bBLtts4KujCoBrUNieuUfgo+wd-E4Yw@mail.gmail.com>
Message-ID: <C61CC367-F138-47F2-AA61-876811458353@gmail.com>

Sorry for the slow reply. Make sure you find out what flavor of MPI you are using (MPICH, MVAPICH2, Intel MPI, or OpenMPI). MAKER does not work with MVAPICH2. It can work with Intel MPI and OpenMPI with some command line modification. And it always works with MPICH, but MPICH may not be able to scale to more than ~100 CPUs.

This command ?-mca btl ^openib?, is only for OpenMPI for example. Also if using OpenMPI, set LD_PRELOAD in accordance with the INSTALL documentation. Also make sure you do not have multiple MPI flavors installed and you compiled MAKER with one then are running with a different flavor. That will cause failure shortly after starting MAKER.

Try looking further back in your STDER for the actual cause. The ?Thread 1 terminated abnormally:? message is the tail end of the failure snowball, so the actual cause is often much further back. 

?Carson


> On Jun 26, 2018, at 9:36 AM, Andr? Machado <andremmachado25 at gmail.com> wrote:
> 
> Hi ,
> 
> First of all thanks for your efforts in Maker pipeline. Its a tremendous help for the people that works with genomes.
> In the last 4 days i have broke my head.. with an error .. but still without a solution.
> I found this old thread: https://groups.google.com/forum/#!msg/maker-devel/X2-76BH9gvg/rU4kLJ3B6tsJ <https://groups.google.com/forum/#!msg/maker-devel/X2-76BH9gvg/rU4kLJ3B6tsJ>
> Seems to be a quite similar... but don't point to a specific solution.
> I have run maker with the data test and all runned ok. Maker finalize the entire process without errors.
> Recently, i?m trying to aplly my own data on MPI cluster. But this error, frequently occurred.
> Thread 1 terminated abnormally: ../dna.maker.output/mpi_blastdb/dna%2Efa.mpi.1/dna%2Efa.mpi.1.0
> --> rank=8, hostname=compute-0-1.local, at ../Analysis/Geno/maker/bin/maker line 1451 thread 1.
> --> rank=8, hostname=compute-0-1.local
> deleted:0 hits
> deleted:0 hits
> preparing ab-inits
> deleted:0 hits
> deleted:0 hits
> FATAL: Thread terminated, causing all processes to fail
> --> rank=8, hostname=compute-0-1.local
> deleted:0 hits
> 
> Basically im tring to run a maker with dna.fa, rna.fa, prot.fa and my_custom_lib_of_repeats.fa, to produce raw genes models which will be used to train SNAP.
> 
> I already used several command lines and all gave me the same error.. The only change between different tests was the local of the error, sometimes happened in compute-0-1.local other time in compute-0-4.local or in another one.
> mpiexec -n 63 --hostfile Host maker 1>1.log 2>2.err
> 
> mpiexec --hostfile Host maker 1>1.log 2>2.err
> mpiexec -mca btl ^openib -n 63 --hostfile Host maker 1>1.log 2>2.err
> nohup mpiexec -mca btl ^openib -n 63 --hostfile Host maker -a 1>1.log 2>2.err
> 
> The log file as well the option files are provided below.
> 
> Many thanks in advance,
> 
> Andr?
> 
> <2.log><maker_exe.ctl><maker_opts.ctl><maker_bopts.ctl>_______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20180705/a5f9a756/attachment-0003.html>

From andremmachado25 at gmail.com  Wed Jul  4 05:16:08 2018
From: andremmachado25 at gmail.com (=?UTF-8?Q?Andr=C3=A9_Machado?=)
Date: Wed, 4 Jul 2018 12:16:08 +0100
Subject: [maker-devel] =?utf-8?q?Maker_Error_=3A_Thread_1_terminated_abno?=
	=?utf-8?q?rmally=2E=2E_Andr=C3=A9_Machado_=3Candremmachado25=40gma?=
	=?utf-8?q?il=2Ecom=3E_AttachmentsJun_26_=288_days_ago=29_to_maker-?=
	=?utf-8?q?devel_Hi_=2C_First_of_all_thanks_for_your_efforts_in_Mak?=
	=?utf-8?q?er_pipeline=2E_Its_a_tremendous_help_for_the_people_that?=
	=?utf-8?q?_works_with_genomes=2E_In_the_last_4_days_i_have_broke_m?=
	=?utf-8?q?y_head=2E=2E_with_an_error_=2E=2E_but_still_without_a_so?=
	=?utf-8?q?lution=2E_I_found_this_old_thread=3A_https=3A//groups=2E?=
	=?utf-8?q?google=2Ecom/forum/=23!msg/maker-devel/X2-76BH9gvg/rU4kL?=
	=?utf-8?q?J3B6tsJ_Seems_to_be_a_quite_similar=2E=2E=2E_but_don=27t?=
	=?utf-8?q?_point_to_a_specific_solution=2E_I_have_run_maker_with_t?=
	=?utf-8?q?he_data_test_and_all_runned_ok=2E_Maker_finalize_the_ent?=
	=?utf-8?q?ire_process_without_errors=2E_Recently=2C_i=E2=80=99m_tr?=
	=?utf-8?q?ying_to_aplly_my_own_data_on_MPI_cluster=2E_But_this_err?=
	=?utf-8?q?or=2C_frequently_occurred=2E_Thread_1_terminated_abnorma?=
	=?utf-8?q?lly=3A_=2E=2E/dna=2Emaker=2Eoutput/mpi=5Fblastdb/dna=252?=
	=?utf-8?b?RWZhLm1waS4xL2RuYSUyRWZhLm1waS4xLjAgLS0+IHJhbms9OCwgaG9z?=
	=?utf-8?q?tname=3Dcompute-0-1=2Elocal=2C_at_=2E=2E/Analysis/Geno/m?=
	=?utf-8?q?aker/bin/maker_line_1451_thread_1=2E_--=3E_rank=3D8=2C_h?=
	=?utf-8?q?ostname=3Dcompute-0-1=2Elocal_deleted=3A0_hits_deleted?=
	=?utf-8?q?=3A0_hits_preparing_ab-ini?=
Message-ID: <CAAXcPKBUtfN3aSxqjo9qHgiS1WNXLRz6Z+Qm2USZkJ_HkvH-Dw@mail.gmail.com>

Hi ,


First of all thanks for your efforts in Maker pipeline. Its a tremendous
help for the people that works with genomes.

In the last 4 days i have broke my head.. with an error .. but still
without a solution.

I found this old thread: https://groups.google.com/
forum/#!msg/maker-devel/X2-76BH9gvg/rU4kLJ3B6tsJ

Seems to be a quite similar... but don't point to a specific solution.

I have run maker with the data test and all runned ok. Maker finalize the
entire process without errors.

Recently, i?m trying to aplly my own data on MPI cluster. But this error,
frequently occurred.

Thread 1 terminated abnormally: ../dna.maker.output/mpi_
blastdb/dna%2Efa.mpi.1/dna%2Efa.mpi.1.0

--> rank=8, hostname=compute-0-1.local, at ../Analysis/Geno/maker/bin/maker
line 1451 thread 1.

--> rank=8, hostname=compute-0-1.local

deleted:0 hits

deleted:0 hits

preparing ab-inits

deleted:0 hits

deleted:0 hits

FATAL: Thread terminated, causing all processes to fail

--> rank=8, hostname=compute-0-1.local

deleted:0 hits


Basically im tring to run a maker with dna.fa, rna.fa, prot.fa and
my_custom_lib_of_repeats.fa, to produce raw genes models which will be used
to train SNAP.


I already used several command lines and all gave me the same error.. The
only change between different tests was the local of the error, sometimes
happened in compute-0-1.local other time in compute-0-4.local or in another
one.

mpiexec -n 63 --hostfile Host maker 1>1.log 2>2.err

mpiexec --hostfile Host maker 1>1.log 2>2.err

mpiexec -mca btl ^openib -n 63 --hostfile Host maker 1>1.log 2>2.err

nohup mpiexec -mca btl ^openib -n 63 --hostfile Host maker -a 1>1.log
2>2.err


The log file as well the option files are provided below.


Many thanks in advance,


Andr?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20180704/20722397/attachment-0003.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 2.log
Type: text/x-log
Size: 38655 bytes
Desc: not available
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20180704/20722397/attachment-0003.log>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: maker_exe.ctl
Type: application/octet-stream
Size: 1224 bytes
Desc: not available
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20180704/20722397/attachment-0009.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: maker_opts.ctl
Type: application/octet-stream
Size: 4548 bytes
Desc: not available
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20180704/20722397/attachment-0010.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: maker_bopts.ctl
Type: application/octet-stream
Size: 1413 bytes
Desc: not available
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20180704/20722397/attachment-0011.obj>

From liorglck at gmail.com  Wed Jul  4 06:28:14 2018
From: liorglck at gmail.com (Lior Glick)
Date: Wed, 4 Jul 2018 14:28:14 +0200
Subject: [maker-devel] How sensitive is MAKER to redundant/partial
 transcripts?
Message-ID: <CAFOVipOCZphMxMLitWEVdjJg1WrF2LxVRkJBWtQOEOSEFUzDOA@mail.gmail.com>

Dear MAKER users,

I am new to MAKER and would like your advice.
I am planning to annotate multiple genomes of tomato variants and wild
relatives. To this end, I have been working on generating a diverse
transcripts data set to be used as input for MAKER (along with protein
sequences and the 'official' tomato annotation). My transcripts set was
generated by collecting multiple available RNA-Seq results from SRA,
covering diverse variants, conditions and tissues, and assembling them into
transcripts using Trinity. My goal is to have a data set as diverse and
broad as possible.
Now I have ~30 fasta files of transcripts, originating from different
studies. Of course, many of the transcripts are redundant and/or partial. I
am exploring ways to merge the multiple data sets into a non-redundant one,
while also stitching partial transcripts into longer ones based on overlaps.
However, this turns out to be not-so-trivial and I am wandering if this is
really necessary in order to get a good annotation? Maybe I can just
concatenate all my transcriptome assembly results, and MAKER will handle
redundant and partial transcripts?
Can someone clarify how this works, and try to assess if an annotation
based on a merged data set should be superior to one that didn't undergo
such a process? If someone has actual experience with such data, that
would be really helpful, but any advice would be highly appreciated.

Thanks a lot and best regards,
Lior
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20180704/71d71274/attachment-0003.html>

From carsonhh at gmail.com  Thu Jul 12 14:05:00 2018
From: carsonhh at gmail.com (Carson Holt)
Date: Thu, 12 Jul 2018 14:05:00 -0600
Subject: [maker-devel] 
 =?utf-8?q?Maker_Error_=3A_Thread_1_terminated_abno?=
 =?utf-8?q?rmally=2E=2E_Andr=C3=A9_Machado_=3Candremmachado25=40gmail=2Eco?=
 =?utf-8?q?m=3E_AttachmentsJun_26_=288_days_ago=29_to_maker-devel_Hi_=2C_F?=
 =?utf-8?q?irst_of_all_thanks_for_your_efforts_in_Maker_pipeline=2E_Its_a_?=
 =?utf-8?q?tremendous_help_for_the_people_that_works_with_genomes=2E_In_th?=
 =?utf-8?q?e_last_4_days_i_have_broke_my_head=2E=2E_with_an_error_=2E=2E_b?=
 =?utf-8?q?ut_still_without_a_solution=2E_I_found_this_old_thread=3A_https?=
 =?utf-8?q?=3A//groups=2Egoogle=2Ecom/forum/=23!msg/maker-devel/X2-76BH9gv?=
 =?utf-8?q?g/rU4kLJ3B6tsJ_Seems_to_be_a_quite_similar=2E=2E=2E_but_don=27t?=
 =?utf-8?q?_point_to_a_specific_solution=2E_I_have_run_maker_with_the_data?=
 =?utf-8?q?_test_and_all_runned_ok=2E_Maker_finalize_the_entire_process_wi?=
 =?utf-8?q?thout_errors=2E_Recently=2C_i=E2=80=99m_trying_to_aplly_my_own_?=
 =?utf-8?q?data_on_MPI_cluster=2E_But_this_error=2C_frequently_occurred=2E?=
 =?utf-8?q?_Thread_1_terminated_abnormally=3A_=2E=2E/dna=2Emaker=2Eoutput/?=
 =?utf-8?q?mpi=5Fblastdb/dna=252Efa=2Empi=2E1/dna=252Efa=2Empi=2E1=2E0_--?=
 =?utf-8?q?=3E_rank=3D8=2C_hostname=3Dcompute-0-1=2Elocal=2C_at_=2E=2E/Ana?=
 =?utf-8?q?lysis/Geno/maker/bin/maker_line_1451_thread_1=2E_--=3E_rank=3D8?=
 =?utf-8?q?=2C_hostname=3Dcompute-0-1=2Elocal_deleted=3A0_hits_deleted=3A0?=
 =?utf-8?q?_hits_preparing_ab-ini?=
In-Reply-To: <CAAXcPKBUtfN3aSxqjo9qHgiS1WNXLRz6Z+Qm2USZkJ_HkvH-Dw@mail.gmail.com>
References: <CAAXcPKBUtfN3aSxqjo9qHgiS1WNXLRz6Z+Qm2USZkJ_HkvH-Dw@mail.gmail.com>
Message-ID: <5F1E5499-239E-405E-81EC-CECC755D7838@gmail.com>

Because you truncated / removed line before the actual error (I need to see the several hundred lines that happened before  "Thread 1 terminated abnormally: ../dna.maker.output/mpi_blastdb/dna%2Efa.mpi.1/dna%2Efa.mpi.1.0?), I can?t give hyou more info.

But you are getting a lot of OpenMPI complaints at the start. You may need to reinstall OpenMPI or use MPICH instead (both will require you to reinstall maker as it will need to rebuild the MPI C/Perl binding for the new installation). Also when using OpenMPI, make sure to export LD_PRELOAD in the way outlined in the ?/maker/INSTALL instructions. 

?Carson


> On Jul 4, 2018, at 5:16 AM, Andr? Machado <andremmachado25 at gmail.com> wrote:
> 
> Hi ,
> 
> First of all thanks for your efforts in Maker pipeline. Its a tremendous help for the people that works with genomes.
> In the last 4 days i have broke my head.. with an error .. but still without a solution.
> I found this old thread: https://groups.google.com/forum/#!msg/maker-devel/X2-76BH9gvg/rU4kLJ3B6tsJ <https://groups.google.com/forum/#!msg/maker-devel/X2-76BH9gvg/rU4kLJ3B6tsJ>
> Seems to be a quite similar... but don't point to a specific solution.
> I have run maker with the data test and all runned ok. Maker finalize the entire process without errors.
> Recently, i?m trying to aplly my own data on MPI cluster. But this error, frequently occurred.
> Thread 1 terminated abnormally: ../dna.maker.output/mpi_blastdb/dna%2Efa.mpi.1/dna%2Efa.mpi.1.0
> --> rank=8, hostname=compute-0-1.local, at ../Analysis/Geno/maker/bin/maker line 1451 thread 1.
> --> rank=8, hostname=compute-0-1.local
> deleted:0 hits
> deleted:0 hits
> preparing ab-inits
> deleted:0 hits
> deleted:0 hits
> FATAL: Thread terminated, causing all processes to fail
> --> rank=8, hostname=compute-0-1.local
> deleted:0 hits
> 
> Basically im tring to run a maker with dna.fa, rna.fa, prot.fa and my_custom_lib_of_repeats.fa, to produce raw genes models which will be used to train SNAP.
> 
> I already used several command lines and all gave me the same error.. The only change between different tests was the local of the error, sometimes happened in compute-0-1.local other time in compute-0-4.local or in another one.
> mpiexec -n 63 --hostfile Host maker 1>1.log 2>2.err
> 
> mpiexec --hostfile Host maker 1>1.log 2>2.err
> mpiexec -mca btl ^openib -n 63 --hostfile Host maker 1>1.log 2>2.err
> nohup mpiexec -mca btl ^openib -n 63 --hostfile Host maker -a 1>1.log 2>2.err
> 
> The log file as well the option files are provided below.
> 
> Many thanks in advance,
> 
> Andr?
> 
> <2.log><maker_exe.ctl><maker_opts.ctl><maker_bopts.ctl>_______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20180712/dadbe692/attachment-0003.html>

From carsonhh at gmail.com  Thu Jul 12 14:38:33 2018
From: carsonhh at gmail.com (Carson Holt)
Date: Thu, 12 Jul 2018 14:38:33 -0600
Subject: [maker-devel] How sensitive is MAKER to redundant/partial
 transcripts?
In-Reply-To: <CAFOVipOCZphMxMLitWEVdjJg1WrF2LxVRkJBWtQOEOSEFUzDOA@mail.gmail.com>
References: <CAFOVipOCZphMxMLitWEVdjJg1WrF2LxVRkJBWtQOEOSEFUzDOA@mail.gmail.com>
Message-ID: <C3CE3772-8538-42A9-9178-BEBF719EFFC8@gmail.com>

MAKER will automatically collapse redundant evidence. The only thing you may need to worry about with too many datasets is background transcription. With more datasets you will have more spurious assemblies from background transcription (if you sequence deep enough everything is transcribed at some level). You should also look at the results in a browser like apollo, you may find that some datasets are more noisy than others and it would be beneficial to drop them especially if they are redundant. So always do a  visual review of results.

?Carson


> On Jul 4, 2018, at 6:28 AM, Lior Glick <liorglck at gmail.com> wrote:
> 
> Dear MAKER users,
> 
> I am new to MAKER and would like your advice.
> I am planning to annotate multiple genomes of tomato variants and wild relatives. To this end, I have been working on generating a diverse transcripts data set to be used as input for MAKER (along with protein sequences and the 'official' tomato annotation). My transcripts set was generated by collecting multiple available RNA-Seq results from SRA, covering diverse variants, conditions and tissues, and assembling them into transcripts using Trinity. My goal is to have a data set as diverse and broad as possible.
> Now I have ~30 fasta files of transcripts, originating from different studies. Of course, many of the transcripts are redundant and/or partial. I am exploring ways to merge the multiple data sets into a non-redundant one, while also stitching partial transcripts into longer ones based on overlaps.
> However, this turns out to be not-so-trivial and I am wandering if this is really necessary in order to get a good annotation? Maybe I can just concatenate all my transcriptome assembly results, and MAKER will handle redundant and partial transcripts?
> Can someone clarify how this works, and try to assess if an annotation based on a merged data set should be superior to one that didn't undergo such a process? If someone has actual experience with such data, that  would be really helpful, but any advice would be highly appreciated.
> 
> Thanks a lot and best regards,
> Lior
> _______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org


From shijunpeng at cau.edu.cn  Sat Jul 14 02:04:38 2018
From: shijunpeng at cau.edu.cn (=?UTF-8?B?5Y+y5L+K6bmP?=)
Date: Sat, 14 Jul 2018 16:04:38 +0800 (GMT+08:00)
Subject: [maker-devel] Ask for help about the collapse of Maker (version
 2.31.9) when annotated with Fgenesh
In-Reply-To: <C3CE3772-8538-42A9-9178-BEBF719EFFC8@gmail.com>
References: <CAFOVipOCZphMxMLitWEVdjJg1WrF2LxVRkJBWtQOEOSEFUzDOA@mail.gmail.com>
	<C3CE3772-8538-42A9-9178-BEBF719EFFC8@gmail.com>
Message-ID: <183e519e.83bf.16497d1fd4b.Coremail.shijunpeng@cau.edu.cn>

Dear Carson,

First of all, I must apologize that I could't post my questions in Google group since I can't get access to Google in mainland China.

I am using Maker (version 2.31.9) to annotate several foxtail millet genomes. I combined Augustus and Fgenesh (v.3.1.1) for the de novo annotation of these genomes.

The majority of contigs were anotated well with maker pipeline. While, several contigs failed when annotated with Fgenesh with the following error information:

#--------- command -------------#
Widget::fgenesh:
/NAS7/home/shijunpeng/software/maker/bin/../lib/Widget/fgenesh/fgenesh_wrap /NAS7/home/shijunpeng/software/fgenesh/fgenesh /NAS7/home/shijunpeng/software/fgenesh/Monocots /tmp/43438.1.all.q/maker_8zLUxB/0/108_0.4597215-4597401.Monocots.auto_annotator.fgenesh.fasta -exon_table:/tmp/43438.1.all.q/maker_8zLUxB/0/108_0.4597215-4597401.Monocots.auto_annotator.xdef.fgenesh > /tmp/43438.1.all.q/maker_8zLUxB/0/108_0.4597215-
#-------------------------------#
ERROR: FgenesH failed
--> rank=NA, hostname=bioinfor3.local
ERROR: Failed while annotating transcripts
ERROR: Chunk failed at level:1, tier_type:4
FAILED CONTIG:scaffold_1

ERROR: Chunk failed at level:6, tier_type:0
FAILED CONTIG:scaffold_1
###############################################################################################################################################

A system core file generated after this collapse. I checked the temperate fasta file 108_0.4597215-4597401.Monocots.auto_annotator.fgenesh.fasta to be normal about ~300 bp.

I also checked my original sequence file and confirmed no problem (A,T,C,G and N). I also tried to set the pred_flank option from 200 (original) to 0 and the error still exists.

I ran the Maker pipeline in a single node with 16 processors and 256 Gb RAMs, so it may be not due to the MPI problems.

Below were my detailed maker bahavior options:
#-----MAKER Behavior Options
max_dna_len=300000 #length for dividing up contigs into chunks (increases/decreases memory usage)
min_contig=10000 #skip genome contigs below this length (under 10kb are often useless)

pred_flank=0 #flank for extending evidence clusters sent to gene predictors
pred_stats=1 #report AED and QI statistics for all predictions as well as models
AED_threshold=1 #Maximum Annotation Edit Distance allowed (bound by 0 and 1)
min_protein=0 #require at least this many amino acids in predicted proteins
alt_splice=1 #Take extra steps to try and find alternative splicing, 1 = yes, 0 = no
always_complete=1 #extra steps to force start and stop codons, 1 = yes, 0 = no
map_forward=1 #map names and attributes forward from old GFF3 genes, 1 = yes, 0 = no
keep_preds=0 #Concordance threshold to add unsupported gene prediction (bound by 0 and 1)

split_hit=10000 #length for the splitting of hits (expected max intron size for evidence alignments)
single_exon=0 #consider single exon EST evidence when generating annotations, 1 = yes, 0 = no
single_length=250 #min length required for single exon ESTs if 'single_exon is enabled'
correct_est_fusion=0 #limits use of ESTs in annotation to avoid fusion genes

tries=5 #number of times to try a contig if there is a failure for some reason
clean_try=0 #remove all data from previous run before retrying, 1 = yes, 0 = no
clean_up=0 #removes theVoid directory with individual analysis files, 1 = yes, 0 = no
TMP= #specify a directory other than the system default temporary directory for temporary files 

Could you please help me to solve this error? I am looking forward to hearing from you.

Sincerely, 
Junpeng

--
Junpeng Shi, PhD
State Key Lab For Agrobiotech, China Agricultural University
National Maize Improvement Center of China 
Center For Life Science, NO.2, 
The West Street of Yuanmingyuan Park, Beijing, P.R.China 
Tel?+86-13581863941

From liorglic at mail.tau.ac.il  Tue Jul 24 01:45:06 2018
From: liorglic at mail.tau.ac.il (Lior Glick)
Date: Tue, 24 Jul 2018 09:45:06 +0200
Subject: [maker-devel] Annotation of a new variant within a species
Message-ID: <CAOzMDPxSUnk5zJXQhsu_SwbzHiJJ0sP0H5KOhD6L0OFFdD8sKg@mail.gmail.com>

Hello,

I am trying to annotate multiple  variants of tomato. While a good
annotation of the reference genome is available, I have denovo-assembled
other variants of the same species and wish to annotate them.
Most MAKER documentation refers to annotation of a new species, while using
transcripts and proteins from either the exact same sample (individual) or
from "an alternate organism", so I'm not sure what to do in this case,
where I am annotating various samples from the same species. I have two
questions:

1. Regarding transcripts data, how should I use transcripts from other
variants of the same species? Namely, should I use the est or the altest
parameter? What is the actual difference in behavior?

2. Is there a way to incorporate gene models (in gff format) from the
reference annotation? I expect high similarity in my assembled variants,
but not identity in terms of content and coordinates, so neither pred_gff
nor model_gff sound like what I need, as far as I understand.
I could also use the reference annotation and sequence to extract cDNA and
provide them as EST data. Is this the way to go? It feels like some
information on introns might be lost this way.

Would highly appreciate your answers to these questions or any other advice.

Thank you very much!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20180724/181eef74/attachment-0003.html>

From roscito at mpi-cbg.de  Tue Jul 31 06:59:58 2018
From: roscito at mpi-cbg.de (Ju Roscito)
Date: Tue, 31 Jul 2018 14:59:58 +0200
Subject: [maker-devel] Few alternative isoforms when alt_splice=0
Message-ID: <2C92DF72-0733-490F-A2EE-6F3724EF7099@mpi-cbg.de>

Dear all,

I have a question about the behaviour of alt_splice option, seems there?s not much about it on the forum.

I have run a single round of MAKER (2.31.9) on a vertebrate genome, with trinity mRNA data and mapped proteins from closely-related species. I set alt_splice to 0, but still got from two to four mRNAs for ~20 out of the 19.000 predicted genes. Has someone also seen the same? Any idea why would that happen?

Thanks a lot in advance.


From timo.metz at googlemail.com  Fri Jul 20 06:20:05 2018
From: timo.metz at googlemail.com (Timo Metz)
Date: Fri, 20 Jul 2018 12:20:05 -0000
Subject: [maker-devel] MAKER chooser algorithm
Message-ID: <CAKGvZVN6En4AmnMV1neZ_OmAGS341CJaZ7Fbgny1KB1CUd1_Jg@mail.gmail.com>

Hey,

I am working on the improvement of an already existing annotation. I could
find that sometimes MAKER would split or merge genes where it intuitively
does not look correct when looking at the evidence. Please find two
examples attached. The first track is the old annotation, the second track
the new annotation, then there is RNA-seq data, proteins, repeats, snap
prediction, augustus prediction. It is visible, that in both cases the
evidence supports two genes, and one gene predictor in each case tends to
create one gene where the other one creates two genes. I do not understand
why in this case the gene is merged, if evidence and also one ab initio
prediction support rather two genes. Are there any suggestions on how to
solve this?

best
Timo
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20180720/81363d18/attachment-0003.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Picture1.png
Type: image/png
Size: 26778 bytes
Desc: not available
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20180720/81363d18/attachment-0006.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Picutre2.png
Type: image/png
Size: 24145 bytes
Desc: not available
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20180720/81363d18/attachment-0007.png>

From cganote at iu.edu  Tue Jul 24 10:31:02 2018
From: cganote at iu.edu (Ganote, Carrie L)
Date: Tue, 24 Jul 2018 16:31:02 -0000
Subject: [maker-devel] Maker ignores evidence and just returns gffs with
 genome contigs
Message-ID: <D77CCC75.46875%cganote@iu.edu>

Running maker, I don't see anything in the gff except the names of the contigs and their lengths:

##gff-version 3
SczI0sq_2092%3%3D3122    .       contig  1       119548  .       .       .       ID=SczI0sq_2092%3%3D3122;Name=SczI0sq_2092%3%3D3122
###
SczI0sq_842%3%3D1778     .       contig  1       4693    .       .       .       ID=SczI0sq_842%3B%3D1778;Name=SczI0sq_842%3%3D1778
###
...

In my opts file, I have:

#-----Genome (these are always required)
genome=/projects/Reference/genome.chr.fa #genome sequence (fasta file or fasta embeded in GFF3 file)
organism_type=eukaryotic #eukaryotic or prokaryotic. Default is eukaryotic

#-----Re-annotation Using MAKER Derived GFF3
maker_gff= #MAKER derived GFF3 file
est_pass=0 #use ESTs in maker_gff: 1 = yes, 0 = no
altest_pass=0 #use alternate organism ESTs in maker_gff: 1 = yes, 0 = no
protein_pass=0 #use protein alignments in maker_gff: 1 = yes, 0 = no
rm_pass=0 #use repeats in maker_gff: 1 = yes, 0 = no
model_pass=0 #use gene models in maker_gff: 1 = yes, 0 = no
pred_pass=0 #use ab-initio predictions in maker_gff: 1 = yes, 0 = no
other_pass=0 #passthrough anyything else in maker_gff: 1 = yes, 0 = no

#-----EST Evidence (for best results provide a file for at least one)
est= #set of ESTs or assembled mRNA-seq in fasta format
altest= #EST/cDNA sequence file in fasta format from an alternate organism
est_gff=/projects/Reference/Maker/EST_assembled.all.gff #aligned ESTs or mRNA-seq from an external GFF3 file
altest_gff= #aligned ESTs from a closly relate species in GFF3 format

#-----Protein Homology Evidence (for best results provide a file for at least one)
protein=  #protein sequence file in fasta format (i.e. from mutiple oransisms)
protein_gff=/projects/Reference/Maker/exonerate_withCC.gff3  #aligned protein homology evidence from an external GFF3 file

#-----Repeat Masking (leave values blank to skip repeat masking)
model_org= #select a model organism for RepBase masking in RepeatMasker
rmlib= #provide an organism specific repeat library in fasta format for RepeatMasker
repeat_protein= #provide a fasta file of transposable element proteins for RepeatRunner
rm_gff= #pre-identified repeat elements from an external GFF3 file
prok_rm=0 #forces MAKER to repeatmask prokaryotes (no reason to change this), 1 = yes, 0 = no
softmask=1 #use soft-masking rather than hard-masking in BLAST (i.e. seg and dust filtering)

#-----Gene Prediction
snaphmm= #SNAP HMM file
gmhmm= #GeneMark HMM file
augustus_species= #Augustus gene prediction species model
fgenesh_par_file= #FGENESH parameter file
pred_gff=/projects/Reference/Maker/augustus_output.reformated.gff #ab-initio predictions from an external GFF3 file
model_gff= #annotated gene models from an external GFF3 file (annotation pass-through)
est2genome=0 #infer gene predictions directly from ESTs, 1 = yes, 0 = no
protein2genome=0 #infer predictions from protein homology, 1 = yes, 0 = no
trna=0 #find tRNAs with tRNAscan, 1 = yes, 0 = no
snoscan_rrna= #rRNA file to have Snoscan find snoRNAs
unmask=0 #also run ab-initio prediction programs on unmasked sequence, 1 = yes, 0 = no

#-----Other Annotation Feature Types (features MAKER doesn't recognize)
other_gff= #extra features to pass-through to final MAKER generated GFF3 file

#-----External Application Behavior Options
alt_peptide=C #amino acid used to replace non-standard amino acids in BLAST databases
cpus=1 #max number of cpus to use in BLAST and RepeatMasker (not for MPI, leave 1 when using MPI)

#-----MAKER Behavior Options
max_dna_len=100000 #length for dividing up contigs into chunks (increases/decreases memory usage)
min_contig=1 #skip genome contigs below this length (under 10kb are often useless)

pred_flank=200 #flank for extending evidence clusters sent to gene predictors
pred_stats=0 #report AED and QI statistics for all predictions as well as models
AED_threshold=1 #Maximum Annotation Edit Distance allowed (bound by 0 and 1)
min_protein=0 #require at least this many amino acids in predicted proteins
alt_splice=0 #Take extra steps to try and find alternative splicing, 1 = yes, 0 = no
always_complete=0 #extra steps to force start and stop codons, 1 = yes, 0 = no
map_forward=0 #map names and attributes forward from old GFF3 genes, 1 = yes, 0 = no
keep_preds=0 #Concordance threshold to add unsupported gene prediction (bound by 0 and 1)

split_hit=10000 #length for the splitting of hits (expected max intron size for evidence alignments)
single_exon=0 #consider single exon EST evidence when generating annotations, 1 = yes, 0 = no
single_length=250 #min length required for single exon ESTs if 'single_exon is enabled'
correct_est_fusion=0 #limits use of ESTs in annotation to avoid fusion genes

tries=2 #number of times to try a contig if there is a failure for some reason
clean_try=0 #remove all data from previous run before retrying, 1 = yes, 0 = no
clean_up=0 #removes theVoid directory with individual analysis files, 1 = yes, 0 = no
TMP= #specify a directory other than the system default temporary directory for temporary files

It ran for ~3 hours and all contigs in the log file said FINISHED. No failures. Did I set something wrong?

-Carrie
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20180724/aa12e191/attachment-0003.html>