From dcg at cau.edu.cn  Mon May  1 08:32:30 2017
From: dcg at cau.edu.cn (dcg at cau.edu.cn)
Date: Mon, 1 May 2017 21:32:30 +0800
Subject: [maker-devel] Why my maker get no results?
Message-ID: <2017050121323023791817@cau.edu.cn>

Dear sir:
    I' have bben working on genome annotation these days.My process in as below:
    
    1. I split my contigs into 300 parts and deal with them simultaneously to speed up my process. 
    2. I used my splited-genome, protein, ESTs and RNA-seq to make the first alignment( est2genome=1,  AED_threshold=0.2 ). 
    3. Merge the maker.*_.master_datastore_index.log to get all the paths of results.
    4. gff_merge script to merge all the results in different dirs.

    However, there is no results returned. (My genome is about 3GB, but the gff of result is none.)
    index_all.log.all.gff    1KB
    index_all.log.all.maker.proteins.fasta    2837KB
    index_all.log.all.maker.transcripts.fasta    9866KB


    Where can the problems take place?
    Thanks!
Yours sincerely.

Chao Chao


dcg at cau.edu.cn
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://box290.bluehost.com/pipermail/maker-devel_yandell-lab.org/attachments/20170501/6f1771a6/attachment.html>

From carsonhh at gmail.com  Mon May  1 15:04:36 2017
From: carsonhh at gmail.com (Carson Holt)
Date: Mon, 1 May 2017 14:04:36 -0600
Subject: [maker-devel] Why my maker get no results?
In-Reply-To: <2017050121323023791817@cau.edu.cn>
References: <2017050121323023791817@cau.edu.cn>
Message-ID: <28772B4F-D674-49E2-BFBD-CE2651CE0454@gmail.com>

You can merge datastore indexes that way. You will need to run them separately (i.e. unmodified location and content from what MAKER gave you), and then merge the fasta and gff3 files afterwards.

?Carson


> On May 1, 2017, at 7:32 AM, dcg at cau.edu.cn wrote:
> 
> Dear sir:
>     I' have bben working on genome annotation these days.My process in as below:
>     
>     1. I split my contigs into 300 parts and deal with them simultaneously to speed up my process. 
>     2. I used my splited-genome, protein, ESTs and RNA-seq to make the first alignment( est2genome=1,  AED_threshold=0.2 ). 
>     3. Merge the maker.*_.master_datastore_index.log to get all the paths of results.
>     4. gff_merge script to merge all the results in different dirs.
> 
>     However, there is no results returned. (My genome is about 3GB, but the gff of result is none.)
>     index_all.log.all.gff    1KB
>     index_all.log.all.maker.proteins.fasta    2837KB
>     index_all.log.all.maker.transcripts.fasta    9866KB
> 
> 
> 
> 
>     Where can the problems take place?
>     Thanks!
> Yours sincerely.
> 
> Chao Chao
> dcg at cau.edu.cn <mailto:dcg at cau.edu.cn>_______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com <mailto:maker-devel at box290.bluehost.com>
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org <http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://box290.bluehost.com/pipermail/maker-devel_yandell-lab.org/attachments/20170501/1ea8b28e/attachment.html>

From jim03ljy at 126.com  Wed May  3 07:15:22 2017
From: jim03ljy at 126.com (=?GBK?B?wqy98NStSmlt?=)
Date: Wed, 3 May 2017 20:15:22 +0800 (CST)
Subject: [maker-devel] RepeatMasker: NCBIBlastSearchEngine::search: Error
Message-ID: <50d2a447.a7ce.15bce3c7dcb.Coremail.jim03ljy@126.com>

Hi, I'm a newbie of maker.
I met some errors in Repeatmasker step.


The error is here:

NCBIBlastSearchEngine::search: Error...compressed subject database (/home/softwares/RepeatMasker/Libraries/20170127/general/is.lib) does not exist!


I tried ncbi+blast 2.5.0 version and 2.6.0 version as the path to blast, both have the same error.

And when I use the command as "maker -R", which skips the repeatmasker step, the maker could work.

I checked the former similar errors reported by another user and he solved the problem by  updating the RepBase.

So,

  I deleted and re-installed the RepeatMasker, updated the RepBase,  also installed RMblast.

The error is the same.


I'm stuck in the problem now.

Would highly appreciate any help - thanks!


Jinyuan Lu
Shanghai Jiao Tong University
No. 800 Dong Chuan Road,Minhang District,  Shanghai, P.R. China
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://box290.bluehost.com/pipermail/maker-devel_yandell-lab.org/attachments/20170503/ec5ea157/attachment.html>

From dcg at cau.edu.cn  Wed May  3 10:29:18 2017
From: dcg at cau.edu.cn (dcg at cau.edu.cn)
Date: Wed, 3 May 2017 23:29:18 +0800
Subject: [maker-devel] How to explain the maker results?
Message-ID: <2017050323291810262239@cau.edu.cn>

Dear sir:
    I?ve been using maker to do my genome annotation. However, I still have something I can't understand:

    1. After assembly, I have many contigs. Firstly, I set est2genome=1 and protein2genome=1 , with my proteins, ESTs and RNA-seq.. Which way below is correct?
    1.1 Each contig has its own gff. I just use its own maker_gff file to get a pyu.hmm(be used in snap practice), and then, train the single contig.
    1.2 I merge all the maker_gff to produce a pyu.hmm(for snap) , and then, use this pyu.hmm to train all the contigs.
    
    2. The aim of my project is to find new protein, so I need to guarantee the rigor of my annotation.
        I  made a plan that the predicted protein should be successfully aligned to the Uniprot(reviewed protein, total number is about 30K) with 100% identity and coverage.
        However, if I choose method 1.2 as above:
        After the first step (est2genome=1 and protein2genome=1), about 1600 proteins can be 100% aligned to the Uniprot. After 2 rounds training(est2genome=0 and protein2genome=0), less proteins can be 100% aligned.
        Is my test method reasonable? Why the final results can't get more well aligned proteins?
        After training and fasta_merge, the results can be index_all.log.all.maker.proteins.fasta, index_all.log.all.maker.snap_masked.proteins.fasta, index_all.log.all.maker.non_overlapping_ab_initio.proteins.fasta,  which is the final results?

    
     I'm looking forward to hearing from you. Thanks!
Yours sincerely!

     
Chao Chao


dcg at cau.edu.cn
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://box290.bluehost.com/pipermail/maker-devel_yandell-lab.org/attachments/20170503/684f53cd/attachment.html>

From d.ence at ufl.edu  Wed May  3 10:49:08 2017
From: d.ence at ufl.edu (Ence,daniel)
Date: Wed, 3 May 2017 15:49:08 +0000
Subject: [maker-devel] RepeatMasker: NCBIBlastSearchEngine::search: Error
In-Reply-To: <50d2a447.a7ce.15bce3c7dcb.Coremail.jim03ljy@126.com>
References: <50d2a447.a7ce.15bce3c7dcb.Coremail.jim03ljy@126.com>
Message-ID: <B69E8792-8153-4840-AC3C-D2E6CCFF3A45@mail.ufl.edu>

Hi, the error is regarding a specific file (is.lib) which isn?t being found. Can you verify that the file is there after you updated Repbase? Use the command:
?ls -l /home/softwares/RepeatMasker/Libraries/20170127/general/is.lib?

Thanks,
Daniel Ence


On May 3, 2017, at 8:15 AM, ???Jim <jim03ljy at 126.com<mailto:jim03ljy at 126.com>> wrote:

Hi, I'm a newbie of maker.
I met some errors in Repeatmasker step.

The error is here:
NCBIBlastSearchEngine::search: Error...compressed subject database (/home/softwares/RepeatMasker/Libraries/20170127/general/is.lib) does not exist!

I tried ncbi+blast 2.5.0 version and 2.6.0 version as the path to blast, both have the same error.
And when I use the command as "maker -R", which skips the repeatmasker step, the maker could work.
I checked the former similar errors reported by another user and he solved the problem by  updating the RepBase.
So,
  I deleted and re-installed the RepeatMasker, updated the RepBase,  also installed RMblast.
The error is the same.

I'm stuck in the problem now.
Would highly appreciate any help - thanks!

Jinyuan Lu
Shanghai Jiao Tong University
No. 800 Dong Chuan Road,Minhang District,  Shanghai, P.R. China
_______________________________________________
maker-devel mailing list
maker-devel at box290.bluehost.com<mailto:maker-devel at box290.bluehost.com>
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://box290.bluehost.com/pipermail/maker-devel_yandell-lab.org/attachments/20170503/2cf26b36/attachment.html>

From carsonhh at gmail.com  Wed May  3 10:53:40 2017
From: carsonhh at gmail.com (Carson Holt)
Date: Wed, 3 May 2017 09:53:40 -0600
Subject: [maker-devel] RepeatMasker: NCBIBlastSearchEngine::search: Error
In-Reply-To: <50d2a447.a7ce.15bce3c7dcb.Coremail.jim03ljy@126.com>
References: <50d2a447.a7ce.15bce3c7dcb.Coremail.jim03ljy@126.com>
Message-ID: <92771056-564B-4953-B738-5A1B97FC71AF@gmail.com>

RepBase and RepeatMasker have changed structure with the new 4.0.7 released two months ago. The new version and RepBase is only compatible with the new version of RepeatMasker. You have to update both (complete reinstall). Or you have to use the previous version of RepeatMasker with the previous version of RepBase.

?Carson


> On May 3, 2017, at 6:15 AM, ???Jim <jim03ljy at 126.com> wrote:
> 
> Hi, I'm a newbie of maker.
> I met some errors in Repeatmasker step.
> 
> The error is here:
> NCBIBlastSearchEngine::search: Error...compressed subject database (/home/softwares/RepeatMasker/Libraries/20170127/general/is.lib) does not exist!
> 
> I tried ncbi+blast 2.5.0 version and 2.6.0 version as the path to blast, both have the same error.
> And when I use the command as "maker -R", which skips the repeatmasker step, the maker could work.
> I checked the former similar errors reported by another user and he solved the problem by  updating the RepBase.
> So,
>   I deleted and re-installed the RepeatMasker, updated the RepBase,  also installed RMblast.
> The error is the same.
> 
> I'm stuck in the problem now.
> Would highly appreciate any help - thanks!
> 
> Jinyuan Lu
> Shanghai Jiao Tong University
> No. 800 Dong Chuan Road,Minhang District,  Shanghai, P.R. China
> _______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://box290.bluehost.com/pipermail/maker-devel_yandell-lab.org/attachments/20170503/2dc895e1/attachment.html>

From carsonhh at gmail.com  Wed May  3 10:55:41 2017
From: carsonhh at gmail.com (Carson Holt)
Date: Wed, 3 May 2017 09:55:41 -0600
Subject: [maker-devel] RepeatMasker: NCBIBlastSearchEngine::search: Error
In-Reply-To: <92771056-564B-4953-B738-5A1B97FC71AF@gmail.com>
References: <50d2a447.a7ce.15bce3c7dcb.Coremail.jim03ljy@126.com>
	<92771056-564B-4953-B738-5A1B97FC71AF@gmail.com>
Message-ID: <2E668C83-8884-430B-A764-CD0B44D03D19@gmail.com>

You may want to use the previous version of both as the new version may still have hidden bugs.

?Carson

> On May 3, 2017, at 9:53 AM, Carson Holt <carsonhh at gmail.com> wrote:
> 
> RepBase and RepeatMasker have changed structure with the new 4.0.7 released two months ago. The new version and RepBase is only compatible with the new version of RepeatMasker. You have to update both (complete reinstall). Or you have to use the previous version of RepeatMasker with the previous version of RepBase.
> 
> ?Carson
> 
> 
> 
>> On May 3, 2017, at 6:15 AM, ???Jim <jim03ljy at 126.com <mailto:jim03ljy at 126.com>> wrote:
>> 
>> Hi, I'm a newbie of maker.
>> I met some errors in Repeatmasker step.
>> 
>> The error is here:
>> NCBIBlastSearchEngine::search: Error...compressed subject database (/home/softwares/RepeatMasker/Libraries/20170127/general/is.lib) does not exist!
>> 
>> I tried ncbi+blast 2.5.0 version and 2.6.0 version as the path to blast, both have the same error.
>> And when I use the command as "maker -R", which skips the repeatmasker step, the maker could work.
>> I checked the former similar errors reported by another user and he solved the problem by  updating the RepBase.
>> So,
>>   I deleted and re-installed the RepeatMasker, updated the RepBase,  also installed RMblast.
>> The error is the same.
>> 
>> I'm stuck in the problem now.
>> Would highly appreciate any help - thanks!
>> 
>> Jinyuan Lu
>> Shanghai Jiao Tong University
>> No. 800 Dong Chuan Road,Minhang District,  Shanghai, P.R. China
>> _______________________________________________
>> maker-devel mailing list
>> maker-devel at box290.bluehost.com <mailto:maker-devel at box290.bluehost.com>
>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://box290.bluehost.com/pipermail/maker-devel_yandell-lab.org/attachments/20170503/2f9e37d8/attachment.html>

From carsonhh at gmail.com  Wed May  3 11:04:20 2017
From: carsonhh at gmail.com (Carson Holt)
Date: Wed, 3 May 2017 10:04:20 -0600
Subject: [maker-devel] RepeatMasker: NCBIBlastSearchEngine::search: Error
In-Reply-To: <2E668C83-8884-430B-A764-CD0B44D03D19@gmail.com>
References: <50d2a447.a7ce.15bce3c7dcb.Coremail.jim03ljy@126.com>
	<92771056-564B-4953-B738-5A1B97FC71AF@gmail.com>
	<2E668C83-8884-430B-A764-CD0B44D03D19@gmail.com>
Message-ID: <BC60F425-2156-457F-B68C-576ABF778840@gmail.com>

You may have to contact RepBase via e-mail to find out how to get the libraries compatible with RepeatMasker 4.0.6 as it looks like they have removed the previous release from the website.

The last release for 4.0.6 was ?> repeatmaskerlibraries-20160829.tar.gz

?Carson


> On May 3, 2017, at 9:55 AM, Carson Holt <carsonhh at gmail.com> wrote:
> 
> You may want to use the previous version of both as the new version may still have hidden bugs.
> 
> ?Carson
> 
>> On May 3, 2017, at 9:53 AM, Carson Holt <carsonhh at gmail.com <mailto:carsonhh at gmail.com>> wrote:
>> 
>> RepBase and RepeatMasker have changed structure with the new 4.0.7 released two months ago. The new version and RepBase is only compatible with the new version of RepeatMasker. You have to update both (complete reinstall). Or you have to use the previous version of RepeatMasker with the previous version of RepBase.
>> 
>> ?Carson
>> 
>> 
>> 
>>> On May 3, 2017, at 6:15 AM, ???Jim <jim03ljy at 126.com <mailto:jim03ljy at 126.com>> wrote:
>>> 
>>> Hi, I'm a newbie of maker.
>>> I met some errors in Repeatmasker step.
>>> 
>>> The error is here:
>>> NCBIBlastSearchEngine::search: Error...compressed subject database (/home/softwares/RepeatMasker/Libraries/20170127/general/is.lib) does not exist!
>>> 
>>> I tried ncbi+blast 2.5.0 version and 2.6.0 version as the path to blast, both have the same error.
>>> And when I use the command as "maker -R", which skips the repeatmasker step, the maker could work.
>>> I checked the former similar errors reported by another user and he solved the problem by  updating the RepBase.
>>> So,
>>>   I deleted and re-installed the RepeatMasker, updated the RepBase,  also installed RMblast.
>>> The error is the same.
>>> 
>>> I'm stuck in the problem now.
>>> Would highly appreciate any help - thanks!
>>> 
>>> Jinyuan Lu
>>> Shanghai Jiao Tong University
>>> No. 800 Dong Chuan Road,Minhang District,  Shanghai, P.R. China
>>> _______________________________________________
>>> maker-devel mailing list
>>> maker-devel at box290.bluehost.com <mailto:maker-devel at box290.bluehost.com>
>>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org <http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org>
>> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://box290.bluehost.com/pipermail/maker-devel_yandell-lab.org/attachments/20170503/26d18c8b/attachment.html>

From carsonhh at gmail.com  Wed May  3 11:10:48 2017
From: carsonhh at gmail.com (Carson Holt)
Date: Wed, 3 May 2017 10:10:48 -0600
Subject: [maker-devel] How to explain the maker results?
In-Reply-To: <2017050323291810262239@cau.edu.cn>
References: <2017050323291810262239@cau.edu.cn>
Message-ID: <049F8AC8-7E16-4F05-B8B2-01CA7AB88751@gmail.com>

Use the merged gff3 to train snap, otherwise you won?t have enough models.

Info on training can be found on the wiki ?> http://weatherby.genetics.utah.edu/MAKER/wiki/index.php/MAKER_Tutorial_for_GMOD_Online_Training_2014#Training_ab_initio_Gene_Predictors <http://weatherby.genetics.utah.edu/MAKER/wiki/index.php/MAKER_Tutorial_for_GMOD_Online_Training_2014#Training_ab_initio_Gene_Predictors>

Also you can find additional detailed info by searching the mailing list archives ?> http://groups.google.com/group/maker-devel <http://groups.google.com/group/maker-devel>

I?m not sure what you are asking with the last question. Alignment is not a function of training, and will not be affected by the hmm, but 100% coverage and identity is too strict a threshold even for data derived from the same species.

?Carson


> On May 3, 2017, at 9:29 AM, dcg at cau.edu.cn wrote:
> 
> Dear sir:
>     I?ve been using maker to do my genome annotation. However, I still have something I can't understand:
> 
>     1. After assembly, I have many contigs. Firstly, I set est2genome=1 and protein2genome=1 , with my proteins, ESTs and RNA-seq.. Which way below is correct?
>     1.1 Each contig has its own gff. I just use its own maker_gff file to get a pyu.hmm(be used in snap practice), and then, train the single contig.
>     1.2 I merge all the maker_gff to produce a pyu.hmm(for snap) , and then, use this pyu.hmm to train all the contigs.
>     
>     2. The aim of my project is to find new protein, so I need to guarantee the rigor of my annotation.
>         I  made a plan that the predicted protein should be successfully aligned to the Uniprot(reviewed protein, total number is about 30K) with 100% identity and coverage.
>         However, if I choose method 1.2 as above:
>         After the first step (est2genome=1 and protein2genome=1), about 1600 proteins can be 100% aligned to the Uniprot. After 2 rounds training(est2genome=0 and protein2genome=0), less proteins can be 100% aligned.
>         Is my test method reasonable? Why the final results can't get more well aligned proteins?
>         After training and fasta_merge, the results can be index_all.log.all.maker.proteins.fasta, index_all.log.all.maker.snap_masked.proteins.fasta, index_all.log.all.maker.non_overlapping_ab_initio.proteins.fasta,  which is the final results?
> 
>     
>      I'm looking forward to hearing from you. Thanks!
> Yours sincerely!
> 
>      
> Chao Chao
> dcg at cau.edu.cn <mailto:dcg at cau.edu.cn>_______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com <mailto:maker-devel at box290.bluehost.com>
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org <http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://box290.bluehost.com/pipermail/maker-devel_yandell-lab.org/attachments/20170503/75918e71/attachment.html>

From nathan.ricks at gmail.com  Wed May  3 11:19:57 2017
From: nathan.ricks at gmail.com (Nathan Ricks)
Date: Wed, 3 May 2017 10:19:57 -0600
Subject: [maker-devel] Post Processing of Annotations
Message-ID: <CAJfcix0=bU1QmOX+qhfGO_YGbSARpxPHpuOsfGv9jAp8ZpyjNg@mail.gmail.com>

Hi,
I've been running your Maker pipeline, and I've reached Post Processing of
Annotations portion. In your Online training you use the output.blastp and
the outuput.iprscan files to help assign function.
My question is what format do these files need to be in.
Iprscan can produce files in a variety of formats: tsv, xml, gff3, html and
SVG
while blastp can produce the tabular, pairise, xml and a number of others.

Thanks

Nathan Ricks
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://box290.bluehost.com/pipermail/maker-devel_yandell-lab.org/attachments/20170503/5f47cc24/attachment.html>

From carsonhh at gmail.com  Wed May  3 11:30:03 2017
From: carsonhh at gmail.com (Carson Holt)
Date: Wed, 3 May 2017 10:30:03 -0600
Subject: [maker-devel] Post Processing of Annotations
In-Reply-To: <CAJfcix0=bU1QmOX+qhfGO_YGbSARpxPHpuOsfGv9jAp8ZpyjNg@mail.gmail.com>
References: <CAJfcix0=bU1QmOX+qhfGO_YGbSARpxPHpuOsfGv9jAp8ZpyjNg@mail.gmail.com>
Message-ID: <DC72D273-A00B-4018-8C36-E124CAC8D606@gmail.com>

Use blastp with the the tab delimited format option and the UniProt/Swiss-Prot database. What additional filters you choose to set (i.e. e-value limit) may vary, although I would recommend 1e-6 or lower.

?Carson

> On May 3, 2017, at 10:19 AM, Nathan Ricks <nathan.ricks at gmail.com> wrote:
> 
> Hi,
> I've been running your Maker pipeline, and I've reached Post Processing of Annotations portion. In your Online training you use the output.blastp and the outuput.iprscan files to help assign function.
> My question is what format do these files need to be in.
> Iprscan can produce files in a variety of formats: tsv, xml, gff3, html and SVG
> while blastp can produce the tabular, pairise, xml and a number of others.
> 
> Thanks
> 
> Nathan Ricks
> _______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org


From d.ence at ufl.edu  Wed May  3 11:34:35 2017
From: d.ence at ufl.edu (Ence,daniel)
Date: Wed, 3 May 2017 16:34:35 +0000
Subject: [maker-devel] Post Processing of Annotations
In-Reply-To: <CAJfcix0=bU1QmOX+qhfGO_YGbSARpxPHpuOsfGv9jAp8ZpyjNg@mail.gmail.com>
References: <CAJfcix0=bU1QmOX+qhfGO_YGbSARpxPHpuOsfGv9jAp8ZpyjNg@mail.gmail.com>
Message-ID: <F9D83F09-F94E-473D-911B-6626CDC8E639@mail.ufl.edu>

Hi, The iprscan output should be in tsv format, which is tab-separated, and the usage statement for the maker_functional_gff says that the blastp output should be in ?wu-blast -mformat 2?, which I think is tabbed too.

~Daniel

 
> On May 3, 2017, at 12:19 PM, Nathan Ricks <nathan.ricks at gmail.com> wrote:
> 
> Hi,
> I've been running your Maker pipeline, and I've reached Post Processing of Annotations portion. In your Online training you use the output.blastp and the outuput.iprscan files to help assign function.
> My question is what format do these files need to be in.
> Iprscan can produce files in a variety of formats: tsv, xml, gff3, html and SVG
> while blastp can produce the tabular, pairise, xml and a number of others.
> 
> Thanks
> 
> Nathan Ricks
> _______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org


From carsonhh at gmail.com  Wed May  3 14:20:31 2017
From: carsonhh at gmail.com (Carson Holt)
Date: Wed, 3 May 2017 13:20:31 -0600
Subject: [maker-devel] Post Processing of Annotations
In-Reply-To: <CAJfcix1wnd24e8CZs5KT2j7axJOO-yhSMswJBozCSFqbo7ds3A@mail.gmail.com>
References: <CAJfcix0=bU1QmOX+qhfGO_YGbSARpxPHpuOsfGv9jAp8ZpyjNg@mail.gmail.com>
	<DC72D273-A00B-4018-8C36-E124CAC8D606@gmail.com>
	<CAJfcix1wnd24e8CZs5KT2j7axJOO-yhSMswJBozCSFqbo7ds3A@mail.gmail.com>
Message-ID: <0EE0D7F6-5F28-46E7-9AB8-CED93DC811F6@gmail.com>

The maker_functional_gff and maker_functional_fasta scripts pull specific fields out of the UniProt fasta header, so they are tied to the format used by UniProt/Swiss-Prot. At one time I had modified them to also work with NR, but that was several years ago, so I don?t know if it would still work.

?Carson


> On May 3, 2017, at 1:10 PM, Nathan Ricks <nathan.ricks at gmail.com> wrote:
> 
> Is it possible to make my own database from sequences that I have downloaded form NCBI instead of using the UniProt/Swiss-Prot?
> 
> On Wed, May 3, 2017 at 10:30 AM, Carson Holt <carsonhh at gmail.com <mailto:carsonhh at gmail.com>> wrote:
> Use blastp with the the tab delimited format option and the UniProt/Swiss-Prot database. What additional filters you choose to set (i.e. e-value limit) may vary, although I would recommend 1e-6 or lower.
> 
> ?Carson
> 
> > On May 3, 2017, at 10:19 AM, Nathan Ricks <nathan.ricks at gmail.com <mailto:nathan.ricks at gmail.com>> wrote:
> >
> > Hi,
> > I've been running your Maker pipeline, and I've reached Post Processing of Annotations portion. In your Online training you use the output.blastp and the outuput.iprscan files to help assign function.
> > My question is what format do these files need to be in.
> > Iprscan can produce files in a variety of formats: tsv, xml, gff3, html and SVG
> > while blastp can produce the tabular, pairise, xml and a number of others.
> >
> > Thanks
> >
> > Nathan Ricks
> > _______________________________________________
> > maker-devel mailing list
> > maker-devel at box290.bluehost.com <mailto:maker-devel at box290.bluehost.com>
> > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org <http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org>
> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://box290.bluehost.com/pipermail/maker-devel_yandell-lab.org/attachments/20170503/917df7db/attachment.html>

From mjfi2sb3 at gmail.com  Thu May  4 01:37:52 2017
From: mjfi2sb3 at gmail.com (Salim Bougouffa)
Date: Thu, 04 May 2017 06:37:52 +0000
Subject: [maker-devel] advanced repeat masking library constructions &
 rna-seq assembly choices
Message-ID: <CAJb_6LTFg7MCZa5_+uUhKDDKMeNqa51Kvq11aQCfECA8nxr1Tg@mail.gmail.com>

Hi,

I am attempting to annotate a plant genome. I have a couple of questions:

*1) RNA-seq assembly*
a) I assembled my RNA-seq data using Trinity and StringTie. The two produce
drastically different numbers. When I compare the two assemblies for each
sample using TransRate, StringTie produces a higher score. for most of the
assemblies. I see in all of the threads that you recommend Trinity but
doesn't trinity produce way too many transcripts (even after chucking out
the "bad" ones using transrate).
b) During hint creation in MAKER, does it take into account that different
transcripts have different read coverage (expression levels). I guess my
question is should I filter transcripts that have a small read coverage.

*2) Repeat Masking *
I am following the advanced repeat library construction tutorial (
http://weatherby.genetics.utah.edu/MAKER/wiki/index.php/Repeat_Library_Construction-Advanced).
The initial steps find 15 sequences for the LTR and 159 for MITE. But, when
I get to the perl DIR_CRL/CRL_Step4.pl step, both output files
(Inner_Seq_For_BLAST.fasta, lLTRs_Seq_For_BLAST.fasta) are empty.

a) are these numbers normal because I was expecting a lot more than 16 for
the LTR?
b) I don't get any errors when I run CRL_Step4.pl yet no output. What's
going on?!

Many thanks,
/SB
-- 

____________________________
Sent from Inbox Mobile
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://box290.bluehost.com/pipermail/maker-devel_yandell-lab.org/attachments/20170504/a5f4a761/attachment.html>

From jim03ljy at 126.com  Thu May  4 01:36:12 2017
From: jim03ljy at 126.com (=?GBK?B?wqy98NStSmlt?=)
Date: Thu, 4 May 2017 14:36:12 +0800 (CST)
Subject: [maker-devel] RepeatMasker: NCBIBlastSearchEngine::search: Error
In-Reply-To: <BC60F425-2156-457F-B68C-576ABF778840@gmail.com>
References: <50d2a447.a7ce.15bce3c7dcb.Coremail.jim03ljy@126.com>
	<92771056-564B-4953-B738-5A1B97FC71AF@gmail.com>
	<2E668C83-8884-430B-A764-CD0B44D03D19@gmail.com>
	<BC60F425-2156-457F-B68C-576ABF778840@gmail.com>
Message-ID: <12de56ed.5ded.15bd22c56c7.Coremail.jim03ljy@126.com>

Thanks a lot!
Problem solved. 
I matched the RepeatMasker 4.0.7 with RepBase20170127 and it worked!


Thanks!
   ----Jinyuan Lu


At 2017-05-04 00:04:20, "Carson Holt" <carsonhh at gmail.com> wrote:
You may have to contact RepBase via e-mail to find out how to get the libraries compatible with RepeatMasker 4.0.6 as it looks like they have removed the previous release from the website.


The last release for 4.0.6 was ?> repeatmaskerlibraries-20160829.tar.gz


?Carson


On May 3, 2017, at 9:55 AM, Carson Holt <carsonhh at gmail.com> wrote:


You may want to use the previous version of both as the new version may still have hidden bugs.


?Carson


On May 3, 2017, at 9:53 AM, Carson Holt <carsonhh at gmail.com> wrote:


RepBase and RepeatMasker have changed structure with the new 4.0.7 released two months ago. The new version and RepBase is only compatible with the new version of RepeatMasker. You have to update both (complete reinstall). Or you have to use the previous version of RepeatMasker with the previous version of RepBase.


?Carson


On May 3, 2017, at 6:15 AM, ???Jim <jim03ljy at 126.com> wrote:


Hi, I'm a newbie of maker.
I met some errors in Repeatmasker step.


The error is here:
NCBIBlastSearchEngine::search: Error...compressed subject database (/home/softwares/RepeatMasker/Libraries/20170127/general/is.lib) does not exist!


I tried ncbi+blast 2.5.0 version and 2.6.0 version as the path to blast, both have the same error.
And when I use the command as "maker -R", which skips the repeatmasker step, the maker could work.
I checked the former similar errors reported by another user and he solved the problem by  updating the RepBase.
So,
  I deleted and re-installed the RepeatMasker, updated the RepBase,  also installed RMblast.
The error is the same.


I'm stuck in the problem now.
Would highly appreciate any help - thanks!


Jinyuan Lu
Shanghai Jiao Tong University
No. 800 Dong Chuan Road,Minhang District,  Shanghai, P.R. China
_______________________________________________
maker-devel mailing list
maker-devel at box290.bluehost.com
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://box290.bluehost.com/pipermail/maker-devel_yandell-lab.org/attachments/20170504/2fc4c117/attachment.html>

From dcg at cau.edu.cn  Fri May  5 08:43:43 2017
From: dcg at cau.edu.cn (dcg at cau.edu.cn)
Date: Fri, 5 May 2017 21:43:43 +0800
Subject: [maker-devel] How to evaluate maker proteins' quality?
Message-ID: <2017050521434331108720@cau.edu.cn>

Dear sir:
    After I finished my maker running, I should check the quality of my results.
    My annotation purpose is to find some new proteins.
    There is about 30K reviewed proteins of my species. If I want to see how many predicted proteins can support the reviewed proteins, how to do it?(Can blastp be OK? How to set the threshold? )
    I used Uniprot, ESTs and RNA-seq to do my annotation. From my perspective, if the protein is reviewed and used to train snap/augustus, we should get the same one after several training rounds. So I planned to align maker_proteins to Uniprot proteins(which I utilized to annotate). If the predicted proteins match Uniprot by 100% identity and coverage, they can be thought to support the reviewed proteins. Is it correct?    
    
    If not, maybe I can evaluate my proteins only by AED value and proteome domain?

    I'm looking forward to your help. Thanks a lot!

Yours sincerely!

Chao Chao


dcg at cau.edu.cn
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://box290.bluehost.com/pipermail/maker-devel_yandell-lab.org/attachments/20170505/0f7ef3a2/attachment.html>

From carsonhh at gmail.com  Sun May  7 19:31:31 2017
From: carsonhh at gmail.com (Carson Holt)
Date: Sun, 7 May 2017 18:31:31 -0600
Subject: [maker-devel] How to evaluate maker proteins' quality?
In-Reply-To: <2017050521434331108720@cau.edu.cn>
References: <2017050521434331108720@cau.edu.cn>
Message-ID: <51620CC3-43D9-47D5-B8B3-871F291D6518@gmail.com>

Because of small differences in the assemblies, individual variants, annotated proteins used as reference being partial, as well as potential assembly error, a 100% identity expectation is too high. About 90+% would be more reasonable for a same species comparison. AED gives a good correlation with protein confidence. A perfect zero score will not happen often though since the way alignment algorithms work will leave alignment errors around splice sites and short exons. Also the evidence used is never perfect, so with AED lower values are better than higher values but can not be used as an overly specific measurement (it is only correlative and not exact).

?Carson


> On May 5, 2017, at 7:43 AM, dcg at cau.edu.cn wrote:
> 
> Dear sir:
>     After I finished my maker running, I should check the quality of my results.
>     My annotation purpose is to find some new proteins.
>     There is about 30K reviewed proteins of my species. If I want to see how many predicted proteins can support the reviewed proteins, how to do it?(Can blastp be OK? How to set the threshold? )
>     I used Uniprot, ESTs and RNA-seq to do my annotation. From my perspective, if the protein is reviewed and used to train snap/augustus, we should get the same one after several training rounds. So I planned to align maker_proteins to Uniprot proteins(which I utilized to annotate). If the predicted proteins match Uniprot by 100% identity and coverage, they can be thought to support the reviewed proteins. Is it correct?    
>     
>     If not, maybe I can evaluate my proteins only by AED value and proteome domain?
> 
>     I'm looking forward to your help. Thanks a lot!
> 
> Yours sincerely!
> 
> Chao Chao
> dcg at cau.edu.cn <mailto:dcg at cau.edu.cn>_______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com <mailto:maker-devel at box290.bluehost.com>
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org <http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://box290.bluehost.com/pipermail/maker-devel_yandell-lab.org/attachments/20170507/07783c61/attachment.html>

From carsonhh at gmail.com  Sun May  7 20:17:37 2017
From: carsonhh at gmail.com (Carson Holt)
Date: Sun, 7 May 2017 19:17:37 -0600
Subject: [maker-devel] advanced repeat masking library constructions &
 rna-seq assembly choices
In-Reply-To: <CAJb_6LTFg7MCZa5_+uUhKDDKMeNqa51Kvq11aQCfECA8nxr1Tg@mail.gmail.com>
References: <CAJb_6LTFg7MCZa5_+uUhKDDKMeNqa51Kvq11aQCfECA8nxr1Tg@mail.gmail.com>
Message-ID: <18086AF2-01C3-4671-B974-C5FF36460618@gmail.com>

Michael can you answer the second question (Michael wrote the protocol, so I CC?d him).

With respect to the first question. Expression level is not necessarily relevant to the annotation process (so no MAKER does not look at read coverage). Instead we use the transcript assemblies to identify introns via splice aware alignment (yes it is the introns and not the exons we care about). Trinity has a nice option called jaccard_clip which avoids false merging of neighboring transcripts (mostly occurs in fungi where UTR can overlap). Merging of transcripts will cause extra introns to be assigned as hints as well as potential overextension of UTR during final polishing steps. The jaccard_clip option is the main reason we recommend Trinity. If Stringtie has a similar option, then it can be used as well.

Thanks,
Carson


> On May 4, 2017, at 12:37 AM, Salim Bougouffa <mjfi2sb3 at gmail.com> wrote:
> 
> Hi,
> 
> I am attempting to annotate a plant genome. I have a couple of questions:
> 
> 1) RNA-seq assembly
> a) I assembled my RNA-seq data using Trinity and StringTie. The two produce drastically different numbers. When I compare the two assemblies for each sample using TransRate, StringTie produces a higher score. for most of the assemblies. I see in all of the threads that you recommend Trinity but doesn't trinity produce way too many transcripts (even after chucking out the "bad" ones using transrate).
> b) During hint creation in MAKER, does it take into account that different transcripts have different read coverage (expression levels). I guess my question is should I filter transcripts that have a small read coverage.
> 
> 2) Repeat Masking 
> I am following the advanced repeat library construction tutorial (http://weatherby.genetics.utah.edu/MAKER/wiki/index.php/Repeat_Library_Construction-Advanced <http://weatherby.genetics.utah.edu/MAKER/wiki/index.php/Repeat_Library_Construction-Advanced>). The initial steps find 15 sequences for the LTR and 159 for MITE. But, when I get to the perl DIR_CRL/CRL_Step4.pl step, both output files (Inner_Seq_For_BLAST.fasta, lLTRs_Seq_For_BLAST.fasta) are empty.
> 
> a) are these numbers normal because I was expecting a lot more than 16 for the LTR? 
> b) I don't get any errors when I run CRL_Step4.pl yet no output. What's going on?!
> 
> Many thanks,
> /SB
> -- 
> ____________________________
> Sent from Inbox Mobile
> 
> _______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://box290.bluehost.com/pipermail/maker-devel_yandell-lab.org/attachments/20170507/28de8fb9/attachment.html>

From mcampbel at cshl.edu  Sun May  7 20:24:27 2017
From: mcampbel at cshl.edu (Campbell, Michael)
Date: Mon, 8 May 2017 01:24:27 +0000
Subject: [maker-devel] advanced repeat masking library constructions &
 rna-seq assembly choices
In-Reply-To: <18086AF2-01C3-4671-B974-C5FF36460618@gmail.com>
References: <CAJb_6LTFg7MCZa5_+uUhKDDKMeNqa51Kvq11aQCfECA8nxr1Tg@mail.gmail.com>
	<18086AF2-01C3-4671-B974-C5FF36460618@gmail.com>
Message-ID: <076B034E-8107-49CE-90C7-277AA4AB4ED3@cshl.edu>

Hi SB,

I?ve added Ning Jaing to this email. She has put great effort into updating this protocol recently and will be able to address your questions better than I can.

Ning, would you mind helping out with this?

Thanks,
Mike

On May 7, 2017, at 9:17 PM, Carson Holt <carsonhh at gmail.com<mailto:carsonhh at gmail.com>> wrote:

Michael can you answer the second question (Michael wrote the protocol, so I CC?d him).

With respect to the first question. Expression level is not necessarily relevant to the annotation process (so no MAKER does not look at read coverage). Instead we use the transcript assemblies to identify introns via splice aware alignment (yes it is the introns and not the exons we care about). Trinity has a nice option called jaccard_clip which avoids false merging of neighboring transcripts (mostly occurs in fungi where UTR can overlap). Merging of transcripts will cause extra introns to be assigned as hints as well as potential overextension of UTR during final polishing steps. The jaccard_clip option is the main reason we recommend Trinity. If Stringtie has a similar option, then it can be used as well.

Thanks,
Carson


On May 4, 2017, at 12:37 AM, Salim Bougouffa <mjfi2sb3 at gmail.com<mailto:mjfi2sb3 at gmail.com>> wrote:

Hi,

I am attempting to annotate a plant genome. I have a couple of questions:

1) RNA-seq assembly
a) I assembled my RNA-seq data using Trinity and StringTie. The two produce drastically different numbers. When I compare the two assemblies for each sample using TransRate, StringTie produces a higher score. for most of the assemblies. I see in all of the threads that you recommend Trinity but doesn't trinity produce way too many transcripts (even after chucking out the "bad" ones using transrate).
b) During hint creation in MAKER, does it take into account that different transcripts have different read coverage (expression levels). I guess my question is should I filter transcripts that have a small read coverage.

2) Repeat Masking
I am following the advanced repeat library construction tutorial (http://weatherby.genetics.utah.edu/MAKER/wiki/index.php/Repeat_Library_Construction-Advanced). The initial steps find 15 sequences for the LTR and 159 for MITE. But, when I get to the perl DIR_CRL/CRL_Step4.pl step, both output files (Inner_Seq_For_BLAST.fasta, lLTRs_Seq_For_BLAST.fasta) are empty.

a) are these numbers normal because I was expecting a lot more than 16 for the LTR?
b) I don't get any errors when I run CRL_Step4.pl yet no output. What's going on?!

Many thanks,
/SB
--

____________________________
Sent from Inbox Mobile

_______________________________________________
maker-devel mailing list
maker-devel at box290.bluehost.com<mailto:maker-devel at box290.bluehost.com>
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org

_______________________________________________
maker-devel mailing list
maker-devel at box290.bluehost.com<mailto:maker-devel at box290.bluehost.com>
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org


From jiangn at msu.edu  Mon May  8 10:50:45 2017
From: jiangn at msu.edu (Jiang, Ning)
Date: Mon, 8 May 2017 15:50:45 +0000
Subject: [maker-devel] advanced repeat masking library constructions &
 rna-seq assembly choices
In-Reply-To: <076B034E-8107-49CE-90C7-277AA4AB4ED3@cshl.edu>
References: <CAJb_6LTFg7MCZa5_+uUhKDDKMeNqa51Kvq11aQCfECA8nxr1Tg@mail.gmail.com>
	<18086AF2-01C3-4671-B974-C5FF36460618@gmail.com>,
	<076B034E-8107-49CE-90C7-277AA4AB4ED3@cshl.edu>
Message-ID: <DM5PR1201MB0060CD7E9DFA122F7D1A127DCEEE0@DM5PR1201MB0060.namprd12.prod.outlook.com>

Hi Salim,


I am sorry to learn about the issues. it depends on the quality of your genome assembly for how many intact LTR elements you would get; however, 16 seems too low to me.


The inner and LTR sequence file should NOT be empty. Some times the issue could be due to that the initial sequence name is long and complicated. If that's the case for your sequences, you might want to simplify your sequence name (only including letters and numbers) and try again.


We are working on an automatic pipeline for LTR collection, if everything goes smoothly, it should be available in two to three months.


Best wishes,


Ning

________________________________
From: Campbell, Michael <mcampbel at cshl.edu>
Sent: Sunday, May 7, 2017 9:24 PM
To: Carson Holt
Cc: Salim Bougouffa; maker-devel at yandell-lab.org List; Jiang, Ning
Subject: Re: [maker-devel] advanced repeat masking library constructions & rna-seq assembly choices

Hi SB,

I?ve added Ning Jaing to this email. She has put great effort into updating this protocol recently and will be able to address your questions better than I can.

Ning, would you mind helping out with this?

Thanks,
Mike

On May 7, 2017, at 9:17 PM, Carson Holt <carsonhh at gmail.com<mailto:carsonhh at gmail.com>> wrote:

Michael can you answer the second question (Michael wrote the protocol, so I CC?d him).

With respect to the first question. Expression level is not necessarily relevant to the annotation process (so no MAKER does not look at read coverage). Instead we use the transcript assemblies to identify introns via splice aware alignment (yes it is the introns and not the exons we care about). Trinity has a nice option called jaccard_clip which avoids false merging of neighboring transcripts (mostly occurs in fungi where UTR can overlap). Merging of transcripts will cause extra introns to be assigned as hints as well as potential overextension of UTR during final polishing steps. The jaccard_clip option is the main reason we recommend Trinity. If Stringtie has a similar option, then it can be used as well.

Thanks,
Carson


On May 4, 2017, at 12:37 AM, Salim Bougouffa <mjfi2sb3 at gmail.com<mailto:mjfi2sb3 at gmail.com>> wrote:

Hi,

I am attempting to annotate a plant genome. I have a couple of questions:

1) RNA-seq assembly
a) I assembled my RNA-seq data using Trinity and StringTie. The two produce drastically different numbers. When I compare the two assemblies for each sample using TransRate, StringTie produces a higher score. for most of the assemblies. I see in all of the threads that you recommend Trinity but doesn't trinity produce way too many transcripts (even after chucking out the "bad" ones using transrate).
b) During hint creation in MAKER, does it take into account that different transcripts have different read coverage (expression levels). I guess my question is should I filter transcripts that have a small read coverage.

2) Repeat Masking
I am following the advanced repeat library construction tutorial (http://weatherby.genetics.utah.edu/MAKER/wiki/index.php/Repeat_Library_Construction-Advanced). The initial steps find 15 sequences for the LTR and 159 for MITE. But, when I get to the perl DIR_CRL/CRL_Step4.pl step, both output files (Inner_Seq_For_BLAST.fasta, lLTRs_Seq_For_BLAST.fasta) are empty.

a) are these numbers normal because I was expecting a lot more than 16 for the LTR?
b) I don't get any errors when I run CRL_Step4.pl yet no output. What's going on?!

Many thanks,
/SB
--

____________________________
Sent from Inbox Mobile

_______________________________________________
maker-devel mailing list
maker-devel at box290.bluehost.com<mailto:maker-devel at box290.bluehost.com>
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org

_______________________________________________
maker-devel mailing list
maker-devel at box290.bluehost.com<mailto:maker-devel at box290.bluehost.com>
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://box290.bluehost.com/pipermail/maker-devel_yandell-lab.org/attachments/20170508/66453bce/attachment.html>

From mjfi2sb3 at gmail.com  Mon May  8 11:41:51 2017
From: mjfi2sb3 at gmail.com (Salim Bougouffa)
Date: Mon, 08 May 2017 16:41:51 +0000
Subject: [maker-devel] advanced repeat masking library constructions &
 rna-seq assembly choices
In-Reply-To: <DM5PR1201MB0060CD7E9DFA122F7D1A127DCEEE0@DM5PR1201MB0060.namprd12.prod.outlook.com>
References: <CAJb_6LTFg7MCZa5_+uUhKDDKMeNqa51Kvq11aQCfECA8nxr1Tg@mail.gmail.com>
	<18086AF2-01C3-4671-B974-C5FF36460618@gmail.com>
	<076B034E-8107-49CE-90C7-277AA4AB4ED3@cshl.edu>
	<DM5PR1201MB0060CD7E9DFA122F7D1A127DCEEE0@DM5PR1201MB0060.namprd12.prod.outlook.com>
Message-ID: <CAJb_6LQFuL0WtMh1oM44Smdc0Rp0dGHeAo-gcRWXXv3cFgu8zg@mail.gmail.com>

Thank you all for your responses.

Regards,
/SB

On Mon, 8 May 2017, 18:50 Jiang, Ning, <jiangn at msu.edu> wrote:

> Hi Salim,
>
>
> I am sorry to learn about the issues. it depends on the quality of your
> genome assembly for how many intact LTR elements you would get; however, 16
> seems too low to me.
>
>
> The inner and LTR sequence file should NOT be empty. Some times the issue
> could be due to that the initial sequence name is long and complicated. If
> that's the case for your sequences, you might want to simplify your
> sequence name (only including letters and numbers) and try again.
>
>
> We are working on an automatic pipeline for LTR collection, if everything
> goes smoothly, it should be available in two to three months.
>
>
> Best wishes,
>
>
> Ning
> ------------------------------
> *From:* Campbell, Michael <mcampbel at cshl.edu>
> *Sent:* Sunday, May 7, 2017 9:24 PM
> *To:* Carson Holt
> *Cc:* Salim Bougouffa; maker-devel at yandell-lab.org List; Jiang, Ning
> *Subject:* Re: [maker-devel] advanced repeat masking library
> constructions & rna-seq assembly choices
>
> Hi SB,
>
> I?ve added Ning Jaing to this email. She has put great effort into
> updating this protocol recently and will be able to address your questions
> better than I can.
>
> Ning, would you mind helping out with this?
>
> Thanks,
> Mike
>
> On May 7, 2017, at 9:17 PM, Carson Holt <carsonhh at gmail.com<mailto:
> carsonhh at gmail.com>> wrote:
>
> Michael can you answer the second question (Michael wrote the protocol, so
> I CC?d him).
>
> With respect to the first question. Expression level is not necessarily
> relevant to the annotation process (so no MAKER does not look at read
> coverage). Instead we use the transcript assemblies to identify introns via
> splice aware alignment (yes it is the introns and not the exons we care
> about). Trinity has a nice option called jaccard_clip which avoids false
> merging of neighboring transcripts (mostly occurs in fungi where UTR can
> overlap). Merging of transcripts will cause extra introns to be assigned as
> hints as well as potential overextension of UTR during final polishing
> steps. The jaccard_clip option is the main reason we recommend Trinity. If
> Stringtie has a similar option, then it can be used as well.
>
> Thanks,
> Carson
>
>
>
> On May 4, 2017, at 12:37 AM, Salim Bougouffa <mjfi2sb3 at gmail.com<mailto:
> mjfi2sb3 at gmail.com>> wrote:
>
> Hi,
>
> I am attempting to annotate a plant genome. I have a couple of questions:
>
> 1) RNA-seq assembly
> a) I assembled my RNA-seq data using Trinity and StringTie. The two
> produce drastically different numbers. When I compare the two assemblies
> for each sample using TransRate, StringTie produces a higher score. for
> most of the assemblies. I see in all of the threads that you recommend
> Trinity but doesn't trinity produce way too many transcripts (even after
> chucking out the "bad" ones using transrate).
> b) During hint creation in MAKER, does it take into account that different
> transcripts have different read coverage (expression levels). I guess my
> question is should I filter transcripts that have a small read coverage.
>
> 2) Repeat Masking
> I am following the advanced repeat library construction tutorial (
> http://weatherby.genetics.utah.edu/MAKER/wiki/index.php/Repeat_Library_Construction-Advanced).
> The initial steps find 15 sequences for the LTR and 159 for MITE. But, when
> I get to the perl DIR_CRL/CRL_Step4.pl step, both output files
> (Inner_Seq_For_BLAST.fasta, lLTRs_Seq_For_BLAST.fasta) are empty.
>
> a) are these numbers normal because I was expecting a lot more than 16 for
> the LTR?
> b) I don't get any errors when I run CRL_Step4.pl yet no output. What's
> going on?!
>
> Many thanks,
> /SB
> --
>
> ____________________________
> Sent from Inbox Mobile
>
> _______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com<mailto:maker-devel at box290.bluehost.com>
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org
>
> _______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com<mailto:maker-devel at box290.bluehost.com>
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://box290.bluehost.com/pipermail/maker-devel_yandell-lab.org/attachments/20170508/0cfc9de1/attachment.html>

From qwzhang0601 at gmail.com  Wed May 10 10:48:01 2017
From: qwzhang0601 at gmail.com (Quanwei Zhang)
Date: Wed, 10 May 2017 11:48:01 -0400
Subject: [maker-devel] want coding sequences
Message-ID: <CAOW6FSJFhCP8-V7L9N6RLnnyvtoa3SvQZaZY9Phi0-stXe+vGQ@mail.gmail.com>

Hello:

Thanks for development and maintenance of the tool "Maker2".  We have used
Maker2 to do genome annotation of a new rodent species. Now we are doing
downstream analysis, which requires inputs of coding sequences from
different species.

I found the outputs I got from Maker2 only include protein sequences and
transcripts. Is there an easy way that I can get the coding sequences for
our annotated genome?

Many thanks

Best
Quanwei
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://box290.bluehost.com/pipermail/maker-devel_yandell-lab.org/attachments/20170510/ef5f7263/attachment.html>

From munholl at uwindsor.ca  Wed May 10 15:24:49 2017
From: munholl at uwindsor.ca (Seth Munholland)
Date: Wed, 10 May 2017 16:24:49 -0400
Subject: [maker-devel] MAKER only running 1 task
Message-ID: <CAL=sJwroONDyNp_Y2mbgs-E3DYh0qxZAcn3LkEtvdRW37jLLVA@mail.gmail.com>

Hello,

I'm running a MAKER annotation on an ubuntu cluster and my top screen shows
the following:

Tasks: 831 total,   3 running, 826 sleeping,   0 stopped,   1 zombie

and my maker run (on a screen) shows:

...
total clusters:4 now processing 0
 ...processing 0 of 3
 ...processing 1 of 3
 ...processing 2 of 3
total clusters:4 now processing 0
flattening protein clusters
prepare section files
merging blast reports...
flattening protein clusters
prepare section files
merging blast reports...
flattening protein clusters
prepare section files

Of the 826, the vast majority of them are maker and only one of the running
tasks is maker.  Is this normal behaviour or has my maker run stopped
processing?

Seth Munholland, B.Sc.
Department of Biological Sciences
Rm. 304 Biology Building
University of Windsor
401 Sunset Ave. N9B 3P4
T: (519) 253-3000 Ext: 4755
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://box290.bluehost.com/pipermail/maker-devel_yandell-lab.org/attachments/20170510/b5663dbe/attachment.html>

From carsonhh at gmail.com  Thu May 11 10:58:59 2017
From: carsonhh at gmail.com (Carson Holt)
Date: Thu, 11 May 2017 09:58:59 -0600
Subject: [maker-devel] want coding sequences
In-Reply-To: <CAOW6FSJFhCP8-V7L9N6RLnnyvtoa3SvQZaZY9Phi0-stXe+vGQ@mail.gmail.com>
References: <CAOW6FSJFhCP8-V7L9N6RLnnyvtoa3SvQZaZY9Phi0-stXe+vGQ@mail.gmail.com>
Message-ID: <82DF4E49-8E78-45F6-8A78-01A45F908987@gmail.com>

Use the fasta_tool utility with ?trim_maker_utr to get just the CDS part of each transcript.

?Carson


> On May 10, 2017, at 9:48 AM, Quanwei Zhang <qwzhang0601 at gmail.com> wrote:
> 
> Hello:
> 
> Thanks for development and maintenance of the tool "Maker2".  We have used Maker2 to do genome annotation of a new rodent species. Now we are doing downstream analysis, which requires inputs of coding sequences from different species. 
> 
> I found the outputs I got from Maker2 only include protein sequences and transcripts. Is there an easy way that I can get the coding sequences for our annotated genome?
> 
> Many thanks
> 
> Best
> Quanwei
> _______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org


From carsonhh at gmail.com  Thu May 11 11:03:04 2017
From: carsonhh at gmail.com (Carson Holt)
Date: Thu, 11 May 2017 10:03:04 -0600
Subject: [maker-devel] MAKER only running 1 task
In-Reply-To: <CAL=sJwroONDyNp_Y2mbgs-E3DYh0qxZAcn3LkEtvdRW37jLLVA@mail.gmail.com>
References: <CAL=sJwroONDyNp_Y2mbgs-E3DYh0qxZAcn3LkEtvdRW37jLLVA@mail.gmail.com>
Message-ID: <2119AF6E-3571-4D28-9D81-B24A513839E5@gmail.com>

It may be frozen, or if it is on the last contig it can be running a non-paralelizable step (very last cluster merging step for each contig is not paralelizable). So on a large contig the very last step can take a little while, and if there are no other contigs, then there is no work to give to other processes to keep them busy in the meantime. So everyone has to wait so they can all exit together once the last step is done. But as I said, this will only happen if you are on the last contig and it is large. Otherwise it is probably frozen somehow (look for any errors further up the log).

?Carson


> On May 10, 2017, at 2:24 PM, Seth Munholland <munholl at uwindsor.ca> wrote:
> 
> Hello,
> 
> I'm running a MAKER annotation on an ubuntu cluster and my top screen shows the following:
> 
> Tasks: 831 total,   3 running, 826 sleeping,   0 stopped,   1 zombie
> 
> and my maker run (on a screen) shows:
> 
> ...
> total clusters:4 now processing 0
>  ...processing 0 of 3
>  ...processing 1 of 3
>  ...processing 2 of 3
> total clusters:4 now processing 0
> flattening protein clusters
> prepare section files
> merging blast reports...
> flattening protein clusters
> prepare section files
> merging blast reports...
> flattening protein clusters
> prepare section files
> 
> Of the 826, the vast majority of them are maker and only one of the running tasks is maker.  Is this normal behaviour or has my maker run stopped processing?
> 
> Seth Munholland, B.Sc.
> Department of Biological Sciences
> Rm. 304 Biology Building
> University of Windsor
> 401 Sunset Ave. N9B 3P4
> T: (519) 253-3000 Ext: 4755 <>_______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://box290.bluehost.com/pipermail/maker-devel_yandell-lab.org/attachments/20170511/f218b2f3/attachment.html>

From mnaymik at tgen.org  Thu May 11 13:29:46 2017
From: mnaymik at tgen.org (Marcus Naymik)
Date: Thu, 11 May 2017 11:29:46 -0700
Subject: [maker-devel] Maker gene vs snap match in final GFF's
Message-ID: <CAMY2gz5vm7hZ4n2OoUSPsfwjmdCY0V6Y1iDdf3bi7=mpGXHDiQ@mail.gmail.com>

In the final GFF annotations what is the difference between a 'gene' from
maker and a 'match' from snap?

-- 
 

*This electronic message is intended to be for the use only of the named 
recipient, and may contain information that is confidential or privileged, 
including patient health information. If you are not the intended 
recipient, you are hereby notified that any disclosure, copying, 
distribution or use of the contents of this message is strictly prohibited. 
If you have received this message in error or are not the named recipient, 
please notify us immediately by contacting the sender at the electronic 
mail address noted above, and delete and destroy all copies of this 
message. Thank you.*
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://box290.bluehost.com/pipermail/maker-devel_yandell-lab.org/attachments/20170511/52fd0d90/attachment.html>

From carsonhh at gmail.com  Thu May 11 13:33:55 2017
From: carsonhh at gmail.com (Carson Holt)
Date: Thu, 11 May 2017 12:33:55 -0600
Subject: [maker-devel] Maker gene vs snap match in final GFF's
In-Reply-To: <CAMY2gz5vm7hZ4n2OoUSPsfwjmdCY0V6Y1iDdf3bi7=mpGXHDiQ@mail.gmail.com>
References: <CAMY2gz5vm7hZ4n2OoUSPsfwjmdCY0V6Y1iDdf3bi7=mpGXHDiQ@mail.gmail.com>
Message-ID: <8A77F384-9BAD-4BF4-BD1E-EDAE4E010612@gmail.com>

MAKER results can be the result of additional hints sent to SNAP together with post processing to add UTR and additional exons that have support form transcript evidence. MAKER results will also have support from either protein or EST/mRNA evidence. SNAP match is simply the raw ab initio call made by SNAP (no hints, no post processing, and may or may not have evidence supporting the structure). They are there just for reference purposes. so you know what SNAP will produce outside of MAKER given the underlying HMM.

?Carson


> On May 11, 2017, at 12:29 PM, Marcus Naymik <mnaymik at tgen.org> wrote:
> 
> In the final GFF annotations what is the difference between a 'gene' from maker and a 'match' from snap?
> 
> This electronic message is intended to be for the use only of the named recipient, and may contain information that is confidential or privileged, including patient health information. If you are not the intended recipient, you are hereby notified that any disclosure, copying, distribution or use of the contents of this message is strictly prohibited. If you have received this message in error or are not the named recipient, please notify us immediately by contacting the sender at the electronic mail address noted above, and delete and destroy all copies of this message. Thank you.
> 
> _______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://box290.bluehost.com/pipermail/maker-devel_yandell-lab.org/attachments/20170511/42fb9023/attachment.html>

From d.ence at ufl.edu  Thu May 11 13:35:00 2017
From: d.ence at ufl.edu (Ence,daniel)
Date: Thu, 11 May 2017 18:35:00 +0000
Subject: [maker-devel] Maker gene vs snap match in final GFF's
In-Reply-To: <CAMY2gz5vm7hZ4n2OoUSPsfwjmdCY0V6Y1iDdf3bi7=mpGXHDiQ@mail.gmail.com>
References: <CAMY2gz5vm7hZ4n2OoUSPsfwjmdCY0V6Y1iDdf3bi7=mpGXHDiQ@mail.gmail.com>
Message-ID: <2560F44E-3D8D-4E81-B7B9-621921408B61@mail.ufl.edu>

The two might have identical coordinates in some cases, but they are different kinds of features. The ?match? is a product of an abinitio gene prediction algorithm, while the ?gene? is is supported by evidence and passed through the maker polishing and filtering steps.


On May 11, 2017, at 2:29 PM, Marcus Naymik <mnaymik at tgen.org<mailto:mnaymik at tgen.org>> wrote:

In the final GFF annotations what is the difference between a 'gene' from maker and a 'match' from snap?


This electronic message is intended to be for the use only of the named recipient, and may contain information that is confidential or privileged, including patient health information. If you are not the intended recipient, you are hereby notified that any disclosure, copying, distribution or use of the contents of this message is strictly prohibited. If you have received this message in error or are not the named recipient, please notify us immediately by contacting the sender at the electronic mail address noted above, and delete and destroy all copies of this message. Thank you.

_______________________________________________
maker-devel mailing list
maker-devel at box290.bluehost.com<mailto:maker-devel at box290.bluehost.com>
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://box290.bluehost.com/pipermail/maker-devel_yandell-lab.org/attachments/20170511/f577dab6/attachment.html>

From nathan.ricks at gmail.com  Fri May 12 16:05:45 2017
From: nathan.ricks at gmail.com (Nathan Ricks)
Date: Fri, 12 May 2017 15:05:45 -0600
Subject: [maker-devel] Using mpich2 with Maker
Message-ID: <CAJfcix0T1mVc4Rxz+7fbEbf_he78Aqj9UgA4uYKtKAZsdE8sCA@mail.gmail.com>

I've been using maker for some time now. However, I would like to speed up
the process by using the mpich2 option. When use the command ./Build
install, the following error is produced. Any help would be appreciated.

Nathan Ricks
[image: Inline image 1]
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://box290.bluehost.com/pipermail/maker-devel_yandell-lab.org/attachments/20170512/ffbe42da/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image.png
Type: image/png
Size: 170139 bytes
Desc: not available
URL: <http://box290.bluehost.com/pipermail/maker-devel_yandell-lab.org/attachments/20170512/ffbe42da/attachment.png>

From yuejiaxing at gmail.com  Mon May 15 06:28:47 2017
From: yuejiaxing at gmail.com (Jia-Xing Yue)
Date: Mon, 15 May 2017 13:28:47 +0200
Subject: [maker-devel] multiple overlapped snoRNA genes got annotated by
 maker
Message-ID: <CAETgyPSWXrZPO=nrkgT5MomgJELf-0ze7yu0A_NCsVqTnLYkDA@mail.gmail.com>

Hello,

I configured snoscan (v.0.9.1) for my maker installation (v2.31.9) and run
the annotation for a yeast (S. cerevisiae) genome. I think the annotation
went well with regard to tRNAs and protein-coding genes but I am not sure
about snoRNAs. I found multiple overlapped snoRNA genes were annotated by
maker as the example below shows. I was wondering if this is expected. If
not, what might have caused this problem and is there a way to work around.
Thanks in advance!

chrIX   maker   gene    4328    4416    .       +       .
ID=snoscan-chrIX-noncoding-gene-0.49;Name=snoscan-chrIX-noncoding-gene-0.49
chrIX   maker   snoRNA  4328    4416    .       +       .
ID=snoscan-chrIX-noncoding-gene-0.49-snoRNA-1;Parent=snoscan-chrIX-noncoding-gene-0.49;Name=snoscan-chrIX-noncoding-gene-0.49-snoRNA-1;_AED=1.00;_eAED=1.00;_QI=0|-1|0|0|-1|0|1|90|0
chrIX   maker   exon    4328    4416    .       +       .
ID=snoscan-chrIX-noncoding-gene-0.49-snoRNA-1:exon:12260;Parent=snoscan-chrIX-noncoding-gene-0.49-snoRNA-1
chrIX   maker   gene    4375    4563    .       +       .
ID=snoscan-chrIX-noncoding-gene-0.50;Name=snoscan-chrIX-noncoding-gene-0.50
chrIX   maker   snoRNA  4375    4563    .       +       .
ID=snoscan-chrIX-noncoding-gene-0.50-snoRNA-1;Parent=snoscan-chrIX-noncoding-gene-0.50;Name=snoscan-chrIX-noncoding-gene-0.50-snoRNA-1;_AED=1.00;_eAED=1.00;_QI=0|-1|0|0|-1|0|1|190|0
chrIX   maker   exon    4375    4563    .       +       .
ID=snoscan-chrIX-noncoding-gene-0.50-snoRNA-1:exon:12261;Parent=snoscan-chrIX-noncoding-gene-0.50-snoRNA-1
chrIX   maker   gene    4375    4461    .       +       .
ID=snoscan-chrIX-noncoding-gene-0.51;Name=snoscan-chrIX-noncoding-gene-0.51
chrIX   maker   snoRNA  4375    4461    .       +       .
ID=snoscan-chrIX-noncoding-gene-0.51-snoRNA-1;Parent=snoscan-chrIX-noncoding-gene-0.51;Name=snoscan-chrIX-noncoding-gene-0.51-snoRNA-1;_AED=1.00;_eAED=1.00;_QI=0|-1|0|0|-1|0|1|88|0
chrIX   maker   exon    4375    4461    .       +       .
ID=snoscan-chrIX-noncoding-gene-0.51-snoRNA-1:exon:12262;Parent=snoscan-chrIX-noncoding-gene-0.51-snoRNA-1
chrIX   maker   gene    4375    4491    .       +       .
ID=snoscan-chrIX-noncoding-gene-0.52;Name=snoscan-chrIX-noncoding-gene-0.52
chrIX   maker   snoRNA  4375    4491    .       +       .
ID=snoscan-chrIX-noncoding-gene-0.52-snoRNA-1;Parent=snoscan-chrIX-noncoding-gene-0.52;Name=snoscan-chrIX-noncoding-gene-0.52-snoRNA-1;_AED=1.00;_eAED=1.00;_QI=0|-1|0|0|-1|0|1|118|0
chrIX   maker   exon    4375    4491    .       +       .
ID=snoscan-chrIX-noncoding-gene-0.52-snoRNA-1:exon:12263;Parent=snoscan-chrIX-noncoding-gene-0.52-snoRNA-1
chrIX   maker   gene    4375    4500    .       +       .
ID=snoscan-chrIX-noncoding-gene-0.53;Name=snoscan-chrIX-noncoding-gene-0.53
chrIX   maker   snoRNA  4375    4500    .       +       .
ID=snoscan-chrIX-noncoding-gene-0.53-snoRNA-1;Parent=snoscan-chrIX-noncoding-gene-0.53;Name=snoscan-chrIX-noncoding-gene-0.53-snoRNA-1;_AED=1.00;_eAED=1.00;_QI=0|-1|0|0|-1|0|1|127|0
chrIX   maker   exon    4375    4500    .       +       .
ID=snoscan-chrIX-noncoding-gene-0.53-snoRNA-1:exon:12264;Parent=snoscan-chrIX-noncoding-gene-0.53-snoRNA-1

Best,
Jia-Xing


-- 
Jia-Xing Yue

Population Genomics and Complex Traits Group
Tour Pasteur 8eme etage
Facult? de M?decine
Institute for Research on Cancer and Aging, Nice (IRCAN)
CNRS UMR 7284 - INSERM U 1081 - Universit? C?te d?Azur (UCA)
28 Avenue de Valombrose
06107 NICE Cedex 2
France

Personal website: http://www.iamphioxus.org/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://box290.bluehost.com/pipermail/maker-devel_yandell-lab.org/attachments/20170515/fb928f1e/attachment.html>

From michael.s.campbell1 at gmail.com  Mon May 15 11:28:58 2017
From: michael.s.campbell1 at gmail.com (Michael Campbell)
Date: Mon, 15 May 2017 12:28:58 -0400
Subject: [maker-devel] multiple overlapped snoRNA genes got annotated by
 maker
In-Reply-To: <CAETgyPSWXrZPO=nrkgT5MomgJELf-0ze7yu0A_NCsVqTnLYkDA@mail.gmail.com>
References: <CAETgyPSWXrZPO=nrkgT5MomgJELf-0ze7yu0A_NCsVqTnLYkDA@mail.gmail.com>
Message-ID: <0AC3F89F-28EB-4E2A-ADDE-2DF8BD625416@gmail.com>

Hi Jia-Xing,

That has been my experience in the past as well. For the non-coding RNAs tRNA-scan is very accurate while snoscan seems to be quite sensitive but very specific. Did you give it a ?snoscan_meth? file? Giving it a snoscan_meth file will help with accuracy. The biggest gains in accuracy are from small RNA-seq data. In the paper where we used snoscan on maize we didn?t keep any snoRNA predictions that didn?t have support from small RNA-seq data, in practical terms we got rid of anything with a AED of 1.

I hope this helps,
Mike
> On May 15, 2017, at 7:28 AM, Jia-Xing Yue <yuejiaxing at gmail.com> wrote:
> 
> Hello,
> 
> I configured snoscan (v.0.9.1) for my maker installation (v2.31.9) and run the annotation for a yeast (S. cerevisiae) genome. I think the annotation went well with regard to tRNAs and protein-coding genes but I am not sure about snoRNAs. I found multiple overlapped snoRNA genes were annotated by maker as the example below shows. I was wondering if this is expected. If not, what might have caused this problem and is there a way to work around. Thanks in advance!
> 
> chrIX   maker   gene    4328    4416    .       +       .       ID=snoscan-chrIX-noncoding-gene-0.49;Name=snoscan-chrIX-noncoding-gene-0.49
> chrIX   maker   snoRNA  4328    4416    .       +       .       ID=snoscan-chrIX-noncoding-gene-0.49-snoRNA-1;Parent=snoscan-chrIX-noncoding-gene-0.49;Name=snoscan-chrIX-noncoding-gene-0.49-snoRNA-1;_AED=1.00;_eAED=1.00;_QI=0|-1|0|0|-1|0|1|90|0
> chrIX   maker   exon    4328    4416    .       +       .       ID=snoscan-chrIX-noncoding-gene-0.49-snoRNA-1:exon:12260;Parent=snoscan-chrIX-noncoding-gene-0.49-snoRNA-1
> chrIX   maker   gene    4375    4563    .       +       .       ID=snoscan-chrIX-noncoding-gene-0.50;Name=snoscan-chrIX-noncoding-gene-0.50
> chrIX   maker   snoRNA  4375    4563    .       +       .       ID=snoscan-chrIX-noncoding-gene-0.50-snoRNA-1;Parent=snoscan-chrIX-noncoding-gene-0.50;Name=snoscan-chrIX-noncoding-gene-0.50-snoRNA-1;_AED=1.00;_eAED=1.00;_QI=0|-1|0|0|-1|0|1|190|0
> chrIX   maker   exon    4375    4563    .       +       .       ID=snoscan-chrIX-noncoding-gene-0.50-snoRNA-1:exon:12261;Parent=snoscan-chrIX-noncoding-gene-0.50-snoRNA-1
> chrIX   maker   gene    4375    4461    .       +       .       ID=snoscan-chrIX-noncoding-gene-0.51;Name=snoscan-chrIX-noncoding-gene-0.51
> chrIX   maker   snoRNA  4375    4461    .       +       .       ID=snoscan-chrIX-noncoding-gene-0.51-snoRNA-1;Parent=snoscan-chrIX-noncoding-gene-0.51;Name=snoscan-chrIX-noncoding-gene-0.51-snoRNA-1;_AED=1.00;_eAED=1.00;_QI=0|-1|0|0|-1|0|1|88|0
> chrIX   maker   exon    4375    4461    .       +       .       ID=snoscan-chrIX-noncoding-gene-0.51-snoRNA-1:exon:12262;Parent=snoscan-chrIX-noncoding-gene-0.51-snoRNA-1
> chrIX   maker   gene    4375    4491    .       +       .       ID=snoscan-chrIX-noncoding-gene-0.52;Name=snoscan-chrIX-noncoding-gene-0.52
> chrIX   maker   snoRNA  4375    4491    .       +       .       ID=snoscan-chrIX-noncoding-gene-0.52-snoRNA-1;Parent=snoscan-chrIX-noncoding-gene-0.52;Name=snoscan-chrIX-noncoding-gene-0.52-snoRNA-1;_AED=1.00;_eAED=1.00;_QI=0|-1|0|0|-1|0|1|118|0
> chrIX   maker   exon    4375    4491    .       +       .       ID=snoscan-chrIX-noncoding-gene-0.52-snoRNA-1:exon:12263;Parent=snoscan-chrIX-noncoding-gene-0.52-snoRNA-1
> chrIX   maker   gene    4375    4500    .       +       .       ID=snoscan-chrIX-noncoding-gene-0.53;Name=snoscan-chrIX-noncoding-gene-0.53
> chrIX   maker   snoRNA  4375    4500    .       +       .       ID=snoscan-chrIX-noncoding-gene-0.53-snoRNA-1;Parent=snoscan-chrIX-noncoding-gene-0.53;Name=snoscan-chrIX-noncoding-gene-0.53-snoRNA-1;_AED=1.00;_eAED=1.00;_QI=0|-1|0|0|-1|0|1|127|0
> chrIX   maker   exon    4375    4500    .       +       .       ID=snoscan-chrIX-noncoding-gene-0.53-snoRNA-1:exon:12264;Parent=snoscan-chrIX-noncoding-gene-0.53-snoRNA-1
> 
> Best,
> Jia-Xing
> 
> 
> -- 
> Jia-Xing Yue 
> 
> Population Genomics and Complex Traits Group
> Tour Pasteur 8eme etage
> Facult? de M?decine
> Institute for Research on Cancer and Aging, Nice (IRCAN)
> CNRS UMR 7284 - INSERM U 1081 - Universit? C?te d?Azur (UCA)
> 28 Avenue de Valombrose
> 06107 NICE Cedex 2
> France
> 
> Personal website: http://www.iamphioxus.org/ <http://www.iamphioxus.org/>
> 
> _______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://box290.bluehost.com/pipermail/maker-devel_yandell-lab.org/attachments/20170515/0ea2c0e5/attachment.html>

From yuejiaxing at gmail.com  Mon May 15 12:14:26 2017
From: yuejiaxing at gmail.com (Jia-Xing Yue)
Date: Mon, 15 May 2017 19:14:26 +0200
Subject: [maker-devel] multiple overlapped snoRNA genes got annotated by
 maker
In-Reply-To: <0AC3F89F-28EB-4E2A-ADDE-2DF8BD625416@gmail.com>
References: <CAETgyPSWXrZPO=nrkgT5MomgJELf-0ze7yu0A_NCsVqTnLYkDA@mail.gmail.com>
	<0AC3F89F-28EB-4E2A-ADDE-2DF8BD625416@gmail.com>
Message-ID: <CAETgyPQOwmGsXPGyAcUcEOf_-LHoOvqzJQuKHLTocRYE6tyVtg@mail.gmail.com>

Hi Michael,

Many thanks for the information! I will specify the "snoscan_meth" file and
give it another try then. I majorly want to use maker to annotate
protein-coding genes and tRNAs. But it would be nice to have snoRNA
reasonably annotated as well.
Thanks gain and have a great day!

Best,
Jia-Xing


On Mon, May 15, 2017 at 6:28 PM, Michael Campbell <
michael.s.campbell1 at gmail.com> wrote:

> Hi Jia-Xing,
>
> That has been my experience in the past as well. For the non-coding RNAs
> tRNA-scan is very accurate while snoscan seems to be quite sensitive but
> very specific. Did you give it a ?snoscan_meth? file? Giving it
> a snoscan_meth file will help with accuracy. The biggest gains in accuracy
> are from small RNA-seq data. In the paper where we used snoscan on maize we
> didn?t keep any snoRNA predictions that didn?t have support from small
> RNA-seq data, in practical terms we got rid of anything with a AED of 1.
>
> I hope this helps,
> Mike
>
> On May 15, 2017, at 7:28 AM, Jia-Xing Yue <yuejiaxing at gmail.com> wrote:
>
> Hello,
>
> I configured snoscan (v.0.9.1) for my maker installation (v2.31.9) and run
> the annotation for a yeast (S. cerevisiae) genome. I think the annotation
> went well with regard to tRNAs and protein-coding genes but I am not sure
> about snoRNAs. I found multiple overlapped snoRNA genes were annotated by
> maker as the example below shows. I was wondering if this is expected. If
> not, what might have caused this problem and is there a way to work around.
> Thanks in advance!
>
> chrIX   maker   gene    4328    4416    .       +       .
> ID=snoscan-chrIX-noncoding-gene-0.49;Name=snoscan-chrIX-
> noncoding-gene-0.49
> chrIX   maker   snoRNA  4328    4416    .       +       .
> ID=snoscan-chrIX-noncoding-gene-0.49-snoRNA-1;Parent=
> snoscan-chrIX-noncoding-gene-0.49;Name=snoscan-chrIX-
> noncoding-gene-0.49-snoRNA-1;_AED=1.00;_eAED=1.00;_QI=0|-1|0|0|-1|0|1|90|0
> chrIX   maker   exon    4328    4416    .       +       .
> ID=snoscan-chrIX-noncoding-gene-0.49-snoRNA-1:exon:12260;
> Parent=snoscan-chrIX-noncoding-gene-0.49-snoRNA-1
> chrIX   maker   gene    4375    4563    .       +       .
> ID=snoscan-chrIX-noncoding-gene-0.50;Name=snoscan-chrIX-
> noncoding-gene-0.50
> chrIX   maker   snoRNA  4375    4563    .       +       .
> ID=snoscan-chrIX-noncoding-gene-0.50-snoRNA-1;Parent=
> snoscan-chrIX-noncoding-gene-0.50;Name=snoscan-chrIX-
> noncoding-gene-0.50-snoRNA-1;_AED=1.00;_eAED=1.00;_QI=0|-1|
> 0|0|-1|0|1|190|0
> chrIX   maker   exon    4375    4563    .       +       .
> ID=snoscan-chrIX-noncoding-gene-0.50-snoRNA-1:exon:12261;
> Parent=snoscan-chrIX-noncoding-gene-0.50-snoRNA-1
> chrIX   maker   gene    4375    4461    .       +       .
> ID=snoscan-chrIX-noncoding-gene-0.51;Name=snoscan-chrIX-
> noncoding-gene-0.51
> chrIX   maker   snoRNA  4375    4461    .       +       .
> ID=snoscan-chrIX-noncoding-gene-0.51-snoRNA-1;Parent=
> snoscan-chrIX-noncoding-gene-0.51;Name=snoscan-chrIX-
> noncoding-gene-0.51-snoRNA-1;_AED=1.00;_eAED=1.00;_QI=0|-1|0|0|-1|0|1|88|0
> chrIX   maker   exon    4375    4461    .       +       .
> ID=snoscan-chrIX-noncoding-gene-0.51-snoRNA-1:exon:12262;
> Parent=snoscan-chrIX-noncoding-gene-0.51-snoRNA-1
> chrIX   maker   gene    4375    4491    .       +       .
> ID=snoscan-chrIX-noncoding-gene-0.52;Name=snoscan-chrIX-
> noncoding-gene-0.52
> chrIX   maker   snoRNA  4375    4491    .       +       .
> ID=snoscan-chrIX-noncoding-gene-0.52-snoRNA-1;Parent=
> snoscan-chrIX-noncoding-gene-0.52;Name=snoscan-chrIX-
> noncoding-gene-0.52-snoRNA-1;_AED=1.00;_eAED=1.00;_QI=0|-1|
> 0|0|-1|0|1|118|0
> chrIX   maker   exon    4375    4491    .       +       .
> ID=snoscan-chrIX-noncoding-gene-0.52-snoRNA-1:exon:12263;
> Parent=snoscan-chrIX-noncoding-gene-0.52-snoRNA-1
> chrIX   maker   gene    4375    4500    .       +       .
> ID=snoscan-chrIX-noncoding-gene-0.53;Name=snoscan-chrIX-
> noncoding-gene-0.53
> chrIX   maker   snoRNA  4375    4500    .       +       .
> ID=snoscan-chrIX-noncoding-gene-0.53-snoRNA-1;Parent=
> snoscan-chrIX-noncoding-gene-0.53;Name=snoscan-chrIX-
> noncoding-gene-0.53-snoRNA-1;_AED=1.00;_eAED=1.00;_QI=0|-1|
> 0|0|-1|0|1|127|0
> chrIX   maker   exon    4375    4500    .       +       .
> ID=snoscan-chrIX-noncoding-gene-0.53-snoRNA-1:exon:12264;
> Parent=snoscan-chrIX-noncoding-gene-0.53-snoRNA-1
>
> Best,
> Jia-Xing
>
>
> --
> Jia-Xing Yue
>
> Population Genomics and Complex Traits Group
> Tour Pasteur 8eme etage
> Facult? de M?decine
> Institute for Research on Cancer and Aging, Nice (IRCAN)
> CNRS UMR 7284 - INSERM U 1081 - Universit? C?te d?Azur (UCA)
> 28 Avenue de Valombrose
> 06107 NICE Cedex 2
> France
>
> Personal website: http://www.iamphioxus.org/
>
> _______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org
>
>
>


-- 
Jia-Xing Yue

Population Genomics and Complex Traits Group
Tour Pasteur 8eme etage
Facult? de M?decine
Institute for Research on Cancer and Aging, Nice (IRCAN)
CNRS UMR 7284 - INSERM U 1081 - Universit? C?te d?Azur (UCA)
28 Avenue de Valombrose
06107 NICE Cedex 2
France

Personal website: http://www.iamphioxus.org/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://box290.bluehost.com/pipermail/maker-devel_yandell-lab.org/attachments/20170515/dd2096fa/attachment.html>

From carsonhh at gmail.com  Tue May 16 09:51:00 2017
From: carsonhh at gmail.com (Carson Holt)
Date: Tue, 16 May 2017 08:51:00 -0600
Subject: [maker-devel] Using mpich2 with Maker
In-Reply-To: <CAJfcix0T1mVc4Rxz+7fbEbf_he78Aqj9UgA4uYKtKAZsdE8sCA@mail.gmail.com>
References: <CAJfcix0T1mVc4Rxz+7fbEbf_he78Aqj9UgA4uYKtKAZsdE8sCA@mail.gmail.com>
Message-ID: <6C739B3B-D5D6-4263-A73C-4EA1762B1EE1@gmail.com>

You probably need to reinstall Parse::RecDescent, Inline, Inline::C, or all of the above via CPAN (perl?s module installer). The ones already installed on your system may have issues. If you do not have the ability to install modules, you can install them just for your user using local::lib and the bootstrapping instructions here ?> http://search.cpan.org/~haarg/local-lib-2.000019/lib/local/lib.pm#The_bootstrapping_technique <http://search.cpan.org/~haarg/local-lib-2.000019/lib/local/lib.pm#The_bootstrapping_technique>

Then reinstall MAKER.

?Carson
 
> On May 12, 2017, at 3:05 PM, Nathan Ricks <nathan.ricks at gmail.com> wrote:
> 
> I've been using maker for some time now. However, I would like to speed up the process by using the mpich2 option. When use the command ./Build install, the following error is produced. Any help would be appreciated.
> 
> Nathan Ricks
> <image.png>
> 
> 
> 
> _______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://box290.bluehost.com/pipermail/maker-devel_yandell-lab.org/attachments/20170516/70b7eb1e/attachment.html>

From salim.bougouffa at kaust.edu.sa  Sun May 21 02:45:50 2017
From: salim.bougouffa at kaust.edu.sa (Salim Bougouffa)
Date: Sun, 21 May 2017 10:45:50 +0300
Subject: [maker-devel] augustus exon calling ~
Message-ID: <CAGekO4ucZ6=H1X=s-B-+8KmaNpZph5s73sFHu5+o1L6m0sVg2A@mail.gmail.com>

Hi Maker folks,

I have several issues with a plant genome annotation that I am currently
doing but perhaps the most recurrent issues are:

1/ CDSs that are missed where significant rna-seq evidence is there (figure
artemis01)
2/ vice versa where one or two exons are added without rna-seq
evidence/intron hints (figure artemis02)

info about the runs:
1/ using augustus with a pre-existing model for a related plant that has
high homology to the one I am annotating
2/ umask=1 (seems to do better than umask=0; is this a good thing to do)
3/ evm = 1 (seems to perform better than emv=0)
4/ repeatmasking (denovo + repbase)

Best,
/SB


_______________________________________________________Salim
Bougouffa(PhD), Postdoctoral Fellow
4700 KAUST, CBRC, Blg3. Office4326-WS05,
Thuwal, Jeddah, KSA, 23955-6900
(966) 012 808 2963 || salim.bougouff at kaust.edu.sa

-- 

------------------------------
This message and its contents, including attachments are intended solely 
for the original recipient. If you are not the intended recipient or have 
received this message in error, please notify me immediately and delete 
this message from your computer system. Any unauthorized use or 
distribution is prohibited. Please consider the environment before printing 
this email.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://box290.bluehost.com/pipermail/maker-devel_yandell-lab.org/attachments/20170521/ee869f4a/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: artemis01.png
Type: image/png
Size: 166352 bytes
Desc: not available
URL: <http://box290.bluehost.com/pipermail/maker-devel_yandell-lab.org/attachments/20170521/ee869f4a/attachment.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: artemis02.png
Type: image/png
Size: 97140 bytes
Desc: not available
URL: <http://box290.bluehost.com/pipermail/maker-devel_yandell-lab.org/attachments/20170521/ee869f4a/attachment-0001.png>

From mjfi2sb3 at gmail.com  Sun May 21 02:48:48 2017
From: mjfi2sb3 at gmail.com (Salim Bougouffa)
Date: Sun, 21 May 2017 07:48:48 +0000
Subject: [maker-devel] augustus exon calling ~
Message-ID: <CAJb_6LSEknrL9oNUozj9=OQihk-rSKZbo0W+TAq+-4agZ_h1FA@mail.gmail.com>

Hi Maker folks,

I have several issues with a plant genome annotation that I am currently
doing but perhaps the most recurrent issues are:

1/ CDSs that are missed where significant rna-seq evidence is there (figure
artemis01)
2/ vice versa where one or two exons are added without rna-seq
evidence/intron hints (figure artemis02)

info about the runs:
1/ using augustus with a pre-existing model for a related plant that has
high homology to the one I am annotating
2/ umask=1 (seems to do better than umask=0; is this a good thing to do)
3/ evm = 1 (seems to perform better than emv=0)
4/ repeatmasking (denovo + repbase)

Best,
/SB
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://box290.bluehost.com/pipermail/maker-devel_yandell-lab.org/attachments/20170521/721eee4f/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: artemis01.png
Type: image/png
Size: 166352 bytes
Desc: not available
URL: <http://box290.bluehost.com/pipermail/maker-devel_yandell-lab.org/attachments/20170521/721eee4f/attachment.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: artemis02.png
Type: image/png
Size: 97140 bytes
Desc: not available
URL: <http://box290.bluehost.com/pipermail/maker-devel_yandell-lab.org/attachments/20170521/721eee4f/attachment-0001.png>

From mjfi2sb3 at gmail.com  Sun May 21 02:55:31 2017
From: mjfi2sb3 at gmail.com (Salim Bougouffa)
Date: Sun, 21 May 2017 07:55:31 +0000
Subject: [maker-devel] augustus exon calling ~
In-Reply-To: <CAJb_6LSEknrL9oNUozj9=OQihk-rSKZbo0W+TAq+-4agZ_h1FA@mail.gmail.com>
References: <CAJb_6LSEknrL9oNUozj9=OQihk-rSKZbo0W+TAq+-4agZ_h1FA@mail.gmail.com>
Message-ID: <CAJb_6LQtJCiDLOtM0hQpq1Eb4BHV_LyqzRKGKEMNMz_YDJiuqg@mail.gmail.com>

Hi,

I should have mentioned a third scenario where an exon is not called fully
by maker despite augustus getting it right (figure artemis03)

[image: artemis03.png]


On Sun, 21 May 2017 at 10:48 Salim Bougouffa <mjfi2sb3 at gmail.com> wrote:

> Hi Maker folks,
>
> I have several issues with a plant genome annotation that I am currently
> doing but perhaps the most recurrent issues are:
>
> 1/ CDSs that are missed where significant rna-seq evidence is there
> (figure artemis01)
> 2/ vice versa where one or two exons are added without rna-seq
> evidence/intron hints (figure artemis02)
>
> info about the runs:
> 1/ using augustus with a pre-existing model for a related plant that has
> high homology to the one I am annotating
> 2/ umask=1 (seems to do better than umask=0; is this a good thing to do)
> 3/ evm = 1 (seems to perform better than emv=0)
> 4/ repeatmasking (denovo + repbase)
>
> Best,
> /SB
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://box290.bluehost.com/pipermail/maker-devel_yandell-lab.org/attachments/20170521/1c0c8274/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: artemis03.png
Type: image/png
Size: 145860 bytes
Desc: not available
URL: <http://box290.bluehost.com/pipermail/maker-devel_yandell-lab.org/attachments/20170521/1c0c8274/attachment.png>

From admin at genome.arizona.edu  Tue May 23 14:52:17 2017
From: admin at genome.arizona.edu (System Admin)
Date: Tue, 23 May 2017 12:52:17 -0700
Subject: [maker-devel] Hyperthreading
Message-ID: <66d4d95b-fc19-233c-af19-787c2af73e25@genome.arizona.edu>

We are using maker in a cluster with mpich.  Currently hyperthreading is 
on and we use 'mpiexec -n <cores>' to start maker.  Our machinelist file 
for mpich specifies the total emulated cores for each node.
With hyperthreading on, we have up to 256 total emulated cores available.

Which is the optimal scenario?
1. Use '-n 256'
2. Use '-n 128' with hyperthreading still on
3. Use '-n 128' with hyperthreading turned off

Thanks


From carsonhh at gmail.com  Tue May 23 15:19:29 2017
From: carsonhh at gmail.com (Carson Holt)
Date: Tue, 23 May 2017 14:19:29 -0600
Subject: [maker-devel] Hyperthreading
In-Reply-To: <66d4d95b-fc19-233c-af19-787c2af73e25@genome.arizona.edu>
References: <66d4d95b-fc19-233c-af19-787c2af73e25@genome.arizona.edu>
Message-ID: <D9561BB5-DC17-4D4B-9657-A5522B6F9231@gmail.com>

MAKER is more of a pipeline. It will launch external tools on as many CPUs as you give it with the mpiexec command. I?ve found that many of the tools used get a boost with hyperthreading even though optimizations are not explicitly built into their code. The short answer is you would have to try it both ways. I doubt there will be much more than a 10-15% difference in runtime.

You can pull back to 128 would if you find that you are running low on RAM or have a high IO burden (both of which will double if you go from 128 to 256 even though CPU isn?t really doubling). Also MAKER per job performance plateaus at around 200 processes due to communication overhead. Above that threshold it is often useful to divide datasets into multiple separate jobs that can run simultaneously.

?Carson


> On May 23, 2017, at 1:52 PM, System Admin <admin at genome.arizona.edu> wrote:
> 
> We are using maker in a cluster with mpich.  Currently hyperthreading is on and we use 'mpiexec -n <cores>' to start maker.  Our machinelist file for mpich specifies the total emulated cores for each node.
> With hyperthreading on, we have up to 256 total emulated cores available.
> 
> Which is the optimal scenario?
> 1. Use '-n 256'
> 2. Use '-n 128' with hyperthreading still on
> 3. Use '-n 128' with hyperthreading turned off
> 
> Thanks
> 
> _______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org


From admin at genome.arizona.edu  Tue May 23 15:31:48 2017
From: admin at genome.arizona.edu (admin at genome.arizona.edu)
Date: Tue, 23 May 2017 13:31:48 -0700
Subject: [maker-devel] Hyperthreading
In-Reply-To: <D9561BB5-DC17-4D4B-9657-A5522B6F9231@gmail.com>
References: <66d4d95b-fc19-233c-af19-787c2af73e25@genome.arizona.edu>
	<D9561BB5-DC17-4D4B-9657-A5522B6F9231@gmail.com>
Message-ID: <fd619da1-cb2b-9248-502e-501b4a459cbb@genome.arizona.edu>

Carson Holt wrote on 05/23/2017 01:19 PM:
> You can pull back to 128 would if you find that you are running low on RAM or have a high IO burden (both of which will double if you go from 128 to 256 even though CPU isn?t really doubling). Also MAKER per job performance plateaus at around 200 processes due to communication overhead. Above that threshold it is often useful to divide datasets into multiple separate jobs that can run simultaneously.

Yes, with '-n 192' we found the load on the cluster will initially go up 
to 360-380 but then continually decreases until maker is finished. 
Memory usage was very low during the processing (under 20%).


From carsonhh at gmail.com  Tue May 23 15:38:42 2017
From: carsonhh at gmail.com (Carson Holt)
Date: Tue, 23 May 2017 14:38:42 -0600
Subject: [maker-devel] Hyperthreading
In-Reply-To: <fd619da1-cb2b-9248-502e-501b4a459cbb@genome.arizona.edu>
References: <66d4d95b-fc19-233c-af19-787c2af73e25@genome.arizona.edu>
	<D9561BB5-DC17-4D4B-9657-A5522B6F9231@gmail.com>
	<fd619da1-cb2b-9248-502e-501b4a459cbb@genome.arizona.edu>
Message-ID: <47000CA1-0F27-40B2-8BB7-3289C2010853@gmail.com>

Also make sure cpus in the control file are set to 1 when using MPI. Otherwise it will tell each program it calls to try and use more CPUs per call.

?Carson

> On May 23, 2017, at 2:31 PM, admin at genome.arizona.edu wrote:
> 
> Carson Holt wrote on 05/23/2017 01:19 PM:
>> You can pull back to 128 would if you find that you are running low on RAM or have a high IO burden (both of which will double if you go from 128 to 256 even though CPU isn?t really doubling). Also MAKER per job performance plateaus at around 200 processes due to communication overhead. Above that threshold it is often useful to divide datasets into multiple separate jobs that can run simultaneously.
> 
> Yes, with '-n 192' we found the load on the cluster will initially go up to 360-380 but then continually decreases until maker is finished. Memory usage was very low during the processing (under 20%).
> 
> _______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org


From mmokrejs at gmail.com  Tue May 23 15:45:21 2017
From: mmokrejs at gmail.com (=?UTF-8?Q?Martin_MOKREJ=c5=a0?=)
Date: Tue, 23 May 2017 22:45:21 +0200
Subject: [maker-devel] Hyperthreading
In-Reply-To: <fd619da1-cb2b-9248-502e-501b4a459cbb@genome.arizona.edu>
References: <66d4d95b-fc19-233c-af19-787c2af73e25@genome.arizona.edu>
	<D9561BB5-DC17-4D4B-9657-A5522B6F9231@gmail.com>
	<fd619da1-cb2b-9248-502e-501b4a459cbb@genome.arizona.edu>
Message-ID: <30ae31f7-264e-44d0-30bc-a010de3e54a7@gmail.com>

admin at genome.arizona.edu wrote:
> Carson Holt wrote on 05/23/2017 01:19 PM:
>> You can pull back to 128 would if you find that you are running low on RAM or have a high IO burden (both of which will double if you go from 128 to 256 even though CPU isn?t really doubling). Also MAKER per job performance plateaus at around 200 processes due to communication overhead. Above that threshold it is often useful to divide datasets into multiple separate jobs that can run simultaneously.
> 
> Yes, with '-n 192' we found the load on the cluster will initially go up to 360-380 but then continually decreases until maker is finished. Memory usage was very low during the processing (under 20%).

Hi,
 the high load could be caused by disk IO or other reasons. The only proof is to run top, htop or similar and check that the processes are in *running* state ("R" is displayed in the status column). There could be "S" (sleep) when task is waiting for data input or output and also "D"(disk) coudl be shown when waiting for disk IO (unlike network IO).
Martin


From mmokrejs at gmail.com  Tue May 23 15:51:18 2017
From: mmokrejs at gmail.com (=?UTF-8?Q?Martin_MOKREJ=c5=a0?=)
Date: Tue, 23 May 2017 22:51:18 +0200
Subject: [maker-devel] Hyperthreading
In-Reply-To: <66d4d95b-fc19-233c-af19-787c2af73e25@genome.arizona.edu>
References: <66d4d95b-fc19-233c-af19-787c2af73e25@genome.arizona.edu>
Message-ID: <54fbcfd6-ba28-e570-cfab-a6d83620f747@gmail.com>

System Admin wrote:
> We are using maker in a cluster with mpich.  Currently hyperthreading is on and we use 'mpiexec -n <cores>' to start maker.  Our machinelist file for mpich specifies the total emulated cores for each node.
> With hyperthreading on, we have up to 256 total emulated cores available.
> 
> Which is the optimal scenario?
> 1. Use '-n 256'
> 2. Use '-n 128' with hyperthreading still on
> 3. Use '-n 128' with hyperthreading turned off

Go for 3. but make sure to disable *hyperthreading* in the kernel of the machines as well. I also disable multicore scheduler (which should again be helping if there are more long-term running processes than physical cores available and if some should probably share a cache). We do not have such jobs, hmmer and blast are mostly accessing data from memory, so the CPU cache is not much relevant for these.
  Hyperthreading only helps if jobs are lousy, waiting for some input/output etc., and in that case *it helps* if another process can be executed on the CPU core (hopefully not having same bottleneck). This is generally a helped in bad situations. You are after good setup, so disable hyperthreading in kernel, load only that many jobs equal to the number of physical CPI cores, and monitor performance. If jobs are starving, resolve the issue.

Martin


From admin at genome.arizona.edu  Tue May 23 15:57:56 2017
From: admin at genome.arizona.edu (admin at genome.arizona.edu)
Date: Tue, 23 May 2017 13:57:56 -0700
Subject: [maker-devel] Hyperthreading
In-Reply-To: <47000CA1-0F27-40B2-8BB7-3289C2010853@gmail.com>
References: <66d4d95b-fc19-233c-af19-787c2af73e25@genome.arizona.edu>
	<D9561BB5-DC17-4D4B-9657-A5522B6F9231@gmail.com>
	<fd619da1-cb2b-9248-502e-501b4a459cbb@genome.arizona.edu>
	<47000CA1-0F27-40B2-8BB7-3289C2010853@gmail.com>
Message-ID: <bf57acaf-5f70-1bd1-4c1d-ab2b5ced581e@genome.arizona.edu>

Carson Holt wrote on 05/23/2017 01:38 PM:
> Also make sure cpus in the control file are set to 1 when using MPI. 
> Otherwise it will tell each program it calls to try and use more
> CPUs per call.

Yes we are using cpus=1 in the control file
Thanks


From carsonhh at gmail.com  Tue May 23 16:03:17 2017
From: carsonhh at gmail.com (Carson Holt)
Date: Tue, 23 May 2017 15:03:17 -0600
Subject: [maker-devel] Hyperthreading
In-Reply-To: <fd619da1-cb2b-9248-502e-501b4a459cbb@genome.arizona.edu>
References: <66d4d95b-fc19-233c-af19-787c2af73e25@genome.arizona.edu>
	<D9561BB5-DC17-4D4B-9657-A5522B6F9231@gmail.com>
	<fd619da1-cb2b-9248-502e-501b4a459cbb@genome.arizona.edu>
Message-ID: <054B76C7-FAC0-4B9B-A6D1-29A8202E35B3@gmail.com>

One last thing to check if using CentOS or RedHat. I?ve seen it happen on a handful of clusters where transparent hugepages can create odd load issues and very high sys CPU usage under top (not just with maker but with BWA, GATK, and other programs that can have larger memory footprints). If using CentOS or RedHat, you may want to disable defrag for hugepages.

You do this on CentOS 6 to disable it (the process is similar on CentOS 7 and RedHat but you may have to google it) ?> 

echo never > /sys/kernel/mm/transparent_hugepage/defrag 
echo 0 > /sys/kernel/mm/transparent_hugepage/khugepaged/defrag

?Carson


> On May 23, 2017, at 2:31 PM, admin at genome.arizona.edu wrote:
> 
> Carson Holt wrote on 05/23/2017 01:19 PM:
>> You can pull back to 128 would if you find that you are running low on RAM or have a high IO burden (both of which will double if you go from 128 to 256 even though CPU isn?t really doubling). Also MAKER per job performance plateaus at around 200 processes due to communication overhead. Above that threshold it is often useful to divide datasets into multiple separate jobs that can run simultaneously.
> 
> Yes, with '-n 192' we found the load on the cluster will initially go up to 360-380 but then continually decreases until maker is finished. Memory usage was very low during the processing (under 20%).
> 
> _______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org


From carsonhh at gmail.com  Tue May 23 16:34:15 2017
From: carsonhh at gmail.com (Carson Holt)
Date: Tue, 23 May 2017 15:34:15 -0600
Subject: [maker-devel] augustus exon calling ~
In-Reply-To: <CAJb_6LSEknrL9oNUozj9=OQihk-rSKZbo0W+TAq+-4agZ_h1FA@mail.gmail.com>
References: <CAJb_6LSEknrL9oNUozj9=OQihk-rSKZbo0W+TAq+-4agZ_h1FA@mail.gmail.com>
Message-ID: <EDDC2B86-8D94-4700-93D0-09CEAF358C3C@gmail.com>

EVM works extremely well when evidence closely matches the predictions and there are no assembly anomalies affecting ORF. Otherwise, EVM performs very very poorly. Also I would not set unmask=1. It adds noise to the calls.

Note in all cases given, gene models are from Augustus (MAKER doesn?t make predictions). MAKER just provides hints that Augustus can use for the second call set. Hints boost the score a model gets whenever a feature matches the hint. What you see as an Augustus match/match_part feature are just references of what Augustus calls without hints.

So if I tell Augustus there is probably an exon/intron at location X, then any model that includes that exon/intron will bump up its score thus causing Augustus to keep models that match the hints and report those over models that don?t match. However if there is an issue with the evidence (i.e. merge mRNA-seq assembly), or an issue with the assembly (base change generates an early stop codon or causes a frameshift), then Augustus may choose to truncate or skip an exon in order to capture the bonus from downstream hints. So it is unlikely that there is a workable model that capture the exact intron exon structure because it breaks the ORF at some point. So Augustus instead produces the best model it can to capture as many hint bonuses as it can.

That being said, look for any odd hint sources like very poor protein or transcript evidence alignments. Eliminating bad hints will improve performance (if using mRNA-seq assemblies Trinity has a jaccard_clip option which helps avoid false merging of transcript evidence for example). Or if an organism you used for protein evidence constantly produces bad protein alignments, then you may want to drop it completely from evidence.

Finally training Augustus on the genome being annotated will help improve performance (note just because a species is closely related in evolutionary space does not mean that its HMM's will perform well; it?s a common fallacy about ab initio prediction discussed in the SNAP paper). Also try adding another gene predictor like SNAP to see if it hurts or helps.

?Carson


> On May 21, 2017, at 1:48 AM, Salim Bougouffa <mjfi2sb3 at gmail.com> wrote:
> 
> Hi Maker folks,
> 
> I have several issues with a plant genome annotation that I am currently doing but perhaps the most recurrent issues are:
> 
> 1/ CDSs that are missed where significant rna-seq evidence is there (figure artemis01)
> 2/ vice versa where one or two exons are added without rna-seq evidence/intron hints (figure artemis02)
> 
> info about the runs:
> 1/ using augustus with a pre-existing model for a related plant that has high homology to the one I am annotating
> 2/ umask=1 (seems to do better than umask=0; is this a good thing to do)
> 3/ evm = 1 (seems to perform better than emv=0)
> 4/ repeatmasking (denovo + repbase)
> 
> Best,
> /SB
> 
> <artemis01.png><artemis02.png>_______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://box290.bluehost.com/pipermail/maker-devel_yandell-lab.org/attachments/20170523/3c7b4bf1/attachment.html>

From nathan.ricks at gmail.com  Tue May 23 16:23:49 2017
From: nathan.ricks at gmail.com (Nathan Ricks)
Date: Tue, 23 May 2017 15:23:49 -0600
Subject: [maker-devel] maker_functional_gff
Message-ID: <CAJfcix1BmK6LyE7F=sF10ARr81k9FSm3eKgNgLMZbP-X7=C8=A@mail.gmail.com>

I've been working with maker and trying to use the maker_functional_gff to
create an annotated .gff file. However, whenever I run the command, the
following pops up, and just continues for a long time.
Use of uninitialized value $qid in hash element at ./maker_functional_gff
line 165, <$IN> line 58363.
Use of uninitialized value $qid in hash element at ./maker_functional_gff
line 170, <$IN> line 58363.
Use of uninitialized value $qid in hash element at ./maker_functional_gff
line 165, <$IN> line 58367.
Use of uninitialized value $qid in hash element at ./maker_functional_gff
line 170, <$IN> line 58367.
^CUse of uninitialized value $qid in hash element at ./maker_functional_gff
line 165, <$IN> line 58368.


Nathan Ricks
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://box290.bluehost.com/pipermail/maker-devel_yandell-lab.org/attachments/20170523/b34f6585/attachment.html>

From d.ence at ufl.edu  Tue May 23 16:39:32 2017
From: d.ence at ufl.edu (Ence,daniel)
Date: Tue, 23 May 2017 21:39:32 +0000
Subject: [maker-devel] maker_functional_gff
In-Reply-To: <CAJfcix1BmK6LyE7F=sF10ARr81k9FSm3eKgNgLMZbP-X7=C8=A@mail.gmail.com>
References: <CAJfcix1BmK6LyE7F=sF10ARr81k9FSm3eKgNgLMZbP-X7=C8=A@mail.gmail.com>
Message-ID: <9388406A-302C-4B19-9F35-D56C06CC9582@mail.ufl.edu>

Hi Nathan, can you send the command line that you?re using and is giving the error? 

Thanks,
Daniel Ence

> On May 23, 2017, at 5:23 PM, Nathan Ricks <nathan.ricks at gmail.com> wrote:
> 
> I've been working with maker and trying to use the maker_functional_gff to create an annotated .gff file. However, whenever I run the command, the following pops up, and just continues for a long time.
> Use of uninitialized value $qid in hash element at ./maker_functional_gff line 165, <$IN> line 58363.
> Use of uninitialized value $qid in hash element at ./maker_functional_gff line 170, <$IN> line 58363.
> Use of uninitialized value $qid in hash element at ./maker_functional_gff line 165, <$IN> line 58367.
> Use of uninitialized value $qid in hash element at ./maker_functional_gff line 170, <$IN> line 58367.
> ^CUse of uninitialized value $qid in hash element at ./maker_functional_gff line 165, <$IN> line 58368.
> 
> 
> Nathan Ricks
> _______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org


From carsonhh at gmail.com  Tue May 23 16:44:05 2017
From: carsonhh at gmail.com (Carson Holt)
Date: Tue, 23 May 2017 15:44:05 -0600
Subject: [maker-devel] maker_functional_gff
In-Reply-To: <CAJfcix1BmK6LyE7F=sF10ARr81k9FSm3eKgNgLMZbP-X7=C8=A@mail.gmail.com>
References: <CAJfcix1BmK6LyE7F=sF10ARr81k9FSm3eKgNgLMZbP-X7=C8=A@mail.gmail.com>
Message-ID: <25B003D7-6A19-4486-B21F-71070F00A580@gmail.com>

The blast report you gave it is in the wrong format, it is partial/truncated, or you provided the files in the wrong order. Basically it receive an empty line from the file at some point.

The blast report format must in tabular foramt which is "wu-blast -mformat 2? or "ncbi-blast -outfmt 6"

Also the script only supports blast results against UniProt/Swiss-prot.

?Carson


> On May 23, 2017, at 3:23 PM, Nathan Ricks <nathan.ricks at gmail.com> wrote:
> 
> I've been working with maker and trying to use the maker_functional_gff to create an annotated .gff file. However, whenever I run the command, the following pops up, and just continues for a long time.
> Use of uninitialized value $qid in hash element at ./maker_functional_gff line 165, <$IN> line 58363.
> Use of uninitialized value $qid in hash element at ./maker_functional_gff line 170, <$IN> line 58363.
> Use of uninitialized value $qid in hash element at ./maker_functional_gff line 165, <$IN> line 58367.
> Use of uninitialized value $qid in hash element at ./maker_functional_gff line 170, <$IN> line 58367.
> ^CUse of uninitialized value $qid in hash element at ./maker_functional_gff line 165, <$IN> line 58368.
> 
> 
> Nathan Ricks
> _______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org


From yuejiaxing at gmail.com  Fri May 26 04:28:48 2017
From: yuejiaxing at gmail.com (Jia-Xing Yue)
Date: Fri, 26 May 2017 11:28:48 +0200
Subject: [maker-devel] multiple overlapped snoRNA genes got annotated by
 maker
In-Reply-To: <CAETgyPQOwmGsXPGyAcUcEOf_-LHoOvqzJQuKHLTocRYE6tyVtg@mail.gmail.com>
References: <CAETgyPSWXrZPO=nrkgT5MomgJELf-0ze7yu0A_NCsVqTnLYkDA@mail.gmail.com>
	<0AC3F89F-28EB-4E2A-ADDE-2DF8BD625416@gmail.com>
	<CAETgyPQOwmGsXPGyAcUcEOf_-LHoOvqzJQuKHLTocRYE6tyVtg@mail.gmail.com>
Message-ID: <CAETgyPSsjRpskkgEs0TY=xbXz0J2N9E0wGQV2S2mVCpej6ZaCg@mail.gmail.com>

Hi Michael,

This is a follow-up for the snoscan issue. I found the snoscan_meth option
seems have been removed in the current maker_opts.ctl template file
(v2.31.9). This option used to be there according to this post (
https://www.biostars.org/p/217240/). I manually specified this option in my
maker_opts.ctl file but I don't think maker has correctly recognized this
option:


STATUS: Parsing control files...
WARNING: Invalid option 'snoscan_meth' in control file maker_opts.ctl
...

Do you know is there a way to work around this problem? Thanks!

Best,
Jia-Xing


On Mon, May 15, 2017 at 7:14 PM, Jia-Xing Yue <yuejiaxing at gmail.com> wrote:

> Hi Michael,
>
> Many thanks for the information! I will specify the "snoscan_meth" file
> and give it another try then. I majorly want to use maker to annotate
> protein-coding genes and tRNAs. But it would be nice to have snoRNA
> reasonably annotated as well.
> Thanks gain and have a great day!
>
> Best,
> Jia-Xing
>
>
> On Mon, May 15, 2017 at 6:28 PM, Michael Campbell <
> michael.s.campbell1 at gmail.com> wrote:
>
>> Hi Jia-Xing,
>>
>> That has been my experience in the past as well. For the non-coding RNAs
>> tRNA-scan is very accurate while snoscan seems to be quite sensitive but
>> very specific. Did you give it a ?snoscan_meth? file? Giving it
>> a snoscan_meth file will help with accuracy. The biggest gains in accuracy
>> are from small RNA-seq data. In the paper where we used snoscan on maize we
>> didn?t keep any snoRNA predictions that didn?t have support from small
>> RNA-seq data, in practical terms we got rid of anything with a AED of 1.
>>
>> I hope this helps,
>> Mike
>>
>> On May 15, 2017, at 7:28 AM, Jia-Xing Yue <yuejiaxing at gmail.com> wrote:
>>
>> Hello,
>>
>> I configured snoscan (v.0.9.1) for my maker installation (v2.31.9) and
>> run the annotation for a yeast (S. cerevisiae) genome. I think the
>> annotation went well with regard to tRNAs and protein-coding genes but I am
>> not sure about snoRNAs. I found multiple overlapped snoRNA genes were
>> annotated by maker as the example below shows. I was wondering if this is
>> expected. If not, what might have caused this problem and is there a way to
>> work around. Thanks in advance!
>>
>> chrIX   maker   gene    4328    4416    .       +       .
>> ID=snoscan-chrIX-noncoding-gene-0.49;Name=snoscan-chrIX-nonc
>> oding-gene-0.49
>> chrIX   maker   snoRNA  4328    4416    .       +       .
>> ID=snoscan-chrIX-noncoding-gene-0.49-snoRNA-1;Parent=snoscan
>> -chrIX-noncoding-gene-0.49;Name=snoscan-chrIX-noncoding-
>> gene-0.49-snoRNA-1;_AED=1.00;_eAED=1.00;_QI=0|-1|0|0|-1|0|1|90|0
>> chrIX   maker   exon    4328    4416    .       +       .
>> ID=snoscan-chrIX-noncoding-gene-0.49-snoRNA-1:exon:12260;Par
>> ent=snoscan-chrIX-noncoding-gene-0.49-snoRNA-1
>> chrIX   maker   gene    4375    4563    .       +       .
>> ID=snoscan-chrIX-noncoding-gene-0.50;Name=snoscan-chrIX-nonc
>> oding-gene-0.50
>> chrIX   maker   snoRNA  4375    4563    .       +       .
>> ID=snoscan-chrIX-noncoding-gene-0.50-snoRNA-1;Parent=snoscan
>> -chrIX-noncoding-gene-0.50;Name=snoscan-chrIX-noncoding-
>> gene-0.50-snoRNA-1;_AED=1.00;_eAED=1.00;_QI=0|-1|0|0|-1|0|1|190|0
>> chrIX   maker   exon    4375    4563    .       +       .
>> ID=snoscan-chrIX-noncoding-gene-0.50-snoRNA-1:exon:12261;Par
>> ent=snoscan-chrIX-noncoding-gene-0.50-snoRNA-1
>> chrIX   maker   gene    4375    4461    .       +       .
>> ID=snoscan-chrIX-noncoding-gene-0.51;Name=snoscan-chrIX-nonc
>> oding-gene-0.51
>> chrIX   maker   snoRNA  4375    4461    .       +       .
>> ID=snoscan-chrIX-noncoding-gene-0.51-snoRNA-1;Parent=snoscan
>> -chrIX-noncoding-gene-0.51;Name=snoscan-chrIX-noncoding-
>> gene-0.51-snoRNA-1;_AED=1.00;_eAED=1.00;_QI=0|-1|0|0|-1|0|1|88|0
>> chrIX   maker   exon    4375    4461    .       +       .
>> ID=snoscan-chrIX-noncoding-gene-0.51-snoRNA-1:exon:12262;Par
>> ent=snoscan-chrIX-noncoding-gene-0.51-snoRNA-1
>> chrIX   maker   gene    4375    4491    .       +       .
>> ID=snoscan-chrIX-noncoding-gene-0.52;Name=snoscan-chrIX-nonc
>> oding-gene-0.52
>> chrIX   maker   snoRNA  4375    4491    .       +       .
>> ID=snoscan-chrIX-noncoding-gene-0.52-snoRNA-1;Parent=snoscan
>> -chrIX-noncoding-gene-0.52;Name=snoscan-chrIX-noncoding-
>> gene-0.52-snoRNA-1;_AED=1.00;_eAED=1.00;_QI=0|-1|0|0|-1|0|1|118|0
>> chrIX   maker   exon    4375    4491    .       +       .
>> ID=snoscan-chrIX-noncoding-gene-0.52-snoRNA-1:exon:12263;Par
>> ent=snoscan-chrIX-noncoding-gene-0.52-snoRNA-1
>> chrIX   maker   gene    4375    4500    .       +       .
>> ID=snoscan-chrIX-noncoding-gene-0.53;Name=snoscan-chrIX-nonc
>> oding-gene-0.53
>> chrIX   maker   snoRNA  4375    4500    .       +       .
>> ID=snoscan-chrIX-noncoding-gene-0.53-snoRNA-1;Parent=snoscan
>> -chrIX-noncoding-gene-0.53;Name=snoscan-chrIX-noncoding-
>> gene-0.53-snoRNA-1;_AED=1.00;_eAED=1.00;_QI=0|-1|0|0|-1|0|1|127|0
>> chrIX   maker   exon    4375    4500    .       +       .
>> ID=snoscan-chrIX-noncoding-gene-0.53-snoRNA-1:exon:12264;Par
>> ent=snoscan-chrIX-noncoding-gene-0.53-snoRNA-1
>>
>> Best,
>> Jia-Xing
>>
>>
>> --
>> Jia-Xing Yue
>>
>> Population Genomics and Complex Traits Group
>> Tour Pasteur 8eme etage
>> Facult? de M?decine
>> Institute for Research on Cancer and Aging, Nice (IRCAN)
>> CNRS UMR 7284 - INSERM U 1081 - Universit? C?te d?Azur (UCA)
>> 28 Avenue de Valombrose
>> 06107 NICE Cedex 2
>> France
>>
>> Personal website: http://www.iamphioxus.org/
>>
>> _______________________________________________
>> maker-devel mailing list
>> maker-devel at box290.bluehost.com
>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org
>>
>>
>>
>
>
> --
> Jia-Xing Yue
>
> Population Genomics and Complex Traits Group
> Tour Pasteur 8eme etage
> Facult? de M?decine
> Institute for Research on Cancer and Aging, Nice (IRCAN)
> CNRS UMR 7284 - INSERM U 1081 - Universit? C?te d?Azur (UCA)
> 28 Avenue de Valombrose
> 06107 NICE Cedex 2
> France
>
> Personal website: http://www.iamphioxus.org/
>
>


-- 
Jia-Xing Yue

Population Genomics and Complex Traits Group
Tour Pasteur 8eme etage
Facult? de M?decine
Institute for Research on Cancer and Aging, Nice (IRCAN)
CNRS UMR 7284 - INSERM U 1081 - Universit? C?te d?Azur (UCA)
28 Avenue de Valombrose
06107 NICE Cedex 2
France

Personal website: http://www.iamphioxus.org/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://box290.bluehost.com/pipermail/maker-devel_yandell-lab.org/attachments/20170526/0edf08e2/attachment.html>

From michael.s.campbell1 at gmail.com  Fri May 26 08:54:44 2017
From: michael.s.campbell1 at gmail.com (Michael Campbell)
Date: Fri, 26 May 2017 09:54:44 -0400
Subject: [maker-devel] multiple overlapped snoRNA genes got annotated by
 maker
In-Reply-To: <CAETgyPSsjRpskkgEs0TY=xbXz0J2N9E0wGQV2S2mVCpej6ZaCg@mail.gmail.com>
References: <CAETgyPSWXrZPO=nrkgT5MomgJELf-0ze7yu0A_NCsVqTnLYkDA@mail.gmail.com>
	<0AC3F89F-28EB-4E2A-ADDE-2DF8BD625416@gmail.com>
	<CAETgyPQOwmGsXPGyAcUcEOf_-LHoOvqzJQuKHLTocRYE6tyVtg@mail.gmail.com>
	<CAETgyPSsjRpskkgEs0TY=xbXz0J2N9E0wGQV2S2mVCpej6ZaCg@mail.gmail.com>
Message-ID: <6461FDD0-BE78-403A-9FEF-E71C3D24F2CA@gmail.com>

Hi Jia-Xing,

v2.31.9 may not have had that option. I know that it is in the v3.00.0 version, so you best option may be to update.

Thanks,
Mike
> On May 26, 2017, at 5:28 AM, Jia-Xing Yue <yuejiaxing at gmail.com> wrote:
> 
> Hi Michael,
> 
> This is a follow-up for the snoscan issue. I found the snoscan_meth option seems have been removed in the current maker_opts.ctl template file (v2.31.9). This option used to be there according to this post (https://www.biostars.org/p/217240/ <https://www.biostars.org/p/217240/>). I manually specified this option in my maker_opts.ctl file but I don't think maker has correctly recognized this option:
> 
> 
> STATUS: Parsing control files...
> WARNING: Invalid option 'snoscan_meth' in control file maker_opts.ctl
> ...
> 
> Do you know is there a way to work around this problem? Thanks!
> 
> Best,
> Jia-Xing
> 
>  
> 
> On Mon, May 15, 2017 at 7:14 PM, Jia-Xing Yue <yuejiaxing at gmail.com <mailto:yuejiaxing at gmail.com>> wrote:
> Hi Michael,
> 
> Many thanks for the information! I will specify the "snoscan_meth" file and give it another try then. I majorly want to use maker to annotate protein-coding genes and tRNAs. But it would be nice to have snoRNA reasonably annotated as well. 
> Thanks gain and have a great day!
> 
> Best,
> Jia-Xing
> 
> 
> On Mon, May 15, 2017 at 6:28 PM, Michael Campbell <michael.s.campbell1 at gmail.com <mailto:michael.s.campbell1 at gmail.com>> wrote:
> Hi Jia-Xing,
> 
> That has been my experience in the past as well. For the non-coding RNAs tRNA-scan is very accurate while snoscan seems to be quite sensitive but very specific. Did you give it a ?snoscan_meth? file? Giving it a snoscan_meth file will help with accuracy. The biggest gains in accuracy are from small RNA-seq data. In the paper where we used snoscan on maize we didn?t keep any snoRNA predictions that didn?t have support from small RNA-seq data, in practical terms we got rid of anything with a AED of 1.
> 
> I hope this helps,
> Mike
>> On May 15, 2017, at 7:28 AM, Jia-Xing Yue <yuejiaxing at gmail.com <mailto:yuejiaxing at gmail.com>> wrote:
>> 
>> Hello,
>> 
>> I configured snoscan (v.0.9.1) for my maker installation (v2.31.9) and run the annotation for a yeast (S. cerevisiae) genome. I think the annotation went well with regard to tRNAs and protein-coding genes but I am not sure about snoRNAs. I found multiple overlapped snoRNA genes were annotated by maker as the example below shows. I was wondering if this is expected. If not, what might have caused this problem and is there a way to work around. Thanks in advance!
>> 
>> chrIX   maker   gene    4328    4416    .       +       .       ID=snoscan-chrIX-noncoding-gene-0.49;Name=snoscan-chrIX-noncoding-gene-0.49
>> chrIX   maker   snoRNA  4328    4416    .       +       .       ID=snoscan-chrIX-noncoding-gene-0.49-snoRNA-1;Parent=snoscan-chrIX-noncoding-gene-0.49;Name=snoscan-chrIX-noncoding-gene-0.49-snoRNA-1;_AED=1.00;_eAED=1.00;_QI=0|-1|0|0|-1|0|1|90|0
>> chrIX   maker   exon    4328    4416    .       +       .       ID=snoscan-chrIX-noncoding-gene-0.49-snoRNA-1:exon:12260;Parent=snoscan-chrIX-noncoding-gene-0.49-snoRNA-1
>> chrIX   maker   gene    4375    4563    .       +       .       ID=snoscan-chrIX-noncoding-gene-0.50;Name=snoscan-chrIX-noncoding-gene-0.50
>> chrIX   maker   snoRNA  4375    4563    .       +       .       ID=snoscan-chrIX-noncoding-gene-0.50-snoRNA-1;Parent=snoscan-chrIX-noncoding-gene-0.50;Name=snoscan-chrIX-noncoding-gene-0.50-snoRNA-1;_AED=1.00;_eAED=1.00;_QI=0|-1|0|0|-1|0|1|190|0
>> chrIX   maker   exon    4375    4563    .       +       .       ID=snoscan-chrIX-noncoding-gene-0.50-snoRNA-1:exon:12261;Parent=snoscan-chrIX-noncoding-gene-0.50-snoRNA-1
>> chrIX   maker   gene    4375    4461    .       +       .       ID=snoscan-chrIX-noncoding-gene-0.51;Name=snoscan-chrIX-noncoding-gene-0.51
>> chrIX   maker   snoRNA  4375    4461    .       +       .       ID=snoscan-chrIX-noncoding-gene-0.51-snoRNA-1;Parent=snoscan-chrIX-noncoding-gene-0.51;Name=snoscan-chrIX-noncoding-gene-0.51-snoRNA-1;_AED=1.00;_eAED=1.00;_QI=0|-1|0|0|-1|0|1|88|0
>> chrIX   maker   exon    4375    4461    .       +       .       ID=snoscan-chrIX-noncoding-gene-0.51-snoRNA-1:exon:12262;Parent=snoscan-chrIX-noncoding-gene-0.51-snoRNA-1
>> chrIX   maker   gene    4375    4491    .       +       .       ID=snoscan-chrIX-noncoding-gene-0.52;Name=snoscan-chrIX-noncoding-gene-0.52
>> chrIX   maker   snoRNA  4375    4491    .       +       .       ID=snoscan-chrIX-noncoding-gene-0.52-snoRNA-1;Parent=snoscan-chrIX-noncoding-gene-0.52;Name=snoscan-chrIX-noncoding-gene-0.52-snoRNA-1;_AED=1.00;_eAED=1.00;_QI=0|-1|0|0|-1|0|1|118|0
>> chrIX   maker   exon    4375    4491    .       +       .       ID=snoscan-chrIX-noncoding-gene-0.52-snoRNA-1:exon:12263;Parent=snoscan-chrIX-noncoding-gene-0.52-snoRNA-1
>> chrIX   maker   gene    4375    4500    .       +       .       ID=snoscan-chrIX-noncoding-gene-0.53;Name=snoscan-chrIX-noncoding-gene-0.53
>> chrIX   maker   snoRNA  4375    4500    .       +       .       ID=snoscan-chrIX-noncoding-gene-0.53-snoRNA-1;Parent=snoscan-chrIX-noncoding-gene-0.53;Name=snoscan-chrIX-noncoding-gene-0.53-snoRNA-1;_AED=1.00;_eAED=1.00;_QI=0|-1|0|0|-1|0|1|127|0
>> chrIX   maker   exon    4375    4500    .       +       .       ID=snoscan-chrIX-noncoding-gene-0.53-snoRNA-1:exon:12264;Parent=snoscan-chrIX-noncoding-gene-0.53-snoRNA-1
>> 
>> Best,
>> Jia-Xing
>> 
>> 
>> -- 
>> Jia-Xing Yue 
>> 
>> Population Genomics and Complex Traits Group
>> Tour Pasteur 8eme etage
>> Facult? de M?decine
>> Institute for Research on Cancer and Aging, Nice (IRCAN)
>> CNRS UMR 7284 - INSERM U 1081 - Universit? C?te d?Azur (UCA)
>> 28 Avenue de Valombrose
>> 06107 NICE Cedex 2
>> France
>> 
>> Personal website: http://www.iamphioxus.org/ <http://www.iamphioxus.org/>
>> 
>> _______________________________________________
>> maker-devel mailing list
>> maker-devel at box290.bluehost.com <mailto:maker-devel at box290.bluehost.com>
>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org <http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org>
> 
> 
> 
> 
> -- 
> Jia-Xing Yue 
> 
> Population Genomics and Complex Traits Group
> Tour Pasteur 8eme etage
> Facult? de M?decine
> Institute for Research on Cancer and Aging, Nice (IRCAN)
> CNRS UMR 7284 - INSERM U 1081 - Universit? C?te d?Azur (UCA)
> 28 Avenue de Valombrose
> 06107 NICE Cedex 2
> France
> 
> Personal website: http://www.iamphioxus.org/ <http://www.iamphioxus.org/>
> 
> 
> 
> 
> -- 
> Jia-Xing Yue 
> 
> Population Genomics and Complex Traits Group
> Tour Pasteur 8eme etage
> Facult? de M?decine
> Institute for Research on Cancer and Aging, Nice (IRCAN)
> CNRS UMR 7284 - INSERM U 1081 - Universit? C?te d?Azur (UCA)
> 28 Avenue de Valombrose
> 06107 NICE Cedex 2
> France
> 
> Personal website: http://www.iamphioxus.org/ <http://www.iamphioxus.org/>
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://box290.bluehost.com/pipermail/maker-devel_yandell-lab.org/attachments/20170526/207e742d/attachment.html>

From yuejiaxing at gmail.com  Fri May 26 09:20:03 2017
From: yuejiaxing at gmail.com (Jia-Xing Yue)
Date: Fri, 26 May 2017 16:20:03 +0200
Subject: [maker-devel] multiple overlapped snoRNA genes got annotated by
 maker
In-Reply-To: <6461FDD0-BE78-403A-9FEF-E71C3D24F2CA@gmail.com>
References: <CAETgyPSWXrZPO=nrkgT5MomgJELf-0ze7yu0A_NCsVqTnLYkDA@mail.gmail.com>
	<0AC3F89F-28EB-4E2A-ADDE-2DF8BD625416@gmail.com>
	<CAETgyPQOwmGsXPGyAcUcEOf_-LHoOvqzJQuKHLTocRYE6tyVtg@mail.gmail.com>
	<CAETgyPSsjRpskkgEs0TY=xbXz0J2N9E0wGQV2S2mVCpej6ZaCg@mail.gmail.com>
	<6461FDD0-BE78-403A-9FEF-E71C3D24F2CA@gmail.com>
Message-ID: <CAETgyPT_K8DDaujyUVYaRhaoKZaCT4MSrAi4DPczZHSkaTpDsQ@mail.gmail.com>

I see. Thanks Michael!

Best,
Jia-Xing

On Fri, May 26, 2017 at 3:54 PM, Michael Campbell <
michael.s.campbell1 at gmail.com> wrote:

> Hi Jia-Xing,
>
> v2.31.9 may not have had that option. I know that it is in the v3.00.0
> version, so you best option may be to update.
>
> Thanks,
> Mike
>
> On May 26, 2017, at 5:28 AM, Jia-Xing Yue <yuejiaxing at gmail.com> wrote:
>
> Hi Michael,
>
> This is a follow-up for the snoscan issue. I found the snoscan_meth option
> seems have been removed in the current maker_opts.ctl template file
> (v2.31.9). This option used to be there according to this post (
> https://www.biostars.org/p/217240/). I manually specified this option in
> my maker_opts.ctl file but I don't think maker has correctly recognized
> this option:
>
>
> STATUS: Parsing control files...
> WARNING: Invalid option 'snoscan_meth' in control file maker_opts.ctl
> ...
>
> Do you know is there a way to work around this problem? Thanks!
>
> Best,
> Jia-Xing
>
>
>
> On Mon, May 15, 2017 at 7:14 PM, Jia-Xing Yue <yuejiaxing at gmail.com>
> wrote:
>
>> Hi Michael,
>>
>> Many thanks for the information! I will specify the "snoscan_meth" file
>> and give it another try then. I majorly want to use maker to annotate
>> protein-coding genes and tRNAs. But it would be nice to have snoRNA
>> reasonably annotated as well.
>> Thanks gain and have a great day!
>>
>> Best,
>> Jia-Xing
>>
>>
>> On Mon, May 15, 2017 at 6:28 PM, Michael Campbell <
>> michael.s.campbell1 at gmail.com> wrote:
>>
>>> Hi Jia-Xing,
>>>
>>> That has been my experience in the past as well. For the non-coding RNAs
>>> tRNA-scan is very accurate while snoscan seems to be quite sensitive but
>>> very specific. Did you give it a ?snoscan_meth? file? Giving it
>>> a snoscan_meth file will help with accuracy. The biggest gains in accuracy
>>> are from small RNA-seq data. In the paper where we used snoscan on maize we
>>> didn?t keep any snoRNA predictions that didn?t have support from small
>>> RNA-seq data, in practical terms we got rid of anything with a AED of 1.
>>>
>>> I hope this helps,
>>> Mike
>>>
>>> On May 15, 2017, at 7:28 AM, Jia-Xing Yue <yuejiaxing at gmail.com> wrote:
>>>
>>> Hello,
>>>
>>> I configured snoscan (v.0.9.1) for my maker installation (v2.31.9) and
>>> run the annotation for a yeast (S. cerevisiae) genome. I think the
>>> annotation went well with regard to tRNAs and protein-coding genes but I am
>>> not sure about snoRNAs. I found multiple overlapped snoRNA genes were
>>> annotated by maker as the example below shows. I was wondering if this is
>>> expected. If not, what might have caused this problem and is there a way to
>>> work around. Thanks in advance!
>>>
>>> chrIX   maker   gene    4328    4416    .       +       .
>>> ID=snoscan-chrIX-noncoding-gene-0.49;Name=snoscan-chrIX-nonc
>>> oding-gene-0.49
>>> chrIX   maker   snoRNA  4328    4416    .       +       .
>>> ID=snoscan-chrIX-noncoding-gene-0.49-snoRNA-1;Parent=snoscan
>>> -chrIX-noncoding-gene-0.49;Name=snoscan-chrIX-noncoding-gene
>>> -0.49-snoRNA-1;_AED=1.00;_eAED=1.00;_QI=0|-1|0|0|-1|0|1|90|0
>>> chrIX   maker   exon    4328    4416    .       +       .
>>> ID=snoscan-chrIX-noncoding-gene-0.49-snoRNA-1:exon:12260;Par
>>> ent=snoscan-chrIX-noncoding-gene-0.49-snoRNA-1
>>> chrIX   maker   gene    4375    4563    .       +       .
>>> ID=snoscan-chrIX-noncoding-gene-0.50;Name=snoscan-chrIX-nonc
>>> oding-gene-0.50
>>> chrIX   maker   snoRNA  4375    4563    .       +       .
>>> ID=snoscan-chrIX-noncoding-gene-0.50-snoRNA-1;Parent=snoscan
>>> -chrIX-noncoding-gene-0.50;Name=snoscan-chrIX-noncoding-gene
>>> -0.50-snoRNA-1;_AED=1.00;_eAED=1.00;_QI=0|-1|0|0|-1|0|1|190|0
>>> chrIX   maker   exon    4375    4563    .       +       .
>>> ID=snoscan-chrIX-noncoding-gene-0.50-snoRNA-1:exon:12261;Par
>>> ent=snoscan-chrIX-noncoding-gene-0.50-snoRNA-1
>>> chrIX   maker   gene    4375    4461    .       +       .
>>> ID=snoscan-chrIX-noncoding-gene-0.51;Name=snoscan-chrIX-nonc
>>> oding-gene-0.51
>>> chrIX   maker   snoRNA  4375    4461    .       +       .
>>> ID=snoscan-chrIX-noncoding-gene-0.51-snoRNA-1;Parent=snoscan
>>> -chrIX-noncoding-gene-0.51;Name=snoscan-chrIX-noncoding-gene
>>> -0.51-snoRNA-1;_AED=1.00;_eAED=1.00;_QI=0|-1|0|0|-1|0|1|88|0
>>> chrIX   maker   exon    4375    4461    .       +       .
>>> ID=snoscan-chrIX-noncoding-gene-0.51-snoRNA-1:exon:12262;Par
>>> ent=snoscan-chrIX-noncoding-gene-0.51-snoRNA-1
>>> chrIX   maker   gene    4375    4491    .       +       .
>>> ID=snoscan-chrIX-noncoding-gene-0.52;Name=snoscan-chrIX-nonc
>>> oding-gene-0.52
>>> chrIX   maker   snoRNA  4375    4491    .       +       .
>>> ID=snoscan-chrIX-noncoding-gene-0.52-snoRNA-1;Parent=snoscan
>>> -chrIX-noncoding-gene-0.52;Name=snoscan-chrIX-noncoding-gene
>>> -0.52-snoRNA-1;_AED=1.00;_eAED=1.00;_QI=0|-1|0|0|-1|0|1|118|0
>>> chrIX   maker   exon    4375    4491    .       +       .
>>> ID=snoscan-chrIX-noncoding-gene-0.52-snoRNA-1:exon:12263;Par
>>> ent=snoscan-chrIX-noncoding-gene-0.52-snoRNA-1
>>> chrIX   maker   gene    4375    4500    .       +       .
>>> ID=snoscan-chrIX-noncoding-gene-0.53;Name=snoscan-chrIX-nonc
>>> oding-gene-0.53
>>> chrIX   maker   snoRNA  4375    4500    .       +       .
>>> ID=snoscan-chrIX-noncoding-gene-0.53-snoRNA-1;Parent=snoscan
>>> -chrIX-noncoding-gene-0.53;Name=snoscan-chrIX-noncoding-gene
>>> -0.53-snoRNA-1;_AED=1.00;_eAED=1.00;_QI=0|-1|0|0|-1|0|1|127|0
>>> chrIX   maker   exon    4375    4500    .       +       .
>>> ID=snoscan-chrIX-noncoding-gene-0.53-snoRNA-1:exon:12264;Par
>>> ent=snoscan-chrIX-noncoding-gene-0.53-snoRNA-1
>>>
>>> Best,
>>> Jia-Xing
>>>
>>>
>>> --
>>> Jia-Xing Yue
>>>
>>> Population Genomics and Complex Traits Group
>>> Tour Pasteur 8eme etage
>>> Facult? de M?decine
>>> Institute for Research on Cancer and Aging, Nice (IRCAN)
>>> CNRS UMR 7284 - INSERM U 1081 - Universit? C?te d?Azur (UCA)
>>> 28 Avenue de Valombrose
>>> 06107 NICE Cedex 2
>>> France
>>>
>>> Personal website: http://www.iamphioxus.org/
>>>
>>> _______________________________________________
>>> maker-devel mailing list
>>> maker-devel at box290.bluehost.com
>>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org
>>>
>>>
>>>
>>
>>
>> --
>> Jia-Xing Yue
>>
>> Population Genomics and Complex Traits Group
>> Tour Pasteur 8eme etage
>> Facult? de M?decine
>> Institute for Research on Cancer and Aging, Nice (IRCAN)
>> CNRS UMR 7284 - INSERM U 1081 - Universit? C?te d?Azur (UCA)
>> 28 Avenue de Valombrose
>> 06107 NICE Cedex 2
>> France
>>
>> Personal website: http://www.iamphioxus.org/
>>
>>
>
>
> --
> Jia-Xing Yue
>
> Population Genomics and Complex Traits Group
> Tour Pasteur 8eme etage
> Facult? de M?decine
> Institute for Research on Cancer and Aging, Nice (IRCAN)
> CNRS UMR 7284 - INSERM U 1081 - Universit? C?te d?Azur (UCA)
> 28 Avenue de Valombrose
> 06107 NICE Cedex 2
> France
>
> Personal website: http://www.iamphioxus.org/
>
>
>


-- 
Jia-Xing Yue

Population Genomics and Complex Traits Group
Tour Pasteur 8eme etage
Facult? de M?decine
Institute for Research on Cancer and Aging, Nice (IRCAN)
CNRS UMR 7284 - INSERM U 1081 - Universit? C?te d?Azur (UCA)
28 Avenue de Valombrose
06107 NICE Cedex 2
France

Personal website: http://www.iamphioxus.org/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://box290.bluehost.com/pipermail/maker-devel_yandell-lab.org/attachments/20170526/9c764175/attachment.html>

From dcg at cau.edu.cn  Mon May  1 07:32:30 2017
From: dcg at cau.edu.cn (dcg at cau.edu.cn)
Date: Mon, 1 May 2017 21:32:30 +0800
Subject: [maker-devel] Why my maker get no results?
Message-ID: <2017050121323023791817@cau.edu.cn>

Dear sir:
    I' have bben working on genome annotation these days.My process in as below:
    
    1. I split my contigs into 300 parts and deal with them simultaneously to speed up my process. 
    2. I used my splited-genome, protein, ESTs and RNA-seq to make the first alignment( est2genome=1,  AED_threshold=0.2 ). 
    3. Merge the maker.*_.master_datastore_index.log to get all the paths of results.
    4. gff_merge script to merge all the results in different dirs.

    However, there is no results returned. (My genome is about 3GB, but the gff of result is none.)
    index_all.log.all.gff    1KB
    index_all.log.all.maker.proteins.fasta    2837KB
    index_all.log.all.maker.transcripts.fasta    9866KB


    Where can the problems take place?
    Thanks!
Yours sincerely.

Chao Chao


dcg at cau.edu.cn
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170501/6f1771a6/attachment-0001.html>

From carsonhh at gmail.com  Mon May  1 14:04:36 2017
From: carsonhh at gmail.com (Carson Holt)
Date: Mon, 1 May 2017 14:04:36 -0600
Subject: [maker-devel] Why my maker get no results?
In-Reply-To: <2017050121323023791817@cau.edu.cn>
References: <2017050121323023791817@cau.edu.cn>
Message-ID: <28772B4F-D674-49E2-BFBD-CE2651CE0454@gmail.com>

You can merge datastore indexes that way. You will need to run them separately (i.e. unmodified location and content from what MAKER gave you), and then merge the fasta and gff3 files afterwards.

?Carson


> On May 1, 2017, at 7:32 AM, dcg at cau.edu.cn wrote:
> 
> Dear sir:
>     I' have bben working on genome annotation these days.My process in as below:
>     
>     1. I split my contigs into 300 parts and deal with them simultaneously to speed up my process. 
>     2. I used my splited-genome, protein, ESTs and RNA-seq to make the first alignment( est2genome=1,  AED_threshold=0.2 ). 
>     3. Merge the maker.*_.master_datastore_index.log to get all the paths of results.
>     4. gff_merge script to merge all the results in different dirs.
> 
>     However, there is no results returned. (My genome is about 3GB, but the gff of result is none.)
>     index_all.log.all.gff    1KB
>     index_all.log.all.maker.proteins.fasta    2837KB
>     index_all.log.all.maker.transcripts.fasta    9866KB
> 
> 
> 
> 
>     Where can the problems take place?
>     Thanks!
> Yours sincerely.
> 
> Chao Chao
> dcg at cau.edu.cn <mailto:dcg at cau.edu.cn>_______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com <mailto:maker-devel at box290.bluehost.com>
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org <http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170501/1ea8b28e/attachment-0001.html>

From jim03ljy at 126.com  Wed May  3 06:15:22 2017
From: jim03ljy at 126.com (=?GBK?B?wqy98NStSmlt?=)
Date: Wed, 3 May 2017 20:15:22 +0800 (CST)
Subject: [maker-devel] RepeatMasker: NCBIBlastSearchEngine::search: Error
Message-ID: <50d2a447.a7ce.15bce3c7dcb.Coremail.jim03ljy@126.com>

Hi, I'm a newbie of maker.
I met some errors in Repeatmasker step.


The error is here:

NCBIBlastSearchEngine::search: Error...compressed subject database (/home/softwares/RepeatMasker/Libraries/20170127/general/is.lib) does not exist!


I tried ncbi+blast 2.5.0 version and 2.6.0 version as the path to blast, both have the same error.

And when I use the command as "maker -R", which skips the repeatmasker step, the maker could work.

I checked the former similar errors reported by another user and he solved the problem by  updating the RepBase.

So,

  I deleted and re-installed the RepeatMasker, updated the RepBase,  also installed RMblast.

The error is the same.


I'm stuck in the problem now.

Would highly appreciate any help - thanks!


Jinyuan Lu
Shanghai Jiao Tong University
No. 800 Dong Chuan Road,Minhang District,  Shanghai, P.R. China
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170503/ec5ea157/attachment-0001.html>

From dcg at cau.edu.cn  Wed May  3 09:29:18 2017
From: dcg at cau.edu.cn (dcg at cau.edu.cn)
Date: Wed, 3 May 2017 23:29:18 +0800
Subject: [maker-devel] How to explain the maker results?
Message-ID: <2017050323291810262239@cau.edu.cn>

Dear sir:
    I?ve been using maker to do my genome annotation. However, I still have something I can't understand:

    1. After assembly, I have many contigs. Firstly, I set est2genome=1 and protein2genome=1 , with my proteins, ESTs and RNA-seq.. Which way below is correct?
    1.1 Each contig has its own gff. I just use its own maker_gff file to get a pyu.hmm(be used in snap practice), and then, train the single contig.
    1.2 I merge all the maker_gff to produce a pyu.hmm(for snap) , and then, use this pyu.hmm to train all the contigs.
    
    2. The aim of my project is to find new protein, so I need to guarantee the rigor of my annotation.
        I  made a plan that the predicted protein should be successfully aligned to the Uniprot(reviewed protein, total number is about 30K) with 100% identity and coverage.
        However, if I choose method 1.2 as above:
        After the first step (est2genome=1 and protein2genome=1), about 1600 proteins can be 100% aligned to the Uniprot. After 2 rounds training(est2genome=0 and protein2genome=0), less proteins can be 100% aligned.
        Is my test method reasonable? Why the final results can't get more well aligned proteins?
        After training and fasta_merge, the results can be index_all.log.all.maker.proteins.fasta, index_all.log.all.maker.snap_masked.proteins.fasta, index_all.log.all.maker.non_overlapping_ab_initio.proteins.fasta,  which is the final results?

    
     I'm looking forward to hearing from you. Thanks!
Yours sincerely!

     
Chao Chao


dcg at cau.edu.cn
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170503/684f53cd/attachment-0001.html>

From d.ence at ufl.edu  Wed May  3 09:49:08 2017
From: d.ence at ufl.edu (Ence,daniel)
Date: Wed, 3 May 2017 15:49:08 +0000
Subject: [maker-devel] RepeatMasker: NCBIBlastSearchEngine::search: Error
In-Reply-To: <50d2a447.a7ce.15bce3c7dcb.Coremail.jim03ljy@126.com>
References: <50d2a447.a7ce.15bce3c7dcb.Coremail.jim03ljy@126.com>
Message-ID: <B69E8792-8153-4840-AC3C-D2E6CCFF3A45@mail.ufl.edu>

Hi, the error is regarding a specific file (is.lib) which isn?t being found. Can you verify that the file is there after you updated Repbase? Use the command:
?ls -l /home/softwares/RepeatMasker/Libraries/20170127/general/is.lib?

Thanks,
Daniel Ence


On May 3, 2017, at 8:15 AM, ???Jim <jim03ljy at 126.com<mailto:jim03ljy at 126.com>> wrote:

Hi, I'm a newbie of maker.
I met some errors in Repeatmasker step.

The error is here:
NCBIBlastSearchEngine::search: Error...compressed subject database (/home/softwares/RepeatMasker/Libraries/20170127/general/is.lib) does not exist!

I tried ncbi+blast 2.5.0 version and 2.6.0 version as the path to blast, both have the same error.
And when I use the command as "maker -R", which skips the repeatmasker step, the maker could work.
I checked the former similar errors reported by another user and he solved the problem by  updating the RepBase.
So,
  I deleted and re-installed the RepeatMasker, updated the RepBase,  also installed RMblast.
The error is the same.

I'm stuck in the problem now.
Would highly appreciate any help - thanks!

Jinyuan Lu
Shanghai Jiao Tong University
No. 800 Dong Chuan Road,Minhang District,  Shanghai, P.R. China
_______________________________________________
maker-devel mailing list
maker-devel at box290.bluehost.com<mailto:maker-devel at box290.bluehost.com>
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170503/2cf26b36/attachment-0001.html>

From carsonhh at gmail.com  Wed May  3 09:53:40 2017
From: carsonhh at gmail.com (Carson Holt)
Date: Wed, 3 May 2017 09:53:40 -0600
Subject: [maker-devel] RepeatMasker: NCBIBlastSearchEngine::search: Error
In-Reply-To: <50d2a447.a7ce.15bce3c7dcb.Coremail.jim03ljy@126.com>
References: <50d2a447.a7ce.15bce3c7dcb.Coremail.jim03ljy@126.com>
Message-ID: <92771056-564B-4953-B738-5A1B97FC71AF@gmail.com>

RepBase and RepeatMasker have changed structure with the new 4.0.7 released two months ago. The new version and RepBase is only compatible with the new version of RepeatMasker. You have to update both (complete reinstall). Or you have to use the previous version of RepeatMasker with the previous version of RepBase.

?Carson


> On May 3, 2017, at 6:15 AM, ???Jim <jim03ljy at 126.com> wrote:
> 
> Hi, I'm a newbie of maker.
> I met some errors in Repeatmasker step.
> 
> The error is here:
> NCBIBlastSearchEngine::search: Error...compressed subject database (/home/softwares/RepeatMasker/Libraries/20170127/general/is.lib) does not exist!
> 
> I tried ncbi+blast 2.5.0 version and 2.6.0 version as the path to blast, both have the same error.
> And when I use the command as "maker -R", which skips the repeatmasker step, the maker could work.
> I checked the former similar errors reported by another user and he solved the problem by  updating the RepBase.
> So,
>   I deleted and re-installed the RepeatMasker, updated the RepBase,  also installed RMblast.
> The error is the same.
> 
> I'm stuck in the problem now.
> Would highly appreciate any help - thanks!
> 
> Jinyuan Lu
> Shanghai Jiao Tong University
> No. 800 Dong Chuan Road,Minhang District,  Shanghai, P.R. China
> _______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170503/2dc895e1/attachment-0001.html>

From carsonhh at gmail.com  Wed May  3 09:55:41 2017
From: carsonhh at gmail.com (Carson Holt)
Date: Wed, 3 May 2017 09:55:41 -0600
Subject: [maker-devel] RepeatMasker: NCBIBlastSearchEngine::search: Error
In-Reply-To: <92771056-564B-4953-B738-5A1B97FC71AF@gmail.com>
References: <50d2a447.a7ce.15bce3c7dcb.Coremail.jim03ljy@126.com>
	<92771056-564B-4953-B738-5A1B97FC71AF@gmail.com>
Message-ID: <2E668C83-8884-430B-A764-CD0B44D03D19@gmail.com>

You may want to use the previous version of both as the new version may still have hidden bugs.

?Carson

> On May 3, 2017, at 9:53 AM, Carson Holt <carsonhh at gmail.com> wrote:
> 
> RepBase and RepeatMasker have changed structure with the new 4.0.7 released two months ago. The new version and RepBase is only compatible with the new version of RepeatMasker. You have to update both (complete reinstall). Or you have to use the previous version of RepeatMasker with the previous version of RepBase.
> 
> ?Carson
> 
> 
> 
>> On May 3, 2017, at 6:15 AM, ???Jim <jim03ljy at 126.com <mailto:jim03ljy at 126.com>> wrote:
>> 
>> Hi, I'm a newbie of maker.
>> I met some errors in Repeatmasker step.
>> 
>> The error is here:
>> NCBIBlastSearchEngine::search: Error...compressed subject database (/home/softwares/RepeatMasker/Libraries/20170127/general/is.lib) does not exist!
>> 
>> I tried ncbi+blast 2.5.0 version and 2.6.0 version as the path to blast, both have the same error.
>> And when I use the command as "maker -R", which skips the repeatmasker step, the maker could work.
>> I checked the former similar errors reported by another user and he solved the problem by  updating the RepBase.
>> So,
>>   I deleted and re-installed the RepeatMasker, updated the RepBase,  also installed RMblast.
>> The error is the same.
>> 
>> I'm stuck in the problem now.
>> Would highly appreciate any help - thanks!
>> 
>> Jinyuan Lu
>> Shanghai Jiao Tong University
>> No. 800 Dong Chuan Road,Minhang District,  Shanghai, P.R. China
>> _______________________________________________
>> maker-devel mailing list
>> maker-devel at box290.bluehost.com <mailto:maker-devel at box290.bluehost.com>
>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170503/2f9e37d8/attachment-0001.html>

From carsonhh at gmail.com  Wed May  3 10:04:20 2017
From: carsonhh at gmail.com (Carson Holt)
Date: Wed, 3 May 2017 10:04:20 -0600
Subject: [maker-devel] RepeatMasker: NCBIBlastSearchEngine::search: Error
In-Reply-To: <2E668C83-8884-430B-A764-CD0B44D03D19@gmail.com>
References: <50d2a447.a7ce.15bce3c7dcb.Coremail.jim03ljy@126.com>
	<92771056-564B-4953-B738-5A1B97FC71AF@gmail.com>
	<2E668C83-8884-430B-A764-CD0B44D03D19@gmail.com>
Message-ID: <BC60F425-2156-457F-B68C-576ABF778840@gmail.com>

You may have to contact RepBase via e-mail to find out how to get the libraries compatible with RepeatMasker 4.0.6 as it looks like they have removed the previous release from the website.

The last release for 4.0.6 was ?> repeatmaskerlibraries-20160829.tar.gz

?Carson


> On May 3, 2017, at 9:55 AM, Carson Holt <carsonhh at gmail.com> wrote:
> 
> You may want to use the previous version of both as the new version may still have hidden bugs.
> 
> ?Carson
> 
>> On May 3, 2017, at 9:53 AM, Carson Holt <carsonhh at gmail.com <mailto:carsonhh at gmail.com>> wrote:
>> 
>> RepBase and RepeatMasker have changed structure with the new 4.0.7 released two months ago. The new version and RepBase is only compatible with the new version of RepeatMasker. You have to update both (complete reinstall). Or you have to use the previous version of RepeatMasker with the previous version of RepBase.
>> 
>> ?Carson
>> 
>> 
>> 
>>> On May 3, 2017, at 6:15 AM, ???Jim <jim03ljy at 126.com <mailto:jim03ljy at 126.com>> wrote:
>>> 
>>> Hi, I'm a newbie of maker.
>>> I met some errors in Repeatmasker step.
>>> 
>>> The error is here:
>>> NCBIBlastSearchEngine::search: Error...compressed subject database (/home/softwares/RepeatMasker/Libraries/20170127/general/is.lib) does not exist!
>>> 
>>> I tried ncbi+blast 2.5.0 version and 2.6.0 version as the path to blast, both have the same error.
>>> And when I use the command as "maker -R", which skips the repeatmasker step, the maker could work.
>>> I checked the former similar errors reported by another user and he solved the problem by  updating the RepBase.
>>> So,
>>>   I deleted and re-installed the RepeatMasker, updated the RepBase,  also installed RMblast.
>>> The error is the same.
>>> 
>>> I'm stuck in the problem now.
>>> Would highly appreciate any help - thanks!
>>> 
>>> Jinyuan Lu
>>> Shanghai Jiao Tong University
>>> No. 800 Dong Chuan Road,Minhang District,  Shanghai, P.R. China
>>> _______________________________________________
>>> maker-devel mailing list
>>> maker-devel at box290.bluehost.com <mailto:maker-devel at box290.bluehost.com>
>>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org <http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org>
>> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170503/26d18c8b/attachment-0001.html>

From carsonhh at gmail.com  Wed May  3 10:10:48 2017
From: carsonhh at gmail.com (Carson Holt)
Date: Wed, 3 May 2017 10:10:48 -0600
Subject: [maker-devel] How to explain the maker results?
In-Reply-To: <2017050323291810262239@cau.edu.cn>
References: <2017050323291810262239@cau.edu.cn>
Message-ID: <049F8AC8-7E16-4F05-B8B2-01CA7AB88751@gmail.com>

Use the merged gff3 to train snap, otherwise you won?t have enough models.

Info on training can be found on the wiki ?> http://weatherby.genetics.utah.edu/MAKER/wiki/index.php/MAKER_Tutorial_for_GMOD_Online_Training_2014#Training_ab_initio_Gene_Predictors <http://weatherby.genetics.utah.edu/MAKER/wiki/index.php/MAKER_Tutorial_for_GMOD_Online_Training_2014#Training_ab_initio_Gene_Predictors>

Also you can find additional detailed info by searching the mailing list archives ?> http://groups.google.com/group/maker-devel <http://groups.google.com/group/maker-devel>

I?m not sure what you are asking with the last question. Alignment is not a function of training, and will not be affected by the hmm, but 100% coverage and identity is too strict a threshold even for data derived from the same species.

?Carson


> On May 3, 2017, at 9:29 AM, dcg at cau.edu.cn wrote:
> 
> Dear sir:
>     I?ve been using maker to do my genome annotation. However, I still have something I can't understand:
> 
>     1. After assembly, I have many contigs. Firstly, I set est2genome=1 and protein2genome=1 , with my proteins, ESTs and RNA-seq.. Which way below is correct?
>     1.1 Each contig has its own gff. I just use its own maker_gff file to get a pyu.hmm(be used in snap practice), and then, train the single contig.
>     1.2 I merge all the maker_gff to produce a pyu.hmm(for snap) , and then, use this pyu.hmm to train all the contigs.
>     
>     2. The aim of my project is to find new protein, so I need to guarantee the rigor of my annotation.
>         I  made a plan that the predicted protein should be successfully aligned to the Uniprot(reviewed protein, total number is about 30K) with 100% identity and coverage.
>         However, if I choose method 1.2 as above:
>         After the first step (est2genome=1 and protein2genome=1), about 1600 proteins can be 100% aligned to the Uniprot. After 2 rounds training(est2genome=0 and protein2genome=0), less proteins can be 100% aligned.
>         Is my test method reasonable? Why the final results can't get more well aligned proteins?
>         After training and fasta_merge, the results can be index_all.log.all.maker.proteins.fasta, index_all.log.all.maker.snap_masked.proteins.fasta, index_all.log.all.maker.non_overlapping_ab_initio.proteins.fasta,  which is the final results?
> 
>     
>      I'm looking forward to hearing from you. Thanks!
> Yours sincerely!
> 
>      
> Chao Chao
> dcg at cau.edu.cn <mailto:dcg at cau.edu.cn>_______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com <mailto:maker-devel at box290.bluehost.com>
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org <http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170503/75918e71/attachment-0001.html>

From nathan.ricks at gmail.com  Wed May  3 10:19:57 2017
From: nathan.ricks at gmail.com (Nathan Ricks)
Date: Wed, 3 May 2017 10:19:57 -0600
Subject: [maker-devel] Post Processing of Annotations
Message-ID: <CAJfcix0=bU1QmOX+qhfGO_YGbSARpxPHpuOsfGv9jAp8ZpyjNg@mail.gmail.com>

Hi,
I've been running your Maker pipeline, and I've reached Post Processing of
Annotations portion. In your Online training you use the output.blastp and
the outuput.iprscan files to help assign function.
My question is what format do these files need to be in.
Iprscan can produce files in a variety of formats: tsv, xml, gff3, html and
SVG
while blastp can produce the tabular, pairise, xml and a number of others.

Thanks

Nathan Ricks
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170503/5f47cc24/attachment-0001.html>

From carsonhh at gmail.com  Wed May  3 10:30:03 2017
From: carsonhh at gmail.com (Carson Holt)
Date: Wed, 3 May 2017 10:30:03 -0600
Subject: [maker-devel] Post Processing of Annotations
In-Reply-To: <CAJfcix0=bU1QmOX+qhfGO_YGbSARpxPHpuOsfGv9jAp8ZpyjNg@mail.gmail.com>
References: <CAJfcix0=bU1QmOX+qhfGO_YGbSARpxPHpuOsfGv9jAp8ZpyjNg@mail.gmail.com>
Message-ID: <DC72D273-A00B-4018-8C36-E124CAC8D606@gmail.com>

Use blastp with the the tab delimited format option and the UniProt/Swiss-Prot database. What additional filters you choose to set (i.e. e-value limit) may vary, although I would recommend 1e-6 or lower.

?Carson

> On May 3, 2017, at 10:19 AM, Nathan Ricks <nathan.ricks at gmail.com> wrote:
> 
> Hi,
> I've been running your Maker pipeline, and I've reached Post Processing of Annotations portion. In your Online training you use the output.blastp and the outuput.iprscan files to help assign function.
> My question is what format do these files need to be in.
> Iprscan can produce files in a variety of formats: tsv, xml, gff3, html and SVG
> while blastp can produce the tabular, pairise, xml and a number of others.
> 
> Thanks
> 
> Nathan Ricks
> _______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org


From d.ence at ufl.edu  Wed May  3 10:34:35 2017
From: d.ence at ufl.edu (Ence,daniel)
Date: Wed, 3 May 2017 16:34:35 +0000
Subject: [maker-devel] Post Processing of Annotations
In-Reply-To: <CAJfcix0=bU1QmOX+qhfGO_YGbSARpxPHpuOsfGv9jAp8ZpyjNg@mail.gmail.com>
References: <CAJfcix0=bU1QmOX+qhfGO_YGbSARpxPHpuOsfGv9jAp8ZpyjNg@mail.gmail.com>
Message-ID: <F9D83F09-F94E-473D-911B-6626CDC8E639@mail.ufl.edu>

Hi, The iprscan output should be in tsv format, which is tab-separated, and the usage statement for the maker_functional_gff says that the blastp output should be in ?wu-blast -mformat 2?, which I think is tabbed too.

~Daniel

 
> On May 3, 2017, at 12:19 PM, Nathan Ricks <nathan.ricks at gmail.com> wrote:
> 
> Hi,
> I've been running your Maker pipeline, and I've reached Post Processing of Annotations portion. In your Online training you use the output.blastp and the outuput.iprscan files to help assign function.
> My question is what format do these files need to be in.
> Iprscan can produce files in a variety of formats: tsv, xml, gff3, html and SVG
> while blastp can produce the tabular, pairise, xml and a number of others.
> 
> Thanks
> 
> Nathan Ricks
> _______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org


From carsonhh at gmail.com  Wed May  3 13:20:31 2017
From: carsonhh at gmail.com (Carson Holt)
Date: Wed, 3 May 2017 13:20:31 -0600
Subject: [maker-devel] Post Processing of Annotations
In-Reply-To: <CAJfcix1wnd24e8CZs5KT2j7axJOO-yhSMswJBozCSFqbo7ds3A@mail.gmail.com>
References: <CAJfcix0=bU1QmOX+qhfGO_YGbSARpxPHpuOsfGv9jAp8ZpyjNg@mail.gmail.com>
	<DC72D273-A00B-4018-8C36-E124CAC8D606@gmail.com>
	<CAJfcix1wnd24e8CZs5KT2j7axJOO-yhSMswJBozCSFqbo7ds3A@mail.gmail.com>
Message-ID: <0EE0D7F6-5F28-46E7-9AB8-CED93DC811F6@gmail.com>

The maker_functional_gff and maker_functional_fasta scripts pull specific fields out of the UniProt fasta header, so they are tied to the format used by UniProt/Swiss-Prot. At one time I had modified them to also work with NR, but that was several years ago, so I don?t know if it would still work.

?Carson


> On May 3, 2017, at 1:10 PM, Nathan Ricks <nathan.ricks at gmail.com> wrote:
> 
> Is it possible to make my own database from sequences that I have downloaded form NCBI instead of using the UniProt/Swiss-Prot?
> 
> On Wed, May 3, 2017 at 10:30 AM, Carson Holt <carsonhh at gmail.com <mailto:carsonhh at gmail.com>> wrote:
> Use blastp with the the tab delimited format option and the UniProt/Swiss-Prot database. What additional filters you choose to set (i.e. e-value limit) may vary, although I would recommend 1e-6 or lower.
> 
> ?Carson
> 
> > On May 3, 2017, at 10:19 AM, Nathan Ricks <nathan.ricks at gmail.com <mailto:nathan.ricks at gmail.com>> wrote:
> >
> > Hi,
> > I've been running your Maker pipeline, and I've reached Post Processing of Annotations portion. In your Online training you use the output.blastp and the outuput.iprscan files to help assign function.
> > My question is what format do these files need to be in.
> > Iprscan can produce files in a variety of formats: tsv, xml, gff3, html and SVG
> > while blastp can produce the tabular, pairise, xml and a number of others.
> >
> > Thanks
> >
> > Nathan Ricks
> > _______________________________________________
> > maker-devel mailing list
> > maker-devel at box290.bluehost.com <mailto:maker-devel at box290.bluehost.com>
> > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org <http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org>
> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170503/917df7db/attachment-0001.html>

From mjfi2sb3 at gmail.com  Thu May  4 00:37:52 2017
From: mjfi2sb3 at gmail.com (Salim Bougouffa)
Date: Thu, 04 May 2017 06:37:52 +0000
Subject: [maker-devel] advanced repeat masking library constructions &
 rna-seq assembly choices
Message-ID: <CAJb_6LTFg7MCZa5_+uUhKDDKMeNqa51Kvq11aQCfECA8nxr1Tg@mail.gmail.com>

Hi,

I am attempting to annotate a plant genome. I have a couple of questions:

*1) RNA-seq assembly*
a) I assembled my RNA-seq data using Trinity and StringTie. The two produce
drastically different numbers. When I compare the two assemblies for each
sample using TransRate, StringTie produces a higher score. for most of the
assemblies. I see in all of the threads that you recommend Trinity but
doesn't trinity produce way too many transcripts (even after chucking out
the "bad" ones using transrate).
b) During hint creation in MAKER, does it take into account that different
transcripts have different read coverage (expression levels). I guess my
question is should I filter transcripts that have a small read coverage.

*2) Repeat Masking *
I am following the advanced repeat library construction tutorial (
http://weatherby.genetics.utah.edu/MAKER/wiki/index.php/Repeat_Library_Construction-Advanced).
The initial steps find 15 sequences for the LTR and 159 for MITE. But, when
I get to the perl DIR_CRL/CRL_Step4.pl step, both output files
(Inner_Seq_For_BLAST.fasta, lLTRs_Seq_For_BLAST.fasta) are empty.

a) are these numbers normal because I was expecting a lot more than 16 for
the LTR?
b) I don't get any errors when I run CRL_Step4.pl yet no output. What's
going on?!

Many thanks,
/SB
-- 

____________________________
Sent from Inbox Mobile
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170504/a5f4a761/attachment-0001.html>

From jim03ljy at 126.com  Thu May  4 00:36:12 2017
From: jim03ljy at 126.com (=?GBK?B?wqy98NStSmlt?=)
Date: Thu, 4 May 2017 14:36:12 +0800 (CST)
Subject: [maker-devel] RepeatMasker: NCBIBlastSearchEngine::search: Error
In-Reply-To: <BC60F425-2156-457F-B68C-576ABF778840@gmail.com>
References: <50d2a447.a7ce.15bce3c7dcb.Coremail.jim03ljy@126.com>
	<92771056-564B-4953-B738-5A1B97FC71AF@gmail.com>
	<2E668C83-8884-430B-A764-CD0B44D03D19@gmail.com>
	<BC60F425-2156-457F-B68C-576ABF778840@gmail.com>
Message-ID: <12de56ed.5ded.15bd22c56c7.Coremail.jim03ljy@126.com>

Thanks a lot!
Problem solved. 
I matched the RepeatMasker 4.0.7 with RepBase20170127 and it worked!


Thanks!
   ----Jinyuan Lu


At 2017-05-04 00:04:20, "Carson Holt" <carsonhh at gmail.com> wrote:
You may have to contact RepBase via e-mail to find out how to get the libraries compatible with RepeatMasker 4.0.6 as it looks like they have removed the previous release from the website.


The last release for 4.0.6 was ?> repeatmaskerlibraries-20160829.tar.gz


?Carson


On May 3, 2017, at 9:55 AM, Carson Holt <carsonhh at gmail.com> wrote:


You may want to use the previous version of both as the new version may still have hidden bugs.


?Carson


On May 3, 2017, at 9:53 AM, Carson Holt <carsonhh at gmail.com> wrote:


RepBase and RepeatMasker have changed structure with the new 4.0.7 released two months ago. The new version and RepBase is only compatible with the new version of RepeatMasker. You have to update both (complete reinstall). Or you have to use the previous version of RepeatMasker with the previous version of RepBase.


?Carson


On May 3, 2017, at 6:15 AM, ???Jim <jim03ljy at 126.com> wrote:


Hi, I'm a newbie of maker.
I met some errors in Repeatmasker step.


The error is here:
NCBIBlastSearchEngine::search: Error...compressed subject database (/home/softwares/RepeatMasker/Libraries/20170127/general/is.lib) does not exist!


I tried ncbi+blast 2.5.0 version and 2.6.0 version as the path to blast, both have the same error.
And when I use the command as "maker -R", which skips the repeatmasker step, the maker could work.
I checked the former similar errors reported by another user and he solved the problem by  updating the RepBase.
So,
  I deleted and re-installed the RepeatMasker, updated the RepBase,  also installed RMblast.
The error is the same.


I'm stuck in the problem now.
Would highly appreciate any help - thanks!


Jinyuan Lu
Shanghai Jiao Tong University
No. 800 Dong Chuan Road,Minhang District,  Shanghai, P.R. China
_______________________________________________
maker-devel mailing list
maker-devel at box290.bluehost.com
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170504/2fc4c117/attachment-0001.html>

From dcg at cau.edu.cn  Fri May  5 07:43:43 2017
From: dcg at cau.edu.cn (dcg at cau.edu.cn)
Date: Fri, 5 May 2017 21:43:43 +0800
Subject: [maker-devel] How to evaluate maker proteins' quality?
Message-ID: <2017050521434331108720@cau.edu.cn>

Dear sir:
    After I finished my maker running, I should check the quality of my results.
    My annotation purpose is to find some new proteins.
    There is about 30K reviewed proteins of my species. If I want to see how many predicted proteins can support the reviewed proteins, how to do it?(Can blastp be OK? How to set the threshold? )
    I used Uniprot, ESTs and RNA-seq to do my annotation. From my perspective, if the protein is reviewed and used to train snap/augustus, we should get the same one after several training rounds. So I planned to align maker_proteins to Uniprot proteins(which I utilized to annotate). If the predicted proteins match Uniprot by 100% identity and coverage, they can be thought to support the reviewed proteins. Is it correct?    
    
    If not, maybe I can evaluate my proteins only by AED value and proteome domain?

    I'm looking forward to your help. Thanks a lot!

Yours sincerely!

Chao Chao


dcg at cau.edu.cn
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170505/0f7ef3a2/attachment-0001.html>

From carsonhh at gmail.com  Sun May  7 18:31:31 2017
From: carsonhh at gmail.com (Carson Holt)
Date: Sun, 7 May 2017 18:31:31 -0600
Subject: [maker-devel] How to evaluate maker proteins' quality?
In-Reply-To: <2017050521434331108720@cau.edu.cn>
References: <2017050521434331108720@cau.edu.cn>
Message-ID: <51620CC3-43D9-47D5-B8B3-871F291D6518@gmail.com>

Because of small differences in the assemblies, individual variants, annotated proteins used as reference being partial, as well as potential assembly error, a 100% identity expectation is too high. About 90+% would be more reasonable for a same species comparison. AED gives a good correlation with protein confidence. A perfect zero score will not happen often though since the way alignment algorithms work will leave alignment errors around splice sites and short exons. Also the evidence used is never perfect, so with AED lower values are better than higher values but can not be used as an overly specific measurement (it is only correlative and not exact).

?Carson


> On May 5, 2017, at 7:43 AM, dcg at cau.edu.cn wrote:
> 
> Dear sir:
>     After I finished my maker running, I should check the quality of my results.
>     My annotation purpose is to find some new proteins.
>     There is about 30K reviewed proteins of my species. If I want to see how many predicted proteins can support the reviewed proteins, how to do it?(Can blastp be OK? How to set the threshold? )
>     I used Uniprot, ESTs and RNA-seq to do my annotation. From my perspective, if the protein is reviewed and used to train snap/augustus, we should get the same one after several training rounds. So I planned to align maker_proteins to Uniprot proteins(which I utilized to annotate). If the predicted proteins match Uniprot by 100% identity and coverage, they can be thought to support the reviewed proteins. Is it correct?    
>     
>     If not, maybe I can evaluate my proteins only by AED value and proteome domain?
> 
>     I'm looking forward to your help. Thanks a lot!
> 
> Yours sincerely!
> 
> Chao Chao
> dcg at cau.edu.cn <mailto:dcg at cau.edu.cn>_______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com <mailto:maker-devel at box290.bluehost.com>
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org <http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170507/07783c61/attachment-0001.html>

From carsonhh at gmail.com  Sun May  7 19:17:37 2017
From: carsonhh at gmail.com (Carson Holt)
Date: Sun, 7 May 2017 19:17:37 -0600
Subject: [maker-devel] advanced repeat masking library constructions &
 rna-seq assembly choices
In-Reply-To: <CAJb_6LTFg7MCZa5_+uUhKDDKMeNqa51Kvq11aQCfECA8nxr1Tg@mail.gmail.com>
References: <CAJb_6LTFg7MCZa5_+uUhKDDKMeNqa51Kvq11aQCfECA8nxr1Tg@mail.gmail.com>
Message-ID: <18086AF2-01C3-4671-B974-C5FF36460618@gmail.com>

Michael can you answer the second question (Michael wrote the protocol, so I CC?d him).

With respect to the first question. Expression level is not necessarily relevant to the annotation process (so no MAKER does not look at read coverage). Instead we use the transcript assemblies to identify introns via splice aware alignment (yes it is the introns and not the exons we care about). Trinity has a nice option called jaccard_clip which avoids false merging of neighboring transcripts (mostly occurs in fungi where UTR can overlap). Merging of transcripts will cause extra introns to be assigned as hints as well as potential overextension of UTR during final polishing steps. The jaccard_clip option is the main reason we recommend Trinity. If Stringtie has a similar option, then it can be used as well.

Thanks,
Carson


> On May 4, 2017, at 12:37 AM, Salim Bougouffa <mjfi2sb3 at gmail.com> wrote:
> 
> Hi,
> 
> I am attempting to annotate a plant genome. I have a couple of questions:
> 
> 1) RNA-seq assembly
> a) I assembled my RNA-seq data using Trinity and StringTie. The two produce drastically different numbers. When I compare the two assemblies for each sample using TransRate, StringTie produces a higher score. for most of the assemblies. I see in all of the threads that you recommend Trinity but doesn't trinity produce way too many transcripts (even after chucking out the "bad" ones using transrate).
> b) During hint creation in MAKER, does it take into account that different transcripts have different read coverage (expression levels). I guess my question is should I filter transcripts that have a small read coverage.
> 
> 2) Repeat Masking 
> I am following the advanced repeat library construction tutorial (http://weatherby.genetics.utah.edu/MAKER/wiki/index.php/Repeat_Library_Construction-Advanced <http://weatherby.genetics.utah.edu/MAKER/wiki/index.php/Repeat_Library_Construction-Advanced>). The initial steps find 15 sequences for the LTR and 159 for MITE. But, when I get to the perl DIR_CRL/CRL_Step4.pl step, both output files (Inner_Seq_For_BLAST.fasta, lLTRs_Seq_For_BLAST.fasta) are empty.
> 
> a) are these numbers normal because I was expecting a lot more than 16 for the LTR? 
> b) I don't get any errors when I run CRL_Step4.pl yet no output. What's going on?!
> 
> Many thanks,
> /SB
> -- 
> ____________________________
> Sent from Inbox Mobile
> 
> _______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170507/28de8fb9/attachment-0001.html>

From mcampbel at cshl.edu  Sun May  7 19:24:27 2017
From: mcampbel at cshl.edu (Campbell, Michael)
Date: Mon, 8 May 2017 01:24:27 +0000
Subject: [maker-devel] advanced repeat masking library constructions &
 rna-seq assembly choices
In-Reply-To: <18086AF2-01C3-4671-B974-C5FF36460618@gmail.com>
References: <CAJb_6LTFg7MCZa5_+uUhKDDKMeNqa51Kvq11aQCfECA8nxr1Tg@mail.gmail.com>
	<18086AF2-01C3-4671-B974-C5FF36460618@gmail.com>
Message-ID: <076B034E-8107-49CE-90C7-277AA4AB4ED3@cshl.edu>

Hi SB,

I?ve added Ning Jaing to this email. She has put great effort into updating this protocol recently and will be able to address your questions better than I can.

Ning, would you mind helping out with this?

Thanks,
Mike

On May 7, 2017, at 9:17 PM, Carson Holt <carsonhh at gmail.com<mailto:carsonhh at gmail.com>> wrote:

Michael can you answer the second question (Michael wrote the protocol, so I CC?d him).

With respect to the first question. Expression level is not necessarily relevant to the annotation process (so no MAKER does not look at read coverage). Instead we use the transcript assemblies to identify introns via splice aware alignment (yes it is the introns and not the exons we care about). Trinity has a nice option called jaccard_clip which avoids false merging of neighboring transcripts (mostly occurs in fungi where UTR can overlap). Merging of transcripts will cause extra introns to be assigned as hints as well as potential overextension of UTR during final polishing steps. The jaccard_clip option is the main reason we recommend Trinity. If Stringtie has a similar option, then it can be used as well.

Thanks,
Carson


On May 4, 2017, at 12:37 AM, Salim Bougouffa <mjfi2sb3 at gmail.com<mailto:mjfi2sb3 at gmail.com>> wrote:

Hi,

I am attempting to annotate a plant genome. I have a couple of questions:

1) RNA-seq assembly
a) I assembled my RNA-seq data using Trinity and StringTie. The two produce drastically different numbers. When I compare the two assemblies for each sample using TransRate, StringTie produces a higher score. for most of the assemblies. I see in all of the threads that you recommend Trinity but doesn't trinity produce way too many transcripts (even after chucking out the "bad" ones using transrate).
b) During hint creation in MAKER, does it take into account that different transcripts have different read coverage (expression levels). I guess my question is should I filter transcripts that have a small read coverage.

2) Repeat Masking
I am following the advanced repeat library construction tutorial (http://weatherby.genetics.utah.edu/MAKER/wiki/index.php/Repeat_Library_Construction-Advanced). The initial steps find 15 sequences for the LTR and 159 for MITE. But, when I get to the perl DIR_CRL/CRL_Step4.pl step, both output files (Inner_Seq_For_BLAST.fasta, lLTRs_Seq_For_BLAST.fasta) are empty.

a) are these numbers normal because I was expecting a lot more than 16 for the LTR?
b) I don't get any errors when I run CRL_Step4.pl yet no output. What's going on?!

Many thanks,
/SB
--

____________________________
Sent from Inbox Mobile

_______________________________________________
maker-devel mailing list
maker-devel at box290.bluehost.com<mailto:maker-devel at box290.bluehost.com>
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org

_______________________________________________
maker-devel mailing list
maker-devel at box290.bluehost.com<mailto:maker-devel at box290.bluehost.com>
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org


From jiangn at msu.edu  Mon May  8 09:50:45 2017
From: jiangn at msu.edu (Jiang, Ning)
Date: Mon, 8 May 2017 15:50:45 +0000
Subject: [maker-devel] advanced repeat masking library constructions &
 rna-seq assembly choices
In-Reply-To: <076B034E-8107-49CE-90C7-277AA4AB4ED3@cshl.edu>
References: <CAJb_6LTFg7MCZa5_+uUhKDDKMeNqa51Kvq11aQCfECA8nxr1Tg@mail.gmail.com>
	<18086AF2-01C3-4671-B974-C5FF36460618@gmail.com>,
	<076B034E-8107-49CE-90C7-277AA4AB4ED3@cshl.edu>
Message-ID: <DM5PR1201MB0060CD7E9DFA122F7D1A127DCEEE0@DM5PR1201MB0060.namprd12.prod.outlook.com>

Hi Salim,


I am sorry to learn about the issues. it depends on the quality of your genome assembly for how many intact LTR elements you would get; however, 16 seems too low to me.


The inner and LTR sequence file should NOT be empty. Some times the issue could be due to that the initial sequence name is long and complicated. If that's the case for your sequences, you might want to simplify your sequence name (only including letters and numbers) and try again.


We are working on an automatic pipeline for LTR collection, if everything goes smoothly, it should be available in two to three months.


Best wishes,


Ning

________________________________
From: Campbell, Michael <mcampbel at cshl.edu>
Sent: Sunday, May 7, 2017 9:24 PM
To: Carson Holt
Cc: Salim Bougouffa; maker-devel at yandell-lab.org List; Jiang, Ning
Subject: Re: [maker-devel] advanced repeat masking library constructions & rna-seq assembly choices

Hi SB,

I?ve added Ning Jaing to this email. She has put great effort into updating this protocol recently and will be able to address your questions better than I can.

Ning, would you mind helping out with this?

Thanks,
Mike

On May 7, 2017, at 9:17 PM, Carson Holt <carsonhh at gmail.com<mailto:carsonhh at gmail.com>> wrote:

Michael can you answer the second question (Michael wrote the protocol, so I CC?d him).

With respect to the first question. Expression level is not necessarily relevant to the annotation process (so no MAKER does not look at read coverage). Instead we use the transcript assemblies to identify introns via splice aware alignment (yes it is the introns and not the exons we care about). Trinity has a nice option called jaccard_clip which avoids false merging of neighboring transcripts (mostly occurs in fungi where UTR can overlap). Merging of transcripts will cause extra introns to be assigned as hints as well as potential overextension of UTR during final polishing steps. The jaccard_clip option is the main reason we recommend Trinity. If Stringtie has a similar option, then it can be used as well.

Thanks,
Carson


On May 4, 2017, at 12:37 AM, Salim Bougouffa <mjfi2sb3 at gmail.com<mailto:mjfi2sb3 at gmail.com>> wrote:

Hi,

I am attempting to annotate a plant genome. I have a couple of questions:

1) RNA-seq assembly
a) I assembled my RNA-seq data using Trinity and StringTie. The two produce drastically different numbers. When I compare the two assemblies for each sample using TransRate, StringTie produces a higher score. for most of the assemblies. I see in all of the threads that you recommend Trinity but doesn't trinity produce way too many transcripts (even after chucking out the "bad" ones using transrate).
b) During hint creation in MAKER, does it take into account that different transcripts have different read coverage (expression levels). I guess my question is should I filter transcripts that have a small read coverage.

2) Repeat Masking
I am following the advanced repeat library construction tutorial (http://weatherby.genetics.utah.edu/MAKER/wiki/index.php/Repeat_Library_Construction-Advanced). The initial steps find 15 sequences for the LTR and 159 for MITE. But, when I get to the perl DIR_CRL/CRL_Step4.pl step, both output files (Inner_Seq_For_BLAST.fasta, lLTRs_Seq_For_BLAST.fasta) are empty.

a) are these numbers normal because I was expecting a lot more than 16 for the LTR?
b) I don't get any errors when I run CRL_Step4.pl yet no output. What's going on?!

Many thanks,
/SB
--

____________________________
Sent from Inbox Mobile

_______________________________________________
maker-devel mailing list
maker-devel at box290.bluehost.com<mailto:maker-devel at box290.bluehost.com>
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org

_______________________________________________
maker-devel mailing list
maker-devel at box290.bluehost.com<mailto:maker-devel at box290.bluehost.com>
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170508/66453bce/attachment-0001.html>

From mjfi2sb3 at gmail.com  Mon May  8 10:41:51 2017
From: mjfi2sb3 at gmail.com (Salim Bougouffa)
Date: Mon, 08 May 2017 16:41:51 +0000
Subject: [maker-devel] advanced repeat masking library constructions &
 rna-seq assembly choices
In-Reply-To: <DM5PR1201MB0060CD7E9DFA122F7D1A127DCEEE0@DM5PR1201MB0060.namprd12.prod.outlook.com>
References: <CAJb_6LTFg7MCZa5_+uUhKDDKMeNqa51Kvq11aQCfECA8nxr1Tg@mail.gmail.com>
	<18086AF2-01C3-4671-B974-C5FF36460618@gmail.com>
	<076B034E-8107-49CE-90C7-277AA4AB4ED3@cshl.edu>
	<DM5PR1201MB0060CD7E9DFA122F7D1A127DCEEE0@DM5PR1201MB0060.namprd12.prod.outlook.com>
Message-ID: <CAJb_6LQFuL0WtMh1oM44Smdc0Rp0dGHeAo-gcRWXXv3cFgu8zg@mail.gmail.com>

Thank you all for your responses.

Regards,
/SB

On Mon, 8 May 2017, 18:50 Jiang, Ning, <jiangn at msu.edu> wrote:

> Hi Salim,
>
>
> I am sorry to learn about the issues. it depends on the quality of your
> genome assembly for how many intact LTR elements you would get; however, 16
> seems too low to me.
>
>
> The inner and LTR sequence file should NOT be empty. Some times the issue
> could be due to that the initial sequence name is long and complicated. If
> that's the case for your sequences, you might want to simplify your
> sequence name (only including letters and numbers) and try again.
>
>
> We are working on an automatic pipeline for LTR collection, if everything
> goes smoothly, it should be available in two to three months.
>
>
> Best wishes,
>
>
> Ning
> ------------------------------
> *From:* Campbell, Michael <mcampbel at cshl.edu>
> *Sent:* Sunday, May 7, 2017 9:24 PM
> *To:* Carson Holt
> *Cc:* Salim Bougouffa; maker-devel at yandell-lab.org List; Jiang, Ning
> *Subject:* Re: [maker-devel] advanced repeat masking library
> constructions & rna-seq assembly choices
>
> Hi SB,
>
> I?ve added Ning Jaing to this email. She has put great effort into
> updating this protocol recently and will be able to address your questions
> better than I can.
>
> Ning, would you mind helping out with this?
>
> Thanks,
> Mike
>
> On May 7, 2017, at 9:17 PM, Carson Holt <carsonhh at gmail.com<mailto:
> carsonhh at gmail.com>> wrote:
>
> Michael can you answer the second question (Michael wrote the protocol, so
> I CC?d him).
>
> With respect to the first question. Expression level is not necessarily
> relevant to the annotation process (so no MAKER does not look at read
> coverage). Instead we use the transcript assemblies to identify introns via
> splice aware alignment (yes it is the introns and not the exons we care
> about). Trinity has a nice option called jaccard_clip which avoids false
> merging of neighboring transcripts (mostly occurs in fungi where UTR can
> overlap). Merging of transcripts will cause extra introns to be assigned as
> hints as well as potential overextension of UTR during final polishing
> steps. The jaccard_clip option is the main reason we recommend Trinity. If
> Stringtie has a similar option, then it can be used as well.
>
> Thanks,
> Carson
>
>
>
> On May 4, 2017, at 12:37 AM, Salim Bougouffa <mjfi2sb3 at gmail.com<mailto:
> mjfi2sb3 at gmail.com>> wrote:
>
> Hi,
>
> I am attempting to annotate a plant genome. I have a couple of questions:
>
> 1) RNA-seq assembly
> a) I assembled my RNA-seq data using Trinity and StringTie. The two
> produce drastically different numbers. When I compare the two assemblies
> for each sample using TransRate, StringTie produces a higher score. for
> most of the assemblies. I see in all of the threads that you recommend
> Trinity but doesn't trinity produce way too many transcripts (even after
> chucking out the "bad" ones using transrate).
> b) During hint creation in MAKER, does it take into account that different
> transcripts have different read coverage (expression levels). I guess my
> question is should I filter transcripts that have a small read coverage.
>
> 2) Repeat Masking
> I am following the advanced repeat library construction tutorial (
> http://weatherby.genetics.utah.edu/MAKER/wiki/index.php/Repeat_Library_Construction-Advanced).
> The initial steps find 15 sequences for the LTR and 159 for MITE. But, when
> I get to the perl DIR_CRL/CRL_Step4.pl step, both output files
> (Inner_Seq_For_BLAST.fasta, lLTRs_Seq_For_BLAST.fasta) are empty.
>
> a) are these numbers normal because I was expecting a lot more than 16 for
> the LTR?
> b) I don't get any errors when I run CRL_Step4.pl yet no output. What's
> going on?!
>
> Many thanks,
> /SB
> --
>
> ____________________________
> Sent from Inbox Mobile
>
> _______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com<mailto:maker-devel at box290.bluehost.com>
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org
>
> _______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com<mailto:maker-devel at box290.bluehost.com>
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170508/0cfc9de1/attachment-0001.html>

From qwzhang0601 at gmail.com  Wed May 10 09:48:01 2017
From: qwzhang0601 at gmail.com (Quanwei Zhang)
Date: Wed, 10 May 2017 11:48:01 -0400
Subject: [maker-devel] want coding sequences
Message-ID: <CAOW6FSJFhCP8-V7L9N6RLnnyvtoa3SvQZaZY9Phi0-stXe+vGQ@mail.gmail.com>

Hello:

Thanks for development and maintenance of the tool "Maker2".  We have used
Maker2 to do genome annotation of a new rodent species. Now we are doing
downstream analysis, which requires inputs of coding sequences from
different species.

I found the outputs I got from Maker2 only include protein sequences and
transcripts. Is there an easy way that I can get the coding sequences for
our annotated genome?

Many thanks

Best
Quanwei
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170510/ef5f7263/attachment-0001.html>

From munholl at uwindsor.ca  Wed May 10 14:24:49 2017
From: munholl at uwindsor.ca (Seth Munholland)
Date: Wed, 10 May 2017 16:24:49 -0400
Subject: [maker-devel] MAKER only running 1 task
Message-ID: <CAL=sJwroONDyNp_Y2mbgs-E3DYh0qxZAcn3LkEtvdRW37jLLVA@mail.gmail.com>

Hello,

I'm running a MAKER annotation on an ubuntu cluster and my top screen shows
the following:

Tasks: 831 total,   3 running, 826 sleeping,   0 stopped,   1 zombie

and my maker run (on a screen) shows:

...
total clusters:4 now processing 0
 ...processing 0 of 3
 ...processing 1 of 3
 ...processing 2 of 3
total clusters:4 now processing 0
flattening protein clusters
prepare section files
merging blast reports...
flattening protein clusters
prepare section files
merging blast reports...
flattening protein clusters
prepare section files

Of the 826, the vast majority of them are maker and only one of the running
tasks is maker.  Is this normal behaviour or has my maker run stopped
processing?

Seth Munholland, B.Sc.
Department of Biological Sciences
Rm. 304 Biology Building
University of Windsor
401 Sunset Ave. N9B 3P4
T: (519) 253-3000 Ext: 4755
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170510/b5663dbe/attachment-0001.html>

From carsonhh at gmail.com  Thu May 11 09:58:59 2017
From: carsonhh at gmail.com (Carson Holt)
Date: Thu, 11 May 2017 09:58:59 -0600
Subject: [maker-devel] want coding sequences
In-Reply-To: <CAOW6FSJFhCP8-V7L9N6RLnnyvtoa3SvQZaZY9Phi0-stXe+vGQ@mail.gmail.com>
References: <CAOW6FSJFhCP8-V7L9N6RLnnyvtoa3SvQZaZY9Phi0-stXe+vGQ@mail.gmail.com>
Message-ID: <82DF4E49-8E78-45F6-8A78-01A45F908987@gmail.com>

Use the fasta_tool utility with ?trim_maker_utr to get just the CDS part of each transcript.

?Carson


> On May 10, 2017, at 9:48 AM, Quanwei Zhang <qwzhang0601 at gmail.com> wrote:
> 
> Hello:
> 
> Thanks for development and maintenance of the tool "Maker2".  We have used Maker2 to do genome annotation of a new rodent species. Now we are doing downstream analysis, which requires inputs of coding sequences from different species. 
> 
> I found the outputs I got from Maker2 only include protein sequences and transcripts. Is there an easy way that I can get the coding sequences for our annotated genome?
> 
> Many thanks
> 
> Best
> Quanwei
> _______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org


From carsonhh at gmail.com  Thu May 11 10:03:04 2017
From: carsonhh at gmail.com (Carson Holt)
Date: Thu, 11 May 2017 10:03:04 -0600
Subject: [maker-devel] MAKER only running 1 task
In-Reply-To: <CAL=sJwroONDyNp_Y2mbgs-E3DYh0qxZAcn3LkEtvdRW37jLLVA@mail.gmail.com>
References: <CAL=sJwroONDyNp_Y2mbgs-E3DYh0qxZAcn3LkEtvdRW37jLLVA@mail.gmail.com>
Message-ID: <2119AF6E-3571-4D28-9D81-B24A513839E5@gmail.com>

It may be frozen, or if it is on the last contig it can be running a non-paralelizable step (very last cluster merging step for each contig is not paralelizable). So on a large contig the very last step can take a little while, and if there are no other contigs, then there is no work to give to other processes to keep them busy in the meantime. So everyone has to wait so they can all exit together once the last step is done. But as I said, this will only happen if you are on the last contig and it is large. Otherwise it is probably frozen somehow (look for any errors further up the log).

?Carson


> On May 10, 2017, at 2:24 PM, Seth Munholland <munholl at uwindsor.ca> wrote:
> 
> Hello,
> 
> I'm running a MAKER annotation on an ubuntu cluster and my top screen shows the following:
> 
> Tasks: 831 total,   3 running, 826 sleeping,   0 stopped,   1 zombie
> 
> and my maker run (on a screen) shows:
> 
> ...
> total clusters:4 now processing 0
>  ...processing 0 of 3
>  ...processing 1 of 3
>  ...processing 2 of 3
> total clusters:4 now processing 0
> flattening protein clusters
> prepare section files
> merging blast reports...
> flattening protein clusters
> prepare section files
> merging blast reports...
> flattening protein clusters
> prepare section files
> 
> Of the 826, the vast majority of them are maker and only one of the running tasks is maker.  Is this normal behaviour or has my maker run stopped processing?
> 
> Seth Munholland, B.Sc.
> Department of Biological Sciences
> Rm. 304 Biology Building
> University of Windsor
> 401 Sunset Ave. N9B 3P4
> T: (519) 253-3000 Ext: 4755 <>_______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170511/f218b2f3/attachment-0001.html>

From mnaymik at tgen.org  Thu May 11 12:29:46 2017
From: mnaymik at tgen.org (Marcus Naymik)
Date: Thu, 11 May 2017 11:29:46 -0700
Subject: [maker-devel] Maker gene vs snap match in final GFF's
Message-ID: <CAMY2gz5vm7hZ4n2OoUSPsfwjmdCY0V6Y1iDdf3bi7=mpGXHDiQ@mail.gmail.com>

In the final GFF annotations what is the difference between a 'gene' from
maker and a 'match' from snap?

-- 
 

*This electronic message is intended to be for the use only of the named 
recipient, and may contain information that is confidential or privileged, 
including patient health information. If you are not the intended 
recipient, you are hereby notified that any disclosure, copying, 
distribution or use of the contents of this message is strictly prohibited. 
If you have received this message in error or are not the named recipient, 
please notify us immediately by contacting the sender at the electronic 
mail address noted above, and delete and destroy all copies of this 
message. Thank you.*
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170511/52fd0d90/attachment-0001.html>

From carsonhh at gmail.com  Thu May 11 12:33:55 2017
From: carsonhh at gmail.com (Carson Holt)
Date: Thu, 11 May 2017 12:33:55 -0600
Subject: [maker-devel] Maker gene vs snap match in final GFF's
In-Reply-To: <CAMY2gz5vm7hZ4n2OoUSPsfwjmdCY0V6Y1iDdf3bi7=mpGXHDiQ@mail.gmail.com>
References: <CAMY2gz5vm7hZ4n2OoUSPsfwjmdCY0V6Y1iDdf3bi7=mpGXHDiQ@mail.gmail.com>
Message-ID: <8A77F384-9BAD-4BF4-BD1E-EDAE4E010612@gmail.com>

MAKER results can be the result of additional hints sent to SNAP together with post processing to add UTR and additional exons that have support form transcript evidence. MAKER results will also have support from either protein or EST/mRNA evidence. SNAP match is simply the raw ab initio call made by SNAP (no hints, no post processing, and may or may not have evidence supporting the structure). They are there just for reference purposes. so you know what SNAP will produce outside of MAKER given the underlying HMM.

?Carson


> On May 11, 2017, at 12:29 PM, Marcus Naymik <mnaymik at tgen.org> wrote:
> 
> In the final GFF annotations what is the difference between a 'gene' from maker and a 'match' from snap?
> 
> This electronic message is intended to be for the use only of the named recipient, and may contain information that is confidential or privileged, including patient health information. If you are not the intended recipient, you are hereby notified that any disclosure, copying, distribution or use of the contents of this message is strictly prohibited. If you have received this message in error or are not the named recipient, please notify us immediately by contacting the sender at the electronic mail address noted above, and delete and destroy all copies of this message. Thank you.
> 
> _______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170511/42fb9023/attachment-0001.html>

From d.ence at ufl.edu  Thu May 11 12:35:00 2017
From: d.ence at ufl.edu (Ence,daniel)
Date: Thu, 11 May 2017 18:35:00 +0000
Subject: [maker-devel] Maker gene vs snap match in final GFF's
In-Reply-To: <CAMY2gz5vm7hZ4n2OoUSPsfwjmdCY0V6Y1iDdf3bi7=mpGXHDiQ@mail.gmail.com>
References: <CAMY2gz5vm7hZ4n2OoUSPsfwjmdCY0V6Y1iDdf3bi7=mpGXHDiQ@mail.gmail.com>
Message-ID: <2560F44E-3D8D-4E81-B7B9-621921408B61@mail.ufl.edu>

The two might have identical coordinates in some cases, but they are different kinds of features. The ?match? is a product of an abinitio gene prediction algorithm, while the ?gene? is is supported by evidence and passed through the maker polishing and filtering steps.


On May 11, 2017, at 2:29 PM, Marcus Naymik <mnaymik at tgen.org<mailto:mnaymik at tgen.org>> wrote:

In the final GFF annotations what is the difference between a 'gene' from maker and a 'match' from snap?


This electronic message is intended to be for the use only of the named recipient, and may contain information that is confidential or privileged, including patient health information. If you are not the intended recipient, you are hereby notified that any disclosure, copying, distribution or use of the contents of this message is strictly prohibited. If you have received this message in error or are not the named recipient, please notify us immediately by contacting the sender at the electronic mail address noted above, and delete and destroy all copies of this message. Thank you.

_______________________________________________
maker-devel mailing list
maker-devel at box290.bluehost.com<mailto:maker-devel at box290.bluehost.com>
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170511/f577dab6/attachment-0001.html>

From nathan.ricks at gmail.com  Fri May 12 15:05:45 2017
From: nathan.ricks at gmail.com (Nathan Ricks)
Date: Fri, 12 May 2017 15:05:45 -0600
Subject: [maker-devel] Using mpich2 with Maker
Message-ID: <CAJfcix0T1mVc4Rxz+7fbEbf_he78Aqj9UgA4uYKtKAZsdE8sCA@mail.gmail.com>

I've been using maker for some time now. However, I would like to speed up
the process by using the mpich2 option. When use the command ./Build
install, the following error is produced. Any help would be appreciated.

Nathan Ricks
[image: Inline image 1]
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170512/ffbe42da/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image.png
Type: image/png
Size: 170139 bytes
Desc: not available
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170512/ffbe42da/attachment-0001.png>

From yuejiaxing at gmail.com  Mon May 15 05:28:47 2017
From: yuejiaxing at gmail.com (Jia-Xing Yue)
Date: Mon, 15 May 2017 13:28:47 +0200
Subject: [maker-devel] multiple overlapped snoRNA genes got annotated by
 maker
Message-ID: <CAETgyPSWXrZPO=nrkgT5MomgJELf-0ze7yu0A_NCsVqTnLYkDA@mail.gmail.com>

Hello,

I configured snoscan (v.0.9.1) for my maker installation (v2.31.9) and run
the annotation for a yeast (S. cerevisiae) genome. I think the annotation
went well with regard to tRNAs and protein-coding genes but I am not sure
about snoRNAs. I found multiple overlapped snoRNA genes were annotated by
maker as the example below shows. I was wondering if this is expected. If
not, what might have caused this problem and is there a way to work around.
Thanks in advance!

chrIX   maker   gene    4328    4416    .       +       .
ID=snoscan-chrIX-noncoding-gene-0.49;Name=snoscan-chrIX-noncoding-gene-0.49
chrIX   maker   snoRNA  4328    4416    .       +       .
ID=snoscan-chrIX-noncoding-gene-0.49-snoRNA-1;Parent=snoscan-chrIX-noncoding-gene-0.49;Name=snoscan-chrIX-noncoding-gene-0.49-snoRNA-1;_AED=1.00;_eAED=1.00;_QI=0|-1|0|0|-1|0|1|90|0
chrIX   maker   exon    4328    4416    .       +       .
ID=snoscan-chrIX-noncoding-gene-0.49-snoRNA-1:exon:12260;Parent=snoscan-chrIX-noncoding-gene-0.49-snoRNA-1
chrIX   maker   gene    4375    4563    .       +       .
ID=snoscan-chrIX-noncoding-gene-0.50;Name=snoscan-chrIX-noncoding-gene-0.50
chrIX   maker   snoRNA  4375    4563    .       +       .
ID=snoscan-chrIX-noncoding-gene-0.50-snoRNA-1;Parent=snoscan-chrIX-noncoding-gene-0.50;Name=snoscan-chrIX-noncoding-gene-0.50-snoRNA-1;_AED=1.00;_eAED=1.00;_QI=0|-1|0|0|-1|0|1|190|0
chrIX   maker   exon    4375    4563    .       +       .
ID=snoscan-chrIX-noncoding-gene-0.50-snoRNA-1:exon:12261;Parent=snoscan-chrIX-noncoding-gene-0.50-snoRNA-1
chrIX   maker   gene    4375    4461    .       +       .
ID=snoscan-chrIX-noncoding-gene-0.51;Name=snoscan-chrIX-noncoding-gene-0.51
chrIX   maker   snoRNA  4375    4461    .       +       .
ID=snoscan-chrIX-noncoding-gene-0.51-snoRNA-1;Parent=snoscan-chrIX-noncoding-gene-0.51;Name=snoscan-chrIX-noncoding-gene-0.51-snoRNA-1;_AED=1.00;_eAED=1.00;_QI=0|-1|0|0|-1|0|1|88|0
chrIX   maker   exon    4375    4461    .       +       .
ID=snoscan-chrIX-noncoding-gene-0.51-snoRNA-1:exon:12262;Parent=snoscan-chrIX-noncoding-gene-0.51-snoRNA-1
chrIX   maker   gene    4375    4491    .       +       .
ID=snoscan-chrIX-noncoding-gene-0.52;Name=snoscan-chrIX-noncoding-gene-0.52
chrIX   maker   snoRNA  4375    4491    .       +       .
ID=snoscan-chrIX-noncoding-gene-0.52-snoRNA-1;Parent=snoscan-chrIX-noncoding-gene-0.52;Name=snoscan-chrIX-noncoding-gene-0.52-snoRNA-1;_AED=1.00;_eAED=1.00;_QI=0|-1|0|0|-1|0|1|118|0
chrIX   maker   exon    4375    4491    .       +       .
ID=snoscan-chrIX-noncoding-gene-0.52-snoRNA-1:exon:12263;Parent=snoscan-chrIX-noncoding-gene-0.52-snoRNA-1
chrIX   maker   gene    4375    4500    .       +       .
ID=snoscan-chrIX-noncoding-gene-0.53;Name=snoscan-chrIX-noncoding-gene-0.53
chrIX   maker   snoRNA  4375    4500    .       +       .
ID=snoscan-chrIX-noncoding-gene-0.53-snoRNA-1;Parent=snoscan-chrIX-noncoding-gene-0.53;Name=snoscan-chrIX-noncoding-gene-0.53-snoRNA-1;_AED=1.00;_eAED=1.00;_QI=0|-1|0|0|-1|0|1|127|0
chrIX   maker   exon    4375    4500    .       +       .
ID=snoscan-chrIX-noncoding-gene-0.53-snoRNA-1:exon:12264;Parent=snoscan-chrIX-noncoding-gene-0.53-snoRNA-1

Best,
Jia-Xing


-- 
Jia-Xing Yue

Population Genomics and Complex Traits Group
Tour Pasteur 8eme etage
Facult? de M?decine
Institute for Research on Cancer and Aging, Nice (IRCAN)
CNRS UMR 7284 - INSERM U 1081 - Universit? C?te d?Azur (UCA)
28 Avenue de Valombrose
06107 NICE Cedex 2
France

Personal website: http://www.iamphioxus.org/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170515/fb928f1e/attachment-0001.html>

From michael.s.campbell1 at gmail.com  Mon May 15 10:28:58 2017
From: michael.s.campbell1 at gmail.com (Michael Campbell)
Date: Mon, 15 May 2017 12:28:58 -0400
Subject: [maker-devel] multiple overlapped snoRNA genes got annotated by
 maker
In-Reply-To: <CAETgyPSWXrZPO=nrkgT5MomgJELf-0ze7yu0A_NCsVqTnLYkDA@mail.gmail.com>
References: <CAETgyPSWXrZPO=nrkgT5MomgJELf-0ze7yu0A_NCsVqTnLYkDA@mail.gmail.com>
Message-ID: <0AC3F89F-28EB-4E2A-ADDE-2DF8BD625416@gmail.com>

Hi Jia-Xing,

That has been my experience in the past as well. For the non-coding RNAs tRNA-scan is very accurate while snoscan seems to be quite sensitive but very specific. Did you give it a ?snoscan_meth? file? Giving it a snoscan_meth file will help with accuracy. The biggest gains in accuracy are from small RNA-seq data. In the paper where we used snoscan on maize we didn?t keep any snoRNA predictions that didn?t have support from small RNA-seq data, in practical terms we got rid of anything with a AED of 1.

I hope this helps,
Mike
> On May 15, 2017, at 7:28 AM, Jia-Xing Yue <yuejiaxing at gmail.com> wrote:
> 
> Hello,
> 
> I configured snoscan (v.0.9.1) for my maker installation (v2.31.9) and run the annotation for a yeast (S. cerevisiae) genome. I think the annotation went well with regard to tRNAs and protein-coding genes but I am not sure about snoRNAs. I found multiple overlapped snoRNA genes were annotated by maker as the example below shows. I was wondering if this is expected. If not, what might have caused this problem and is there a way to work around. Thanks in advance!
> 
> chrIX   maker   gene    4328    4416    .       +       .       ID=snoscan-chrIX-noncoding-gene-0.49;Name=snoscan-chrIX-noncoding-gene-0.49
> chrIX   maker   snoRNA  4328    4416    .       +       .       ID=snoscan-chrIX-noncoding-gene-0.49-snoRNA-1;Parent=snoscan-chrIX-noncoding-gene-0.49;Name=snoscan-chrIX-noncoding-gene-0.49-snoRNA-1;_AED=1.00;_eAED=1.00;_QI=0|-1|0|0|-1|0|1|90|0
> chrIX   maker   exon    4328    4416    .       +       .       ID=snoscan-chrIX-noncoding-gene-0.49-snoRNA-1:exon:12260;Parent=snoscan-chrIX-noncoding-gene-0.49-snoRNA-1
> chrIX   maker   gene    4375    4563    .       +       .       ID=snoscan-chrIX-noncoding-gene-0.50;Name=snoscan-chrIX-noncoding-gene-0.50
> chrIX   maker   snoRNA  4375    4563    .       +       .       ID=snoscan-chrIX-noncoding-gene-0.50-snoRNA-1;Parent=snoscan-chrIX-noncoding-gene-0.50;Name=snoscan-chrIX-noncoding-gene-0.50-snoRNA-1;_AED=1.00;_eAED=1.00;_QI=0|-1|0|0|-1|0|1|190|0
> chrIX   maker   exon    4375    4563    .       +       .       ID=snoscan-chrIX-noncoding-gene-0.50-snoRNA-1:exon:12261;Parent=snoscan-chrIX-noncoding-gene-0.50-snoRNA-1
> chrIX   maker   gene    4375    4461    .       +       .       ID=snoscan-chrIX-noncoding-gene-0.51;Name=snoscan-chrIX-noncoding-gene-0.51
> chrIX   maker   snoRNA  4375    4461    .       +       .       ID=snoscan-chrIX-noncoding-gene-0.51-snoRNA-1;Parent=snoscan-chrIX-noncoding-gene-0.51;Name=snoscan-chrIX-noncoding-gene-0.51-snoRNA-1;_AED=1.00;_eAED=1.00;_QI=0|-1|0|0|-1|0|1|88|0
> chrIX   maker   exon    4375    4461    .       +       .       ID=snoscan-chrIX-noncoding-gene-0.51-snoRNA-1:exon:12262;Parent=snoscan-chrIX-noncoding-gene-0.51-snoRNA-1
> chrIX   maker   gene    4375    4491    .       +       .       ID=snoscan-chrIX-noncoding-gene-0.52;Name=snoscan-chrIX-noncoding-gene-0.52
> chrIX   maker   snoRNA  4375    4491    .       +       .       ID=snoscan-chrIX-noncoding-gene-0.52-snoRNA-1;Parent=snoscan-chrIX-noncoding-gene-0.52;Name=snoscan-chrIX-noncoding-gene-0.52-snoRNA-1;_AED=1.00;_eAED=1.00;_QI=0|-1|0|0|-1|0|1|118|0
> chrIX   maker   exon    4375    4491    .       +       .       ID=snoscan-chrIX-noncoding-gene-0.52-snoRNA-1:exon:12263;Parent=snoscan-chrIX-noncoding-gene-0.52-snoRNA-1
> chrIX   maker   gene    4375    4500    .       +       .       ID=snoscan-chrIX-noncoding-gene-0.53;Name=snoscan-chrIX-noncoding-gene-0.53
> chrIX   maker   snoRNA  4375    4500    .       +       .       ID=snoscan-chrIX-noncoding-gene-0.53-snoRNA-1;Parent=snoscan-chrIX-noncoding-gene-0.53;Name=snoscan-chrIX-noncoding-gene-0.53-snoRNA-1;_AED=1.00;_eAED=1.00;_QI=0|-1|0|0|-1|0|1|127|0
> chrIX   maker   exon    4375    4500    .       +       .       ID=snoscan-chrIX-noncoding-gene-0.53-snoRNA-1:exon:12264;Parent=snoscan-chrIX-noncoding-gene-0.53-snoRNA-1
> 
> Best,
> Jia-Xing
> 
> 
> -- 
> Jia-Xing Yue 
> 
> Population Genomics and Complex Traits Group
> Tour Pasteur 8eme etage
> Facult? de M?decine
> Institute for Research on Cancer and Aging, Nice (IRCAN)
> CNRS UMR 7284 - INSERM U 1081 - Universit? C?te d?Azur (UCA)
> 28 Avenue de Valombrose
> 06107 NICE Cedex 2
> France
> 
> Personal website: http://www.iamphioxus.org/ <http://www.iamphioxus.org/>
> 
> _______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170515/0ea2c0e5/attachment-0001.html>

From yuejiaxing at gmail.com  Mon May 15 11:14:26 2017
From: yuejiaxing at gmail.com (Jia-Xing Yue)
Date: Mon, 15 May 2017 19:14:26 +0200
Subject: [maker-devel] multiple overlapped snoRNA genes got annotated by
 maker
In-Reply-To: <0AC3F89F-28EB-4E2A-ADDE-2DF8BD625416@gmail.com>
References: <CAETgyPSWXrZPO=nrkgT5MomgJELf-0ze7yu0A_NCsVqTnLYkDA@mail.gmail.com>
	<0AC3F89F-28EB-4E2A-ADDE-2DF8BD625416@gmail.com>
Message-ID: <CAETgyPQOwmGsXPGyAcUcEOf_-LHoOvqzJQuKHLTocRYE6tyVtg@mail.gmail.com>

Hi Michael,

Many thanks for the information! I will specify the "snoscan_meth" file and
give it another try then. I majorly want to use maker to annotate
protein-coding genes and tRNAs. But it would be nice to have snoRNA
reasonably annotated as well.
Thanks gain and have a great day!

Best,
Jia-Xing


On Mon, May 15, 2017 at 6:28 PM, Michael Campbell <
michael.s.campbell1 at gmail.com> wrote:

> Hi Jia-Xing,
>
> That has been my experience in the past as well. For the non-coding RNAs
> tRNA-scan is very accurate while snoscan seems to be quite sensitive but
> very specific. Did you give it a ?snoscan_meth? file? Giving it
> a snoscan_meth file will help with accuracy. The biggest gains in accuracy
> are from small RNA-seq data. In the paper where we used snoscan on maize we
> didn?t keep any snoRNA predictions that didn?t have support from small
> RNA-seq data, in practical terms we got rid of anything with a AED of 1.
>
> I hope this helps,
> Mike
>
> On May 15, 2017, at 7:28 AM, Jia-Xing Yue <yuejiaxing at gmail.com> wrote:
>
> Hello,
>
> I configured snoscan (v.0.9.1) for my maker installation (v2.31.9) and run
> the annotation for a yeast (S. cerevisiae) genome. I think the annotation
> went well with regard to tRNAs and protein-coding genes but I am not sure
> about snoRNAs. I found multiple overlapped snoRNA genes were annotated by
> maker as the example below shows. I was wondering if this is expected. If
> not, what might have caused this problem and is there a way to work around.
> Thanks in advance!
>
> chrIX   maker   gene    4328    4416    .       +       .
> ID=snoscan-chrIX-noncoding-gene-0.49;Name=snoscan-chrIX-
> noncoding-gene-0.49
> chrIX   maker   snoRNA  4328    4416    .       +       .
> ID=snoscan-chrIX-noncoding-gene-0.49-snoRNA-1;Parent=
> snoscan-chrIX-noncoding-gene-0.49;Name=snoscan-chrIX-
> noncoding-gene-0.49-snoRNA-1;_AED=1.00;_eAED=1.00;_QI=0|-1|0|0|-1|0|1|90|0
> chrIX   maker   exon    4328    4416    .       +       .
> ID=snoscan-chrIX-noncoding-gene-0.49-snoRNA-1:exon:12260;
> Parent=snoscan-chrIX-noncoding-gene-0.49-snoRNA-1
> chrIX   maker   gene    4375    4563    .       +       .
> ID=snoscan-chrIX-noncoding-gene-0.50;Name=snoscan-chrIX-
> noncoding-gene-0.50
> chrIX   maker   snoRNA  4375    4563    .       +       .
> ID=snoscan-chrIX-noncoding-gene-0.50-snoRNA-1;Parent=
> snoscan-chrIX-noncoding-gene-0.50;Name=snoscan-chrIX-
> noncoding-gene-0.50-snoRNA-1;_AED=1.00;_eAED=1.00;_QI=0|-1|
> 0|0|-1|0|1|190|0
> chrIX   maker   exon    4375    4563    .       +       .
> ID=snoscan-chrIX-noncoding-gene-0.50-snoRNA-1:exon:12261;
> Parent=snoscan-chrIX-noncoding-gene-0.50-snoRNA-1
> chrIX   maker   gene    4375    4461    .       +       .
> ID=snoscan-chrIX-noncoding-gene-0.51;Name=snoscan-chrIX-
> noncoding-gene-0.51
> chrIX   maker   snoRNA  4375    4461    .       +       .
> ID=snoscan-chrIX-noncoding-gene-0.51-snoRNA-1;Parent=
> snoscan-chrIX-noncoding-gene-0.51;Name=snoscan-chrIX-
> noncoding-gene-0.51-snoRNA-1;_AED=1.00;_eAED=1.00;_QI=0|-1|0|0|-1|0|1|88|0
> chrIX   maker   exon    4375    4461    .       +       .
> ID=snoscan-chrIX-noncoding-gene-0.51-snoRNA-1:exon:12262;
> Parent=snoscan-chrIX-noncoding-gene-0.51-snoRNA-1
> chrIX   maker   gene    4375    4491    .       +       .
> ID=snoscan-chrIX-noncoding-gene-0.52;Name=snoscan-chrIX-
> noncoding-gene-0.52
> chrIX   maker   snoRNA  4375    4491    .       +       .
> ID=snoscan-chrIX-noncoding-gene-0.52-snoRNA-1;Parent=
> snoscan-chrIX-noncoding-gene-0.52;Name=snoscan-chrIX-
> noncoding-gene-0.52-snoRNA-1;_AED=1.00;_eAED=1.00;_QI=0|-1|
> 0|0|-1|0|1|118|0
> chrIX   maker   exon    4375    4491    .       +       .
> ID=snoscan-chrIX-noncoding-gene-0.52-snoRNA-1:exon:12263;
> Parent=snoscan-chrIX-noncoding-gene-0.52-snoRNA-1
> chrIX   maker   gene    4375    4500    .       +       .
> ID=snoscan-chrIX-noncoding-gene-0.53;Name=snoscan-chrIX-
> noncoding-gene-0.53
> chrIX   maker   snoRNA  4375    4500    .       +       .
> ID=snoscan-chrIX-noncoding-gene-0.53-snoRNA-1;Parent=
> snoscan-chrIX-noncoding-gene-0.53;Name=snoscan-chrIX-
> noncoding-gene-0.53-snoRNA-1;_AED=1.00;_eAED=1.00;_QI=0|-1|
> 0|0|-1|0|1|127|0
> chrIX   maker   exon    4375    4500    .       +       .
> ID=snoscan-chrIX-noncoding-gene-0.53-snoRNA-1:exon:12264;
> Parent=snoscan-chrIX-noncoding-gene-0.53-snoRNA-1
>
> Best,
> Jia-Xing
>
>
> --
> Jia-Xing Yue
>
> Population Genomics and Complex Traits Group
> Tour Pasteur 8eme etage
> Facult? de M?decine
> Institute for Research on Cancer and Aging, Nice (IRCAN)
> CNRS UMR 7284 - INSERM U 1081 - Universit? C?te d?Azur (UCA)
> 28 Avenue de Valombrose
> 06107 NICE Cedex 2
> France
>
> Personal website: http://www.iamphioxus.org/
>
> _______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org
>
>
>


-- 
Jia-Xing Yue

Population Genomics and Complex Traits Group
Tour Pasteur 8eme etage
Facult? de M?decine
Institute for Research on Cancer and Aging, Nice (IRCAN)
CNRS UMR 7284 - INSERM U 1081 - Universit? C?te d?Azur (UCA)
28 Avenue de Valombrose
06107 NICE Cedex 2
France

Personal website: http://www.iamphioxus.org/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170515/dd2096fa/attachment-0001.html>

From carsonhh at gmail.com  Tue May 16 08:51:00 2017
From: carsonhh at gmail.com (Carson Holt)
Date: Tue, 16 May 2017 08:51:00 -0600
Subject: [maker-devel] Using mpich2 with Maker
In-Reply-To: <CAJfcix0T1mVc4Rxz+7fbEbf_he78Aqj9UgA4uYKtKAZsdE8sCA@mail.gmail.com>
References: <CAJfcix0T1mVc4Rxz+7fbEbf_he78Aqj9UgA4uYKtKAZsdE8sCA@mail.gmail.com>
Message-ID: <6C739B3B-D5D6-4263-A73C-4EA1762B1EE1@gmail.com>

You probably need to reinstall Parse::RecDescent, Inline, Inline::C, or all of the above via CPAN (perl?s module installer). The ones already installed on your system may have issues. If you do not have the ability to install modules, you can install them just for your user using local::lib and the bootstrapping instructions here ?> http://search.cpan.org/~haarg/local-lib-2.000019/lib/local/lib.pm#The_bootstrapping_technique <http://search.cpan.org/~haarg/local-lib-2.000019/lib/local/lib.pm#The_bootstrapping_technique>

Then reinstall MAKER.

?Carson
 
> On May 12, 2017, at 3:05 PM, Nathan Ricks <nathan.ricks at gmail.com> wrote:
> 
> I've been using maker for some time now. However, I would like to speed up the process by using the mpich2 option. When use the command ./Build install, the following error is produced. Any help would be appreciated.
> 
> Nathan Ricks
> <image.png>
> 
> 
> 
> _______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170516/70b7eb1e/attachment-0001.html>

From salim.bougouffa at kaust.edu.sa  Sun May 21 01:45:50 2017
From: salim.bougouffa at kaust.edu.sa (Salim Bougouffa)
Date: Sun, 21 May 2017 10:45:50 +0300
Subject: [maker-devel] augustus exon calling ~
Message-ID: <CAGekO4ucZ6=H1X=s-B-+8KmaNpZph5s73sFHu5+o1L6m0sVg2A@mail.gmail.com>

Hi Maker folks,

I have several issues with a plant genome annotation that I am currently
doing but perhaps the most recurrent issues are:

1/ CDSs that are missed where significant rna-seq evidence is there (figure
artemis01)
2/ vice versa where one or two exons are added without rna-seq
evidence/intron hints (figure artemis02)

info about the runs:
1/ using augustus with a pre-existing model for a related plant that has
high homology to the one I am annotating
2/ umask=1 (seems to do better than umask=0; is this a good thing to do)
3/ evm = 1 (seems to perform better than emv=0)
4/ repeatmasking (denovo + repbase)

Best,
/SB


_______________________________________________________Salim
Bougouffa(PhD), Postdoctoral Fellow
4700 KAUST, CBRC, Blg3. Office4326-WS05,
Thuwal, Jeddah, KSA, 23955-6900
(966) 012 808 2963 || salim.bougouff at kaust.edu.sa

-- 

------------------------------
This message and its contents, including attachments are intended solely 
for the original recipient. If you are not the intended recipient or have 
received this message in error, please notify me immediately and delete 
this message from your computer system. Any unauthorized use or 
distribution is prohibited. Please consider the environment before printing 
this email.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170521/ee869f4a/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: artemis01.png
Type: image/png
Size: 166352 bytes
Desc: not available
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170521/ee869f4a/attachment-0002.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: artemis02.png
Type: image/png
Size: 97140 bytes
Desc: not available
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170521/ee869f4a/attachment-0003.png>

From mjfi2sb3 at gmail.com  Sun May 21 01:48:48 2017
From: mjfi2sb3 at gmail.com (Salim Bougouffa)
Date: Sun, 21 May 2017 07:48:48 +0000
Subject: [maker-devel] augustus exon calling ~
Message-ID: <CAJb_6LSEknrL9oNUozj9=OQihk-rSKZbo0W+TAq+-4agZ_h1FA@mail.gmail.com>

Hi Maker folks,

I have several issues with a plant genome annotation that I am currently
doing but perhaps the most recurrent issues are:

1/ CDSs that are missed where significant rna-seq evidence is there (figure
artemis01)
2/ vice versa where one or two exons are added without rna-seq
evidence/intron hints (figure artemis02)

info about the runs:
1/ using augustus with a pre-existing model for a related plant that has
high homology to the one I am annotating
2/ umask=1 (seems to do better than umask=0; is this a good thing to do)
3/ evm = 1 (seems to perform better than emv=0)
4/ repeatmasking (denovo + repbase)

Best,
/SB
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170521/721eee4f/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: artemis01.png
Type: image/png
Size: 166352 bytes
Desc: not available
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170521/721eee4f/attachment-0002.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: artemis02.png
Type: image/png
Size: 97140 bytes
Desc: not available
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170521/721eee4f/attachment-0003.png>

From mjfi2sb3 at gmail.com  Sun May 21 01:55:31 2017
From: mjfi2sb3 at gmail.com (Salim Bougouffa)
Date: Sun, 21 May 2017 07:55:31 +0000
Subject: [maker-devel] augustus exon calling ~
In-Reply-To: <CAJb_6LSEknrL9oNUozj9=OQihk-rSKZbo0W+TAq+-4agZ_h1FA@mail.gmail.com>
References: <CAJb_6LSEknrL9oNUozj9=OQihk-rSKZbo0W+TAq+-4agZ_h1FA@mail.gmail.com>
Message-ID: <CAJb_6LQtJCiDLOtM0hQpq1Eb4BHV_LyqzRKGKEMNMz_YDJiuqg@mail.gmail.com>

Hi,

I should have mentioned a third scenario where an exon is not called fully
by maker despite augustus getting it right (figure artemis03)

[image: artemis03.png]


On Sun, 21 May 2017 at 10:48 Salim Bougouffa <mjfi2sb3 at gmail.com> wrote:

> Hi Maker folks,
>
> I have several issues with a plant genome annotation that I am currently
> doing but perhaps the most recurrent issues are:
>
> 1/ CDSs that are missed where significant rna-seq evidence is there
> (figure artemis01)
> 2/ vice versa where one or two exons are added without rna-seq
> evidence/intron hints (figure artemis02)
>
> info about the runs:
> 1/ using augustus with a pre-existing model for a related plant that has
> high homology to the one I am annotating
> 2/ umask=1 (seems to do better than umask=0; is this a good thing to do)
> 3/ evm = 1 (seems to perform better than emv=0)
> 4/ repeatmasking (denovo + repbase)
>
> Best,
> /SB
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170521/1c0c8274/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: artemis03.png
Type: image/png
Size: 145860 bytes
Desc: not available
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170521/1c0c8274/attachment-0001.png>

From admin at genome.arizona.edu  Tue May 23 13:52:17 2017
From: admin at genome.arizona.edu (System Admin)
Date: Tue, 23 May 2017 12:52:17 -0700
Subject: [maker-devel] Hyperthreading
Message-ID: <66d4d95b-fc19-233c-af19-787c2af73e25@genome.arizona.edu>

We are using maker in a cluster with mpich.  Currently hyperthreading is 
on and we use 'mpiexec -n <cores>' to start maker.  Our machinelist file 
for mpich specifies the total emulated cores for each node.
With hyperthreading on, we have up to 256 total emulated cores available.

Which is the optimal scenario?
1. Use '-n 256'
2. Use '-n 128' with hyperthreading still on
3. Use '-n 128' with hyperthreading turned off

Thanks


From carsonhh at gmail.com  Tue May 23 14:19:29 2017
From: carsonhh at gmail.com (Carson Holt)
Date: Tue, 23 May 2017 14:19:29 -0600
Subject: [maker-devel] Hyperthreading
In-Reply-To: <66d4d95b-fc19-233c-af19-787c2af73e25@genome.arizona.edu>
References: <66d4d95b-fc19-233c-af19-787c2af73e25@genome.arizona.edu>
Message-ID: <D9561BB5-DC17-4D4B-9657-A5522B6F9231@gmail.com>

MAKER is more of a pipeline. It will launch external tools on as many CPUs as you give it with the mpiexec command. I?ve found that many of the tools used get a boost with hyperthreading even though optimizations are not explicitly built into their code. The short answer is you would have to try it both ways. I doubt there will be much more than a 10-15% difference in runtime.

You can pull back to 128 would if you find that you are running low on RAM or have a high IO burden (both of which will double if you go from 128 to 256 even though CPU isn?t really doubling). Also MAKER per job performance plateaus at around 200 processes due to communication overhead. Above that threshold it is often useful to divide datasets into multiple separate jobs that can run simultaneously.

?Carson


> On May 23, 2017, at 1:52 PM, System Admin <admin at genome.arizona.edu> wrote:
> 
> We are using maker in a cluster with mpich.  Currently hyperthreading is on and we use 'mpiexec -n <cores>' to start maker.  Our machinelist file for mpich specifies the total emulated cores for each node.
> With hyperthreading on, we have up to 256 total emulated cores available.
> 
> Which is the optimal scenario?
> 1. Use '-n 256'
> 2. Use '-n 128' with hyperthreading still on
> 3. Use '-n 128' with hyperthreading turned off
> 
> Thanks
> 
> _______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org


From admin at genome.arizona.edu  Tue May 23 14:31:48 2017
From: admin at genome.arizona.edu (admin at genome.arizona.edu)
Date: Tue, 23 May 2017 13:31:48 -0700
Subject: [maker-devel] Hyperthreading
In-Reply-To: <D9561BB5-DC17-4D4B-9657-A5522B6F9231@gmail.com>
References: <66d4d95b-fc19-233c-af19-787c2af73e25@genome.arizona.edu>
	<D9561BB5-DC17-4D4B-9657-A5522B6F9231@gmail.com>
Message-ID: <fd619da1-cb2b-9248-502e-501b4a459cbb@genome.arizona.edu>

Carson Holt wrote on 05/23/2017 01:19 PM:
> You can pull back to 128 would if you find that you are running low on RAM or have a high IO burden (both of which will double if you go from 128 to 256 even though CPU isn?t really doubling). Also MAKER per job performance plateaus at around 200 processes due to communication overhead. Above that threshold it is often useful to divide datasets into multiple separate jobs that can run simultaneously.

Yes, with '-n 192' we found the load on the cluster will initially go up 
to 360-380 but then continually decreases until maker is finished. 
Memory usage was very low during the processing (under 20%).


From carsonhh at gmail.com  Tue May 23 14:38:42 2017
From: carsonhh at gmail.com (Carson Holt)
Date: Tue, 23 May 2017 14:38:42 -0600
Subject: [maker-devel] Hyperthreading
In-Reply-To: <fd619da1-cb2b-9248-502e-501b4a459cbb@genome.arizona.edu>
References: <66d4d95b-fc19-233c-af19-787c2af73e25@genome.arizona.edu>
	<D9561BB5-DC17-4D4B-9657-A5522B6F9231@gmail.com>
	<fd619da1-cb2b-9248-502e-501b4a459cbb@genome.arizona.edu>
Message-ID: <47000CA1-0F27-40B2-8BB7-3289C2010853@gmail.com>

Also make sure cpus in the control file are set to 1 when using MPI. Otherwise it will tell each program it calls to try and use more CPUs per call.

?Carson

> On May 23, 2017, at 2:31 PM, admin at genome.arizona.edu wrote:
> 
> Carson Holt wrote on 05/23/2017 01:19 PM:
>> You can pull back to 128 would if you find that you are running low on RAM or have a high IO burden (both of which will double if you go from 128 to 256 even though CPU isn?t really doubling). Also MAKER per job performance plateaus at around 200 processes due to communication overhead. Above that threshold it is often useful to divide datasets into multiple separate jobs that can run simultaneously.
> 
> Yes, with '-n 192' we found the load on the cluster will initially go up to 360-380 but then continually decreases until maker is finished. Memory usage was very low during the processing (under 20%).
> 
> _______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org


From mmokrejs at gmail.com  Tue May 23 14:45:21 2017
From: mmokrejs at gmail.com (=?UTF-8?Q?Martin_MOKREJ=c5=a0?=)
Date: Tue, 23 May 2017 22:45:21 +0200
Subject: [maker-devel] Hyperthreading
In-Reply-To: <fd619da1-cb2b-9248-502e-501b4a459cbb@genome.arizona.edu>
References: <66d4d95b-fc19-233c-af19-787c2af73e25@genome.arizona.edu>
	<D9561BB5-DC17-4D4B-9657-A5522B6F9231@gmail.com>
	<fd619da1-cb2b-9248-502e-501b4a459cbb@genome.arizona.edu>
Message-ID: <30ae31f7-264e-44d0-30bc-a010de3e54a7@gmail.com>

admin at genome.arizona.edu wrote:
> Carson Holt wrote on 05/23/2017 01:19 PM:
>> You can pull back to 128 would if you find that you are running low on RAM or have a high IO burden (both of which will double if you go from 128 to 256 even though CPU isn?t really doubling). Also MAKER per job performance plateaus at around 200 processes due to communication overhead. Above that threshold it is often useful to divide datasets into multiple separate jobs that can run simultaneously.
> 
> Yes, with '-n 192' we found the load on the cluster will initially go up to 360-380 but then continually decreases until maker is finished. Memory usage was very low during the processing (under 20%).

Hi,
 the high load could be caused by disk IO or other reasons. The only proof is to run top, htop or similar and check that the processes are in *running* state ("R" is displayed in the status column). There could be "S" (sleep) when task is waiting for data input or output and also "D"(disk) coudl be shown when waiting for disk IO (unlike network IO).
Martin


From mmokrejs at gmail.com  Tue May 23 14:51:18 2017
From: mmokrejs at gmail.com (=?UTF-8?Q?Martin_MOKREJ=c5=a0?=)
Date: Tue, 23 May 2017 22:51:18 +0200
Subject: [maker-devel] Hyperthreading
In-Reply-To: <66d4d95b-fc19-233c-af19-787c2af73e25@genome.arizona.edu>
References: <66d4d95b-fc19-233c-af19-787c2af73e25@genome.arizona.edu>
Message-ID: <54fbcfd6-ba28-e570-cfab-a6d83620f747@gmail.com>

System Admin wrote:
> We are using maker in a cluster with mpich.  Currently hyperthreading is on and we use 'mpiexec -n <cores>' to start maker.  Our machinelist file for mpich specifies the total emulated cores for each node.
> With hyperthreading on, we have up to 256 total emulated cores available.
> 
> Which is the optimal scenario?
> 1. Use '-n 256'
> 2. Use '-n 128' with hyperthreading still on
> 3. Use '-n 128' with hyperthreading turned off

Go for 3. but make sure to disable *hyperthreading* in the kernel of the machines as well. I also disable multicore scheduler (which should again be helping if there are more long-term running processes than physical cores available and if some should probably share a cache). We do not have such jobs, hmmer and blast are mostly accessing data from memory, so the CPU cache is not much relevant for these.
  Hyperthreading only helps if jobs are lousy, waiting for some input/output etc., and in that case *it helps* if another process can be executed on the CPU core (hopefully not having same bottleneck). This is generally a helped in bad situations. You are after good setup, so disable hyperthreading in kernel, load only that many jobs equal to the number of physical CPI cores, and monitor performance. If jobs are starving, resolve the issue.

Martin


From admin at genome.arizona.edu  Tue May 23 14:57:56 2017
From: admin at genome.arizona.edu (admin at genome.arizona.edu)
Date: Tue, 23 May 2017 13:57:56 -0700
Subject: [maker-devel] Hyperthreading
In-Reply-To: <47000CA1-0F27-40B2-8BB7-3289C2010853@gmail.com>
References: <66d4d95b-fc19-233c-af19-787c2af73e25@genome.arizona.edu>
	<D9561BB5-DC17-4D4B-9657-A5522B6F9231@gmail.com>
	<fd619da1-cb2b-9248-502e-501b4a459cbb@genome.arizona.edu>
	<47000CA1-0F27-40B2-8BB7-3289C2010853@gmail.com>
Message-ID: <bf57acaf-5f70-1bd1-4c1d-ab2b5ced581e@genome.arizona.edu>

Carson Holt wrote on 05/23/2017 01:38 PM:
> Also make sure cpus in the control file are set to 1 when using MPI. 
> Otherwise it will tell each program it calls to try and use more
> CPUs per call.

Yes we are using cpus=1 in the control file
Thanks


From carsonhh at gmail.com  Tue May 23 15:03:17 2017
From: carsonhh at gmail.com (Carson Holt)
Date: Tue, 23 May 2017 15:03:17 -0600
Subject: [maker-devel] Hyperthreading
In-Reply-To: <fd619da1-cb2b-9248-502e-501b4a459cbb@genome.arizona.edu>
References: <66d4d95b-fc19-233c-af19-787c2af73e25@genome.arizona.edu>
	<D9561BB5-DC17-4D4B-9657-A5522B6F9231@gmail.com>
	<fd619da1-cb2b-9248-502e-501b4a459cbb@genome.arizona.edu>
Message-ID: <054B76C7-FAC0-4B9B-A6D1-29A8202E35B3@gmail.com>

One last thing to check if using CentOS or RedHat. I?ve seen it happen on a handful of clusters where transparent hugepages can create odd load issues and very high sys CPU usage under top (not just with maker but with BWA, GATK, and other programs that can have larger memory footprints). If using CentOS or RedHat, you may want to disable defrag for hugepages.

You do this on CentOS 6 to disable it (the process is similar on CentOS 7 and RedHat but you may have to google it) ?> 

echo never > /sys/kernel/mm/transparent_hugepage/defrag 
echo 0 > /sys/kernel/mm/transparent_hugepage/khugepaged/defrag

?Carson


> On May 23, 2017, at 2:31 PM, admin at genome.arizona.edu wrote:
> 
> Carson Holt wrote on 05/23/2017 01:19 PM:
>> You can pull back to 128 would if you find that you are running low on RAM or have a high IO burden (both of which will double if you go from 128 to 256 even though CPU isn?t really doubling). Also MAKER per job performance plateaus at around 200 processes due to communication overhead. Above that threshold it is often useful to divide datasets into multiple separate jobs that can run simultaneously.
> 
> Yes, with '-n 192' we found the load on the cluster will initially go up to 360-380 but then continually decreases until maker is finished. Memory usage was very low during the processing (under 20%).
> 
> _______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org


From carsonhh at gmail.com  Tue May 23 15:34:15 2017
From: carsonhh at gmail.com (Carson Holt)
Date: Tue, 23 May 2017 15:34:15 -0600
Subject: [maker-devel] augustus exon calling ~
In-Reply-To: <CAJb_6LSEknrL9oNUozj9=OQihk-rSKZbo0W+TAq+-4agZ_h1FA@mail.gmail.com>
References: <CAJb_6LSEknrL9oNUozj9=OQihk-rSKZbo0W+TAq+-4agZ_h1FA@mail.gmail.com>
Message-ID: <EDDC2B86-8D94-4700-93D0-09CEAF358C3C@gmail.com>

EVM works extremely well when evidence closely matches the predictions and there are no assembly anomalies affecting ORF. Otherwise, EVM performs very very poorly. Also I would not set unmask=1. It adds noise to the calls.

Note in all cases given, gene models are from Augustus (MAKER doesn?t make predictions). MAKER just provides hints that Augustus can use for the second call set. Hints boost the score a model gets whenever a feature matches the hint. What you see as an Augustus match/match_part feature are just references of what Augustus calls without hints.

So if I tell Augustus there is probably an exon/intron at location X, then any model that includes that exon/intron will bump up its score thus causing Augustus to keep models that match the hints and report those over models that don?t match. However if there is an issue with the evidence (i.e. merge mRNA-seq assembly), or an issue with the assembly (base change generates an early stop codon or causes a frameshift), then Augustus may choose to truncate or skip an exon in order to capture the bonus from downstream hints. So it is unlikely that there is a workable model that capture the exact intron exon structure because it breaks the ORF at some point. So Augustus instead produces the best model it can to capture as many hint bonuses as it can.

That being said, look for any odd hint sources like very poor protein or transcript evidence alignments. Eliminating bad hints will improve performance (if using mRNA-seq assemblies Trinity has a jaccard_clip option which helps avoid false merging of transcript evidence for example). Or if an organism you used for protein evidence constantly produces bad protein alignments, then you may want to drop it completely from evidence.

Finally training Augustus on the genome being annotated will help improve performance (note just because a species is closely related in evolutionary space does not mean that its HMM's will perform well; it?s a common fallacy about ab initio prediction discussed in the SNAP paper). Also try adding another gene predictor like SNAP to see if it hurts or helps.

?Carson


> On May 21, 2017, at 1:48 AM, Salim Bougouffa <mjfi2sb3 at gmail.com> wrote:
> 
> Hi Maker folks,
> 
> I have several issues with a plant genome annotation that I am currently doing but perhaps the most recurrent issues are:
> 
> 1/ CDSs that are missed where significant rna-seq evidence is there (figure artemis01)
> 2/ vice versa where one or two exons are added without rna-seq evidence/intron hints (figure artemis02)
> 
> info about the runs:
> 1/ using augustus with a pre-existing model for a related plant that has high homology to the one I am annotating
> 2/ umask=1 (seems to do better than umask=0; is this a good thing to do)
> 3/ evm = 1 (seems to perform better than emv=0)
> 4/ repeatmasking (denovo + repbase)
> 
> Best,
> /SB
> 
> <artemis01.png><artemis02.png>_______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170523/3c7b4bf1/attachment-0001.html>

From nathan.ricks at gmail.com  Tue May 23 15:23:49 2017
From: nathan.ricks at gmail.com (Nathan Ricks)
Date: Tue, 23 May 2017 15:23:49 -0600
Subject: [maker-devel] maker_functional_gff
Message-ID: <CAJfcix1BmK6LyE7F=sF10ARr81k9FSm3eKgNgLMZbP-X7=C8=A@mail.gmail.com>

I've been working with maker and trying to use the maker_functional_gff to
create an annotated .gff file. However, whenever I run the command, the
following pops up, and just continues for a long time.
Use of uninitialized value $qid in hash element at ./maker_functional_gff
line 165, <$IN> line 58363.
Use of uninitialized value $qid in hash element at ./maker_functional_gff
line 170, <$IN> line 58363.
Use of uninitialized value $qid in hash element at ./maker_functional_gff
line 165, <$IN> line 58367.
Use of uninitialized value $qid in hash element at ./maker_functional_gff
line 170, <$IN> line 58367.
^CUse of uninitialized value $qid in hash element at ./maker_functional_gff
line 165, <$IN> line 58368.


Nathan Ricks
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170523/b34f6585/attachment-0001.html>

From d.ence at ufl.edu  Tue May 23 15:39:32 2017
From: d.ence at ufl.edu (Ence,daniel)
Date: Tue, 23 May 2017 21:39:32 +0000
Subject: [maker-devel] maker_functional_gff
In-Reply-To: <CAJfcix1BmK6LyE7F=sF10ARr81k9FSm3eKgNgLMZbP-X7=C8=A@mail.gmail.com>
References: <CAJfcix1BmK6LyE7F=sF10ARr81k9FSm3eKgNgLMZbP-X7=C8=A@mail.gmail.com>
Message-ID: <9388406A-302C-4B19-9F35-D56C06CC9582@mail.ufl.edu>

Hi Nathan, can you send the command line that you?re using and is giving the error? 

Thanks,
Daniel Ence

> On May 23, 2017, at 5:23 PM, Nathan Ricks <nathan.ricks at gmail.com> wrote:
> 
> I've been working with maker and trying to use the maker_functional_gff to create an annotated .gff file. However, whenever I run the command, the following pops up, and just continues for a long time.
> Use of uninitialized value $qid in hash element at ./maker_functional_gff line 165, <$IN> line 58363.
> Use of uninitialized value $qid in hash element at ./maker_functional_gff line 170, <$IN> line 58363.
> Use of uninitialized value $qid in hash element at ./maker_functional_gff line 165, <$IN> line 58367.
> Use of uninitialized value $qid in hash element at ./maker_functional_gff line 170, <$IN> line 58367.
> ^CUse of uninitialized value $qid in hash element at ./maker_functional_gff line 165, <$IN> line 58368.
> 
> 
> Nathan Ricks
> _______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org


From carsonhh at gmail.com  Tue May 23 15:44:05 2017
From: carsonhh at gmail.com (Carson Holt)
Date: Tue, 23 May 2017 15:44:05 -0600
Subject: [maker-devel] maker_functional_gff
In-Reply-To: <CAJfcix1BmK6LyE7F=sF10ARr81k9FSm3eKgNgLMZbP-X7=C8=A@mail.gmail.com>
References: <CAJfcix1BmK6LyE7F=sF10ARr81k9FSm3eKgNgLMZbP-X7=C8=A@mail.gmail.com>
Message-ID: <25B003D7-6A19-4486-B21F-71070F00A580@gmail.com>

The blast report you gave it is in the wrong format, it is partial/truncated, or you provided the files in the wrong order. Basically it receive an empty line from the file at some point.

The blast report format must in tabular foramt which is "wu-blast -mformat 2? or "ncbi-blast -outfmt 6"

Also the script only supports blast results against UniProt/Swiss-prot.

?Carson


> On May 23, 2017, at 3:23 PM, Nathan Ricks <nathan.ricks at gmail.com> wrote:
> 
> I've been working with maker and trying to use the maker_functional_gff to create an annotated .gff file. However, whenever I run the command, the following pops up, and just continues for a long time.
> Use of uninitialized value $qid in hash element at ./maker_functional_gff line 165, <$IN> line 58363.
> Use of uninitialized value $qid in hash element at ./maker_functional_gff line 170, <$IN> line 58363.
> Use of uninitialized value $qid in hash element at ./maker_functional_gff line 165, <$IN> line 58367.
> Use of uninitialized value $qid in hash element at ./maker_functional_gff line 170, <$IN> line 58367.
> ^CUse of uninitialized value $qid in hash element at ./maker_functional_gff line 165, <$IN> line 58368.
> 
> 
> Nathan Ricks
> _______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org


From yuejiaxing at gmail.com  Fri May 26 03:28:48 2017
From: yuejiaxing at gmail.com (Jia-Xing Yue)
Date: Fri, 26 May 2017 11:28:48 +0200
Subject: [maker-devel] multiple overlapped snoRNA genes got annotated by
 maker
In-Reply-To: <CAETgyPQOwmGsXPGyAcUcEOf_-LHoOvqzJQuKHLTocRYE6tyVtg@mail.gmail.com>
References: <CAETgyPSWXrZPO=nrkgT5MomgJELf-0ze7yu0A_NCsVqTnLYkDA@mail.gmail.com>
	<0AC3F89F-28EB-4E2A-ADDE-2DF8BD625416@gmail.com>
	<CAETgyPQOwmGsXPGyAcUcEOf_-LHoOvqzJQuKHLTocRYE6tyVtg@mail.gmail.com>
Message-ID: <CAETgyPSsjRpskkgEs0TY=xbXz0J2N9E0wGQV2S2mVCpej6ZaCg@mail.gmail.com>

Hi Michael,

This is a follow-up for the snoscan issue. I found the snoscan_meth option
seems have been removed in the current maker_opts.ctl template file
(v2.31.9). This option used to be there according to this post (
https://www.biostars.org/p/217240/). I manually specified this option in my
maker_opts.ctl file but I don't think maker has correctly recognized this
option:


STATUS: Parsing control files...
WARNING: Invalid option 'snoscan_meth' in control file maker_opts.ctl
...

Do you know is there a way to work around this problem? Thanks!

Best,
Jia-Xing


On Mon, May 15, 2017 at 7:14 PM, Jia-Xing Yue <yuejiaxing at gmail.com> wrote:

> Hi Michael,
>
> Many thanks for the information! I will specify the "snoscan_meth" file
> and give it another try then. I majorly want to use maker to annotate
> protein-coding genes and tRNAs. But it would be nice to have snoRNA
> reasonably annotated as well.
> Thanks gain and have a great day!
>
> Best,
> Jia-Xing
>
>
> On Mon, May 15, 2017 at 6:28 PM, Michael Campbell <
> michael.s.campbell1 at gmail.com> wrote:
>
>> Hi Jia-Xing,
>>
>> That has been my experience in the past as well. For the non-coding RNAs
>> tRNA-scan is very accurate while snoscan seems to be quite sensitive but
>> very specific. Did you give it a ?snoscan_meth? file? Giving it
>> a snoscan_meth file will help with accuracy. The biggest gains in accuracy
>> are from small RNA-seq data. In the paper where we used snoscan on maize we
>> didn?t keep any snoRNA predictions that didn?t have support from small
>> RNA-seq data, in practical terms we got rid of anything with a AED of 1.
>>
>> I hope this helps,
>> Mike
>>
>> On May 15, 2017, at 7:28 AM, Jia-Xing Yue <yuejiaxing at gmail.com> wrote:
>>
>> Hello,
>>
>> I configured snoscan (v.0.9.1) for my maker installation (v2.31.9) and
>> run the annotation for a yeast (S. cerevisiae) genome. I think the
>> annotation went well with regard to tRNAs and protein-coding genes but I am
>> not sure about snoRNAs. I found multiple overlapped snoRNA genes were
>> annotated by maker as the example below shows. I was wondering if this is
>> expected. If not, what might have caused this problem and is there a way to
>> work around. Thanks in advance!
>>
>> chrIX   maker   gene    4328    4416    .       +       .
>> ID=snoscan-chrIX-noncoding-gene-0.49;Name=snoscan-chrIX-nonc
>> oding-gene-0.49
>> chrIX   maker   snoRNA  4328    4416    .       +       .
>> ID=snoscan-chrIX-noncoding-gene-0.49-snoRNA-1;Parent=snoscan
>> -chrIX-noncoding-gene-0.49;Name=snoscan-chrIX-noncoding-
>> gene-0.49-snoRNA-1;_AED=1.00;_eAED=1.00;_QI=0|-1|0|0|-1|0|1|90|0
>> chrIX   maker   exon    4328    4416    .       +       .
>> ID=snoscan-chrIX-noncoding-gene-0.49-snoRNA-1:exon:12260;Par
>> ent=snoscan-chrIX-noncoding-gene-0.49-snoRNA-1
>> chrIX   maker   gene    4375    4563    .       +       .
>> ID=snoscan-chrIX-noncoding-gene-0.50;Name=snoscan-chrIX-nonc
>> oding-gene-0.50
>> chrIX   maker   snoRNA  4375    4563    .       +       .
>> ID=snoscan-chrIX-noncoding-gene-0.50-snoRNA-1;Parent=snoscan
>> -chrIX-noncoding-gene-0.50;Name=snoscan-chrIX-noncoding-
>> gene-0.50-snoRNA-1;_AED=1.00;_eAED=1.00;_QI=0|-1|0|0|-1|0|1|190|0
>> chrIX   maker   exon    4375    4563    .       +       .
>> ID=snoscan-chrIX-noncoding-gene-0.50-snoRNA-1:exon:12261;Par
>> ent=snoscan-chrIX-noncoding-gene-0.50-snoRNA-1
>> chrIX   maker   gene    4375    4461    .       +       .
>> ID=snoscan-chrIX-noncoding-gene-0.51;Name=snoscan-chrIX-nonc
>> oding-gene-0.51
>> chrIX   maker   snoRNA  4375    4461    .       +       .
>> ID=snoscan-chrIX-noncoding-gene-0.51-snoRNA-1;Parent=snoscan
>> -chrIX-noncoding-gene-0.51;Name=snoscan-chrIX-noncoding-
>> gene-0.51-snoRNA-1;_AED=1.00;_eAED=1.00;_QI=0|-1|0|0|-1|0|1|88|0
>> chrIX   maker   exon    4375    4461    .       +       .
>> ID=snoscan-chrIX-noncoding-gene-0.51-snoRNA-1:exon:12262;Par
>> ent=snoscan-chrIX-noncoding-gene-0.51-snoRNA-1
>> chrIX   maker   gene    4375    4491    .       +       .
>> ID=snoscan-chrIX-noncoding-gene-0.52;Name=snoscan-chrIX-nonc
>> oding-gene-0.52
>> chrIX   maker   snoRNA  4375    4491    .       +       .
>> ID=snoscan-chrIX-noncoding-gene-0.52-snoRNA-1;Parent=snoscan
>> -chrIX-noncoding-gene-0.52;Name=snoscan-chrIX-noncoding-
>> gene-0.52-snoRNA-1;_AED=1.00;_eAED=1.00;_QI=0|-1|0|0|-1|0|1|118|0
>> chrIX   maker   exon    4375    4491    .       +       .
>> ID=snoscan-chrIX-noncoding-gene-0.52-snoRNA-1:exon:12263;Par
>> ent=snoscan-chrIX-noncoding-gene-0.52-snoRNA-1
>> chrIX   maker   gene    4375    4500    .       +       .
>> ID=snoscan-chrIX-noncoding-gene-0.53;Name=snoscan-chrIX-nonc
>> oding-gene-0.53
>> chrIX   maker   snoRNA  4375    4500    .       +       .
>> ID=snoscan-chrIX-noncoding-gene-0.53-snoRNA-1;Parent=snoscan
>> -chrIX-noncoding-gene-0.53;Name=snoscan-chrIX-noncoding-
>> gene-0.53-snoRNA-1;_AED=1.00;_eAED=1.00;_QI=0|-1|0|0|-1|0|1|127|0
>> chrIX   maker   exon    4375    4500    .       +       .
>> ID=snoscan-chrIX-noncoding-gene-0.53-snoRNA-1:exon:12264;Par
>> ent=snoscan-chrIX-noncoding-gene-0.53-snoRNA-1
>>
>> Best,
>> Jia-Xing
>>
>>
>> --
>> Jia-Xing Yue
>>
>> Population Genomics and Complex Traits Group
>> Tour Pasteur 8eme etage
>> Facult? de M?decine
>> Institute for Research on Cancer and Aging, Nice (IRCAN)
>> CNRS UMR 7284 - INSERM U 1081 - Universit? C?te d?Azur (UCA)
>> 28 Avenue de Valombrose
>> 06107 NICE Cedex 2
>> France
>>
>> Personal website: http://www.iamphioxus.org/
>>
>> _______________________________________________
>> maker-devel mailing list
>> maker-devel at box290.bluehost.com
>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org
>>
>>
>>
>
>
> --
> Jia-Xing Yue
>
> Population Genomics and Complex Traits Group
> Tour Pasteur 8eme etage
> Facult? de M?decine
> Institute for Research on Cancer and Aging, Nice (IRCAN)
> CNRS UMR 7284 - INSERM U 1081 - Universit? C?te d?Azur (UCA)
> 28 Avenue de Valombrose
> 06107 NICE Cedex 2
> France
>
> Personal website: http://www.iamphioxus.org/
>
>


-- 
Jia-Xing Yue

Population Genomics and Complex Traits Group
Tour Pasteur 8eme etage
Facult? de M?decine
Institute for Research on Cancer and Aging, Nice (IRCAN)
CNRS UMR 7284 - INSERM U 1081 - Universit? C?te d?Azur (UCA)
28 Avenue de Valombrose
06107 NICE Cedex 2
France

Personal website: http://www.iamphioxus.org/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170526/0edf08e2/attachment-0001.html>

From michael.s.campbell1 at gmail.com  Fri May 26 07:54:44 2017
From: michael.s.campbell1 at gmail.com (Michael Campbell)
Date: Fri, 26 May 2017 09:54:44 -0400
Subject: [maker-devel] multiple overlapped snoRNA genes got annotated by
 maker
In-Reply-To: <CAETgyPSsjRpskkgEs0TY=xbXz0J2N9E0wGQV2S2mVCpej6ZaCg@mail.gmail.com>
References: <CAETgyPSWXrZPO=nrkgT5MomgJELf-0ze7yu0A_NCsVqTnLYkDA@mail.gmail.com>
	<0AC3F89F-28EB-4E2A-ADDE-2DF8BD625416@gmail.com>
	<CAETgyPQOwmGsXPGyAcUcEOf_-LHoOvqzJQuKHLTocRYE6tyVtg@mail.gmail.com>
	<CAETgyPSsjRpskkgEs0TY=xbXz0J2N9E0wGQV2S2mVCpej6ZaCg@mail.gmail.com>
Message-ID: <6461FDD0-BE78-403A-9FEF-E71C3D24F2CA@gmail.com>

Hi Jia-Xing,

v2.31.9 may not have had that option. I know that it is in the v3.00.0 version, so you best option may be to update.

Thanks,
Mike
> On May 26, 2017, at 5:28 AM, Jia-Xing Yue <yuejiaxing at gmail.com> wrote:
> 
> Hi Michael,
> 
> This is a follow-up for the snoscan issue. I found the snoscan_meth option seems have been removed in the current maker_opts.ctl template file (v2.31.9). This option used to be there according to this post (https://www.biostars.org/p/217240/ <https://www.biostars.org/p/217240/>). I manually specified this option in my maker_opts.ctl file but I don't think maker has correctly recognized this option:
> 
> 
> STATUS: Parsing control files...
> WARNING: Invalid option 'snoscan_meth' in control file maker_opts.ctl
> ...
> 
> Do you know is there a way to work around this problem? Thanks!
> 
> Best,
> Jia-Xing
> 
>  
> 
> On Mon, May 15, 2017 at 7:14 PM, Jia-Xing Yue <yuejiaxing at gmail.com <mailto:yuejiaxing at gmail.com>> wrote:
> Hi Michael,
> 
> Many thanks for the information! I will specify the "snoscan_meth" file and give it another try then. I majorly want to use maker to annotate protein-coding genes and tRNAs. But it would be nice to have snoRNA reasonably annotated as well. 
> Thanks gain and have a great day!
> 
> Best,
> Jia-Xing
> 
> 
> On Mon, May 15, 2017 at 6:28 PM, Michael Campbell <michael.s.campbell1 at gmail.com <mailto:michael.s.campbell1 at gmail.com>> wrote:
> Hi Jia-Xing,
> 
> That has been my experience in the past as well. For the non-coding RNAs tRNA-scan is very accurate while snoscan seems to be quite sensitive but very specific. Did you give it a ?snoscan_meth? file? Giving it a snoscan_meth file will help with accuracy. The biggest gains in accuracy are from small RNA-seq data. In the paper where we used snoscan on maize we didn?t keep any snoRNA predictions that didn?t have support from small RNA-seq data, in practical terms we got rid of anything with a AED of 1.
> 
> I hope this helps,
> Mike
>> On May 15, 2017, at 7:28 AM, Jia-Xing Yue <yuejiaxing at gmail.com <mailto:yuejiaxing at gmail.com>> wrote:
>> 
>> Hello,
>> 
>> I configured snoscan (v.0.9.1) for my maker installation (v2.31.9) and run the annotation for a yeast (S. cerevisiae) genome. I think the annotation went well with regard to tRNAs and protein-coding genes but I am not sure about snoRNAs. I found multiple overlapped snoRNA genes were annotated by maker as the example below shows. I was wondering if this is expected. If not, what might have caused this problem and is there a way to work around. Thanks in advance!
>> 
>> chrIX   maker   gene    4328    4416    .       +       .       ID=snoscan-chrIX-noncoding-gene-0.49;Name=snoscan-chrIX-noncoding-gene-0.49
>> chrIX   maker   snoRNA  4328    4416    .       +       .       ID=snoscan-chrIX-noncoding-gene-0.49-snoRNA-1;Parent=snoscan-chrIX-noncoding-gene-0.49;Name=snoscan-chrIX-noncoding-gene-0.49-snoRNA-1;_AED=1.00;_eAED=1.00;_QI=0|-1|0|0|-1|0|1|90|0
>> chrIX   maker   exon    4328    4416    .       +       .       ID=snoscan-chrIX-noncoding-gene-0.49-snoRNA-1:exon:12260;Parent=snoscan-chrIX-noncoding-gene-0.49-snoRNA-1
>> chrIX   maker   gene    4375    4563    .       +       .       ID=snoscan-chrIX-noncoding-gene-0.50;Name=snoscan-chrIX-noncoding-gene-0.50
>> chrIX   maker   snoRNA  4375    4563    .       +       .       ID=snoscan-chrIX-noncoding-gene-0.50-snoRNA-1;Parent=snoscan-chrIX-noncoding-gene-0.50;Name=snoscan-chrIX-noncoding-gene-0.50-snoRNA-1;_AED=1.00;_eAED=1.00;_QI=0|-1|0|0|-1|0|1|190|0
>> chrIX   maker   exon    4375    4563    .       +       .       ID=snoscan-chrIX-noncoding-gene-0.50-snoRNA-1:exon:12261;Parent=snoscan-chrIX-noncoding-gene-0.50-snoRNA-1
>> chrIX   maker   gene    4375    4461    .       +       .       ID=snoscan-chrIX-noncoding-gene-0.51;Name=snoscan-chrIX-noncoding-gene-0.51
>> chrIX   maker   snoRNA  4375    4461    .       +       .       ID=snoscan-chrIX-noncoding-gene-0.51-snoRNA-1;Parent=snoscan-chrIX-noncoding-gene-0.51;Name=snoscan-chrIX-noncoding-gene-0.51-snoRNA-1;_AED=1.00;_eAED=1.00;_QI=0|-1|0|0|-1|0|1|88|0
>> chrIX   maker   exon    4375    4461    .       +       .       ID=snoscan-chrIX-noncoding-gene-0.51-snoRNA-1:exon:12262;Parent=snoscan-chrIX-noncoding-gene-0.51-snoRNA-1
>> chrIX   maker   gene    4375    4491    .       +       .       ID=snoscan-chrIX-noncoding-gene-0.52;Name=snoscan-chrIX-noncoding-gene-0.52
>> chrIX   maker   snoRNA  4375    4491    .       +       .       ID=snoscan-chrIX-noncoding-gene-0.52-snoRNA-1;Parent=snoscan-chrIX-noncoding-gene-0.52;Name=snoscan-chrIX-noncoding-gene-0.52-snoRNA-1;_AED=1.00;_eAED=1.00;_QI=0|-1|0|0|-1|0|1|118|0
>> chrIX   maker   exon    4375    4491    .       +       .       ID=snoscan-chrIX-noncoding-gene-0.52-snoRNA-1:exon:12263;Parent=snoscan-chrIX-noncoding-gene-0.52-snoRNA-1
>> chrIX   maker   gene    4375    4500    .       +       .       ID=snoscan-chrIX-noncoding-gene-0.53;Name=snoscan-chrIX-noncoding-gene-0.53
>> chrIX   maker   snoRNA  4375    4500    .       +       .       ID=snoscan-chrIX-noncoding-gene-0.53-snoRNA-1;Parent=snoscan-chrIX-noncoding-gene-0.53;Name=snoscan-chrIX-noncoding-gene-0.53-snoRNA-1;_AED=1.00;_eAED=1.00;_QI=0|-1|0|0|-1|0|1|127|0
>> chrIX   maker   exon    4375    4500    .       +       .       ID=snoscan-chrIX-noncoding-gene-0.53-snoRNA-1:exon:12264;Parent=snoscan-chrIX-noncoding-gene-0.53-snoRNA-1
>> 
>> Best,
>> Jia-Xing
>> 
>> 
>> -- 
>> Jia-Xing Yue 
>> 
>> Population Genomics and Complex Traits Group
>> Tour Pasteur 8eme etage
>> Facult? de M?decine
>> Institute for Research on Cancer and Aging, Nice (IRCAN)
>> CNRS UMR 7284 - INSERM U 1081 - Universit? C?te d?Azur (UCA)
>> 28 Avenue de Valombrose
>> 06107 NICE Cedex 2
>> France
>> 
>> Personal website: http://www.iamphioxus.org/ <http://www.iamphioxus.org/>
>> 
>> _______________________________________________
>> maker-devel mailing list
>> maker-devel at box290.bluehost.com <mailto:maker-devel at box290.bluehost.com>
>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org <http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org>
> 
> 
> 
> 
> -- 
> Jia-Xing Yue 
> 
> Population Genomics and Complex Traits Group
> Tour Pasteur 8eme etage
> Facult? de M?decine
> Institute for Research on Cancer and Aging, Nice (IRCAN)
> CNRS UMR 7284 - INSERM U 1081 - Universit? C?te d?Azur (UCA)
> 28 Avenue de Valombrose
> 06107 NICE Cedex 2
> France
> 
> Personal website: http://www.iamphioxus.org/ <http://www.iamphioxus.org/>
> 
> 
> 
> 
> -- 
> Jia-Xing Yue 
> 
> Population Genomics and Complex Traits Group
> Tour Pasteur 8eme etage
> Facult? de M?decine
> Institute for Research on Cancer and Aging, Nice (IRCAN)
> CNRS UMR 7284 - INSERM U 1081 - Universit? C?te d?Azur (UCA)
> 28 Avenue de Valombrose
> 06107 NICE Cedex 2
> France
> 
> Personal website: http://www.iamphioxus.org/ <http://www.iamphioxus.org/>
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170526/207e742d/attachment-0001.html>

From yuejiaxing at gmail.com  Fri May 26 08:20:03 2017
From: yuejiaxing at gmail.com (Jia-Xing Yue)
Date: Fri, 26 May 2017 16:20:03 +0200
Subject: [maker-devel] multiple overlapped snoRNA genes got annotated by
 maker
In-Reply-To: <6461FDD0-BE78-403A-9FEF-E71C3D24F2CA@gmail.com>
References: <CAETgyPSWXrZPO=nrkgT5MomgJELf-0ze7yu0A_NCsVqTnLYkDA@mail.gmail.com>
	<0AC3F89F-28EB-4E2A-ADDE-2DF8BD625416@gmail.com>
	<CAETgyPQOwmGsXPGyAcUcEOf_-LHoOvqzJQuKHLTocRYE6tyVtg@mail.gmail.com>
	<CAETgyPSsjRpskkgEs0TY=xbXz0J2N9E0wGQV2S2mVCpej6ZaCg@mail.gmail.com>
	<6461FDD0-BE78-403A-9FEF-E71C3D24F2CA@gmail.com>
Message-ID: <CAETgyPT_K8DDaujyUVYaRhaoKZaCT4MSrAi4DPczZHSkaTpDsQ@mail.gmail.com>

I see. Thanks Michael!

Best,
Jia-Xing

On Fri, May 26, 2017 at 3:54 PM, Michael Campbell <
michael.s.campbell1 at gmail.com> wrote:

> Hi Jia-Xing,
>
> v2.31.9 may not have had that option. I know that it is in the v3.00.0
> version, so you best option may be to update.
>
> Thanks,
> Mike
>
> On May 26, 2017, at 5:28 AM, Jia-Xing Yue <yuejiaxing at gmail.com> wrote:
>
> Hi Michael,
>
> This is a follow-up for the snoscan issue. I found the snoscan_meth option
> seems have been removed in the current maker_opts.ctl template file
> (v2.31.9). This option used to be there according to this post (
> https://www.biostars.org/p/217240/). I manually specified this option in
> my maker_opts.ctl file but I don't think maker has correctly recognized
> this option:
>
>
> STATUS: Parsing control files...
> WARNING: Invalid option 'snoscan_meth' in control file maker_opts.ctl
> ...
>
> Do you know is there a way to work around this problem? Thanks!
>
> Best,
> Jia-Xing
>
>
>
> On Mon, May 15, 2017 at 7:14 PM, Jia-Xing Yue <yuejiaxing at gmail.com>
> wrote:
>
>> Hi Michael,
>>
>> Many thanks for the information! I will specify the "snoscan_meth" file
>> and give it another try then. I majorly want to use maker to annotate
>> protein-coding genes and tRNAs. But it would be nice to have snoRNA
>> reasonably annotated as well.
>> Thanks gain and have a great day!
>>
>> Best,
>> Jia-Xing
>>
>>
>> On Mon, May 15, 2017 at 6:28 PM, Michael Campbell <
>> michael.s.campbell1 at gmail.com> wrote:
>>
>>> Hi Jia-Xing,
>>>
>>> That has been my experience in the past as well. For the non-coding RNAs
>>> tRNA-scan is very accurate while snoscan seems to be quite sensitive but
>>> very specific. Did you give it a ?snoscan_meth? file? Giving it
>>> a snoscan_meth file will help with accuracy. The biggest gains in accuracy
>>> are from small RNA-seq data. In the paper where we used snoscan on maize we
>>> didn?t keep any snoRNA predictions that didn?t have support from small
>>> RNA-seq data, in practical terms we got rid of anything with a AED of 1.
>>>
>>> I hope this helps,
>>> Mike
>>>
>>> On May 15, 2017, at 7:28 AM, Jia-Xing Yue <yuejiaxing at gmail.com> wrote:
>>>
>>> Hello,
>>>
>>> I configured snoscan (v.0.9.1) for my maker installation (v2.31.9) and
>>> run the annotation for a yeast (S. cerevisiae) genome. I think the
>>> annotation went well with regard to tRNAs and protein-coding genes but I am
>>> not sure about snoRNAs. I found multiple overlapped snoRNA genes were
>>> annotated by maker as the example below shows. I was wondering if this is
>>> expected. If not, what might have caused this problem and is there a way to
>>> work around. Thanks in advance!
>>>
>>> chrIX   maker   gene    4328    4416    .       +       .
>>> ID=snoscan-chrIX-noncoding-gene-0.49;Name=snoscan-chrIX-nonc
>>> oding-gene-0.49
>>> chrIX   maker   snoRNA  4328    4416    .       +       .
>>> ID=snoscan-chrIX-noncoding-gene-0.49-snoRNA-1;Parent=snoscan
>>> -chrIX-noncoding-gene-0.49;Name=snoscan-chrIX-noncoding-gene
>>> -0.49-snoRNA-1;_AED=1.00;_eAED=1.00;_QI=0|-1|0|0|-1|0|1|90|0
>>> chrIX   maker   exon    4328    4416    .       +       .
>>> ID=snoscan-chrIX-noncoding-gene-0.49-snoRNA-1:exon:12260;Par
>>> ent=snoscan-chrIX-noncoding-gene-0.49-snoRNA-1
>>> chrIX   maker   gene    4375    4563    .       +       .
>>> ID=snoscan-chrIX-noncoding-gene-0.50;Name=snoscan-chrIX-nonc
>>> oding-gene-0.50
>>> chrIX   maker   snoRNA  4375    4563    .       +       .
>>> ID=snoscan-chrIX-noncoding-gene-0.50-snoRNA-1;Parent=snoscan
>>> -chrIX-noncoding-gene-0.50;Name=snoscan-chrIX-noncoding-gene
>>> -0.50-snoRNA-1;_AED=1.00;_eAED=1.00;_QI=0|-1|0|0|-1|0|1|190|0
>>> chrIX   maker   exon    4375    4563    .       +       .
>>> ID=snoscan-chrIX-noncoding-gene-0.50-snoRNA-1:exon:12261;Par
>>> ent=snoscan-chrIX-noncoding-gene-0.50-snoRNA-1
>>> chrIX   maker   gene    4375    4461    .       +       .
>>> ID=snoscan-chrIX-noncoding-gene-0.51;Name=snoscan-chrIX-nonc
>>> oding-gene-0.51
>>> chrIX   maker   snoRNA  4375    4461    .       +       .
>>> ID=snoscan-chrIX-noncoding-gene-0.51-snoRNA-1;Parent=snoscan
>>> -chrIX-noncoding-gene-0.51;Name=snoscan-chrIX-noncoding-gene
>>> -0.51-snoRNA-1;_AED=1.00;_eAED=1.00;_QI=0|-1|0|0|-1|0|1|88|0
>>> chrIX   maker   exon    4375    4461    .       +       .
>>> ID=snoscan-chrIX-noncoding-gene-0.51-snoRNA-1:exon:12262;Par
>>> ent=snoscan-chrIX-noncoding-gene-0.51-snoRNA-1
>>> chrIX   maker   gene    4375    4491    .       +       .
>>> ID=snoscan-chrIX-noncoding-gene-0.52;Name=snoscan-chrIX-nonc
>>> oding-gene-0.52
>>> chrIX   maker   snoRNA  4375    4491    .       +       .
>>> ID=snoscan-chrIX-noncoding-gene-0.52-snoRNA-1;Parent=snoscan
>>> -chrIX-noncoding-gene-0.52;Name=snoscan-chrIX-noncoding-gene
>>> -0.52-snoRNA-1;_AED=1.00;_eAED=1.00;_QI=0|-1|0|0|-1|0|1|118|0
>>> chrIX   maker   exon    4375    4491    .       +       .
>>> ID=snoscan-chrIX-noncoding-gene-0.52-snoRNA-1:exon:12263;Par
>>> ent=snoscan-chrIX-noncoding-gene-0.52-snoRNA-1
>>> chrIX   maker   gene    4375    4500    .       +       .
>>> ID=snoscan-chrIX-noncoding-gene-0.53;Name=snoscan-chrIX-nonc
>>> oding-gene-0.53
>>> chrIX   maker   snoRNA  4375    4500    .       +       .
>>> ID=snoscan-chrIX-noncoding-gene-0.53-snoRNA-1;Parent=snoscan
>>> -chrIX-noncoding-gene-0.53;Name=snoscan-chrIX-noncoding-gene
>>> -0.53-snoRNA-1;_AED=1.00;_eAED=1.00;_QI=0|-1|0|0|-1|0|1|127|0
>>> chrIX   maker   exon    4375    4500    .       +       .
>>> ID=snoscan-chrIX-noncoding-gene-0.53-snoRNA-1:exon:12264;Par
>>> ent=snoscan-chrIX-noncoding-gene-0.53-snoRNA-1
>>>
>>> Best,
>>> Jia-Xing
>>>
>>>
>>> --
>>> Jia-Xing Yue
>>>
>>> Population Genomics and Complex Traits Group
>>> Tour Pasteur 8eme etage
>>> Facult? de M?decine
>>> Institute for Research on Cancer and Aging, Nice (IRCAN)
>>> CNRS UMR 7284 - INSERM U 1081 - Universit? C?te d?Azur (UCA)
>>> 28 Avenue de Valombrose
>>> 06107 NICE Cedex 2
>>> France
>>>
>>> Personal website: http://www.iamphioxus.org/
>>>
>>> _______________________________________________
>>> maker-devel mailing list
>>> maker-devel at box290.bluehost.com
>>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org
>>>
>>>
>>>
>>
>>
>> --
>> Jia-Xing Yue
>>
>> Population Genomics and Complex Traits Group
>> Tour Pasteur 8eme etage
>> Facult? de M?decine
>> Institute for Research on Cancer and Aging, Nice (IRCAN)
>> CNRS UMR 7284 - INSERM U 1081 - Universit? C?te d?Azur (UCA)
>> 28 Avenue de Valombrose
>> 06107 NICE Cedex 2
>> France
>>
>> Personal website: http://www.iamphioxus.org/
>>
>>
>
>
> --
> Jia-Xing Yue
>
> Population Genomics and Complex Traits Group
> Tour Pasteur 8eme etage
> Facult? de M?decine
> Institute for Research on Cancer and Aging, Nice (IRCAN)
> CNRS UMR 7284 - INSERM U 1081 - Universit? C?te d?Azur (UCA)
> 28 Avenue de Valombrose
> 06107 NICE Cedex 2
> France
>
> Personal website: http://www.iamphioxus.org/
>
>
>


-- 
Jia-Xing Yue

Population Genomics and Complex Traits Group
Tour Pasteur 8eme etage
Facult? de M?decine
Institute for Research on Cancer and Aging, Nice (IRCAN)
CNRS UMR 7284 - INSERM U 1081 - Universit? C?te d?Azur (UCA)
28 Avenue de Valombrose
06107 NICE Cedex 2
France

Personal website: http://www.iamphioxus.org/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170526/9c764175/attachment-0001.html>

From dcg at cau.edu.cn  Mon May  1 07:32:30 2017
From: dcg at cau.edu.cn (dcg at cau.edu.cn)
Date: Mon, 1 May 2017 21:32:30 +0800
Subject: [maker-devel] Why my maker get no results?
Message-ID: <2017050121323023791817@cau.edu.cn>

Dear sir:
    I' have bben working on genome annotation these days.My process in as below:
    
    1. I split my contigs into 300 parts and deal with them simultaneously to speed up my process. 
    2. I used my splited-genome, protein, ESTs and RNA-seq to make the first alignment( est2genome=1,  AED_threshold=0.2 ). 
    3. Merge the maker.*_.master_datastore_index.log to get all the paths of results.
    4. gff_merge script to merge all the results in different dirs.

    However, there is no results returned. (My genome is about 3GB, but the gff of result is none.)
    index_all.log.all.gff    1KB
    index_all.log.all.maker.proteins.fasta    2837KB
    index_all.log.all.maker.transcripts.fasta    9866KB


    Where can the problems take place?
    Thanks!
Yours sincerely.

Chao Chao


dcg at cau.edu.cn
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170501/6f1771a6/attachment-0002.html>

From carsonhh at gmail.com  Mon May  1 14:04:36 2017
From: carsonhh at gmail.com (Carson Holt)
Date: Mon, 1 May 2017 14:04:36 -0600
Subject: [maker-devel] Why my maker get no results?
In-Reply-To: <2017050121323023791817@cau.edu.cn>
References: <2017050121323023791817@cau.edu.cn>
Message-ID: <28772B4F-D674-49E2-BFBD-CE2651CE0454@gmail.com>

You can merge datastore indexes that way. You will need to run them separately (i.e. unmodified location and content from what MAKER gave you), and then merge the fasta and gff3 files afterwards.

?Carson


> On May 1, 2017, at 7:32 AM, dcg at cau.edu.cn wrote:
> 
> Dear sir:
>     I' have bben working on genome annotation these days.My process in as below:
>     
>     1. I split my contigs into 300 parts and deal with them simultaneously to speed up my process. 
>     2. I used my splited-genome, protein, ESTs and RNA-seq to make the first alignment( est2genome=1,  AED_threshold=0.2 ). 
>     3. Merge the maker.*_.master_datastore_index.log to get all the paths of results.
>     4. gff_merge script to merge all the results in different dirs.
> 
>     However, there is no results returned. (My genome is about 3GB, but the gff of result is none.)
>     index_all.log.all.gff    1KB
>     index_all.log.all.maker.proteins.fasta    2837KB
>     index_all.log.all.maker.transcripts.fasta    9866KB
> 
> 
> 
> 
>     Where can the problems take place?
>     Thanks!
> Yours sincerely.
> 
> Chao Chao
> dcg at cau.edu.cn <mailto:dcg at cau.edu.cn>_______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com <mailto:maker-devel at box290.bluehost.com>
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org <http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170501/1ea8b28e/attachment-0002.html>

From jim03ljy at 126.com  Wed May  3 06:15:22 2017
From: jim03ljy at 126.com (=?GBK?B?wqy98NStSmlt?=)
Date: Wed, 3 May 2017 20:15:22 +0800 (CST)
Subject: [maker-devel] RepeatMasker: NCBIBlastSearchEngine::search: Error
Message-ID: <50d2a447.a7ce.15bce3c7dcb.Coremail.jim03ljy@126.com>

Hi, I'm a newbie of maker.
I met some errors in Repeatmasker step.


The error is here:

NCBIBlastSearchEngine::search: Error...compressed subject database (/home/softwares/RepeatMasker/Libraries/20170127/general/is.lib) does not exist!


I tried ncbi+blast 2.5.0 version and 2.6.0 version as the path to blast, both have the same error.

And when I use the command as "maker -R", which skips the repeatmasker step, the maker could work.

I checked the former similar errors reported by another user and he solved the problem by  updating the RepBase.

So,

  I deleted and re-installed the RepeatMasker, updated the RepBase,  also installed RMblast.

The error is the same.


I'm stuck in the problem now.

Would highly appreciate any help - thanks!


Jinyuan Lu
Shanghai Jiao Tong University
No. 800 Dong Chuan Road,Minhang District,  Shanghai, P.R. China
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170503/ec5ea157/attachment-0002.html>

From dcg at cau.edu.cn  Wed May  3 09:29:18 2017
From: dcg at cau.edu.cn (dcg at cau.edu.cn)
Date: Wed, 3 May 2017 23:29:18 +0800
Subject: [maker-devel] How to explain the maker results?
Message-ID: <2017050323291810262239@cau.edu.cn>

Dear sir:
    I?ve been using maker to do my genome annotation. However, I still have something I can't understand:

    1. After assembly, I have many contigs. Firstly, I set est2genome=1 and protein2genome=1 , with my proteins, ESTs and RNA-seq.. Which way below is correct?
    1.1 Each contig has its own gff. I just use its own maker_gff file to get a pyu.hmm(be used in snap practice), and then, train the single contig.
    1.2 I merge all the maker_gff to produce a pyu.hmm(for snap) , and then, use this pyu.hmm to train all the contigs.
    
    2. The aim of my project is to find new protein, so I need to guarantee the rigor of my annotation.
        I  made a plan that the predicted protein should be successfully aligned to the Uniprot(reviewed protein, total number is about 30K) with 100% identity and coverage.
        However, if I choose method 1.2 as above:
        After the first step (est2genome=1 and protein2genome=1), about 1600 proteins can be 100% aligned to the Uniprot. After 2 rounds training(est2genome=0 and protein2genome=0), less proteins can be 100% aligned.
        Is my test method reasonable? Why the final results can't get more well aligned proteins?
        After training and fasta_merge, the results can be index_all.log.all.maker.proteins.fasta, index_all.log.all.maker.snap_masked.proteins.fasta, index_all.log.all.maker.non_overlapping_ab_initio.proteins.fasta,  which is the final results?

    
     I'm looking forward to hearing from you. Thanks!
Yours sincerely!

     
Chao Chao


dcg at cau.edu.cn
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170503/684f53cd/attachment-0002.html>

From d.ence at ufl.edu  Wed May  3 09:49:08 2017
From: d.ence at ufl.edu (Ence,daniel)
Date: Wed, 3 May 2017 15:49:08 +0000
Subject: [maker-devel] RepeatMasker: NCBIBlastSearchEngine::search: Error
In-Reply-To: <50d2a447.a7ce.15bce3c7dcb.Coremail.jim03ljy@126.com>
References: <50d2a447.a7ce.15bce3c7dcb.Coremail.jim03ljy@126.com>
Message-ID: <B69E8792-8153-4840-AC3C-D2E6CCFF3A45@mail.ufl.edu>

Hi, the error is regarding a specific file (is.lib) which isn?t being found. Can you verify that the file is there after you updated Repbase? Use the command:
?ls -l /home/softwares/RepeatMasker/Libraries/20170127/general/is.lib?

Thanks,
Daniel Ence


On May 3, 2017, at 8:15 AM, ???Jim <jim03ljy at 126.com<mailto:jim03ljy at 126.com>> wrote:

Hi, I'm a newbie of maker.
I met some errors in Repeatmasker step.

The error is here:
NCBIBlastSearchEngine::search: Error...compressed subject database (/home/softwares/RepeatMasker/Libraries/20170127/general/is.lib) does not exist!

I tried ncbi+blast 2.5.0 version and 2.6.0 version as the path to blast, both have the same error.
And when I use the command as "maker -R", which skips the repeatmasker step, the maker could work.
I checked the former similar errors reported by another user and he solved the problem by  updating the RepBase.
So,
  I deleted and re-installed the RepeatMasker, updated the RepBase,  also installed RMblast.
The error is the same.

I'm stuck in the problem now.
Would highly appreciate any help - thanks!

Jinyuan Lu
Shanghai Jiao Tong University
No. 800 Dong Chuan Road,Minhang District,  Shanghai, P.R. China
_______________________________________________
maker-devel mailing list
maker-devel at box290.bluehost.com<mailto:maker-devel at box290.bluehost.com>
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170503/2cf26b36/attachment-0002.html>

From carsonhh at gmail.com  Wed May  3 09:53:40 2017
From: carsonhh at gmail.com (Carson Holt)
Date: Wed, 3 May 2017 09:53:40 -0600
Subject: [maker-devel] RepeatMasker: NCBIBlastSearchEngine::search: Error
In-Reply-To: <50d2a447.a7ce.15bce3c7dcb.Coremail.jim03ljy@126.com>
References: <50d2a447.a7ce.15bce3c7dcb.Coremail.jim03ljy@126.com>
Message-ID: <92771056-564B-4953-B738-5A1B97FC71AF@gmail.com>

RepBase and RepeatMasker have changed structure with the new 4.0.7 released two months ago. The new version and RepBase is only compatible with the new version of RepeatMasker. You have to update both (complete reinstall). Or you have to use the previous version of RepeatMasker with the previous version of RepBase.

?Carson


> On May 3, 2017, at 6:15 AM, ???Jim <jim03ljy at 126.com> wrote:
> 
> Hi, I'm a newbie of maker.
> I met some errors in Repeatmasker step.
> 
> The error is here:
> NCBIBlastSearchEngine::search: Error...compressed subject database (/home/softwares/RepeatMasker/Libraries/20170127/general/is.lib) does not exist!
> 
> I tried ncbi+blast 2.5.0 version and 2.6.0 version as the path to blast, both have the same error.
> And when I use the command as "maker -R", which skips the repeatmasker step, the maker could work.
> I checked the former similar errors reported by another user and he solved the problem by  updating the RepBase.
> So,
>   I deleted and re-installed the RepeatMasker, updated the RepBase,  also installed RMblast.
> The error is the same.
> 
> I'm stuck in the problem now.
> Would highly appreciate any help - thanks!
> 
> Jinyuan Lu
> Shanghai Jiao Tong University
> No. 800 Dong Chuan Road,Minhang District,  Shanghai, P.R. China
> _______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170503/2dc895e1/attachment-0002.html>

From carsonhh at gmail.com  Wed May  3 09:55:41 2017
From: carsonhh at gmail.com (Carson Holt)
Date: Wed, 3 May 2017 09:55:41 -0600
Subject: [maker-devel] RepeatMasker: NCBIBlastSearchEngine::search: Error
In-Reply-To: <92771056-564B-4953-B738-5A1B97FC71AF@gmail.com>
References: <50d2a447.a7ce.15bce3c7dcb.Coremail.jim03ljy@126.com>
	<92771056-564B-4953-B738-5A1B97FC71AF@gmail.com>
Message-ID: <2E668C83-8884-430B-A764-CD0B44D03D19@gmail.com>

You may want to use the previous version of both as the new version may still have hidden bugs.

?Carson

> On May 3, 2017, at 9:53 AM, Carson Holt <carsonhh at gmail.com> wrote:
> 
> RepBase and RepeatMasker have changed structure with the new 4.0.7 released two months ago. The new version and RepBase is only compatible with the new version of RepeatMasker. You have to update both (complete reinstall). Or you have to use the previous version of RepeatMasker with the previous version of RepBase.
> 
> ?Carson
> 
> 
> 
>> On May 3, 2017, at 6:15 AM, ???Jim <jim03ljy at 126.com <mailto:jim03ljy at 126.com>> wrote:
>> 
>> Hi, I'm a newbie of maker.
>> I met some errors in Repeatmasker step.
>> 
>> The error is here:
>> NCBIBlastSearchEngine::search: Error...compressed subject database (/home/softwares/RepeatMasker/Libraries/20170127/general/is.lib) does not exist!
>> 
>> I tried ncbi+blast 2.5.0 version and 2.6.0 version as the path to blast, both have the same error.
>> And when I use the command as "maker -R", which skips the repeatmasker step, the maker could work.
>> I checked the former similar errors reported by another user and he solved the problem by  updating the RepBase.
>> So,
>>   I deleted and re-installed the RepeatMasker, updated the RepBase,  also installed RMblast.
>> The error is the same.
>> 
>> I'm stuck in the problem now.
>> Would highly appreciate any help - thanks!
>> 
>> Jinyuan Lu
>> Shanghai Jiao Tong University
>> No. 800 Dong Chuan Road,Minhang District,  Shanghai, P.R. China
>> _______________________________________________
>> maker-devel mailing list
>> maker-devel at box290.bluehost.com <mailto:maker-devel at box290.bluehost.com>
>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170503/2f9e37d8/attachment-0002.html>

From carsonhh at gmail.com  Wed May  3 10:04:20 2017
From: carsonhh at gmail.com (Carson Holt)
Date: Wed, 3 May 2017 10:04:20 -0600
Subject: [maker-devel] RepeatMasker: NCBIBlastSearchEngine::search: Error
In-Reply-To: <2E668C83-8884-430B-A764-CD0B44D03D19@gmail.com>
References: <50d2a447.a7ce.15bce3c7dcb.Coremail.jim03ljy@126.com>
	<92771056-564B-4953-B738-5A1B97FC71AF@gmail.com>
	<2E668C83-8884-430B-A764-CD0B44D03D19@gmail.com>
Message-ID: <BC60F425-2156-457F-B68C-576ABF778840@gmail.com>

You may have to contact RepBase via e-mail to find out how to get the libraries compatible with RepeatMasker 4.0.6 as it looks like they have removed the previous release from the website.

The last release for 4.0.6 was ?> repeatmaskerlibraries-20160829.tar.gz

?Carson


> On May 3, 2017, at 9:55 AM, Carson Holt <carsonhh at gmail.com> wrote:
> 
> You may want to use the previous version of both as the new version may still have hidden bugs.
> 
> ?Carson
> 
>> On May 3, 2017, at 9:53 AM, Carson Holt <carsonhh at gmail.com <mailto:carsonhh at gmail.com>> wrote:
>> 
>> RepBase and RepeatMasker have changed structure with the new 4.0.7 released two months ago. The new version and RepBase is only compatible with the new version of RepeatMasker. You have to update both (complete reinstall). Or you have to use the previous version of RepeatMasker with the previous version of RepBase.
>> 
>> ?Carson
>> 
>> 
>> 
>>> On May 3, 2017, at 6:15 AM, ???Jim <jim03ljy at 126.com <mailto:jim03ljy at 126.com>> wrote:
>>> 
>>> Hi, I'm a newbie of maker.
>>> I met some errors in Repeatmasker step.
>>> 
>>> The error is here:
>>> NCBIBlastSearchEngine::search: Error...compressed subject database (/home/softwares/RepeatMasker/Libraries/20170127/general/is.lib) does not exist!
>>> 
>>> I tried ncbi+blast 2.5.0 version and 2.6.0 version as the path to blast, both have the same error.
>>> And when I use the command as "maker -R", which skips the repeatmasker step, the maker could work.
>>> I checked the former similar errors reported by another user and he solved the problem by  updating the RepBase.
>>> So,
>>>   I deleted and re-installed the RepeatMasker, updated the RepBase,  also installed RMblast.
>>> The error is the same.
>>> 
>>> I'm stuck in the problem now.
>>> Would highly appreciate any help - thanks!
>>> 
>>> Jinyuan Lu
>>> Shanghai Jiao Tong University
>>> No. 800 Dong Chuan Road,Minhang District,  Shanghai, P.R. China
>>> _______________________________________________
>>> maker-devel mailing list
>>> maker-devel at box290.bluehost.com <mailto:maker-devel at box290.bluehost.com>
>>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org <http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org>
>> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170503/26d18c8b/attachment-0002.html>

From carsonhh at gmail.com  Wed May  3 10:10:48 2017
From: carsonhh at gmail.com (Carson Holt)
Date: Wed, 3 May 2017 10:10:48 -0600
Subject: [maker-devel] How to explain the maker results?
In-Reply-To: <2017050323291810262239@cau.edu.cn>
References: <2017050323291810262239@cau.edu.cn>
Message-ID: <049F8AC8-7E16-4F05-B8B2-01CA7AB88751@gmail.com>

Use the merged gff3 to train snap, otherwise you won?t have enough models.

Info on training can be found on the wiki ?> http://weatherby.genetics.utah.edu/MAKER/wiki/index.php/MAKER_Tutorial_for_GMOD_Online_Training_2014#Training_ab_initio_Gene_Predictors <http://weatherby.genetics.utah.edu/MAKER/wiki/index.php/MAKER_Tutorial_for_GMOD_Online_Training_2014#Training_ab_initio_Gene_Predictors>

Also you can find additional detailed info by searching the mailing list archives ?> http://groups.google.com/group/maker-devel <http://groups.google.com/group/maker-devel>

I?m not sure what you are asking with the last question. Alignment is not a function of training, and will not be affected by the hmm, but 100% coverage and identity is too strict a threshold even for data derived from the same species.

?Carson


> On May 3, 2017, at 9:29 AM, dcg at cau.edu.cn wrote:
> 
> Dear sir:
>     I?ve been using maker to do my genome annotation. However, I still have something I can't understand:
> 
>     1. After assembly, I have many contigs. Firstly, I set est2genome=1 and protein2genome=1 , with my proteins, ESTs and RNA-seq.. Which way below is correct?
>     1.1 Each contig has its own gff. I just use its own maker_gff file to get a pyu.hmm(be used in snap practice), and then, train the single contig.
>     1.2 I merge all the maker_gff to produce a pyu.hmm(for snap) , and then, use this pyu.hmm to train all the contigs.
>     
>     2. The aim of my project is to find new protein, so I need to guarantee the rigor of my annotation.
>         I  made a plan that the predicted protein should be successfully aligned to the Uniprot(reviewed protein, total number is about 30K) with 100% identity and coverage.
>         However, if I choose method 1.2 as above:
>         After the first step (est2genome=1 and protein2genome=1), about 1600 proteins can be 100% aligned to the Uniprot. After 2 rounds training(est2genome=0 and protein2genome=0), less proteins can be 100% aligned.
>         Is my test method reasonable? Why the final results can't get more well aligned proteins?
>         After training and fasta_merge, the results can be index_all.log.all.maker.proteins.fasta, index_all.log.all.maker.snap_masked.proteins.fasta, index_all.log.all.maker.non_overlapping_ab_initio.proteins.fasta,  which is the final results?
> 
>     
>      I'm looking forward to hearing from you. Thanks!
> Yours sincerely!
> 
>      
> Chao Chao
> dcg at cau.edu.cn <mailto:dcg at cau.edu.cn>_______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com <mailto:maker-devel at box290.bluehost.com>
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org <http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170503/75918e71/attachment-0002.html>

From nathan.ricks at gmail.com  Wed May  3 10:19:57 2017
From: nathan.ricks at gmail.com (Nathan Ricks)
Date: Wed, 3 May 2017 10:19:57 -0600
Subject: [maker-devel] Post Processing of Annotations
Message-ID: <CAJfcix0=bU1QmOX+qhfGO_YGbSARpxPHpuOsfGv9jAp8ZpyjNg@mail.gmail.com>

Hi,
I've been running your Maker pipeline, and I've reached Post Processing of
Annotations portion. In your Online training you use the output.blastp and
the outuput.iprscan files to help assign function.
My question is what format do these files need to be in.
Iprscan can produce files in a variety of formats: tsv, xml, gff3, html and
SVG
while blastp can produce the tabular, pairise, xml and a number of others.

Thanks

Nathan Ricks
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170503/5f47cc24/attachment-0002.html>

From carsonhh at gmail.com  Wed May  3 10:30:03 2017
From: carsonhh at gmail.com (Carson Holt)
Date: Wed, 3 May 2017 10:30:03 -0600
Subject: [maker-devel] Post Processing of Annotations
In-Reply-To: <CAJfcix0=bU1QmOX+qhfGO_YGbSARpxPHpuOsfGv9jAp8ZpyjNg@mail.gmail.com>
References: <CAJfcix0=bU1QmOX+qhfGO_YGbSARpxPHpuOsfGv9jAp8ZpyjNg@mail.gmail.com>
Message-ID: <DC72D273-A00B-4018-8C36-E124CAC8D606@gmail.com>

Use blastp with the the tab delimited format option and the UniProt/Swiss-Prot database. What additional filters you choose to set (i.e. e-value limit) may vary, although I would recommend 1e-6 or lower.

?Carson

> On May 3, 2017, at 10:19 AM, Nathan Ricks <nathan.ricks at gmail.com> wrote:
> 
> Hi,
> I've been running your Maker pipeline, and I've reached Post Processing of Annotations portion. In your Online training you use the output.blastp and the outuput.iprscan files to help assign function.
> My question is what format do these files need to be in.
> Iprscan can produce files in a variety of formats: tsv, xml, gff3, html and SVG
> while blastp can produce the tabular, pairise, xml and a number of others.
> 
> Thanks
> 
> Nathan Ricks
> _______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org


From d.ence at ufl.edu  Wed May  3 10:34:35 2017
From: d.ence at ufl.edu (Ence,daniel)
Date: Wed, 3 May 2017 16:34:35 +0000
Subject: [maker-devel] Post Processing of Annotations
In-Reply-To: <CAJfcix0=bU1QmOX+qhfGO_YGbSARpxPHpuOsfGv9jAp8ZpyjNg@mail.gmail.com>
References: <CAJfcix0=bU1QmOX+qhfGO_YGbSARpxPHpuOsfGv9jAp8ZpyjNg@mail.gmail.com>
Message-ID: <F9D83F09-F94E-473D-911B-6626CDC8E639@mail.ufl.edu>

Hi, The iprscan output should be in tsv format, which is tab-separated, and the usage statement for the maker_functional_gff says that the blastp output should be in ?wu-blast -mformat 2?, which I think is tabbed too.

~Daniel

 
> On May 3, 2017, at 12:19 PM, Nathan Ricks <nathan.ricks at gmail.com> wrote:
> 
> Hi,
> I've been running your Maker pipeline, and I've reached Post Processing of Annotations portion. In your Online training you use the output.blastp and the outuput.iprscan files to help assign function.
> My question is what format do these files need to be in.
> Iprscan can produce files in a variety of formats: tsv, xml, gff3, html and SVG
> while blastp can produce the tabular, pairise, xml and a number of others.
> 
> Thanks
> 
> Nathan Ricks
> _______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org


From carsonhh at gmail.com  Wed May  3 13:20:31 2017
From: carsonhh at gmail.com (Carson Holt)
Date: Wed, 3 May 2017 13:20:31 -0600
Subject: [maker-devel] Post Processing of Annotations
In-Reply-To: <CAJfcix1wnd24e8CZs5KT2j7axJOO-yhSMswJBozCSFqbo7ds3A@mail.gmail.com>
References: <CAJfcix0=bU1QmOX+qhfGO_YGbSARpxPHpuOsfGv9jAp8ZpyjNg@mail.gmail.com>
	<DC72D273-A00B-4018-8C36-E124CAC8D606@gmail.com>
	<CAJfcix1wnd24e8CZs5KT2j7axJOO-yhSMswJBozCSFqbo7ds3A@mail.gmail.com>
Message-ID: <0EE0D7F6-5F28-46E7-9AB8-CED93DC811F6@gmail.com>

The maker_functional_gff and maker_functional_fasta scripts pull specific fields out of the UniProt fasta header, so they are tied to the format used by UniProt/Swiss-Prot. At one time I had modified them to also work with NR, but that was several years ago, so I don?t know if it would still work.

?Carson


> On May 3, 2017, at 1:10 PM, Nathan Ricks <nathan.ricks at gmail.com> wrote:
> 
> Is it possible to make my own database from sequences that I have downloaded form NCBI instead of using the UniProt/Swiss-Prot?
> 
> On Wed, May 3, 2017 at 10:30 AM, Carson Holt <carsonhh at gmail.com <mailto:carsonhh at gmail.com>> wrote:
> Use blastp with the the tab delimited format option and the UniProt/Swiss-Prot database. What additional filters you choose to set (i.e. e-value limit) may vary, although I would recommend 1e-6 or lower.
> 
> ?Carson
> 
> > On May 3, 2017, at 10:19 AM, Nathan Ricks <nathan.ricks at gmail.com <mailto:nathan.ricks at gmail.com>> wrote:
> >
> > Hi,
> > I've been running your Maker pipeline, and I've reached Post Processing of Annotations portion. In your Online training you use the output.blastp and the outuput.iprscan files to help assign function.
> > My question is what format do these files need to be in.
> > Iprscan can produce files in a variety of formats: tsv, xml, gff3, html and SVG
> > while blastp can produce the tabular, pairise, xml and a number of others.
> >
> > Thanks
> >
> > Nathan Ricks
> > _______________________________________________
> > maker-devel mailing list
> > maker-devel at box290.bluehost.com <mailto:maker-devel at box290.bluehost.com>
> > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org <http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org>
> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170503/917df7db/attachment-0002.html>

From mjfi2sb3 at gmail.com  Thu May  4 00:37:52 2017
From: mjfi2sb3 at gmail.com (Salim Bougouffa)
Date: Thu, 04 May 2017 06:37:52 +0000
Subject: [maker-devel] advanced repeat masking library constructions &
 rna-seq assembly choices
Message-ID: <CAJb_6LTFg7MCZa5_+uUhKDDKMeNqa51Kvq11aQCfECA8nxr1Tg@mail.gmail.com>

Hi,

I am attempting to annotate a plant genome. I have a couple of questions:

*1) RNA-seq assembly*
a) I assembled my RNA-seq data using Trinity and StringTie. The two produce
drastically different numbers. When I compare the two assemblies for each
sample using TransRate, StringTie produces a higher score. for most of the
assemblies. I see in all of the threads that you recommend Trinity but
doesn't trinity produce way too many transcripts (even after chucking out
the "bad" ones using transrate).
b) During hint creation in MAKER, does it take into account that different
transcripts have different read coverage (expression levels). I guess my
question is should I filter transcripts that have a small read coverage.

*2) Repeat Masking *
I am following the advanced repeat library construction tutorial (
http://weatherby.genetics.utah.edu/MAKER/wiki/index.php/Repeat_Library_Construction-Advanced).
The initial steps find 15 sequences for the LTR and 159 for MITE. But, when
I get to the perl DIR_CRL/CRL_Step4.pl step, both output files
(Inner_Seq_For_BLAST.fasta, lLTRs_Seq_For_BLAST.fasta) are empty.

a) are these numbers normal because I was expecting a lot more than 16 for
the LTR?
b) I don't get any errors when I run CRL_Step4.pl yet no output. What's
going on?!

Many thanks,
/SB
-- 

____________________________
Sent from Inbox Mobile
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170504/a5f4a761/attachment-0002.html>

From jim03ljy at 126.com  Thu May  4 00:36:12 2017
From: jim03ljy at 126.com (=?GBK?B?wqy98NStSmlt?=)
Date: Thu, 4 May 2017 14:36:12 +0800 (CST)
Subject: [maker-devel] RepeatMasker: NCBIBlastSearchEngine::search: Error
In-Reply-To: <BC60F425-2156-457F-B68C-576ABF778840@gmail.com>
References: <50d2a447.a7ce.15bce3c7dcb.Coremail.jim03ljy@126.com>
	<92771056-564B-4953-B738-5A1B97FC71AF@gmail.com>
	<2E668C83-8884-430B-A764-CD0B44D03D19@gmail.com>
	<BC60F425-2156-457F-B68C-576ABF778840@gmail.com>
Message-ID: <12de56ed.5ded.15bd22c56c7.Coremail.jim03ljy@126.com>

Thanks a lot!
Problem solved. 
I matched the RepeatMasker 4.0.7 with RepBase20170127 and it worked!


Thanks!
   ----Jinyuan Lu


At 2017-05-04 00:04:20, "Carson Holt" <carsonhh at gmail.com> wrote:
You may have to contact RepBase via e-mail to find out how to get the libraries compatible with RepeatMasker 4.0.6 as it looks like they have removed the previous release from the website.


The last release for 4.0.6 was ?> repeatmaskerlibraries-20160829.tar.gz


?Carson


On May 3, 2017, at 9:55 AM, Carson Holt <carsonhh at gmail.com> wrote:


You may want to use the previous version of both as the new version may still have hidden bugs.


?Carson


On May 3, 2017, at 9:53 AM, Carson Holt <carsonhh at gmail.com> wrote:


RepBase and RepeatMasker have changed structure with the new 4.0.7 released two months ago. The new version and RepBase is only compatible with the new version of RepeatMasker. You have to update both (complete reinstall). Or you have to use the previous version of RepeatMasker with the previous version of RepBase.


?Carson


On May 3, 2017, at 6:15 AM, ???Jim <jim03ljy at 126.com> wrote:


Hi, I'm a newbie of maker.
I met some errors in Repeatmasker step.


The error is here:
NCBIBlastSearchEngine::search: Error...compressed subject database (/home/softwares/RepeatMasker/Libraries/20170127/general/is.lib) does not exist!


I tried ncbi+blast 2.5.0 version and 2.6.0 version as the path to blast, both have the same error.
And when I use the command as "maker -R", which skips the repeatmasker step, the maker could work.
I checked the former similar errors reported by another user and he solved the problem by  updating the RepBase.
So,
  I deleted and re-installed the RepeatMasker, updated the RepBase,  also installed RMblast.
The error is the same.


I'm stuck in the problem now.
Would highly appreciate any help - thanks!


Jinyuan Lu
Shanghai Jiao Tong University
No. 800 Dong Chuan Road,Minhang District,  Shanghai, P.R. China
_______________________________________________
maker-devel mailing list
maker-devel at box290.bluehost.com
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170504/2fc4c117/attachment-0002.html>

From dcg at cau.edu.cn  Fri May  5 07:43:43 2017
From: dcg at cau.edu.cn (dcg at cau.edu.cn)
Date: Fri, 5 May 2017 21:43:43 +0800
Subject: [maker-devel] How to evaluate maker proteins' quality?
Message-ID: <2017050521434331108720@cau.edu.cn>

Dear sir:
    After I finished my maker running, I should check the quality of my results.
    My annotation purpose is to find some new proteins.
    There is about 30K reviewed proteins of my species. If I want to see how many predicted proteins can support the reviewed proteins, how to do it?(Can blastp be OK? How to set the threshold? )
    I used Uniprot, ESTs and RNA-seq to do my annotation. From my perspective, if the protein is reviewed and used to train snap/augustus, we should get the same one after several training rounds. So I planned to align maker_proteins to Uniprot proteins(which I utilized to annotate). If the predicted proteins match Uniprot by 100% identity and coverage, they can be thought to support the reviewed proteins. Is it correct?    
    
    If not, maybe I can evaluate my proteins only by AED value and proteome domain?

    I'm looking forward to your help. Thanks a lot!

Yours sincerely!

Chao Chao


dcg at cau.edu.cn
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170505/0f7ef3a2/attachment-0002.html>

From carsonhh at gmail.com  Sun May  7 18:31:31 2017
From: carsonhh at gmail.com (Carson Holt)
Date: Sun, 7 May 2017 18:31:31 -0600
Subject: [maker-devel] How to evaluate maker proteins' quality?
In-Reply-To: <2017050521434331108720@cau.edu.cn>
References: <2017050521434331108720@cau.edu.cn>
Message-ID: <51620CC3-43D9-47D5-B8B3-871F291D6518@gmail.com>

Because of small differences in the assemblies, individual variants, annotated proteins used as reference being partial, as well as potential assembly error, a 100% identity expectation is too high. About 90+% would be more reasonable for a same species comparison. AED gives a good correlation with protein confidence. A perfect zero score will not happen often though since the way alignment algorithms work will leave alignment errors around splice sites and short exons. Also the evidence used is never perfect, so with AED lower values are better than higher values but can not be used as an overly specific measurement (it is only correlative and not exact).

?Carson


> On May 5, 2017, at 7:43 AM, dcg at cau.edu.cn wrote:
> 
> Dear sir:
>     After I finished my maker running, I should check the quality of my results.
>     My annotation purpose is to find some new proteins.
>     There is about 30K reviewed proteins of my species. If I want to see how many predicted proteins can support the reviewed proteins, how to do it?(Can blastp be OK? How to set the threshold? )
>     I used Uniprot, ESTs and RNA-seq to do my annotation. From my perspective, if the protein is reviewed and used to train snap/augustus, we should get the same one after several training rounds. So I planned to align maker_proteins to Uniprot proteins(which I utilized to annotate). If the predicted proteins match Uniprot by 100% identity and coverage, they can be thought to support the reviewed proteins. Is it correct?    
>     
>     If not, maybe I can evaluate my proteins only by AED value and proteome domain?
> 
>     I'm looking forward to your help. Thanks a lot!
> 
> Yours sincerely!
> 
> Chao Chao
> dcg at cau.edu.cn <mailto:dcg at cau.edu.cn>_______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com <mailto:maker-devel at box290.bluehost.com>
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org <http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170507/07783c61/attachment-0002.html>

From carsonhh at gmail.com  Sun May  7 19:17:37 2017
From: carsonhh at gmail.com (Carson Holt)
Date: Sun, 7 May 2017 19:17:37 -0600
Subject: [maker-devel] advanced repeat masking library constructions &
 rna-seq assembly choices
In-Reply-To: <CAJb_6LTFg7MCZa5_+uUhKDDKMeNqa51Kvq11aQCfECA8nxr1Tg@mail.gmail.com>
References: <CAJb_6LTFg7MCZa5_+uUhKDDKMeNqa51Kvq11aQCfECA8nxr1Tg@mail.gmail.com>
Message-ID: <18086AF2-01C3-4671-B974-C5FF36460618@gmail.com>

Michael can you answer the second question (Michael wrote the protocol, so I CC?d him).

With respect to the first question. Expression level is not necessarily relevant to the annotation process (so no MAKER does not look at read coverage). Instead we use the transcript assemblies to identify introns via splice aware alignment (yes it is the introns and not the exons we care about). Trinity has a nice option called jaccard_clip which avoids false merging of neighboring transcripts (mostly occurs in fungi where UTR can overlap). Merging of transcripts will cause extra introns to be assigned as hints as well as potential overextension of UTR during final polishing steps. The jaccard_clip option is the main reason we recommend Trinity. If Stringtie has a similar option, then it can be used as well.

Thanks,
Carson


> On May 4, 2017, at 12:37 AM, Salim Bougouffa <mjfi2sb3 at gmail.com> wrote:
> 
> Hi,
> 
> I am attempting to annotate a plant genome. I have a couple of questions:
> 
> 1) RNA-seq assembly
> a) I assembled my RNA-seq data using Trinity and StringTie. The two produce drastically different numbers. When I compare the two assemblies for each sample using TransRate, StringTie produces a higher score. for most of the assemblies. I see in all of the threads that you recommend Trinity but doesn't trinity produce way too many transcripts (even after chucking out the "bad" ones using transrate).
> b) During hint creation in MAKER, does it take into account that different transcripts have different read coverage (expression levels). I guess my question is should I filter transcripts that have a small read coverage.
> 
> 2) Repeat Masking 
> I am following the advanced repeat library construction tutorial (http://weatherby.genetics.utah.edu/MAKER/wiki/index.php/Repeat_Library_Construction-Advanced <http://weatherby.genetics.utah.edu/MAKER/wiki/index.php/Repeat_Library_Construction-Advanced>). The initial steps find 15 sequences for the LTR and 159 for MITE. But, when I get to the perl DIR_CRL/CRL_Step4.pl step, both output files (Inner_Seq_For_BLAST.fasta, lLTRs_Seq_For_BLAST.fasta) are empty.
> 
> a) are these numbers normal because I was expecting a lot more than 16 for the LTR? 
> b) I don't get any errors when I run CRL_Step4.pl yet no output. What's going on?!
> 
> Many thanks,
> /SB
> -- 
> ____________________________
> Sent from Inbox Mobile
> 
> _______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170507/28de8fb9/attachment-0002.html>

From mcampbel at cshl.edu  Sun May  7 19:24:27 2017
From: mcampbel at cshl.edu (Campbell, Michael)
Date: Mon, 8 May 2017 01:24:27 +0000
Subject: [maker-devel] advanced repeat masking library constructions &
 rna-seq assembly choices
In-Reply-To: <18086AF2-01C3-4671-B974-C5FF36460618@gmail.com>
References: <CAJb_6LTFg7MCZa5_+uUhKDDKMeNqa51Kvq11aQCfECA8nxr1Tg@mail.gmail.com>
	<18086AF2-01C3-4671-B974-C5FF36460618@gmail.com>
Message-ID: <076B034E-8107-49CE-90C7-277AA4AB4ED3@cshl.edu>

Hi SB,

I?ve added Ning Jaing to this email. She has put great effort into updating this protocol recently and will be able to address your questions better than I can.

Ning, would you mind helping out with this?

Thanks,
Mike

On May 7, 2017, at 9:17 PM, Carson Holt <carsonhh at gmail.com<mailto:carsonhh at gmail.com>> wrote:

Michael can you answer the second question (Michael wrote the protocol, so I CC?d him).

With respect to the first question. Expression level is not necessarily relevant to the annotation process (so no MAKER does not look at read coverage). Instead we use the transcript assemblies to identify introns via splice aware alignment (yes it is the introns and not the exons we care about). Trinity has a nice option called jaccard_clip which avoids false merging of neighboring transcripts (mostly occurs in fungi where UTR can overlap). Merging of transcripts will cause extra introns to be assigned as hints as well as potential overextension of UTR during final polishing steps. The jaccard_clip option is the main reason we recommend Trinity. If Stringtie has a similar option, then it can be used as well.

Thanks,
Carson


On May 4, 2017, at 12:37 AM, Salim Bougouffa <mjfi2sb3 at gmail.com<mailto:mjfi2sb3 at gmail.com>> wrote:

Hi,

I am attempting to annotate a plant genome. I have a couple of questions:

1) RNA-seq assembly
a) I assembled my RNA-seq data using Trinity and StringTie. The two produce drastically different numbers. When I compare the two assemblies for each sample using TransRate, StringTie produces a higher score. for most of the assemblies. I see in all of the threads that you recommend Trinity but doesn't trinity produce way too many transcripts (even after chucking out the "bad" ones using transrate).
b) During hint creation in MAKER, does it take into account that different transcripts have different read coverage (expression levels). I guess my question is should I filter transcripts that have a small read coverage.

2) Repeat Masking
I am following the advanced repeat library construction tutorial (http://weatherby.genetics.utah.edu/MAKER/wiki/index.php/Repeat_Library_Construction-Advanced). The initial steps find 15 sequences for the LTR and 159 for MITE. But, when I get to the perl DIR_CRL/CRL_Step4.pl step, both output files (Inner_Seq_For_BLAST.fasta, lLTRs_Seq_For_BLAST.fasta) are empty.

a) are these numbers normal because I was expecting a lot more than 16 for the LTR?
b) I don't get any errors when I run CRL_Step4.pl yet no output. What's going on?!

Many thanks,
/SB
--

____________________________
Sent from Inbox Mobile

_______________________________________________
maker-devel mailing list
maker-devel at box290.bluehost.com<mailto:maker-devel at box290.bluehost.com>
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org

_______________________________________________
maker-devel mailing list
maker-devel at box290.bluehost.com<mailto:maker-devel at box290.bluehost.com>
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org


From jiangn at msu.edu  Mon May  8 09:50:45 2017
From: jiangn at msu.edu (Jiang, Ning)
Date: Mon, 8 May 2017 15:50:45 +0000
Subject: [maker-devel] advanced repeat masking library constructions &
 rna-seq assembly choices
In-Reply-To: <076B034E-8107-49CE-90C7-277AA4AB4ED3@cshl.edu>
References: <CAJb_6LTFg7MCZa5_+uUhKDDKMeNqa51Kvq11aQCfECA8nxr1Tg@mail.gmail.com>
	<18086AF2-01C3-4671-B974-C5FF36460618@gmail.com>,
	<076B034E-8107-49CE-90C7-277AA4AB4ED3@cshl.edu>
Message-ID: <DM5PR1201MB0060CD7E9DFA122F7D1A127DCEEE0@DM5PR1201MB0060.namprd12.prod.outlook.com>

Hi Salim,


I am sorry to learn about the issues. it depends on the quality of your genome assembly for how many intact LTR elements you would get; however, 16 seems too low to me.


The inner and LTR sequence file should NOT be empty. Some times the issue could be due to that the initial sequence name is long and complicated. If that's the case for your sequences, you might want to simplify your sequence name (only including letters and numbers) and try again.


We are working on an automatic pipeline for LTR collection, if everything goes smoothly, it should be available in two to three months.


Best wishes,


Ning

________________________________
From: Campbell, Michael <mcampbel at cshl.edu>
Sent: Sunday, May 7, 2017 9:24 PM
To: Carson Holt
Cc: Salim Bougouffa; maker-devel at yandell-lab.org List; Jiang, Ning
Subject: Re: [maker-devel] advanced repeat masking library constructions & rna-seq assembly choices

Hi SB,

I?ve added Ning Jaing to this email. She has put great effort into updating this protocol recently and will be able to address your questions better than I can.

Ning, would you mind helping out with this?

Thanks,
Mike

On May 7, 2017, at 9:17 PM, Carson Holt <carsonhh at gmail.com<mailto:carsonhh at gmail.com>> wrote:

Michael can you answer the second question (Michael wrote the protocol, so I CC?d him).

With respect to the first question. Expression level is not necessarily relevant to the annotation process (so no MAKER does not look at read coverage). Instead we use the transcript assemblies to identify introns via splice aware alignment (yes it is the introns and not the exons we care about). Trinity has a nice option called jaccard_clip which avoids false merging of neighboring transcripts (mostly occurs in fungi where UTR can overlap). Merging of transcripts will cause extra introns to be assigned as hints as well as potential overextension of UTR during final polishing steps. The jaccard_clip option is the main reason we recommend Trinity. If Stringtie has a similar option, then it can be used as well.

Thanks,
Carson


On May 4, 2017, at 12:37 AM, Salim Bougouffa <mjfi2sb3 at gmail.com<mailto:mjfi2sb3 at gmail.com>> wrote:

Hi,

I am attempting to annotate a plant genome. I have a couple of questions:

1) RNA-seq assembly
a) I assembled my RNA-seq data using Trinity and StringTie. The two produce drastically different numbers. When I compare the two assemblies for each sample using TransRate, StringTie produces a higher score. for most of the assemblies. I see in all of the threads that you recommend Trinity but doesn't trinity produce way too many transcripts (even after chucking out the "bad" ones using transrate).
b) During hint creation in MAKER, does it take into account that different transcripts have different read coverage (expression levels). I guess my question is should I filter transcripts that have a small read coverage.

2) Repeat Masking
I am following the advanced repeat library construction tutorial (http://weatherby.genetics.utah.edu/MAKER/wiki/index.php/Repeat_Library_Construction-Advanced). The initial steps find 15 sequences for the LTR and 159 for MITE. But, when I get to the perl DIR_CRL/CRL_Step4.pl step, both output files (Inner_Seq_For_BLAST.fasta, lLTRs_Seq_For_BLAST.fasta) are empty.

a) are these numbers normal because I was expecting a lot more than 16 for the LTR?
b) I don't get any errors when I run CRL_Step4.pl yet no output. What's going on?!

Many thanks,
/SB
--

____________________________
Sent from Inbox Mobile

_______________________________________________
maker-devel mailing list
maker-devel at box290.bluehost.com<mailto:maker-devel at box290.bluehost.com>
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org

_______________________________________________
maker-devel mailing list
maker-devel at box290.bluehost.com<mailto:maker-devel at box290.bluehost.com>
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170508/66453bce/attachment-0002.html>

From mjfi2sb3 at gmail.com  Mon May  8 10:41:51 2017
From: mjfi2sb3 at gmail.com (Salim Bougouffa)
Date: Mon, 08 May 2017 16:41:51 +0000
Subject: [maker-devel] advanced repeat masking library constructions &
 rna-seq assembly choices
In-Reply-To: <DM5PR1201MB0060CD7E9DFA122F7D1A127DCEEE0@DM5PR1201MB0060.namprd12.prod.outlook.com>
References: <CAJb_6LTFg7MCZa5_+uUhKDDKMeNqa51Kvq11aQCfECA8nxr1Tg@mail.gmail.com>
	<18086AF2-01C3-4671-B974-C5FF36460618@gmail.com>
	<076B034E-8107-49CE-90C7-277AA4AB4ED3@cshl.edu>
	<DM5PR1201MB0060CD7E9DFA122F7D1A127DCEEE0@DM5PR1201MB0060.namprd12.prod.outlook.com>
Message-ID: <CAJb_6LQFuL0WtMh1oM44Smdc0Rp0dGHeAo-gcRWXXv3cFgu8zg@mail.gmail.com>

Thank you all for your responses.

Regards,
/SB

On Mon, 8 May 2017, 18:50 Jiang, Ning, <jiangn at msu.edu> wrote:

> Hi Salim,
>
>
> I am sorry to learn about the issues. it depends on the quality of your
> genome assembly for how many intact LTR elements you would get; however, 16
> seems too low to me.
>
>
> The inner and LTR sequence file should NOT be empty. Some times the issue
> could be due to that the initial sequence name is long and complicated. If
> that's the case for your sequences, you might want to simplify your
> sequence name (only including letters and numbers) and try again.
>
>
> We are working on an automatic pipeline for LTR collection, if everything
> goes smoothly, it should be available in two to three months.
>
>
> Best wishes,
>
>
> Ning
> ------------------------------
> *From:* Campbell, Michael <mcampbel at cshl.edu>
> *Sent:* Sunday, May 7, 2017 9:24 PM
> *To:* Carson Holt
> *Cc:* Salim Bougouffa; maker-devel at yandell-lab.org List; Jiang, Ning
> *Subject:* Re: [maker-devel] advanced repeat masking library
> constructions & rna-seq assembly choices
>
> Hi SB,
>
> I?ve added Ning Jaing to this email. She has put great effort into
> updating this protocol recently and will be able to address your questions
> better than I can.
>
> Ning, would you mind helping out with this?
>
> Thanks,
> Mike
>
> On May 7, 2017, at 9:17 PM, Carson Holt <carsonhh at gmail.com<mailto:
> carsonhh at gmail.com>> wrote:
>
> Michael can you answer the second question (Michael wrote the protocol, so
> I CC?d him).
>
> With respect to the first question. Expression level is not necessarily
> relevant to the annotation process (so no MAKER does not look at read
> coverage). Instead we use the transcript assemblies to identify introns via
> splice aware alignment (yes it is the introns and not the exons we care
> about). Trinity has a nice option called jaccard_clip which avoids false
> merging of neighboring transcripts (mostly occurs in fungi where UTR can
> overlap). Merging of transcripts will cause extra introns to be assigned as
> hints as well as potential overextension of UTR during final polishing
> steps. The jaccard_clip option is the main reason we recommend Trinity. If
> Stringtie has a similar option, then it can be used as well.
>
> Thanks,
> Carson
>
>
>
> On May 4, 2017, at 12:37 AM, Salim Bougouffa <mjfi2sb3 at gmail.com<mailto:
> mjfi2sb3 at gmail.com>> wrote:
>
> Hi,
>
> I am attempting to annotate a plant genome. I have a couple of questions:
>
> 1) RNA-seq assembly
> a) I assembled my RNA-seq data using Trinity and StringTie. The two
> produce drastically different numbers. When I compare the two assemblies
> for each sample using TransRate, StringTie produces a higher score. for
> most of the assemblies. I see in all of the threads that you recommend
> Trinity but doesn't trinity produce way too many transcripts (even after
> chucking out the "bad" ones using transrate).
> b) During hint creation in MAKER, does it take into account that different
> transcripts have different read coverage (expression levels). I guess my
> question is should I filter transcripts that have a small read coverage.
>
> 2) Repeat Masking
> I am following the advanced repeat library construction tutorial (
> http://weatherby.genetics.utah.edu/MAKER/wiki/index.php/Repeat_Library_Construction-Advanced).
> The initial steps find 15 sequences for the LTR and 159 for MITE. But, when
> I get to the perl DIR_CRL/CRL_Step4.pl step, both output files
> (Inner_Seq_For_BLAST.fasta, lLTRs_Seq_For_BLAST.fasta) are empty.
>
> a) are these numbers normal because I was expecting a lot more than 16 for
> the LTR?
> b) I don't get any errors when I run CRL_Step4.pl yet no output. What's
> going on?!
>
> Many thanks,
> /SB
> --
>
> ____________________________
> Sent from Inbox Mobile
>
> _______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com<mailto:maker-devel at box290.bluehost.com>
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org
>
> _______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com<mailto:maker-devel at box290.bluehost.com>
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170508/0cfc9de1/attachment-0002.html>

From qwzhang0601 at gmail.com  Wed May 10 09:48:01 2017
From: qwzhang0601 at gmail.com (Quanwei Zhang)
Date: Wed, 10 May 2017 11:48:01 -0400
Subject: [maker-devel] want coding sequences
Message-ID: <CAOW6FSJFhCP8-V7L9N6RLnnyvtoa3SvQZaZY9Phi0-stXe+vGQ@mail.gmail.com>

Hello:

Thanks for development and maintenance of the tool "Maker2".  We have used
Maker2 to do genome annotation of a new rodent species. Now we are doing
downstream analysis, which requires inputs of coding sequences from
different species.

I found the outputs I got from Maker2 only include protein sequences and
transcripts. Is there an easy way that I can get the coding sequences for
our annotated genome?

Many thanks

Best
Quanwei
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170510/ef5f7263/attachment-0002.html>

From munholl at uwindsor.ca  Wed May 10 14:24:49 2017
From: munholl at uwindsor.ca (Seth Munholland)
Date: Wed, 10 May 2017 16:24:49 -0400
Subject: [maker-devel] MAKER only running 1 task
Message-ID: <CAL=sJwroONDyNp_Y2mbgs-E3DYh0qxZAcn3LkEtvdRW37jLLVA@mail.gmail.com>

Hello,

I'm running a MAKER annotation on an ubuntu cluster and my top screen shows
the following:

Tasks: 831 total,   3 running, 826 sleeping,   0 stopped,   1 zombie

and my maker run (on a screen) shows:

...
total clusters:4 now processing 0
 ...processing 0 of 3
 ...processing 1 of 3
 ...processing 2 of 3
total clusters:4 now processing 0
flattening protein clusters
prepare section files
merging blast reports...
flattening protein clusters
prepare section files
merging blast reports...
flattening protein clusters
prepare section files

Of the 826, the vast majority of them are maker and only one of the running
tasks is maker.  Is this normal behaviour or has my maker run stopped
processing?

Seth Munholland, B.Sc.
Department of Biological Sciences
Rm. 304 Biology Building
University of Windsor
401 Sunset Ave. N9B 3P4
T: (519) 253-3000 Ext: 4755
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170510/b5663dbe/attachment-0002.html>

From carsonhh at gmail.com  Thu May 11 09:58:59 2017
From: carsonhh at gmail.com (Carson Holt)
Date: Thu, 11 May 2017 09:58:59 -0600
Subject: [maker-devel] want coding sequences
In-Reply-To: <CAOW6FSJFhCP8-V7L9N6RLnnyvtoa3SvQZaZY9Phi0-stXe+vGQ@mail.gmail.com>
References: <CAOW6FSJFhCP8-V7L9N6RLnnyvtoa3SvQZaZY9Phi0-stXe+vGQ@mail.gmail.com>
Message-ID: <82DF4E49-8E78-45F6-8A78-01A45F908987@gmail.com>

Use the fasta_tool utility with ?trim_maker_utr to get just the CDS part of each transcript.

?Carson


> On May 10, 2017, at 9:48 AM, Quanwei Zhang <qwzhang0601 at gmail.com> wrote:
> 
> Hello:
> 
> Thanks for development and maintenance of the tool "Maker2".  We have used Maker2 to do genome annotation of a new rodent species. Now we are doing downstream analysis, which requires inputs of coding sequences from different species. 
> 
> I found the outputs I got from Maker2 only include protein sequences and transcripts. Is there an easy way that I can get the coding sequences for our annotated genome?
> 
> Many thanks
> 
> Best
> Quanwei
> _______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org


From carsonhh at gmail.com  Thu May 11 10:03:04 2017
From: carsonhh at gmail.com (Carson Holt)
Date: Thu, 11 May 2017 10:03:04 -0600
Subject: [maker-devel] MAKER only running 1 task
In-Reply-To: <CAL=sJwroONDyNp_Y2mbgs-E3DYh0qxZAcn3LkEtvdRW37jLLVA@mail.gmail.com>
References: <CAL=sJwroONDyNp_Y2mbgs-E3DYh0qxZAcn3LkEtvdRW37jLLVA@mail.gmail.com>
Message-ID: <2119AF6E-3571-4D28-9D81-B24A513839E5@gmail.com>

It may be frozen, or if it is on the last contig it can be running a non-paralelizable step (very last cluster merging step for each contig is not paralelizable). So on a large contig the very last step can take a little while, and if there are no other contigs, then there is no work to give to other processes to keep them busy in the meantime. So everyone has to wait so they can all exit together once the last step is done. But as I said, this will only happen if you are on the last contig and it is large. Otherwise it is probably frozen somehow (look for any errors further up the log).

?Carson


> On May 10, 2017, at 2:24 PM, Seth Munholland <munholl at uwindsor.ca> wrote:
> 
> Hello,
> 
> I'm running a MAKER annotation on an ubuntu cluster and my top screen shows the following:
> 
> Tasks: 831 total,   3 running, 826 sleeping,   0 stopped,   1 zombie
> 
> and my maker run (on a screen) shows:
> 
> ...
> total clusters:4 now processing 0
>  ...processing 0 of 3
>  ...processing 1 of 3
>  ...processing 2 of 3
> total clusters:4 now processing 0
> flattening protein clusters
> prepare section files
> merging blast reports...
> flattening protein clusters
> prepare section files
> merging blast reports...
> flattening protein clusters
> prepare section files
> 
> Of the 826, the vast majority of them are maker and only one of the running tasks is maker.  Is this normal behaviour or has my maker run stopped processing?
> 
> Seth Munholland, B.Sc.
> Department of Biological Sciences
> Rm. 304 Biology Building
> University of Windsor
> 401 Sunset Ave. N9B 3P4
> T: (519) 253-3000 Ext: 4755 <>_______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170511/f218b2f3/attachment-0002.html>

From mnaymik at tgen.org  Thu May 11 12:29:46 2017
From: mnaymik at tgen.org (Marcus Naymik)
Date: Thu, 11 May 2017 11:29:46 -0700
Subject: [maker-devel] Maker gene vs snap match in final GFF's
Message-ID: <CAMY2gz5vm7hZ4n2OoUSPsfwjmdCY0V6Y1iDdf3bi7=mpGXHDiQ@mail.gmail.com>

In the final GFF annotations what is the difference between a 'gene' from
maker and a 'match' from snap?

-- 
 

*This electronic message is intended to be for the use only of the named 
recipient, and may contain information that is confidential or privileged, 
including patient health information. If you are not the intended 
recipient, you are hereby notified that any disclosure, copying, 
distribution or use of the contents of this message is strictly prohibited. 
If you have received this message in error or are not the named recipient, 
please notify us immediately by contacting the sender at the electronic 
mail address noted above, and delete and destroy all copies of this 
message. Thank you.*
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170511/52fd0d90/attachment-0002.html>

From carsonhh at gmail.com  Thu May 11 12:33:55 2017
From: carsonhh at gmail.com (Carson Holt)
Date: Thu, 11 May 2017 12:33:55 -0600
Subject: [maker-devel] Maker gene vs snap match in final GFF's
In-Reply-To: <CAMY2gz5vm7hZ4n2OoUSPsfwjmdCY0V6Y1iDdf3bi7=mpGXHDiQ@mail.gmail.com>
References: <CAMY2gz5vm7hZ4n2OoUSPsfwjmdCY0V6Y1iDdf3bi7=mpGXHDiQ@mail.gmail.com>
Message-ID: <8A77F384-9BAD-4BF4-BD1E-EDAE4E010612@gmail.com>

MAKER results can be the result of additional hints sent to SNAP together with post processing to add UTR and additional exons that have support form transcript evidence. MAKER results will also have support from either protein or EST/mRNA evidence. SNAP match is simply the raw ab initio call made by SNAP (no hints, no post processing, and may or may not have evidence supporting the structure). They are there just for reference purposes. so you know what SNAP will produce outside of MAKER given the underlying HMM.

?Carson


> On May 11, 2017, at 12:29 PM, Marcus Naymik <mnaymik at tgen.org> wrote:
> 
> In the final GFF annotations what is the difference between a 'gene' from maker and a 'match' from snap?
> 
> This electronic message is intended to be for the use only of the named recipient, and may contain information that is confidential or privileged, including patient health information. If you are not the intended recipient, you are hereby notified that any disclosure, copying, distribution or use of the contents of this message is strictly prohibited. If you have received this message in error or are not the named recipient, please notify us immediately by contacting the sender at the electronic mail address noted above, and delete and destroy all copies of this message. Thank you.
> 
> _______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170511/42fb9023/attachment-0002.html>

From d.ence at ufl.edu  Thu May 11 12:35:00 2017
From: d.ence at ufl.edu (Ence,daniel)
Date: Thu, 11 May 2017 18:35:00 +0000
Subject: [maker-devel] Maker gene vs snap match in final GFF's
In-Reply-To: <CAMY2gz5vm7hZ4n2OoUSPsfwjmdCY0V6Y1iDdf3bi7=mpGXHDiQ@mail.gmail.com>
References: <CAMY2gz5vm7hZ4n2OoUSPsfwjmdCY0V6Y1iDdf3bi7=mpGXHDiQ@mail.gmail.com>
Message-ID: <2560F44E-3D8D-4E81-B7B9-621921408B61@mail.ufl.edu>

The two might have identical coordinates in some cases, but they are different kinds of features. The ?match? is a product of an abinitio gene prediction algorithm, while the ?gene? is is supported by evidence and passed through the maker polishing and filtering steps.


On May 11, 2017, at 2:29 PM, Marcus Naymik <mnaymik at tgen.org<mailto:mnaymik at tgen.org>> wrote:

In the final GFF annotations what is the difference between a 'gene' from maker and a 'match' from snap?


This electronic message is intended to be for the use only of the named recipient, and may contain information that is confidential or privileged, including patient health information. If you are not the intended recipient, you are hereby notified that any disclosure, copying, distribution or use of the contents of this message is strictly prohibited. If you have received this message in error or are not the named recipient, please notify us immediately by contacting the sender at the electronic mail address noted above, and delete and destroy all copies of this message. Thank you.

_______________________________________________
maker-devel mailing list
maker-devel at box290.bluehost.com<mailto:maker-devel at box290.bluehost.com>
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170511/f577dab6/attachment-0002.html>

From nathan.ricks at gmail.com  Fri May 12 15:05:45 2017
From: nathan.ricks at gmail.com (Nathan Ricks)
Date: Fri, 12 May 2017 15:05:45 -0600
Subject: [maker-devel] Using mpich2 with Maker
Message-ID: <CAJfcix0T1mVc4Rxz+7fbEbf_he78Aqj9UgA4uYKtKAZsdE8sCA@mail.gmail.com>

I've been using maker for some time now. However, I would like to speed up
the process by using the mpich2 option. When use the command ./Build
install, the following error is produced. Any help would be appreciated.

Nathan Ricks
[image: Inline image 1]
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170512/ffbe42da/attachment-0002.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image.png
Type: image/png
Size: 170139 bytes
Desc: not available
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170512/ffbe42da/attachment-0002.png>

From yuejiaxing at gmail.com  Mon May 15 05:28:47 2017
From: yuejiaxing at gmail.com (Jia-Xing Yue)
Date: Mon, 15 May 2017 13:28:47 +0200
Subject: [maker-devel] multiple overlapped snoRNA genes got annotated by
 maker
Message-ID: <CAETgyPSWXrZPO=nrkgT5MomgJELf-0ze7yu0A_NCsVqTnLYkDA@mail.gmail.com>

Hello,

I configured snoscan (v.0.9.1) for my maker installation (v2.31.9) and run
the annotation for a yeast (S. cerevisiae) genome. I think the annotation
went well with regard to tRNAs and protein-coding genes but I am not sure
about snoRNAs. I found multiple overlapped snoRNA genes were annotated by
maker as the example below shows. I was wondering if this is expected. If
not, what might have caused this problem and is there a way to work around.
Thanks in advance!

chrIX   maker   gene    4328    4416    .       +       .
ID=snoscan-chrIX-noncoding-gene-0.49;Name=snoscan-chrIX-noncoding-gene-0.49
chrIX   maker   snoRNA  4328    4416    .       +       .
ID=snoscan-chrIX-noncoding-gene-0.49-snoRNA-1;Parent=snoscan-chrIX-noncoding-gene-0.49;Name=snoscan-chrIX-noncoding-gene-0.49-snoRNA-1;_AED=1.00;_eAED=1.00;_QI=0|-1|0|0|-1|0|1|90|0
chrIX   maker   exon    4328    4416    .       +       .
ID=snoscan-chrIX-noncoding-gene-0.49-snoRNA-1:exon:12260;Parent=snoscan-chrIX-noncoding-gene-0.49-snoRNA-1
chrIX   maker   gene    4375    4563    .       +       .
ID=snoscan-chrIX-noncoding-gene-0.50;Name=snoscan-chrIX-noncoding-gene-0.50
chrIX   maker   snoRNA  4375    4563    .       +       .
ID=snoscan-chrIX-noncoding-gene-0.50-snoRNA-1;Parent=snoscan-chrIX-noncoding-gene-0.50;Name=snoscan-chrIX-noncoding-gene-0.50-snoRNA-1;_AED=1.00;_eAED=1.00;_QI=0|-1|0|0|-1|0|1|190|0
chrIX   maker   exon    4375    4563    .       +       .
ID=snoscan-chrIX-noncoding-gene-0.50-snoRNA-1:exon:12261;Parent=snoscan-chrIX-noncoding-gene-0.50-snoRNA-1
chrIX   maker   gene    4375    4461    .       +       .
ID=snoscan-chrIX-noncoding-gene-0.51;Name=snoscan-chrIX-noncoding-gene-0.51
chrIX   maker   snoRNA  4375    4461    .       +       .
ID=snoscan-chrIX-noncoding-gene-0.51-snoRNA-1;Parent=snoscan-chrIX-noncoding-gene-0.51;Name=snoscan-chrIX-noncoding-gene-0.51-snoRNA-1;_AED=1.00;_eAED=1.00;_QI=0|-1|0|0|-1|0|1|88|0
chrIX   maker   exon    4375    4461    .       +       .
ID=snoscan-chrIX-noncoding-gene-0.51-snoRNA-1:exon:12262;Parent=snoscan-chrIX-noncoding-gene-0.51-snoRNA-1
chrIX   maker   gene    4375    4491    .       +       .
ID=snoscan-chrIX-noncoding-gene-0.52;Name=snoscan-chrIX-noncoding-gene-0.52
chrIX   maker   snoRNA  4375    4491    .       +       .
ID=snoscan-chrIX-noncoding-gene-0.52-snoRNA-1;Parent=snoscan-chrIX-noncoding-gene-0.52;Name=snoscan-chrIX-noncoding-gene-0.52-snoRNA-1;_AED=1.00;_eAED=1.00;_QI=0|-1|0|0|-1|0|1|118|0
chrIX   maker   exon    4375    4491    .       +       .
ID=snoscan-chrIX-noncoding-gene-0.52-snoRNA-1:exon:12263;Parent=snoscan-chrIX-noncoding-gene-0.52-snoRNA-1
chrIX   maker   gene    4375    4500    .       +       .
ID=snoscan-chrIX-noncoding-gene-0.53;Name=snoscan-chrIX-noncoding-gene-0.53
chrIX   maker   snoRNA  4375    4500    .       +       .
ID=snoscan-chrIX-noncoding-gene-0.53-snoRNA-1;Parent=snoscan-chrIX-noncoding-gene-0.53;Name=snoscan-chrIX-noncoding-gene-0.53-snoRNA-1;_AED=1.00;_eAED=1.00;_QI=0|-1|0|0|-1|0|1|127|0
chrIX   maker   exon    4375    4500    .       +       .
ID=snoscan-chrIX-noncoding-gene-0.53-snoRNA-1:exon:12264;Parent=snoscan-chrIX-noncoding-gene-0.53-snoRNA-1

Best,
Jia-Xing


-- 
Jia-Xing Yue

Population Genomics and Complex Traits Group
Tour Pasteur 8eme etage
Facult? de M?decine
Institute for Research on Cancer and Aging, Nice (IRCAN)
CNRS UMR 7284 - INSERM U 1081 - Universit? C?te d?Azur (UCA)
28 Avenue de Valombrose
06107 NICE Cedex 2
France

Personal website: http://www.iamphioxus.org/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170515/fb928f1e/attachment-0002.html>

From michael.s.campbell1 at gmail.com  Mon May 15 10:28:58 2017
From: michael.s.campbell1 at gmail.com (Michael Campbell)
Date: Mon, 15 May 2017 12:28:58 -0400
Subject: [maker-devel] multiple overlapped snoRNA genes got annotated by
 maker
In-Reply-To: <CAETgyPSWXrZPO=nrkgT5MomgJELf-0ze7yu0A_NCsVqTnLYkDA@mail.gmail.com>
References: <CAETgyPSWXrZPO=nrkgT5MomgJELf-0ze7yu0A_NCsVqTnLYkDA@mail.gmail.com>
Message-ID: <0AC3F89F-28EB-4E2A-ADDE-2DF8BD625416@gmail.com>

Hi Jia-Xing,

That has been my experience in the past as well. For the non-coding RNAs tRNA-scan is very accurate while snoscan seems to be quite sensitive but very specific. Did you give it a ?snoscan_meth? file? Giving it a snoscan_meth file will help with accuracy. The biggest gains in accuracy are from small RNA-seq data. In the paper where we used snoscan on maize we didn?t keep any snoRNA predictions that didn?t have support from small RNA-seq data, in practical terms we got rid of anything with a AED of 1.

I hope this helps,
Mike
> On May 15, 2017, at 7:28 AM, Jia-Xing Yue <yuejiaxing at gmail.com> wrote:
> 
> Hello,
> 
> I configured snoscan (v.0.9.1) for my maker installation (v2.31.9) and run the annotation for a yeast (S. cerevisiae) genome. I think the annotation went well with regard to tRNAs and protein-coding genes but I am not sure about snoRNAs. I found multiple overlapped snoRNA genes were annotated by maker as the example below shows. I was wondering if this is expected. If not, what might have caused this problem and is there a way to work around. Thanks in advance!
> 
> chrIX   maker   gene    4328    4416    .       +       .       ID=snoscan-chrIX-noncoding-gene-0.49;Name=snoscan-chrIX-noncoding-gene-0.49
> chrIX   maker   snoRNA  4328    4416    .       +       .       ID=snoscan-chrIX-noncoding-gene-0.49-snoRNA-1;Parent=snoscan-chrIX-noncoding-gene-0.49;Name=snoscan-chrIX-noncoding-gene-0.49-snoRNA-1;_AED=1.00;_eAED=1.00;_QI=0|-1|0|0|-1|0|1|90|0
> chrIX   maker   exon    4328    4416    .       +       .       ID=snoscan-chrIX-noncoding-gene-0.49-snoRNA-1:exon:12260;Parent=snoscan-chrIX-noncoding-gene-0.49-snoRNA-1
> chrIX   maker   gene    4375    4563    .       +       .       ID=snoscan-chrIX-noncoding-gene-0.50;Name=snoscan-chrIX-noncoding-gene-0.50
> chrIX   maker   snoRNA  4375    4563    .       +       .       ID=snoscan-chrIX-noncoding-gene-0.50-snoRNA-1;Parent=snoscan-chrIX-noncoding-gene-0.50;Name=snoscan-chrIX-noncoding-gene-0.50-snoRNA-1;_AED=1.00;_eAED=1.00;_QI=0|-1|0|0|-1|0|1|190|0
> chrIX   maker   exon    4375    4563    .       +       .       ID=snoscan-chrIX-noncoding-gene-0.50-snoRNA-1:exon:12261;Parent=snoscan-chrIX-noncoding-gene-0.50-snoRNA-1
> chrIX   maker   gene    4375    4461    .       +       .       ID=snoscan-chrIX-noncoding-gene-0.51;Name=snoscan-chrIX-noncoding-gene-0.51
> chrIX   maker   snoRNA  4375    4461    .       +       .       ID=snoscan-chrIX-noncoding-gene-0.51-snoRNA-1;Parent=snoscan-chrIX-noncoding-gene-0.51;Name=snoscan-chrIX-noncoding-gene-0.51-snoRNA-1;_AED=1.00;_eAED=1.00;_QI=0|-1|0|0|-1|0|1|88|0
> chrIX   maker   exon    4375    4461    .       +       .       ID=snoscan-chrIX-noncoding-gene-0.51-snoRNA-1:exon:12262;Parent=snoscan-chrIX-noncoding-gene-0.51-snoRNA-1
> chrIX   maker   gene    4375    4491    .       +       .       ID=snoscan-chrIX-noncoding-gene-0.52;Name=snoscan-chrIX-noncoding-gene-0.52
> chrIX   maker   snoRNA  4375    4491    .       +       .       ID=snoscan-chrIX-noncoding-gene-0.52-snoRNA-1;Parent=snoscan-chrIX-noncoding-gene-0.52;Name=snoscan-chrIX-noncoding-gene-0.52-snoRNA-1;_AED=1.00;_eAED=1.00;_QI=0|-1|0|0|-1|0|1|118|0
> chrIX   maker   exon    4375    4491    .       +       .       ID=snoscan-chrIX-noncoding-gene-0.52-snoRNA-1:exon:12263;Parent=snoscan-chrIX-noncoding-gene-0.52-snoRNA-1
> chrIX   maker   gene    4375    4500    .       +       .       ID=snoscan-chrIX-noncoding-gene-0.53;Name=snoscan-chrIX-noncoding-gene-0.53
> chrIX   maker   snoRNA  4375    4500    .       +       .       ID=snoscan-chrIX-noncoding-gene-0.53-snoRNA-1;Parent=snoscan-chrIX-noncoding-gene-0.53;Name=snoscan-chrIX-noncoding-gene-0.53-snoRNA-1;_AED=1.00;_eAED=1.00;_QI=0|-1|0|0|-1|0|1|127|0
> chrIX   maker   exon    4375    4500    .       +       .       ID=snoscan-chrIX-noncoding-gene-0.53-snoRNA-1:exon:12264;Parent=snoscan-chrIX-noncoding-gene-0.53-snoRNA-1
> 
> Best,
> Jia-Xing
> 
> 
> -- 
> Jia-Xing Yue 
> 
> Population Genomics and Complex Traits Group
> Tour Pasteur 8eme etage
> Facult? de M?decine
> Institute for Research on Cancer and Aging, Nice (IRCAN)
> CNRS UMR 7284 - INSERM U 1081 - Universit? C?te d?Azur (UCA)
> 28 Avenue de Valombrose
> 06107 NICE Cedex 2
> France
> 
> Personal website: http://www.iamphioxus.org/ <http://www.iamphioxus.org/>
> 
> _______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170515/0ea2c0e5/attachment-0002.html>

From yuejiaxing at gmail.com  Mon May 15 11:14:26 2017
From: yuejiaxing at gmail.com (Jia-Xing Yue)
Date: Mon, 15 May 2017 19:14:26 +0200
Subject: [maker-devel] multiple overlapped snoRNA genes got annotated by
 maker
In-Reply-To: <0AC3F89F-28EB-4E2A-ADDE-2DF8BD625416@gmail.com>
References: <CAETgyPSWXrZPO=nrkgT5MomgJELf-0ze7yu0A_NCsVqTnLYkDA@mail.gmail.com>
	<0AC3F89F-28EB-4E2A-ADDE-2DF8BD625416@gmail.com>
Message-ID: <CAETgyPQOwmGsXPGyAcUcEOf_-LHoOvqzJQuKHLTocRYE6tyVtg@mail.gmail.com>

Hi Michael,

Many thanks for the information! I will specify the "snoscan_meth" file and
give it another try then. I majorly want to use maker to annotate
protein-coding genes and tRNAs. But it would be nice to have snoRNA
reasonably annotated as well.
Thanks gain and have a great day!

Best,
Jia-Xing


On Mon, May 15, 2017 at 6:28 PM, Michael Campbell <
michael.s.campbell1 at gmail.com> wrote:

> Hi Jia-Xing,
>
> That has been my experience in the past as well. For the non-coding RNAs
> tRNA-scan is very accurate while snoscan seems to be quite sensitive but
> very specific. Did you give it a ?snoscan_meth? file? Giving it
> a snoscan_meth file will help with accuracy. The biggest gains in accuracy
> are from small RNA-seq data. In the paper where we used snoscan on maize we
> didn?t keep any snoRNA predictions that didn?t have support from small
> RNA-seq data, in practical terms we got rid of anything with a AED of 1.
>
> I hope this helps,
> Mike
>
> On May 15, 2017, at 7:28 AM, Jia-Xing Yue <yuejiaxing at gmail.com> wrote:
>
> Hello,
>
> I configured snoscan (v.0.9.1) for my maker installation (v2.31.9) and run
> the annotation for a yeast (S. cerevisiae) genome. I think the annotation
> went well with regard to tRNAs and protein-coding genes but I am not sure
> about snoRNAs. I found multiple overlapped snoRNA genes were annotated by
> maker as the example below shows. I was wondering if this is expected. If
> not, what might have caused this problem and is there a way to work around.
> Thanks in advance!
>
> chrIX   maker   gene    4328    4416    .       +       .
> ID=snoscan-chrIX-noncoding-gene-0.49;Name=snoscan-chrIX-
> noncoding-gene-0.49
> chrIX   maker   snoRNA  4328    4416    .       +       .
> ID=snoscan-chrIX-noncoding-gene-0.49-snoRNA-1;Parent=
> snoscan-chrIX-noncoding-gene-0.49;Name=snoscan-chrIX-
> noncoding-gene-0.49-snoRNA-1;_AED=1.00;_eAED=1.00;_QI=0|-1|0|0|-1|0|1|90|0
> chrIX   maker   exon    4328    4416    .       +       .
> ID=snoscan-chrIX-noncoding-gene-0.49-snoRNA-1:exon:12260;
> Parent=snoscan-chrIX-noncoding-gene-0.49-snoRNA-1
> chrIX   maker   gene    4375    4563    .       +       .
> ID=snoscan-chrIX-noncoding-gene-0.50;Name=snoscan-chrIX-
> noncoding-gene-0.50
> chrIX   maker   snoRNA  4375    4563    .       +       .
> ID=snoscan-chrIX-noncoding-gene-0.50-snoRNA-1;Parent=
> snoscan-chrIX-noncoding-gene-0.50;Name=snoscan-chrIX-
> noncoding-gene-0.50-snoRNA-1;_AED=1.00;_eAED=1.00;_QI=0|-1|
> 0|0|-1|0|1|190|0
> chrIX   maker   exon    4375    4563    .       +       .
> ID=snoscan-chrIX-noncoding-gene-0.50-snoRNA-1:exon:12261;
> Parent=snoscan-chrIX-noncoding-gene-0.50-snoRNA-1
> chrIX   maker   gene    4375    4461    .       +       .
> ID=snoscan-chrIX-noncoding-gene-0.51;Name=snoscan-chrIX-
> noncoding-gene-0.51
> chrIX   maker   snoRNA  4375    4461    .       +       .
> ID=snoscan-chrIX-noncoding-gene-0.51-snoRNA-1;Parent=
> snoscan-chrIX-noncoding-gene-0.51;Name=snoscan-chrIX-
> noncoding-gene-0.51-snoRNA-1;_AED=1.00;_eAED=1.00;_QI=0|-1|0|0|-1|0|1|88|0
> chrIX   maker   exon    4375    4461    .       +       .
> ID=snoscan-chrIX-noncoding-gene-0.51-snoRNA-1:exon:12262;
> Parent=snoscan-chrIX-noncoding-gene-0.51-snoRNA-1
> chrIX   maker   gene    4375    4491    .       +       .
> ID=snoscan-chrIX-noncoding-gene-0.52;Name=snoscan-chrIX-
> noncoding-gene-0.52
> chrIX   maker   snoRNA  4375    4491    .       +       .
> ID=snoscan-chrIX-noncoding-gene-0.52-snoRNA-1;Parent=
> snoscan-chrIX-noncoding-gene-0.52;Name=snoscan-chrIX-
> noncoding-gene-0.52-snoRNA-1;_AED=1.00;_eAED=1.00;_QI=0|-1|
> 0|0|-1|0|1|118|0
> chrIX   maker   exon    4375    4491    .       +       .
> ID=snoscan-chrIX-noncoding-gene-0.52-snoRNA-1:exon:12263;
> Parent=snoscan-chrIX-noncoding-gene-0.52-snoRNA-1
> chrIX   maker   gene    4375    4500    .       +       .
> ID=snoscan-chrIX-noncoding-gene-0.53;Name=snoscan-chrIX-
> noncoding-gene-0.53
> chrIX   maker   snoRNA  4375    4500    .       +       .
> ID=snoscan-chrIX-noncoding-gene-0.53-snoRNA-1;Parent=
> snoscan-chrIX-noncoding-gene-0.53;Name=snoscan-chrIX-
> noncoding-gene-0.53-snoRNA-1;_AED=1.00;_eAED=1.00;_QI=0|-1|
> 0|0|-1|0|1|127|0
> chrIX   maker   exon    4375    4500    .       +       .
> ID=snoscan-chrIX-noncoding-gene-0.53-snoRNA-1:exon:12264;
> Parent=snoscan-chrIX-noncoding-gene-0.53-snoRNA-1
>
> Best,
> Jia-Xing
>
>
> --
> Jia-Xing Yue
>
> Population Genomics and Complex Traits Group
> Tour Pasteur 8eme etage
> Facult? de M?decine
> Institute for Research on Cancer and Aging, Nice (IRCAN)
> CNRS UMR 7284 - INSERM U 1081 - Universit? C?te d?Azur (UCA)
> 28 Avenue de Valombrose
> 06107 NICE Cedex 2
> France
>
> Personal website: http://www.iamphioxus.org/
>
> _______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org
>
>
>


-- 
Jia-Xing Yue

Population Genomics and Complex Traits Group
Tour Pasteur 8eme etage
Facult? de M?decine
Institute for Research on Cancer and Aging, Nice (IRCAN)
CNRS UMR 7284 - INSERM U 1081 - Universit? C?te d?Azur (UCA)
28 Avenue de Valombrose
06107 NICE Cedex 2
France

Personal website: http://www.iamphioxus.org/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170515/dd2096fa/attachment-0002.html>

From carsonhh at gmail.com  Tue May 16 08:51:00 2017
From: carsonhh at gmail.com (Carson Holt)
Date: Tue, 16 May 2017 08:51:00 -0600
Subject: [maker-devel] Using mpich2 with Maker
In-Reply-To: <CAJfcix0T1mVc4Rxz+7fbEbf_he78Aqj9UgA4uYKtKAZsdE8sCA@mail.gmail.com>
References: <CAJfcix0T1mVc4Rxz+7fbEbf_he78Aqj9UgA4uYKtKAZsdE8sCA@mail.gmail.com>
Message-ID: <6C739B3B-D5D6-4263-A73C-4EA1762B1EE1@gmail.com>

You probably need to reinstall Parse::RecDescent, Inline, Inline::C, or all of the above via CPAN (perl?s module installer). The ones already installed on your system may have issues. If you do not have the ability to install modules, you can install them just for your user using local::lib and the bootstrapping instructions here ?> http://search.cpan.org/~haarg/local-lib-2.000019/lib/local/lib.pm#The_bootstrapping_technique <http://search.cpan.org/~haarg/local-lib-2.000019/lib/local/lib.pm#The_bootstrapping_technique>

Then reinstall MAKER.

?Carson
 
> On May 12, 2017, at 3:05 PM, Nathan Ricks <nathan.ricks at gmail.com> wrote:
> 
> I've been using maker for some time now. However, I would like to speed up the process by using the mpich2 option. When use the command ./Build install, the following error is produced. Any help would be appreciated.
> 
> Nathan Ricks
> <image.png>
> 
> 
> 
> _______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170516/70b7eb1e/attachment-0002.html>

From salim.bougouffa at kaust.edu.sa  Sun May 21 01:45:50 2017
From: salim.bougouffa at kaust.edu.sa (Salim Bougouffa)
Date: Sun, 21 May 2017 10:45:50 +0300
Subject: [maker-devel] augustus exon calling ~
Message-ID: <CAGekO4ucZ6=H1X=s-B-+8KmaNpZph5s73sFHu5+o1L6m0sVg2A@mail.gmail.com>

Hi Maker folks,

I have several issues with a plant genome annotation that I am currently
doing but perhaps the most recurrent issues are:

1/ CDSs that are missed where significant rna-seq evidence is there (figure
artemis01)
2/ vice versa where one or two exons are added without rna-seq
evidence/intron hints (figure artemis02)

info about the runs:
1/ using augustus with a pre-existing model for a related plant that has
high homology to the one I am annotating
2/ umask=1 (seems to do better than umask=0; is this a good thing to do)
3/ evm = 1 (seems to perform better than emv=0)
4/ repeatmasking (denovo + repbase)

Best,
/SB


_______________________________________________________Salim
Bougouffa(PhD), Postdoctoral Fellow
4700 KAUST, CBRC, Blg3. Office4326-WS05,
Thuwal, Jeddah, KSA, 23955-6900
(966) 012 808 2963 || salim.bougouff at kaust.edu.sa

-- 

------------------------------
This message and its contents, including attachments are intended solely 
for the original recipient. If you are not the intended recipient or have 
received this message in error, please notify me immediately and delete 
this message from your computer system. Any unauthorized use or 
distribution is prohibited. Please consider the environment before printing 
this email.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170521/ee869f4a/attachment-0002.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: artemis01.png
Type: image/png
Size: 166352 bytes
Desc: not available
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170521/ee869f4a/attachment-0004.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: artemis02.png
Type: image/png
Size: 97140 bytes
Desc: not available
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170521/ee869f4a/attachment-0005.png>

From mjfi2sb3 at gmail.com  Sun May 21 01:48:48 2017
From: mjfi2sb3 at gmail.com (Salim Bougouffa)
Date: Sun, 21 May 2017 07:48:48 +0000
Subject: [maker-devel] augustus exon calling ~
Message-ID: <CAJb_6LSEknrL9oNUozj9=OQihk-rSKZbo0W+TAq+-4agZ_h1FA@mail.gmail.com>

Hi Maker folks,

I have several issues with a plant genome annotation that I am currently
doing but perhaps the most recurrent issues are:

1/ CDSs that are missed where significant rna-seq evidence is there (figure
artemis01)
2/ vice versa where one or two exons are added without rna-seq
evidence/intron hints (figure artemis02)

info about the runs:
1/ using augustus with a pre-existing model for a related plant that has
high homology to the one I am annotating
2/ umask=1 (seems to do better than umask=0; is this a good thing to do)
3/ evm = 1 (seems to perform better than emv=0)
4/ repeatmasking (denovo + repbase)

Best,
/SB
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170521/721eee4f/attachment-0002.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: artemis01.png
Type: image/png
Size: 166352 bytes
Desc: not available
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170521/721eee4f/attachment-0004.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: artemis02.png
Type: image/png
Size: 97140 bytes
Desc: not available
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170521/721eee4f/attachment-0005.png>

From mjfi2sb3 at gmail.com  Sun May 21 01:55:31 2017
From: mjfi2sb3 at gmail.com (Salim Bougouffa)
Date: Sun, 21 May 2017 07:55:31 +0000
Subject: [maker-devel] augustus exon calling ~
In-Reply-To: <CAJb_6LSEknrL9oNUozj9=OQihk-rSKZbo0W+TAq+-4agZ_h1FA@mail.gmail.com>
References: <CAJb_6LSEknrL9oNUozj9=OQihk-rSKZbo0W+TAq+-4agZ_h1FA@mail.gmail.com>
Message-ID: <CAJb_6LQtJCiDLOtM0hQpq1Eb4BHV_LyqzRKGKEMNMz_YDJiuqg@mail.gmail.com>

Hi,

I should have mentioned a third scenario where an exon is not called fully
by maker despite augustus getting it right (figure artemis03)

[image: artemis03.png]


On Sun, 21 May 2017 at 10:48 Salim Bougouffa <mjfi2sb3 at gmail.com> wrote:

> Hi Maker folks,
>
> I have several issues with a plant genome annotation that I am currently
> doing but perhaps the most recurrent issues are:
>
> 1/ CDSs that are missed where significant rna-seq evidence is there
> (figure artemis01)
> 2/ vice versa where one or two exons are added without rna-seq
> evidence/intron hints (figure artemis02)
>
> info about the runs:
> 1/ using augustus with a pre-existing model for a related plant that has
> high homology to the one I am annotating
> 2/ umask=1 (seems to do better than umask=0; is this a good thing to do)
> 3/ evm = 1 (seems to perform better than emv=0)
> 4/ repeatmasking (denovo + repbase)
>
> Best,
> /SB
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170521/1c0c8274/attachment-0002.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: artemis03.png
Type: image/png
Size: 145860 bytes
Desc: not available
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170521/1c0c8274/attachment-0002.png>

From admin at genome.arizona.edu  Tue May 23 13:52:17 2017
From: admin at genome.arizona.edu (System Admin)
Date: Tue, 23 May 2017 12:52:17 -0700
Subject: [maker-devel] Hyperthreading
Message-ID: <66d4d95b-fc19-233c-af19-787c2af73e25@genome.arizona.edu>

We are using maker in a cluster with mpich.  Currently hyperthreading is 
on and we use 'mpiexec -n <cores>' to start maker.  Our machinelist file 
for mpich specifies the total emulated cores for each node.
With hyperthreading on, we have up to 256 total emulated cores available.

Which is the optimal scenario?
1. Use '-n 256'
2. Use '-n 128' with hyperthreading still on
3. Use '-n 128' with hyperthreading turned off

Thanks


From carsonhh at gmail.com  Tue May 23 14:19:29 2017
From: carsonhh at gmail.com (Carson Holt)
Date: Tue, 23 May 2017 14:19:29 -0600
Subject: [maker-devel] Hyperthreading
In-Reply-To: <66d4d95b-fc19-233c-af19-787c2af73e25@genome.arizona.edu>
References: <66d4d95b-fc19-233c-af19-787c2af73e25@genome.arizona.edu>
Message-ID: <D9561BB5-DC17-4D4B-9657-A5522B6F9231@gmail.com>

MAKER is more of a pipeline. It will launch external tools on as many CPUs as you give it with the mpiexec command. I?ve found that many of the tools used get a boost with hyperthreading even though optimizations are not explicitly built into their code. The short answer is you would have to try it both ways. I doubt there will be much more than a 10-15% difference in runtime.

You can pull back to 128 would if you find that you are running low on RAM or have a high IO burden (both of which will double if you go from 128 to 256 even though CPU isn?t really doubling). Also MAKER per job performance plateaus at around 200 processes due to communication overhead. Above that threshold it is often useful to divide datasets into multiple separate jobs that can run simultaneously.

?Carson


> On May 23, 2017, at 1:52 PM, System Admin <admin at genome.arizona.edu> wrote:
> 
> We are using maker in a cluster with mpich.  Currently hyperthreading is on and we use 'mpiexec -n <cores>' to start maker.  Our machinelist file for mpich specifies the total emulated cores for each node.
> With hyperthreading on, we have up to 256 total emulated cores available.
> 
> Which is the optimal scenario?
> 1. Use '-n 256'
> 2. Use '-n 128' with hyperthreading still on
> 3. Use '-n 128' with hyperthreading turned off
> 
> Thanks
> 
> _______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org


From admin at genome.arizona.edu  Tue May 23 14:31:48 2017
From: admin at genome.arizona.edu (admin at genome.arizona.edu)
Date: Tue, 23 May 2017 13:31:48 -0700
Subject: [maker-devel] Hyperthreading
In-Reply-To: <D9561BB5-DC17-4D4B-9657-A5522B6F9231@gmail.com>
References: <66d4d95b-fc19-233c-af19-787c2af73e25@genome.arizona.edu>
	<D9561BB5-DC17-4D4B-9657-A5522B6F9231@gmail.com>
Message-ID: <fd619da1-cb2b-9248-502e-501b4a459cbb@genome.arizona.edu>

Carson Holt wrote on 05/23/2017 01:19 PM:
> You can pull back to 128 would if you find that you are running low on RAM or have a high IO burden (both of which will double if you go from 128 to 256 even though CPU isn?t really doubling). Also MAKER per job performance plateaus at around 200 processes due to communication overhead. Above that threshold it is often useful to divide datasets into multiple separate jobs that can run simultaneously.

Yes, with '-n 192' we found the load on the cluster will initially go up 
to 360-380 but then continually decreases until maker is finished. 
Memory usage was very low during the processing (under 20%).


From carsonhh at gmail.com  Tue May 23 14:38:42 2017
From: carsonhh at gmail.com (Carson Holt)
Date: Tue, 23 May 2017 14:38:42 -0600
Subject: [maker-devel] Hyperthreading
In-Reply-To: <fd619da1-cb2b-9248-502e-501b4a459cbb@genome.arizona.edu>
References: <66d4d95b-fc19-233c-af19-787c2af73e25@genome.arizona.edu>
	<D9561BB5-DC17-4D4B-9657-A5522B6F9231@gmail.com>
	<fd619da1-cb2b-9248-502e-501b4a459cbb@genome.arizona.edu>
Message-ID: <47000CA1-0F27-40B2-8BB7-3289C2010853@gmail.com>

Also make sure cpus in the control file are set to 1 when using MPI. Otherwise it will tell each program it calls to try and use more CPUs per call.

?Carson

> On May 23, 2017, at 2:31 PM, admin at genome.arizona.edu wrote:
> 
> Carson Holt wrote on 05/23/2017 01:19 PM:
>> You can pull back to 128 would if you find that you are running low on RAM or have a high IO burden (both of which will double if you go from 128 to 256 even though CPU isn?t really doubling). Also MAKER per job performance plateaus at around 200 processes due to communication overhead. Above that threshold it is often useful to divide datasets into multiple separate jobs that can run simultaneously.
> 
> Yes, with '-n 192' we found the load on the cluster will initially go up to 360-380 but then continually decreases until maker is finished. Memory usage was very low during the processing (under 20%).
> 
> _______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org


From mmokrejs at gmail.com  Tue May 23 14:45:21 2017
From: mmokrejs at gmail.com (=?UTF-8?Q?Martin_MOKREJ=c5=a0?=)
Date: Tue, 23 May 2017 22:45:21 +0200
Subject: [maker-devel] Hyperthreading
In-Reply-To: <fd619da1-cb2b-9248-502e-501b4a459cbb@genome.arizona.edu>
References: <66d4d95b-fc19-233c-af19-787c2af73e25@genome.arizona.edu>
	<D9561BB5-DC17-4D4B-9657-A5522B6F9231@gmail.com>
	<fd619da1-cb2b-9248-502e-501b4a459cbb@genome.arizona.edu>
Message-ID: <30ae31f7-264e-44d0-30bc-a010de3e54a7@gmail.com>

admin at genome.arizona.edu wrote:
> Carson Holt wrote on 05/23/2017 01:19 PM:
>> You can pull back to 128 would if you find that you are running low on RAM or have a high IO burden (both of which will double if you go from 128 to 256 even though CPU isn?t really doubling). Also MAKER per job performance plateaus at around 200 processes due to communication overhead. Above that threshold it is often useful to divide datasets into multiple separate jobs that can run simultaneously.
> 
> Yes, with '-n 192' we found the load on the cluster will initially go up to 360-380 but then continually decreases until maker is finished. Memory usage was very low during the processing (under 20%).

Hi,
 the high load could be caused by disk IO or other reasons. The only proof is to run top, htop or similar and check that the processes are in *running* state ("R" is displayed in the status column). There could be "S" (sleep) when task is waiting for data input or output and also "D"(disk) coudl be shown when waiting for disk IO (unlike network IO).
Martin


From mmokrejs at gmail.com  Tue May 23 14:51:18 2017
From: mmokrejs at gmail.com (=?UTF-8?Q?Martin_MOKREJ=c5=a0?=)
Date: Tue, 23 May 2017 22:51:18 +0200
Subject: [maker-devel] Hyperthreading
In-Reply-To: <66d4d95b-fc19-233c-af19-787c2af73e25@genome.arizona.edu>
References: <66d4d95b-fc19-233c-af19-787c2af73e25@genome.arizona.edu>
Message-ID: <54fbcfd6-ba28-e570-cfab-a6d83620f747@gmail.com>

System Admin wrote:
> We are using maker in a cluster with mpich.  Currently hyperthreading is on and we use 'mpiexec -n <cores>' to start maker.  Our machinelist file for mpich specifies the total emulated cores for each node.
> With hyperthreading on, we have up to 256 total emulated cores available.
> 
> Which is the optimal scenario?
> 1. Use '-n 256'
> 2. Use '-n 128' with hyperthreading still on
> 3. Use '-n 128' with hyperthreading turned off

Go for 3. but make sure to disable *hyperthreading* in the kernel of the machines as well. I also disable multicore scheduler (which should again be helping if there are more long-term running processes than physical cores available and if some should probably share a cache). We do not have such jobs, hmmer and blast are mostly accessing data from memory, so the CPU cache is not much relevant for these.
  Hyperthreading only helps if jobs are lousy, waiting for some input/output etc., and in that case *it helps* if another process can be executed on the CPU core (hopefully not having same bottleneck). This is generally a helped in bad situations. You are after good setup, so disable hyperthreading in kernel, load only that many jobs equal to the number of physical CPI cores, and monitor performance. If jobs are starving, resolve the issue.

Martin


From admin at genome.arizona.edu  Tue May 23 14:57:56 2017
From: admin at genome.arizona.edu (admin at genome.arizona.edu)
Date: Tue, 23 May 2017 13:57:56 -0700
Subject: [maker-devel] Hyperthreading
In-Reply-To: <47000CA1-0F27-40B2-8BB7-3289C2010853@gmail.com>
References: <66d4d95b-fc19-233c-af19-787c2af73e25@genome.arizona.edu>
	<D9561BB5-DC17-4D4B-9657-A5522B6F9231@gmail.com>
	<fd619da1-cb2b-9248-502e-501b4a459cbb@genome.arizona.edu>
	<47000CA1-0F27-40B2-8BB7-3289C2010853@gmail.com>
Message-ID: <bf57acaf-5f70-1bd1-4c1d-ab2b5ced581e@genome.arizona.edu>

Carson Holt wrote on 05/23/2017 01:38 PM:
> Also make sure cpus in the control file are set to 1 when using MPI. 
> Otherwise it will tell each program it calls to try and use more
> CPUs per call.

Yes we are using cpus=1 in the control file
Thanks


From carsonhh at gmail.com  Tue May 23 15:03:17 2017
From: carsonhh at gmail.com (Carson Holt)
Date: Tue, 23 May 2017 15:03:17 -0600
Subject: [maker-devel] Hyperthreading
In-Reply-To: <fd619da1-cb2b-9248-502e-501b4a459cbb@genome.arizona.edu>
References: <66d4d95b-fc19-233c-af19-787c2af73e25@genome.arizona.edu>
	<D9561BB5-DC17-4D4B-9657-A5522B6F9231@gmail.com>
	<fd619da1-cb2b-9248-502e-501b4a459cbb@genome.arizona.edu>
Message-ID: <054B76C7-FAC0-4B9B-A6D1-29A8202E35B3@gmail.com>

One last thing to check if using CentOS or RedHat. I?ve seen it happen on a handful of clusters where transparent hugepages can create odd load issues and very high sys CPU usage under top (not just with maker but with BWA, GATK, and other programs that can have larger memory footprints). If using CentOS or RedHat, you may want to disable defrag for hugepages.

You do this on CentOS 6 to disable it (the process is similar on CentOS 7 and RedHat but you may have to google it) ?> 

echo never > /sys/kernel/mm/transparent_hugepage/defrag 
echo 0 > /sys/kernel/mm/transparent_hugepage/khugepaged/defrag

?Carson


> On May 23, 2017, at 2:31 PM, admin at genome.arizona.edu wrote:
> 
> Carson Holt wrote on 05/23/2017 01:19 PM:
>> You can pull back to 128 would if you find that you are running low on RAM or have a high IO burden (both of which will double if you go from 128 to 256 even though CPU isn?t really doubling). Also MAKER per job performance plateaus at around 200 processes due to communication overhead. Above that threshold it is often useful to divide datasets into multiple separate jobs that can run simultaneously.
> 
> Yes, with '-n 192' we found the load on the cluster will initially go up to 360-380 but then continually decreases until maker is finished. Memory usage was very low during the processing (under 20%).
> 
> _______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org


From carsonhh at gmail.com  Tue May 23 15:34:15 2017
From: carsonhh at gmail.com (Carson Holt)
Date: Tue, 23 May 2017 15:34:15 -0600
Subject: [maker-devel] augustus exon calling ~
In-Reply-To: <CAJb_6LSEknrL9oNUozj9=OQihk-rSKZbo0W+TAq+-4agZ_h1FA@mail.gmail.com>
References: <CAJb_6LSEknrL9oNUozj9=OQihk-rSKZbo0W+TAq+-4agZ_h1FA@mail.gmail.com>
Message-ID: <EDDC2B86-8D94-4700-93D0-09CEAF358C3C@gmail.com>

EVM works extremely well when evidence closely matches the predictions and there are no assembly anomalies affecting ORF. Otherwise, EVM performs very very poorly. Also I would not set unmask=1. It adds noise to the calls.

Note in all cases given, gene models are from Augustus (MAKER doesn?t make predictions). MAKER just provides hints that Augustus can use for the second call set. Hints boost the score a model gets whenever a feature matches the hint. What you see as an Augustus match/match_part feature are just references of what Augustus calls without hints.

So if I tell Augustus there is probably an exon/intron at location X, then any model that includes that exon/intron will bump up its score thus causing Augustus to keep models that match the hints and report those over models that don?t match. However if there is an issue with the evidence (i.e. merge mRNA-seq assembly), or an issue with the assembly (base change generates an early stop codon or causes a frameshift), then Augustus may choose to truncate or skip an exon in order to capture the bonus from downstream hints. So it is unlikely that there is a workable model that capture the exact intron exon structure because it breaks the ORF at some point. So Augustus instead produces the best model it can to capture as many hint bonuses as it can.

That being said, look for any odd hint sources like very poor protein or transcript evidence alignments. Eliminating bad hints will improve performance (if using mRNA-seq assemblies Trinity has a jaccard_clip option which helps avoid false merging of transcript evidence for example). Or if an organism you used for protein evidence constantly produces bad protein alignments, then you may want to drop it completely from evidence.

Finally training Augustus on the genome being annotated will help improve performance (note just because a species is closely related in evolutionary space does not mean that its HMM's will perform well; it?s a common fallacy about ab initio prediction discussed in the SNAP paper). Also try adding another gene predictor like SNAP to see if it hurts or helps.

?Carson


> On May 21, 2017, at 1:48 AM, Salim Bougouffa <mjfi2sb3 at gmail.com> wrote:
> 
> Hi Maker folks,
> 
> I have several issues with a plant genome annotation that I am currently doing but perhaps the most recurrent issues are:
> 
> 1/ CDSs that are missed where significant rna-seq evidence is there (figure artemis01)
> 2/ vice versa where one or two exons are added without rna-seq evidence/intron hints (figure artemis02)
> 
> info about the runs:
> 1/ using augustus with a pre-existing model for a related plant that has high homology to the one I am annotating
> 2/ umask=1 (seems to do better than umask=0; is this a good thing to do)
> 3/ evm = 1 (seems to perform better than emv=0)
> 4/ repeatmasking (denovo + repbase)
> 
> Best,
> /SB
> 
> <artemis01.png><artemis02.png>_______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170523/3c7b4bf1/attachment-0002.html>

From nathan.ricks at gmail.com  Tue May 23 15:23:49 2017
From: nathan.ricks at gmail.com (Nathan Ricks)
Date: Tue, 23 May 2017 15:23:49 -0600
Subject: [maker-devel] maker_functional_gff
Message-ID: <CAJfcix1BmK6LyE7F=sF10ARr81k9FSm3eKgNgLMZbP-X7=C8=A@mail.gmail.com>

I've been working with maker and trying to use the maker_functional_gff to
create an annotated .gff file. However, whenever I run the command, the
following pops up, and just continues for a long time.
Use of uninitialized value $qid in hash element at ./maker_functional_gff
line 165, <$IN> line 58363.
Use of uninitialized value $qid in hash element at ./maker_functional_gff
line 170, <$IN> line 58363.
Use of uninitialized value $qid in hash element at ./maker_functional_gff
line 165, <$IN> line 58367.
Use of uninitialized value $qid in hash element at ./maker_functional_gff
line 170, <$IN> line 58367.
^CUse of uninitialized value $qid in hash element at ./maker_functional_gff
line 165, <$IN> line 58368.


Nathan Ricks
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170523/b34f6585/attachment-0002.html>

From d.ence at ufl.edu  Tue May 23 15:39:32 2017
From: d.ence at ufl.edu (Ence,daniel)
Date: Tue, 23 May 2017 21:39:32 +0000
Subject: [maker-devel] maker_functional_gff
In-Reply-To: <CAJfcix1BmK6LyE7F=sF10ARr81k9FSm3eKgNgLMZbP-X7=C8=A@mail.gmail.com>
References: <CAJfcix1BmK6LyE7F=sF10ARr81k9FSm3eKgNgLMZbP-X7=C8=A@mail.gmail.com>
Message-ID: <9388406A-302C-4B19-9F35-D56C06CC9582@mail.ufl.edu>

Hi Nathan, can you send the command line that you?re using and is giving the error? 

Thanks,
Daniel Ence

> On May 23, 2017, at 5:23 PM, Nathan Ricks <nathan.ricks at gmail.com> wrote:
> 
> I've been working with maker and trying to use the maker_functional_gff to create an annotated .gff file. However, whenever I run the command, the following pops up, and just continues for a long time.
> Use of uninitialized value $qid in hash element at ./maker_functional_gff line 165, <$IN> line 58363.
> Use of uninitialized value $qid in hash element at ./maker_functional_gff line 170, <$IN> line 58363.
> Use of uninitialized value $qid in hash element at ./maker_functional_gff line 165, <$IN> line 58367.
> Use of uninitialized value $qid in hash element at ./maker_functional_gff line 170, <$IN> line 58367.
> ^CUse of uninitialized value $qid in hash element at ./maker_functional_gff line 165, <$IN> line 58368.
> 
> 
> Nathan Ricks
> _______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org


From carsonhh at gmail.com  Tue May 23 15:44:05 2017
From: carsonhh at gmail.com (Carson Holt)
Date: Tue, 23 May 2017 15:44:05 -0600
Subject: [maker-devel] maker_functional_gff
In-Reply-To: <CAJfcix1BmK6LyE7F=sF10ARr81k9FSm3eKgNgLMZbP-X7=C8=A@mail.gmail.com>
References: <CAJfcix1BmK6LyE7F=sF10ARr81k9FSm3eKgNgLMZbP-X7=C8=A@mail.gmail.com>
Message-ID: <25B003D7-6A19-4486-B21F-71070F00A580@gmail.com>

The blast report you gave it is in the wrong format, it is partial/truncated, or you provided the files in the wrong order. Basically it receive an empty line from the file at some point.

The blast report format must in tabular foramt which is "wu-blast -mformat 2? or "ncbi-blast -outfmt 6"

Also the script only supports blast results against UniProt/Swiss-prot.

?Carson


> On May 23, 2017, at 3:23 PM, Nathan Ricks <nathan.ricks at gmail.com> wrote:
> 
> I've been working with maker and trying to use the maker_functional_gff to create an annotated .gff file. However, whenever I run the command, the following pops up, and just continues for a long time.
> Use of uninitialized value $qid in hash element at ./maker_functional_gff line 165, <$IN> line 58363.
> Use of uninitialized value $qid in hash element at ./maker_functional_gff line 170, <$IN> line 58363.
> Use of uninitialized value $qid in hash element at ./maker_functional_gff line 165, <$IN> line 58367.
> Use of uninitialized value $qid in hash element at ./maker_functional_gff line 170, <$IN> line 58367.
> ^CUse of uninitialized value $qid in hash element at ./maker_functional_gff line 165, <$IN> line 58368.
> 
> 
> Nathan Ricks
> _______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org


From yuejiaxing at gmail.com  Fri May 26 03:28:48 2017
From: yuejiaxing at gmail.com (Jia-Xing Yue)
Date: Fri, 26 May 2017 11:28:48 +0200
Subject: [maker-devel] multiple overlapped snoRNA genes got annotated by
 maker
In-Reply-To: <CAETgyPQOwmGsXPGyAcUcEOf_-LHoOvqzJQuKHLTocRYE6tyVtg@mail.gmail.com>
References: <CAETgyPSWXrZPO=nrkgT5MomgJELf-0ze7yu0A_NCsVqTnLYkDA@mail.gmail.com>
	<0AC3F89F-28EB-4E2A-ADDE-2DF8BD625416@gmail.com>
	<CAETgyPQOwmGsXPGyAcUcEOf_-LHoOvqzJQuKHLTocRYE6tyVtg@mail.gmail.com>
Message-ID: <CAETgyPSsjRpskkgEs0TY=xbXz0J2N9E0wGQV2S2mVCpej6ZaCg@mail.gmail.com>

Hi Michael,

This is a follow-up for the snoscan issue. I found the snoscan_meth option
seems have been removed in the current maker_opts.ctl template file
(v2.31.9). This option used to be there according to this post (
https://www.biostars.org/p/217240/). I manually specified this option in my
maker_opts.ctl file but I don't think maker has correctly recognized this
option:


STATUS: Parsing control files...
WARNING: Invalid option 'snoscan_meth' in control file maker_opts.ctl
...

Do you know is there a way to work around this problem? Thanks!

Best,
Jia-Xing


On Mon, May 15, 2017 at 7:14 PM, Jia-Xing Yue <yuejiaxing at gmail.com> wrote:

> Hi Michael,
>
> Many thanks for the information! I will specify the "snoscan_meth" file
> and give it another try then. I majorly want to use maker to annotate
> protein-coding genes and tRNAs. But it would be nice to have snoRNA
> reasonably annotated as well.
> Thanks gain and have a great day!
>
> Best,
> Jia-Xing
>
>
> On Mon, May 15, 2017 at 6:28 PM, Michael Campbell <
> michael.s.campbell1 at gmail.com> wrote:
>
>> Hi Jia-Xing,
>>
>> That has been my experience in the past as well. For the non-coding RNAs
>> tRNA-scan is very accurate while snoscan seems to be quite sensitive but
>> very specific. Did you give it a ?snoscan_meth? file? Giving it
>> a snoscan_meth file will help with accuracy. The biggest gains in accuracy
>> are from small RNA-seq data. In the paper where we used snoscan on maize we
>> didn?t keep any snoRNA predictions that didn?t have support from small
>> RNA-seq data, in practical terms we got rid of anything with a AED of 1.
>>
>> I hope this helps,
>> Mike
>>
>> On May 15, 2017, at 7:28 AM, Jia-Xing Yue <yuejiaxing at gmail.com> wrote:
>>
>> Hello,
>>
>> I configured snoscan (v.0.9.1) for my maker installation (v2.31.9) and
>> run the annotation for a yeast (S. cerevisiae) genome. I think the
>> annotation went well with regard to tRNAs and protein-coding genes but I am
>> not sure about snoRNAs. I found multiple overlapped snoRNA genes were
>> annotated by maker as the example below shows. I was wondering if this is
>> expected. If not, what might have caused this problem and is there a way to
>> work around. Thanks in advance!
>>
>> chrIX   maker   gene    4328    4416    .       +       .
>> ID=snoscan-chrIX-noncoding-gene-0.49;Name=snoscan-chrIX-nonc
>> oding-gene-0.49
>> chrIX   maker   snoRNA  4328    4416    .       +       .
>> ID=snoscan-chrIX-noncoding-gene-0.49-snoRNA-1;Parent=snoscan
>> -chrIX-noncoding-gene-0.49;Name=snoscan-chrIX-noncoding-
>> gene-0.49-snoRNA-1;_AED=1.00;_eAED=1.00;_QI=0|-1|0|0|-1|0|1|90|0
>> chrIX   maker   exon    4328    4416    .       +       .
>> ID=snoscan-chrIX-noncoding-gene-0.49-snoRNA-1:exon:12260;Par
>> ent=snoscan-chrIX-noncoding-gene-0.49-snoRNA-1
>> chrIX   maker   gene    4375    4563    .       +       .
>> ID=snoscan-chrIX-noncoding-gene-0.50;Name=snoscan-chrIX-nonc
>> oding-gene-0.50
>> chrIX   maker   snoRNA  4375    4563    .       +       .
>> ID=snoscan-chrIX-noncoding-gene-0.50-snoRNA-1;Parent=snoscan
>> -chrIX-noncoding-gene-0.50;Name=snoscan-chrIX-noncoding-
>> gene-0.50-snoRNA-1;_AED=1.00;_eAED=1.00;_QI=0|-1|0|0|-1|0|1|190|0
>> chrIX   maker   exon    4375    4563    .       +       .
>> ID=snoscan-chrIX-noncoding-gene-0.50-snoRNA-1:exon:12261;Par
>> ent=snoscan-chrIX-noncoding-gene-0.50-snoRNA-1
>> chrIX   maker   gene    4375    4461    .       +       .
>> ID=snoscan-chrIX-noncoding-gene-0.51;Name=snoscan-chrIX-nonc
>> oding-gene-0.51
>> chrIX   maker   snoRNA  4375    4461    .       +       .
>> ID=snoscan-chrIX-noncoding-gene-0.51-snoRNA-1;Parent=snoscan
>> -chrIX-noncoding-gene-0.51;Name=snoscan-chrIX-noncoding-
>> gene-0.51-snoRNA-1;_AED=1.00;_eAED=1.00;_QI=0|-1|0|0|-1|0|1|88|0
>> chrIX   maker   exon    4375    4461    .       +       .
>> ID=snoscan-chrIX-noncoding-gene-0.51-snoRNA-1:exon:12262;Par
>> ent=snoscan-chrIX-noncoding-gene-0.51-snoRNA-1
>> chrIX   maker   gene    4375    4491    .       +       .
>> ID=snoscan-chrIX-noncoding-gene-0.52;Name=snoscan-chrIX-nonc
>> oding-gene-0.52
>> chrIX   maker   snoRNA  4375    4491    .       +       .
>> ID=snoscan-chrIX-noncoding-gene-0.52-snoRNA-1;Parent=snoscan
>> -chrIX-noncoding-gene-0.52;Name=snoscan-chrIX-noncoding-
>> gene-0.52-snoRNA-1;_AED=1.00;_eAED=1.00;_QI=0|-1|0|0|-1|0|1|118|0
>> chrIX   maker   exon    4375    4491    .       +       .
>> ID=snoscan-chrIX-noncoding-gene-0.52-snoRNA-1:exon:12263;Par
>> ent=snoscan-chrIX-noncoding-gene-0.52-snoRNA-1
>> chrIX   maker   gene    4375    4500    .       +       .
>> ID=snoscan-chrIX-noncoding-gene-0.53;Name=snoscan-chrIX-nonc
>> oding-gene-0.53
>> chrIX   maker   snoRNA  4375    4500    .       +       .
>> ID=snoscan-chrIX-noncoding-gene-0.53-snoRNA-1;Parent=snoscan
>> -chrIX-noncoding-gene-0.53;Name=snoscan-chrIX-noncoding-
>> gene-0.53-snoRNA-1;_AED=1.00;_eAED=1.00;_QI=0|-1|0|0|-1|0|1|127|0
>> chrIX   maker   exon    4375    4500    .       +       .
>> ID=snoscan-chrIX-noncoding-gene-0.53-snoRNA-1:exon:12264;Par
>> ent=snoscan-chrIX-noncoding-gene-0.53-snoRNA-1
>>
>> Best,
>> Jia-Xing
>>
>>
>> --
>> Jia-Xing Yue
>>
>> Population Genomics and Complex Traits Group
>> Tour Pasteur 8eme etage
>> Facult? de M?decine
>> Institute for Research on Cancer and Aging, Nice (IRCAN)
>> CNRS UMR 7284 - INSERM U 1081 - Universit? C?te d?Azur (UCA)
>> 28 Avenue de Valombrose
>> 06107 NICE Cedex 2
>> France
>>
>> Personal website: http://www.iamphioxus.org/
>>
>> _______________________________________________
>> maker-devel mailing list
>> maker-devel at box290.bluehost.com
>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org
>>
>>
>>
>
>
> --
> Jia-Xing Yue
>
> Population Genomics and Complex Traits Group
> Tour Pasteur 8eme etage
> Facult? de M?decine
> Institute for Research on Cancer and Aging, Nice (IRCAN)
> CNRS UMR 7284 - INSERM U 1081 - Universit? C?te d?Azur (UCA)
> 28 Avenue de Valombrose
> 06107 NICE Cedex 2
> France
>
> Personal website: http://www.iamphioxus.org/
>
>


-- 
Jia-Xing Yue

Population Genomics and Complex Traits Group
Tour Pasteur 8eme etage
Facult? de M?decine
Institute for Research on Cancer and Aging, Nice (IRCAN)
CNRS UMR 7284 - INSERM U 1081 - Universit? C?te d?Azur (UCA)
28 Avenue de Valombrose
06107 NICE Cedex 2
France

Personal website: http://www.iamphioxus.org/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170526/0edf08e2/attachment-0002.html>

From michael.s.campbell1 at gmail.com  Fri May 26 07:54:44 2017
From: michael.s.campbell1 at gmail.com (Michael Campbell)
Date: Fri, 26 May 2017 09:54:44 -0400
Subject: [maker-devel] multiple overlapped snoRNA genes got annotated by
 maker
In-Reply-To: <CAETgyPSsjRpskkgEs0TY=xbXz0J2N9E0wGQV2S2mVCpej6ZaCg@mail.gmail.com>
References: <CAETgyPSWXrZPO=nrkgT5MomgJELf-0ze7yu0A_NCsVqTnLYkDA@mail.gmail.com>
	<0AC3F89F-28EB-4E2A-ADDE-2DF8BD625416@gmail.com>
	<CAETgyPQOwmGsXPGyAcUcEOf_-LHoOvqzJQuKHLTocRYE6tyVtg@mail.gmail.com>
	<CAETgyPSsjRpskkgEs0TY=xbXz0J2N9E0wGQV2S2mVCpej6ZaCg@mail.gmail.com>
Message-ID: <6461FDD0-BE78-403A-9FEF-E71C3D24F2CA@gmail.com>

Hi Jia-Xing,

v2.31.9 may not have had that option. I know that it is in the v3.00.0 version, so you best option may be to update.

Thanks,
Mike
> On May 26, 2017, at 5:28 AM, Jia-Xing Yue <yuejiaxing at gmail.com> wrote:
> 
> Hi Michael,
> 
> This is a follow-up for the snoscan issue. I found the snoscan_meth option seems have been removed in the current maker_opts.ctl template file (v2.31.9). This option used to be there according to this post (https://www.biostars.org/p/217240/ <https://www.biostars.org/p/217240/>). I manually specified this option in my maker_opts.ctl file but I don't think maker has correctly recognized this option:
> 
> 
> STATUS: Parsing control files...
> WARNING: Invalid option 'snoscan_meth' in control file maker_opts.ctl
> ...
> 
> Do you know is there a way to work around this problem? Thanks!
> 
> Best,
> Jia-Xing
> 
>  
> 
> On Mon, May 15, 2017 at 7:14 PM, Jia-Xing Yue <yuejiaxing at gmail.com <mailto:yuejiaxing at gmail.com>> wrote:
> Hi Michael,
> 
> Many thanks for the information! I will specify the "snoscan_meth" file and give it another try then. I majorly want to use maker to annotate protein-coding genes and tRNAs. But it would be nice to have snoRNA reasonably annotated as well. 
> Thanks gain and have a great day!
> 
> Best,
> Jia-Xing
> 
> 
> On Mon, May 15, 2017 at 6:28 PM, Michael Campbell <michael.s.campbell1 at gmail.com <mailto:michael.s.campbell1 at gmail.com>> wrote:
> Hi Jia-Xing,
> 
> That has been my experience in the past as well. For the non-coding RNAs tRNA-scan is very accurate while snoscan seems to be quite sensitive but very specific. Did you give it a ?snoscan_meth? file? Giving it a snoscan_meth file will help with accuracy. The biggest gains in accuracy are from small RNA-seq data. In the paper where we used snoscan on maize we didn?t keep any snoRNA predictions that didn?t have support from small RNA-seq data, in practical terms we got rid of anything with a AED of 1.
> 
> I hope this helps,
> Mike
>> On May 15, 2017, at 7:28 AM, Jia-Xing Yue <yuejiaxing at gmail.com <mailto:yuejiaxing at gmail.com>> wrote:
>> 
>> Hello,
>> 
>> I configured snoscan (v.0.9.1) for my maker installation (v2.31.9) and run the annotation for a yeast (S. cerevisiae) genome. I think the annotation went well with regard to tRNAs and protein-coding genes but I am not sure about snoRNAs. I found multiple overlapped snoRNA genes were annotated by maker as the example below shows. I was wondering if this is expected. If not, what might have caused this problem and is there a way to work around. Thanks in advance!
>> 
>> chrIX   maker   gene    4328    4416    .       +       .       ID=snoscan-chrIX-noncoding-gene-0.49;Name=snoscan-chrIX-noncoding-gene-0.49
>> chrIX   maker   snoRNA  4328    4416    .       +       .       ID=snoscan-chrIX-noncoding-gene-0.49-snoRNA-1;Parent=snoscan-chrIX-noncoding-gene-0.49;Name=snoscan-chrIX-noncoding-gene-0.49-snoRNA-1;_AED=1.00;_eAED=1.00;_QI=0|-1|0|0|-1|0|1|90|0
>> chrIX   maker   exon    4328    4416    .       +       .       ID=snoscan-chrIX-noncoding-gene-0.49-snoRNA-1:exon:12260;Parent=snoscan-chrIX-noncoding-gene-0.49-snoRNA-1
>> chrIX   maker   gene    4375    4563    .       +       .       ID=snoscan-chrIX-noncoding-gene-0.50;Name=snoscan-chrIX-noncoding-gene-0.50
>> chrIX   maker   snoRNA  4375    4563    .       +       .       ID=snoscan-chrIX-noncoding-gene-0.50-snoRNA-1;Parent=snoscan-chrIX-noncoding-gene-0.50;Name=snoscan-chrIX-noncoding-gene-0.50-snoRNA-1;_AED=1.00;_eAED=1.00;_QI=0|-1|0|0|-1|0|1|190|0
>> chrIX   maker   exon    4375    4563    .       +       .       ID=snoscan-chrIX-noncoding-gene-0.50-snoRNA-1:exon:12261;Parent=snoscan-chrIX-noncoding-gene-0.50-snoRNA-1
>> chrIX   maker   gene    4375    4461    .       +       .       ID=snoscan-chrIX-noncoding-gene-0.51;Name=snoscan-chrIX-noncoding-gene-0.51
>> chrIX   maker   snoRNA  4375    4461    .       +       .       ID=snoscan-chrIX-noncoding-gene-0.51-snoRNA-1;Parent=snoscan-chrIX-noncoding-gene-0.51;Name=snoscan-chrIX-noncoding-gene-0.51-snoRNA-1;_AED=1.00;_eAED=1.00;_QI=0|-1|0|0|-1|0|1|88|0
>> chrIX   maker   exon    4375    4461    .       +       .       ID=snoscan-chrIX-noncoding-gene-0.51-snoRNA-1:exon:12262;Parent=snoscan-chrIX-noncoding-gene-0.51-snoRNA-1
>> chrIX   maker   gene    4375    4491    .       +       .       ID=snoscan-chrIX-noncoding-gene-0.52;Name=snoscan-chrIX-noncoding-gene-0.52
>> chrIX   maker   snoRNA  4375    4491    .       +       .       ID=snoscan-chrIX-noncoding-gene-0.52-snoRNA-1;Parent=snoscan-chrIX-noncoding-gene-0.52;Name=snoscan-chrIX-noncoding-gene-0.52-snoRNA-1;_AED=1.00;_eAED=1.00;_QI=0|-1|0|0|-1|0|1|118|0
>> chrIX   maker   exon    4375    4491    .       +       .       ID=snoscan-chrIX-noncoding-gene-0.52-snoRNA-1:exon:12263;Parent=snoscan-chrIX-noncoding-gene-0.52-snoRNA-1
>> chrIX   maker   gene    4375    4500    .       +       .       ID=snoscan-chrIX-noncoding-gene-0.53;Name=snoscan-chrIX-noncoding-gene-0.53
>> chrIX   maker   snoRNA  4375    4500    .       +       .       ID=snoscan-chrIX-noncoding-gene-0.53-snoRNA-1;Parent=snoscan-chrIX-noncoding-gene-0.53;Name=snoscan-chrIX-noncoding-gene-0.53-snoRNA-1;_AED=1.00;_eAED=1.00;_QI=0|-1|0|0|-1|0|1|127|0
>> chrIX   maker   exon    4375    4500    .       +       .       ID=snoscan-chrIX-noncoding-gene-0.53-snoRNA-1:exon:12264;Parent=snoscan-chrIX-noncoding-gene-0.53-snoRNA-1
>> 
>> Best,
>> Jia-Xing
>> 
>> 
>> -- 
>> Jia-Xing Yue 
>> 
>> Population Genomics and Complex Traits Group
>> Tour Pasteur 8eme etage
>> Facult? de M?decine
>> Institute for Research on Cancer and Aging, Nice (IRCAN)
>> CNRS UMR 7284 - INSERM U 1081 - Universit? C?te d?Azur (UCA)
>> 28 Avenue de Valombrose
>> 06107 NICE Cedex 2
>> France
>> 
>> Personal website: http://www.iamphioxus.org/ <http://www.iamphioxus.org/>
>> 
>> _______________________________________________
>> maker-devel mailing list
>> maker-devel at box290.bluehost.com <mailto:maker-devel at box290.bluehost.com>
>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org <http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org>
> 
> 
> 
> 
> -- 
> Jia-Xing Yue 
> 
> Population Genomics and Complex Traits Group
> Tour Pasteur 8eme etage
> Facult? de M?decine
> Institute for Research on Cancer and Aging, Nice (IRCAN)
> CNRS UMR 7284 - INSERM U 1081 - Universit? C?te d?Azur (UCA)
> 28 Avenue de Valombrose
> 06107 NICE Cedex 2
> France
> 
> Personal website: http://www.iamphioxus.org/ <http://www.iamphioxus.org/>
> 
> 
> 
> 
> -- 
> Jia-Xing Yue 
> 
> Population Genomics and Complex Traits Group
> Tour Pasteur 8eme etage
> Facult? de M?decine
> Institute for Research on Cancer and Aging, Nice (IRCAN)
> CNRS UMR 7284 - INSERM U 1081 - Universit? C?te d?Azur (UCA)
> 28 Avenue de Valombrose
> 06107 NICE Cedex 2
> France
> 
> Personal website: http://www.iamphioxus.org/ <http://www.iamphioxus.org/>
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170526/207e742d/attachment-0002.html>

From yuejiaxing at gmail.com  Fri May 26 08:20:03 2017
From: yuejiaxing at gmail.com (Jia-Xing Yue)
Date: Fri, 26 May 2017 16:20:03 +0200
Subject: [maker-devel] multiple overlapped snoRNA genes got annotated by
 maker
In-Reply-To: <6461FDD0-BE78-403A-9FEF-E71C3D24F2CA@gmail.com>
References: <CAETgyPSWXrZPO=nrkgT5MomgJELf-0ze7yu0A_NCsVqTnLYkDA@mail.gmail.com>
	<0AC3F89F-28EB-4E2A-ADDE-2DF8BD625416@gmail.com>
	<CAETgyPQOwmGsXPGyAcUcEOf_-LHoOvqzJQuKHLTocRYE6tyVtg@mail.gmail.com>
	<CAETgyPSsjRpskkgEs0TY=xbXz0J2N9E0wGQV2S2mVCpej6ZaCg@mail.gmail.com>
	<6461FDD0-BE78-403A-9FEF-E71C3D24F2CA@gmail.com>
Message-ID: <CAETgyPT_K8DDaujyUVYaRhaoKZaCT4MSrAi4DPczZHSkaTpDsQ@mail.gmail.com>

I see. Thanks Michael!

Best,
Jia-Xing

On Fri, May 26, 2017 at 3:54 PM, Michael Campbell <
michael.s.campbell1 at gmail.com> wrote:

> Hi Jia-Xing,
>
> v2.31.9 may not have had that option. I know that it is in the v3.00.0
> version, so you best option may be to update.
>
> Thanks,
> Mike
>
> On May 26, 2017, at 5:28 AM, Jia-Xing Yue <yuejiaxing at gmail.com> wrote:
>
> Hi Michael,
>
> This is a follow-up for the snoscan issue. I found the snoscan_meth option
> seems have been removed in the current maker_opts.ctl template file
> (v2.31.9). This option used to be there according to this post (
> https://www.biostars.org/p/217240/). I manually specified this option in
> my maker_opts.ctl file but I don't think maker has correctly recognized
> this option:
>
>
> STATUS: Parsing control files...
> WARNING: Invalid option 'snoscan_meth' in control file maker_opts.ctl
> ...
>
> Do you know is there a way to work around this problem? Thanks!
>
> Best,
> Jia-Xing
>
>
>
> On Mon, May 15, 2017 at 7:14 PM, Jia-Xing Yue <yuejiaxing at gmail.com>
> wrote:
>
>> Hi Michael,
>>
>> Many thanks for the information! I will specify the "snoscan_meth" file
>> and give it another try then. I majorly want to use maker to annotate
>> protein-coding genes and tRNAs. But it would be nice to have snoRNA
>> reasonably annotated as well.
>> Thanks gain and have a great day!
>>
>> Best,
>> Jia-Xing
>>
>>
>> On Mon, May 15, 2017 at 6:28 PM, Michael Campbell <
>> michael.s.campbell1 at gmail.com> wrote:
>>
>>> Hi Jia-Xing,
>>>
>>> That has been my experience in the past as well. For the non-coding RNAs
>>> tRNA-scan is very accurate while snoscan seems to be quite sensitive but
>>> very specific. Did you give it a ?snoscan_meth? file? Giving it
>>> a snoscan_meth file will help with accuracy. The biggest gains in accuracy
>>> are from small RNA-seq data. In the paper where we used snoscan on maize we
>>> didn?t keep any snoRNA predictions that didn?t have support from small
>>> RNA-seq data, in practical terms we got rid of anything with a AED of 1.
>>>
>>> I hope this helps,
>>> Mike
>>>
>>> On May 15, 2017, at 7:28 AM, Jia-Xing Yue <yuejiaxing at gmail.com> wrote:
>>>
>>> Hello,
>>>
>>> I configured snoscan (v.0.9.1) for my maker installation (v2.31.9) and
>>> run the annotation for a yeast (S. cerevisiae) genome. I think the
>>> annotation went well with regard to tRNAs and protein-coding genes but I am
>>> not sure about snoRNAs. I found multiple overlapped snoRNA genes were
>>> annotated by maker as the example below shows. I was wondering if this is
>>> expected. If not, what might have caused this problem and is there a way to
>>> work around. Thanks in advance!
>>>
>>> chrIX   maker   gene    4328    4416    .       +       .
>>> ID=snoscan-chrIX-noncoding-gene-0.49;Name=snoscan-chrIX-nonc
>>> oding-gene-0.49
>>> chrIX   maker   snoRNA  4328    4416    .       +       .
>>> ID=snoscan-chrIX-noncoding-gene-0.49-snoRNA-1;Parent=snoscan
>>> -chrIX-noncoding-gene-0.49;Name=snoscan-chrIX-noncoding-gene
>>> -0.49-snoRNA-1;_AED=1.00;_eAED=1.00;_QI=0|-1|0|0|-1|0|1|90|0
>>> chrIX   maker   exon    4328    4416    .       +       .
>>> ID=snoscan-chrIX-noncoding-gene-0.49-snoRNA-1:exon:12260;Par
>>> ent=snoscan-chrIX-noncoding-gene-0.49-snoRNA-1
>>> chrIX   maker   gene    4375    4563    .       +       .
>>> ID=snoscan-chrIX-noncoding-gene-0.50;Name=snoscan-chrIX-nonc
>>> oding-gene-0.50
>>> chrIX   maker   snoRNA  4375    4563    .       +       .
>>> ID=snoscan-chrIX-noncoding-gene-0.50-snoRNA-1;Parent=snoscan
>>> -chrIX-noncoding-gene-0.50;Name=snoscan-chrIX-noncoding-gene
>>> -0.50-snoRNA-1;_AED=1.00;_eAED=1.00;_QI=0|-1|0|0|-1|0|1|190|0
>>> chrIX   maker   exon    4375    4563    .       +       .
>>> ID=snoscan-chrIX-noncoding-gene-0.50-snoRNA-1:exon:12261;Par
>>> ent=snoscan-chrIX-noncoding-gene-0.50-snoRNA-1
>>> chrIX   maker   gene    4375    4461    .       +       .
>>> ID=snoscan-chrIX-noncoding-gene-0.51;Name=snoscan-chrIX-nonc
>>> oding-gene-0.51
>>> chrIX   maker   snoRNA  4375    4461    .       +       .
>>> ID=snoscan-chrIX-noncoding-gene-0.51-snoRNA-1;Parent=snoscan
>>> -chrIX-noncoding-gene-0.51;Name=snoscan-chrIX-noncoding-gene
>>> -0.51-snoRNA-1;_AED=1.00;_eAED=1.00;_QI=0|-1|0|0|-1|0|1|88|0
>>> chrIX   maker   exon    4375    4461    .       +       .
>>> ID=snoscan-chrIX-noncoding-gene-0.51-snoRNA-1:exon:12262;Par
>>> ent=snoscan-chrIX-noncoding-gene-0.51-snoRNA-1
>>> chrIX   maker   gene    4375    4491    .       +       .
>>> ID=snoscan-chrIX-noncoding-gene-0.52;Name=snoscan-chrIX-nonc
>>> oding-gene-0.52
>>> chrIX   maker   snoRNA  4375    4491    .       +       .
>>> ID=snoscan-chrIX-noncoding-gene-0.52-snoRNA-1;Parent=snoscan
>>> -chrIX-noncoding-gene-0.52;Name=snoscan-chrIX-noncoding-gene
>>> -0.52-snoRNA-1;_AED=1.00;_eAED=1.00;_QI=0|-1|0|0|-1|0|1|118|0
>>> chrIX   maker   exon    4375    4491    .       +       .
>>> ID=snoscan-chrIX-noncoding-gene-0.52-snoRNA-1:exon:12263;Par
>>> ent=snoscan-chrIX-noncoding-gene-0.52-snoRNA-1
>>> chrIX   maker   gene    4375    4500    .       +       .
>>> ID=snoscan-chrIX-noncoding-gene-0.53;Name=snoscan-chrIX-nonc
>>> oding-gene-0.53
>>> chrIX   maker   snoRNA  4375    4500    .       +       .
>>> ID=snoscan-chrIX-noncoding-gene-0.53-snoRNA-1;Parent=snoscan
>>> -chrIX-noncoding-gene-0.53;Name=snoscan-chrIX-noncoding-gene
>>> -0.53-snoRNA-1;_AED=1.00;_eAED=1.00;_QI=0|-1|0|0|-1|0|1|127|0
>>> chrIX   maker   exon    4375    4500    .       +       .
>>> ID=snoscan-chrIX-noncoding-gene-0.53-snoRNA-1:exon:12264;Par
>>> ent=snoscan-chrIX-noncoding-gene-0.53-snoRNA-1
>>>
>>> Best,
>>> Jia-Xing
>>>
>>>
>>> --
>>> Jia-Xing Yue
>>>
>>> Population Genomics and Complex Traits Group
>>> Tour Pasteur 8eme etage
>>> Facult? de M?decine
>>> Institute for Research on Cancer and Aging, Nice (IRCAN)
>>> CNRS UMR 7284 - INSERM U 1081 - Universit? C?te d?Azur (UCA)
>>> 28 Avenue de Valombrose
>>> 06107 NICE Cedex 2
>>> France
>>>
>>> Personal website: http://www.iamphioxus.org/
>>>
>>> _______________________________________________
>>> maker-devel mailing list
>>> maker-devel at box290.bluehost.com
>>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org
>>>
>>>
>>>
>>
>>
>> --
>> Jia-Xing Yue
>>
>> Population Genomics and Complex Traits Group
>> Tour Pasteur 8eme etage
>> Facult? de M?decine
>> Institute for Research on Cancer and Aging, Nice (IRCAN)
>> CNRS UMR 7284 - INSERM U 1081 - Universit? C?te d?Azur (UCA)
>> 28 Avenue de Valombrose
>> 06107 NICE Cedex 2
>> France
>>
>> Personal website: http://www.iamphioxus.org/
>>
>>
>
>
> --
> Jia-Xing Yue
>
> Population Genomics and Complex Traits Group
> Tour Pasteur 8eme etage
> Facult? de M?decine
> Institute for Research on Cancer and Aging, Nice (IRCAN)
> CNRS UMR 7284 - INSERM U 1081 - Universit? C?te d?Azur (UCA)
> 28 Avenue de Valombrose
> 06107 NICE Cedex 2
> France
>
> Personal website: http://www.iamphioxus.org/
>
>
>


-- 
Jia-Xing Yue

Population Genomics and Complex Traits Group
Tour Pasteur 8eme etage
Facult? de M?decine
Institute for Research on Cancer and Aging, Nice (IRCAN)
CNRS UMR 7284 - INSERM U 1081 - Universit? C?te d?Azur (UCA)
28 Avenue de Valombrose
06107 NICE Cedex 2
France

Personal website: http://www.iamphioxus.org/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170526/9c764175/attachment-0002.html>

From dcg at cau.edu.cn  Mon May  1 07:32:30 2017
From: dcg at cau.edu.cn (dcg at cau.edu.cn)
Date: Mon, 1 May 2017 21:32:30 +0800
Subject: [maker-devel] Why my maker get no results?
Message-ID: <2017050121323023791817@cau.edu.cn>

Dear sir:
    I' have bben working on genome annotation these days.My process in as below:
    
    1. I split my contigs into 300 parts and deal with them simultaneously to speed up my process. 
    2. I used my splited-genome, protein, ESTs and RNA-seq to make the first alignment( est2genome=1,  AED_threshold=0.2 ). 
    3. Merge the maker.*_.master_datastore_index.log to get all the paths of results.
    4. gff_merge script to merge all the results in different dirs.

    However, there is no results returned. (My genome is about 3GB, but the gff of result is none.)
    index_all.log.all.gff    1KB
    index_all.log.all.maker.proteins.fasta    2837KB
    index_all.log.all.maker.transcripts.fasta    9866KB


    Where can the problems take place?
    Thanks!
Yours sincerely.

Chao Chao


dcg at cau.edu.cn
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170501/6f1771a6/attachment-0003.html>

From carsonhh at gmail.com  Mon May  1 14:04:36 2017
From: carsonhh at gmail.com (Carson Holt)
Date: Mon, 1 May 2017 14:04:36 -0600
Subject: [maker-devel] Why my maker get no results?
In-Reply-To: <2017050121323023791817@cau.edu.cn>
References: <2017050121323023791817@cau.edu.cn>
Message-ID: <28772B4F-D674-49E2-BFBD-CE2651CE0454@gmail.com>

You can merge datastore indexes that way. You will need to run them separately (i.e. unmodified location and content from what MAKER gave you), and then merge the fasta and gff3 files afterwards.

?Carson


> On May 1, 2017, at 7:32 AM, dcg at cau.edu.cn wrote:
> 
> Dear sir:
>     I' have bben working on genome annotation these days.My process in as below:
>     
>     1. I split my contigs into 300 parts and deal with them simultaneously to speed up my process. 
>     2. I used my splited-genome, protein, ESTs and RNA-seq to make the first alignment( est2genome=1,  AED_threshold=0.2 ). 
>     3. Merge the maker.*_.master_datastore_index.log to get all the paths of results.
>     4. gff_merge script to merge all the results in different dirs.
> 
>     However, there is no results returned. (My genome is about 3GB, but the gff of result is none.)
>     index_all.log.all.gff    1KB
>     index_all.log.all.maker.proteins.fasta    2837KB
>     index_all.log.all.maker.transcripts.fasta    9866KB
> 
> 
> 
> 
>     Where can the problems take place?
>     Thanks!
> Yours sincerely.
> 
> Chao Chao
> dcg at cau.edu.cn <mailto:dcg at cau.edu.cn>_______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com <mailto:maker-devel at box290.bluehost.com>
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org <http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170501/1ea8b28e/attachment-0003.html>

From jim03ljy at 126.com  Wed May  3 06:15:22 2017
From: jim03ljy at 126.com (=?GBK?B?wqy98NStSmlt?=)
Date: Wed, 3 May 2017 20:15:22 +0800 (CST)
Subject: [maker-devel] RepeatMasker: NCBIBlastSearchEngine::search: Error
Message-ID: <50d2a447.a7ce.15bce3c7dcb.Coremail.jim03ljy@126.com>

Hi, I'm a newbie of maker.
I met some errors in Repeatmasker step.


The error is here:

NCBIBlastSearchEngine::search: Error...compressed subject database (/home/softwares/RepeatMasker/Libraries/20170127/general/is.lib) does not exist!


I tried ncbi+blast 2.5.0 version and 2.6.0 version as the path to blast, both have the same error.

And when I use the command as "maker -R", which skips the repeatmasker step, the maker could work.

I checked the former similar errors reported by another user and he solved the problem by  updating the RepBase.

So,

  I deleted and re-installed the RepeatMasker, updated the RepBase,  also installed RMblast.

The error is the same.


I'm stuck in the problem now.

Would highly appreciate any help - thanks!


Jinyuan Lu
Shanghai Jiao Tong University
No. 800 Dong Chuan Road,Minhang District,  Shanghai, P.R. China
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170503/ec5ea157/attachment-0003.html>

From dcg at cau.edu.cn  Wed May  3 09:29:18 2017
From: dcg at cau.edu.cn (dcg at cau.edu.cn)
Date: Wed, 3 May 2017 23:29:18 +0800
Subject: [maker-devel] How to explain the maker results?
Message-ID: <2017050323291810262239@cau.edu.cn>

Dear sir:
    I?ve been using maker to do my genome annotation. However, I still have something I can't understand:

    1. After assembly, I have many contigs. Firstly, I set est2genome=1 and protein2genome=1 , with my proteins, ESTs and RNA-seq.. Which way below is correct?
    1.1 Each contig has its own gff. I just use its own maker_gff file to get a pyu.hmm(be used in snap practice), and then, train the single contig.
    1.2 I merge all the maker_gff to produce a pyu.hmm(for snap) , and then, use this pyu.hmm to train all the contigs.
    
    2. The aim of my project is to find new protein, so I need to guarantee the rigor of my annotation.
        I  made a plan that the predicted protein should be successfully aligned to the Uniprot(reviewed protein, total number is about 30K) with 100% identity and coverage.
        However, if I choose method 1.2 as above:
        After the first step (est2genome=1 and protein2genome=1), about 1600 proteins can be 100% aligned to the Uniprot. After 2 rounds training(est2genome=0 and protein2genome=0), less proteins can be 100% aligned.
        Is my test method reasonable? Why the final results can't get more well aligned proteins?
        After training and fasta_merge, the results can be index_all.log.all.maker.proteins.fasta, index_all.log.all.maker.snap_masked.proteins.fasta, index_all.log.all.maker.non_overlapping_ab_initio.proteins.fasta,  which is the final results?

    
     I'm looking forward to hearing from you. Thanks!
Yours sincerely!

     
Chao Chao


dcg at cau.edu.cn
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170503/684f53cd/attachment-0003.html>

From d.ence at ufl.edu  Wed May  3 09:49:08 2017
From: d.ence at ufl.edu (Ence,daniel)
Date: Wed, 3 May 2017 15:49:08 +0000
Subject: [maker-devel] RepeatMasker: NCBIBlastSearchEngine::search: Error
In-Reply-To: <50d2a447.a7ce.15bce3c7dcb.Coremail.jim03ljy@126.com>
References: <50d2a447.a7ce.15bce3c7dcb.Coremail.jim03ljy@126.com>
Message-ID: <B69E8792-8153-4840-AC3C-D2E6CCFF3A45@mail.ufl.edu>

Hi, the error is regarding a specific file (is.lib) which isn?t being found. Can you verify that the file is there after you updated Repbase? Use the command:
?ls -l /home/softwares/RepeatMasker/Libraries/20170127/general/is.lib?

Thanks,
Daniel Ence


On May 3, 2017, at 8:15 AM, ???Jim <jim03ljy at 126.com<mailto:jim03ljy at 126.com>> wrote:

Hi, I'm a newbie of maker.
I met some errors in Repeatmasker step.

The error is here:
NCBIBlastSearchEngine::search: Error...compressed subject database (/home/softwares/RepeatMasker/Libraries/20170127/general/is.lib) does not exist!

I tried ncbi+blast 2.5.0 version and 2.6.0 version as the path to blast, both have the same error.
And when I use the command as "maker -R", which skips the repeatmasker step, the maker could work.
I checked the former similar errors reported by another user and he solved the problem by  updating the RepBase.
So,
  I deleted and re-installed the RepeatMasker, updated the RepBase,  also installed RMblast.
The error is the same.

I'm stuck in the problem now.
Would highly appreciate any help - thanks!

Jinyuan Lu
Shanghai Jiao Tong University
No. 800 Dong Chuan Road,Minhang District,  Shanghai, P.R. China
_______________________________________________
maker-devel mailing list
maker-devel at box290.bluehost.com<mailto:maker-devel at box290.bluehost.com>
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170503/2cf26b36/attachment-0003.html>

From carsonhh at gmail.com  Wed May  3 09:53:40 2017
From: carsonhh at gmail.com (Carson Holt)
Date: Wed, 3 May 2017 09:53:40 -0600
Subject: [maker-devel] RepeatMasker: NCBIBlastSearchEngine::search: Error
In-Reply-To: <50d2a447.a7ce.15bce3c7dcb.Coremail.jim03ljy@126.com>
References: <50d2a447.a7ce.15bce3c7dcb.Coremail.jim03ljy@126.com>
Message-ID: <92771056-564B-4953-B738-5A1B97FC71AF@gmail.com>

RepBase and RepeatMasker have changed structure with the new 4.0.7 released two months ago. The new version and RepBase is only compatible with the new version of RepeatMasker. You have to update both (complete reinstall). Or you have to use the previous version of RepeatMasker with the previous version of RepBase.

?Carson


> On May 3, 2017, at 6:15 AM, ???Jim <jim03ljy at 126.com> wrote:
> 
> Hi, I'm a newbie of maker.
> I met some errors in Repeatmasker step.
> 
> The error is here:
> NCBIBlastSearchEngine::search: Error...compressed subject database (/home/softwares/RepeatMasker/Libraries/20170127/general/is.lib) does not exist!
> 
> I tried ncbi+blast 2.5.0 version and 2.6.0 version as the path to blast, both have the same error.
> And when I use the command as "maker -R", which skips the repeatmasker step, the maker could work.
> I checked the former similar errors reported by another user and he solved the problem by  updating the RepBase.
> So,
>   I deleted and re-installed the RepeatMasker, updated the RepBase,  also installed RMblast.
> The error is the same.
> 
> I'm stuck in the problem now.
> Would highly appreciate any help - thanks!
> 
> Jinyuan Lu
> Shanghai Jiao Tong University
> No. 800 Dong Chuan Road,Minhang District,  Shanghai, P.R. China
> _______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170503/2dc895e1/attachment-0003.html>

From carsonhh at gmail.com  Wed May  3 09:55:41 2017
From: carsonhh at gmail.com (Carson Holt)
Date: Wed, 3 May 2017 09:55:41 -0600
Subject: [maker-devel] RepeatMasker: NCBIBlastSearchEngine::search: Error
In-Reply-To: <92771056-564B-4953-B738-5A1B97FC71AF@gmail.com>
References: <50d2a447.a7ce.15bce3c7dcb.Coremail.jim03ljy@126.com>
	<92771056-564B-4953-B738-5A1B97FC71AF@gmail.com>
Message-ID: <2E668C83-8884-430B-A764-CD0B44D03D19@gmail.com>

You may want to use the previous version of both as the new version may still have hidden bugs.

?Carson

> On May 3, 2017, at 9:53 AM, Carson Holt <carsonhh at gmail.com> wrote:
> 
> RepBase and RepeatMasker have changed structure with the new 4.0.7 released two months ago. The new version and RepBase is only compatible with the new version of RepeatMasker. You have to update both (complete reinstall). Or you have to use the previous version of RepeatMasker with the previous version of RepBase.
> 
> ?Carson
> 
> 
> 
>> On May 3, 2017, at 6:15 AM, ???Jim <jim03ljy at 126.com <mailto:jim03ljy at 126.com>> wrote:
>> 
>> Hi, I'm a newbie of maker.
>> I met some errors in Repeatmasker step.
>> 
>> The error is here:
>> NCBIBlastSearchEngine::search: Error...compressed subject database (/home/softwares/RepeatMasker/Libraries/20170127/general/is.lib) does not exist!
>> 
>> I tried ncbi+blast 2.5.0 version and 2.6.0 version as the path to blast, both have the same error.
>> And when I use the command as "maker -R", which skips the repeatmasker step, the maker could work.
>> I checked the former similar errors reported by another user and he solved the problem by  updating the RepBase.
>> So,
>>   I deleted and re-installed the RepeatMasker, updated the RepBase,  also installed RMblast.
>> The error is the same.
>> 
>> I'm stuck in the problem now.
>> Would highly appreciate any help - thanks!
>> 
>> Jinyuan Lu
>> Shanghai Jiao Tong University
>> No. 800 Dong Chuan Road,Minhang District,  Shanghai, P.R. China
>> _______________________________________________
>> maker-devel mailing list
>> maker-devel at box290.bluehost.com <mailto:maker-devel at box290.bluehost.com>
>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170503/2f9e37d8/attachment-0003.html>

From carsonhh at gmail.com  Wed May  3 10:04:20 2017
From: carsonhh at gmail.com (Carson Holt)
Date: Wed, 3 May 2017 10:04:20 -0600
Subject: [maker-devel] RepeatMasker: NCBIBlastSearchEngine::search: Error
In-Reply-To: <2E668C83-8884-430B-A764-CD0B44D03D19@gmail.com>
References: <50d2a447.a7ce.15bce3c7dcb.Coremail.jim03ljy@126.com>
	<92771056-564B-4953-B738-5A1B97FC71AF@gmail.com>
	<2E668C83-8884-430B-A764-CD0B44D03D19@gmail.com>
Message-ID: <BC60F425-2156-457F-B68C-576ABF778840@gmail.com>

You may have to contact RepBase via e-mail to find out how to get the libraries compatible with RepeatMasker 4.0.6 as it looks like they have removed the previous release from the website.

The last release for 4.0.6 was ?> repeatmaskerlibraries-20160829.tar.gz

?Carson


> On May 3, 2017, at 9:55 AM, Carson Holt <carsonhh at gmail.com> wrote:
> 
> You may want to use the previous version of both as the new version may still have hidden bugs.
> 
> ?Carson
> 
>> On May 3, 2017, at 9:53 AM, Carson Holt <carsonhh at gmail.com <mailto:carsonhh at gmail.com>> wrote:
>> 
>> RepBase and RepeatMasker have changed structure with the new 4.0.7 released two months ago. The new version and RepBase is only compatible with the new version of RepeatMasker. You have to update both (complete reinstall). Or you have to use the previous version of RepeatMasker with the previous version of RepBase.
>> 
>> ?Carson
>> 
>> 
>> 
>>> On May 3, 2017, at 6:15 AM, ???Jim <jim03ljy at 126.com <mailto:jim03ljy at 126.com>> wrote:
>>> 
>>> Hi, I'm a newbie of maker.
>>> I met some errors in Repeatmasker step.
>>> 
>>> The error is here:
>>> NCBIBlastSearchEngine::search: Error...compressed subject database (/home/softwares/RepeatMasker/Libraries/20170127/general/is.lib) does not exist!
>>> 
>>> I tried ncbi+blast 2.5.0 version and 2.6.0 version as the path to blast, both have the same error.
>>> And when I use the command as "maker -R", which skips the repeatmasker step, the maker could work.
>>> I checked the former similar errors reported by another user and he solved the problem by  updating the RepBase.
>>> So,
>>>   I deleted and re-installed the RepeatMasker, updated the RepBase,  also installed RMblast.
>>> The error is the same.
>>> 
>>> I'm stuck in the problem now.
>>> Would highly appreciate any help - thanks!
>>> 
>>> Jinyuan Lu
>>> Shanghai Jiao Tong University
>>> No. 800 Dong Chuan Road,Minhang District,  Shanghai, P.R. China
>>> _______________________________________________
>>> maker-devel mailing list
>>> maker-devel at box290.bluehost.com <mailto:maker-devel at box290.bluehost.com>
>>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org <http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org>
>> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170503/26d18c8b/attachment-0003.html>

From carsonhh at gmail.com  Wed May  3 10:10:48 2017
From: carsonhh at gmail.com (Carson Holt)
Date: Wed, 3 May 2017 10:10:48 -0600
Subject: [maker-devel] How to explain the maker results?
In-Reply-To: <2017050323291810262239@cau.edu.cn>
References: <2017050323291810262239@cau.edu.cn>
Message-ID: <049F8AC8-7E16-4F05-B8B2-01CA7AB88751@gmail.com>

Use the merged gff3 to train snap, otherwise you won?t have enough models.

Info on training can be found on the wiki ?> http://weatherby.genetics.utah.edu/MAKER/wiki/index.php/MAKER_Tutorial_for_GMOD_Online_Training_2014#Training_ab_initio_Gene_Predictors <http://weatherby.genetics.utah.edu/MAKER/wiki/index.php/MAKER_Tutorial_for_GMOD_Online_Training_2014#Training_ab_initio_Gene_Predictors>

Also you can find additional detailed info by searching the mailing list archives ?> http://groups.google.com/group/maker-devel <http://groups.google.com/group/maker-devel>

I?m not sure what you are asking with the last question. Alignment is not a function of training, and will not be affected by the hmm, but 100% coverage and identity is too strict a threshold even for data derived from the same species.

?Carson


> On May 3, 2017, at 9:29 AM, dcg at cau.edu.cn wrote:
> 
> Dear sir:
>     I?ve been using maker to do my genome annotation. However, I still have something I can't understand:
> 
>     1. After assembly, I have many contigs. Firstly, I set est2genome=1 and protein2genome=1 , with my proteins, ESTs and RNA-seq.. Which way below is correct?
>     1.1 Each contig has its own gff. I just use its own maker_gff file to get a pyu.hmm(be used in snap practice), and then, train the single contig.
>     1.2 I merge all the maker_gff to produce a pyu.hmm(for snap) , and then, use this pyu.hmm to train all the contigs.
>     
>     2. The aim of my project is to find new protein, so I need to guarantee the rigor of my annotation.
>         I  made a plan that the predicted protein should be successfully aligned to the Uniprot(reviewed protein, total number is about 30K) with 100% identity and coverage.
>         However, if I choose method 1.2 as above:
>         After the first step (est2genome=1 and protein2genome=1), about 1600 proteins can be 100% aligned to the Uniprot. After 2 rounds training(est2genome=0 and protein2genome=0), less proteins can be 100% aligned.
>         Is my test method reasonable? Why the final results can't get more well aligned proteins?
>         After training and fasta_merge, the results can be index_all.log.all.maker.proteins.fasta, index_all.log.all.maker.snap_masked.proteins.fasta, index_all.log.all.maker.non_overlapping_ab_initio.proteins.fasta,  which is the final results?
> 
>     
>      I'm looking forward to hearing from you. Thanks!
> Yours sincerely!
> 
>      
> Chao Chao
> dcg at cau.edu.cn <mailto:dcg at cau.edu.cn>_______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com <mailto:maker-devel at box290.bluehost.com>
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org <http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170503/75918e71/attachment-0003.html>

From nathan.ricks at gmail.com  Wed May  3 10:19:57 2017
From: nathan.ricks at gmail.com (Nathan Ricks)
Date: Wed, 3 May 2017 10:19:57 -0600
Subject: [maker-devel] Post Processing of Annotations
Message-ID: <CAJfcix0=bU1QmOX+qhfGO_YGbSARpxPHpuOsfGv9jAp8ZpyjNg@mail.gmail.com>

Hi,
I've been running your Maker pipeline, and I've reached Post Processing of
Annotations portion. In your Online training you use the output.blastp and
the outuput.iprscan files to help assign function.
My question is what format do these files need to be in.
Iprscan can produce files in a variety of formats: tsv, xml, gff3, html and
SVG
while blastp can produce the tabular, pairise, xml and a number of others.

Thanks

Nathan Ricks
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170503/5f47cc24/attachment-0003.html>

From carsonhh at gmail.com  Wed May  3 10:30:03 2017
From: carsonhh at gmail.com (Carson Holt)
Date: Wed, 3 May 2017 10:30:03 -0600
Subject: [maker-devel] Post Processing of Annotations
In-Reply-To: <CAJfcix0=bU1QmOX+qhfGO_YGbSARpxPHpuOsfGv9jAp8ZpyjNg@mail.gmail.com>
References: <CAJfcix0=bU1QmOX+qhfGO_YGbSARpxPHpuOsfGv9jAp8ZpyjNg@mail.gmail.com>
Message-ID: <DC72D273-A00B-4018-8C36-E124CAC8D606@gmail.com>

Use blastp with the the tab delimited format option and the UniProt/Swiss-Prot database. What additional filters you choose to set (i.e. e-value limit) may vary, although I would recommend 1e-6 or lower.

?Carson

> On May 3, 2017, at 10:19 AM, Nathan Ricks <nathan.ricks at gmail.com> wrote:
> 
> Hi,
> I've been running your Maker pipeline, and I've reached Post Processing of Annotations portion. In your Online training you use the output.blastp and the outuput.iprscan files to help assign function.
> My question is what format do these files need to be in.
> Iprscan can produce files in a variety of formats: tsv, xml, gff3, html and SVG
> while blastp can produce the tabular, pairise, xml and a number of others.
> 
> Thanks
> 
> Nathan Ricks
> _______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org


From d.ence at ufl.edu  Wed May  3 10:34:35 2017
From: d.ence at ufl.edu (Ence,daniel)
Date: Wed, 3 May 2017 16:34:35 +0000
Subject: [maker-devel] Post Processing of Annotations
In-Reply-To: <CAJfcix0=bU1QmOX+qhfGO_YGbSARpxPHpuOsfGv9jAp8ZpyjNg@mail.gmail.com>
References: <CAJfcix0=bU1QmOX+qhfGO_YGbSARpxPHpuOsfGv9jAp8ZpyjNg@mail.gmail.com>
Message-ID: <F9D83F09-F94E-473D-911B-6626CDC8E639@mail.ufl.edu>

Hi, The iprscan output should be in tsv format, which is tab-separated, and the usage statement for the maker_functional_gff says that the blastp output should be in ?wu-blast -mformat 2?, which I think is tabbed too.

~Daniel

 
> On May 3, 2017, at 12:19 PM, Nathan Ricks <nathan.ricks at gmail.com> wrote:
> 
> Hi,
> I've been running your Maker pipeline, and I've reached Post Processing of Annotations portion. In your Online training you use the output.blastp and the outuput.iprscan files to help assign function.
> My question is what format do these files need to be in.
> Iprscan can produce files in a variety of formats: tsv, xml, gff3, html and SVG
> while blastp can produce the tabular, pairise, xml and a number of others.
> 
> Thanks
> 
> Nathan Ricks
> _______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org


From carsonhh at gmail.com  Wed May  3 13:20:31 2017
From: carsonhh at gmail.com (Carson Holt)
Date: Wed, 3 May 2017 13:20:31 -0600
Subject: [maker-devel] Post Processing of Annotations
In-Reply-To: <CAJfcix1wnd24e8CZs5KT2j7axJOO-yhSMswJBozCSFqbo7ds3A@mail.gmail.com>
References: <CAJfcix0=bU1QmOX+qhfGO_YGbSARpxPHpuOsfGv9jAp8ZpyjNg@mail.gmail.com>
	<DC72D273-A00B-4018-8C36-E124CAC8D606@gmail.com>
	<CAJfcix1wnd24e8CZs5KT2j7axJOO-yhSMswJBozCSFqbo7ds3A@mail.gmail.com>
Message-ID: <0EE0D7F6-5F28-46E7-9AB8-CED93DC811F6@gmail.com>

The maker_functional_gff and maker_functional_fasta scripts pull specific fields out of the UniProt fasta header, so they are tied to the format used by UniProt/Swiss-Prot. At one time I had modified them to also work with NR, but that was several years ago, so I don?t know if it would still work.

?Carson


> On May 3, 2017, at 1:10 PM, Nathan Ricks <nathan.ricks at gmail.com> wrote:
> 
> Is it possible to make my own database from sequences that I have downloaded form NCBI instead of using the UniProt/Swiss-Prot?
> 
> On Wed, May 3, 2017 at 10:30 AM, Carson Holt <carsonhh at gmail.com <mailto:carsonhh at gmail.com>> wrote:
> Use blastp with the the tab delimited format option and the UniProt/Swiss-Prot database. What additional filters you choose to set (i.e. e-value limit) may vary, although I would recommend 1e-6 or lower.
> 
> ?Carson
> 
> > On May 3, 2017, at 10:19 AM, Nathan Ricks <nathan.ricks at gmail.com <mailto:nathan.ricks at gmail.com>> wrote:
> >
> > Hi,
> > I've been running your Maker pipeline, and I've reached Post Processing of Annotations portion. In your Online training you use the output.blastp and the outuput.iprscan files to help assign function.
> > My question is what format do these files need to be in.
> > Iprscan can produce files in a variety of formats: tsv, xml, gff3, html and SVG
> > while blastp can produce the tabular, pairise, xml and a number of others.
> >
> > Thanks
> >
> > Nathan Ricks
> > _______________________________________________
> > maker-devel mailing list
> > maker-devel at box290.bluehost.com <mailto:maker-devel at box290.bluehost.com>
> > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org <http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org>
> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170503/917df7db/attachment-0003.html>

From mjfi2sb3 at gmail.com  Thu May  4 00:37:52 2017
From: mjfi2sb3 at gmail.com (Salim Bougouffa)
Date: Thu, 04 May 2017 06:37:52 +0000
Subject: [maker-devel] advanced repeat masking library constructions &
 rna-seq assembly choices
Message-ID: <CAJb_6LTFg7MCZa5_+uUhKDDKMeNqa51Kvq11aQCfECA8nxr1Tg@mail.gmail.com>

Hi,

I am attempting to annotate a plant genome. I have a couple of questions:

*1) RNA-seq assembly*
a) I assembled my RNA-seq data using Trinity and StringTie. The two produce
drastically different numbers. When I compare the two assemblies for each
sample using TransRate, StringTie produces a higher score. for most of the
assemblies. I see in all of the threads that you recommend Trinity but
doesn't trinity produce way too many transcripts (even after chucking out
the "bad" ones using transrate).
b) During hint creation in MAKER, does it take into account that different
transcripts have different read coverage (expression levels). I guess my
question is should I filter transcripts that have a small read coverage.

*2) Repeat Masking *
I am following the advanced repeat library construction tutorial (
http://weatherby.genetics.utah.edu/MAKER/wiki/index.php/Repeat_Library_Construction-Advanced).
The initial steps find 15 sequences for the LTR and 159 for MITE. But, when
I get to the perl DIR_CRL/CRL_Step4.pl step, both output files
(Inner_Seq_For_BLAST.fasta, lLTRs_Seq_For_BLAST.fasta) are empty.

a) are these numbers normal because I was expecting a lot more than 16 for
the LTR?
b) I don't get any errors when I run CRL_Step4.pl yet no output. What's
going on?!

Many thanks,
/SB
-- 

____________________________
Sent from Inbox Mobile
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170504/a5f4a761/attachment-0003.html>

From jim03ljy at 126.com  Thu May  4 00:36:12 2017
From: jim03ljy at 126.com (=?GBK?B?wqy98NStSmlt?=)
Date: Thu, 4 May 2017 14:36:12 +0800 (CST)
Subject: [maker-devel] RepeatMasker: NCBIBlastSearchEngine::search: Error
In-Reply-To: <BC60F425-2156-457F-B68C-576ABF778840@gmail.com>
References: <50d2a447.a7ce.15bce3c7dcb.Coremail.jim03ljy@126.com>
	<92771056-564B-4953-B738-5A1B97FC71AF@gmail.com>
	<2E668C83-8884-430B-A764-CD0B44D03D19@gmail.com>
	<BC60F425-2156-457F-B68C-576ABF778840@gmail.com>
Message-ID: <12de56ed.5ded.15bd22c56c7.Coremail.jim03ljy@126.com>

Thanks a lot!
Problem solved. 
I matched the RepeatMasker 4.0.7 with RepBase20170127 and it worked!


Thanks!
   ----Jinyuan Lu


At 2017-05-04 00:04:20, "Carson Holt" <carsonhh at gmail.com> wrote:
You may have to contact RepBase via e-mail to find out how to get the libraries compatible with RepeatMasker 4.0.6 as it looks like they have removed the previous release from the website.


The last release for 4.0.6 was ?> repeatmaskerlibraries-20160829.tar.gz


?Carson


On May 3, 2017, at 9:55 AM, Carson Holt <carsonhh at gmail.com> wrote:


You may want to use the previous version of both as the new version may still have hidden bugs.


?Carson


On May 3, 2017, at 9:53 AM, Carson Holt <carsonhh at gmail.com> wrote:


RepBase and RepeatMasker have changed structure with the new 4.0.7 released two months ago. The new version and RepBase is only compatible with the new version of RepeatMasker. You have to update both (complete reinstall). Or you have to use the previous version of RepeatMasker with the previous version of RepBase.


?Carson


On May 3, 2017, at 6:15 AM, ???Jim <jim03ljy at 126.com> wrote:


Hi, I'm a newbie of maker.
I met some errors in Repeatmasker step.


The error is here:
NCBIBlastSearchEngine::search: Error...compressed subject database (/home/softwares/RepeatMasker/Libraries/20170127/general/is.lib) does not exist!


I tried ncbi+blast 2.5.0 version and 2.6.0 version as the path to blast, both have the same error.
And when I use the command as "maker -R", which skips the repeatmasker step, the maker could work.
I checked the former similar errors reported by another user and he solved the problem by  updating the RepBase.
So,
  I deleted and re-installed the RepeatMasker, updated the RepBase,  also installed RMblast.
The error is the same.


I'm stuck in the problem now.
Would highly appreciate any help - thanks!


Jinyuan Lu
Shanghai Jiao Tong University
No. 800 Dong Chuan Road,Minhang District,  Shanghai, P.R. China
_______________________________________________
maker-devel mailing list
maker-devel at box290.bluehost.com
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170504/2fc4c117/attachment-0003.html>

From dcg at cau.edu.cn  Fri May  5 07:43:43 2017
From: dcg at cau.edu.cn (dcg at cau.edu.cn)
Date: Fri, 5 May 2017 21:43:43 +0800
Subject: [maker-devel] How to evaluate maker proteins' quality?
Message-ID: <2017050521434331108720@cau.edu.cn>

Dear sir:
    After I finished my maker running, I should check the quality of my results.
    My annotation purpose is to find some new proteins.
    There is about 30K reviewed proteins of my species. If I want to see how many predicted proteins can support the reviewed proteins, how to do it?(Can blastp be OK? How to set the threshold? )
    I used Uniprot, ESTs and RNA-seq to do my annotation. From my perspective, if the protein is reviewed and used to train snap/augustus, we should get the same one after several training rounds. So I planned to align maker_proteins to Uniprot proteins(which I utilized to annotate). If the predicted proteins match Uniprot by 100% identity and coverage, they can be thought to support the reviewed proteins. Is it correct?    
    
    If not, maybe I can evaluate my proteins only by AED value and proteome domain?

    I'm looking forward to your help. Thanks a lot!

Yours sincerely!

Chao Chao


dcg at cau.edu.cn
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170505/0f7ef3a2/attachment-0003.html>

From carsonhh at gmail.com  Sun May  7 18:31:31 2017
From: carsonhh at gmail.com (Carson Holt)
Date: Sun, 7 May 2017 18:31:31 -0600
Subject: [maker-devel] How to evaluate maker proteins' quality?
In-Reply-To: <2017050521434331108720@cau.edu.cn>
References: <2017050521434331108720@cau.edu.cn>
Message-ID: <51620CC3-43D9-47D5-B8B3-871F291D6518@gmail.com>

Because of small differences in the assemblies, individual variants, annotated proteins used as reference being partial, as well as potential assembly error, a 100% identity expectation is too high. About 90+% would be more reasonable for a same species comparison. AED gives a good correlation with protein confidence. A perfect zero score will not happen often though since the way alignment algorithms work will leave alignment errors around splice sites and short exons. Also the evidence used is never perfect, so with AED lower values are better than higher values but can not be used as an overly specific measurement (it is only correlative and not exact).

?Carson


> On May 5, 2017, at 7:43 AM, dcg at cau.edu.cn wrote:
> 
> Dear sir:
>     After I finished my maker running, I should check the quality of my results.
>     My annotation purpose is to find some new proteins.
>     There is about 30K reviewed proteins of my species. If I want to see how many predicted proteins can support the reviewed proteins, how to do it?(Can blastp be OK? How to set the threshold? )
>     I used Uniprot, ESTs and RNA-seq to do my annotation. From my perspective, if the protein is reviewed and used to train snap/augustus, we should get the same one after several training rounds. So I planned to align maker_proteins to Uniprot proteins(which I utilized to annotate). If the predicted proteins match Uniprot by 100% identity and coverage, they can be thought to support the reviewed proteins. Is it correct?    
>     
>     If not, maybe I can evaluate my proteins only by AED value and proteome domain?
> 
>     I'm looking forward to your help. Thanks a lot!
> 
> Yours sincerely!
> 
> Chao Chao
> dcg at cau.edu.cn <mailto:dcg at cau.edu.cn>_______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com <mailto:maker-devel at box290.bluehost.com>
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org <http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170507/07783c61/attachment-0003.html>

From carsonhh at gmail.com  Sun May  7 19:17:37 2017
From: carsonhh at gmail.com (Carson Holt)
Date: Sun, 7 May 2017 19:17:37 -0600
Subject: [maker-devel] advanced repeat masking library constructions &
 rna-seq assembly choices
In-Reply-To: <CAJb_6LTFg7MCZa5_+uUhKDDKMeNqa51Kvq11aQCfECA8nxr1Tg@mail.gmail.com>
References: <CAJb_6LTFg7MCZa5_+uUhKDDKMeNqa51Kvq11aQCfECA8nxr1Tg@mail.gmail.com>
Message-ID: <18086AF2-01C3-4671-B974-C5FF36460618@gmail.com>

Michael can you answer the second question (Michael wrote the protocol, so I CC?d him).

With respect to the first question. Expression level is not necessarily relevant to the annotation process (so no MAKER does not look at read coverage). Instead we use the transcript assemblies to identify introns via splice aware alignment (yes it is the introns and not the exons we care about). Trinity has a nice option called jaccard_clip which avoids false merging of neighboring transcripts (mostly occurs in fungi where UTR can overlap). Merging of transcripts will cause extra introns to be assigned as hints as well as potential overextension of UTR during final polishing steps. The jaccard_clip option is the main reason we recommend Trinity. If Stringtie has a similar option, then it can be used as well.

Thanks,
Carson


> On May 4, 2017, at 12:37 AM, Salim Bougouffa <mjfi2sb3 at gmail.com> wrote:
> 
> Hi,
> 
> I am attempting to annotate a plant genome. I have a couple of questions:
> 
> 1) RNA-seq assembly
> a) I assembled my RNA-seq data using Trinity and StringTie. The two produce drastically different numbers. When I compare the two assemblies for each sample using TransRate, StringTie produces a higher score. for most of the assemblies. I see in all of the threads that you recommend Trinity but doesn't trinity produce way too many transcripts (even after chucking out the "bad" ones using transrate).
> b) During hint creation in MAKER, does it take into account that different transcripts have different read coverage (expression levels). I guess my question is should I filter transcripts that have a small read coverage.
> 
> 2) Repeat Masking 
> I am following the advanced repeat library construction tutorial (http://weatherby.genetics.utah.edu/MAKER/wiki/index.php/Repeat_Library_Construction-Advanced <http://weatherby.genetics.utah.edu/MAKER/wiki/index.php/Repeat_Library_Construction-Advanced>). The initial steps find 15 sequences for the LTR and 159 for MITE. But, when I get to the perl DIR_CRL/CRL_Step4.pl step, both output files (Inner_Seq_For_BLAST.fasta, lLTRs_Seq_For_BLAST.fasta) are empty.
> 
> a) are these numbers normal because I was expecting a lot more than 16 for the LTR? 
> b) I don't get any errors when I run CRL_Step4.pl yet no output. What's going on?!
> 
> Many thanks,
> /SB
> -- 
> ____________________________
> Sent from Inbox Mobile
> 
> _______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170507/28de8fb9/attachment-0003.html>

From mcampbel at cshl.edu  Sun May  7 19:24:27 2017
From: mcampbel at cshl.edu (Campbell, Michael)
Date: Mon, 8 May 2017 01:24:27 +0000
Subject: [maker-devel] advanced repeat masking library constructions &
 rna-seq assembly choices
In-Reply-To: <18086AF2-01C3-4671-B974-C5FF36460618@gmail.com>
References: <CAJb_6LTFg7MCZa5_+uUhKDDKMeNqa51Kvq11aQCfECA8nxr1Tg@mail.gmail.com>
	<18086AF2-01C3-4671-B974-C5FF36460618@gmail.com>
Message-ID: <076B034E-8107-49CE-90C7-277AA4AB4ED3@cshl.edu>

Hi SB,

I?ve added Ning Jaing to this email. She has put great effort into updating this protocol recently and will be able to address your questions better than I can.

Ning, would you mind helping out with this?

Thanks,
Mike

On May 7, 2017, at 9:17 PM, Carson Holt <carsonhh at gmail.com<mailto:carsonhh at gmail.com>> wrote:

Michael can you answer the second question (Michael wrote the protocol, so I CC?d him).

With respect to the first question. Expression level is not necessarily relevant to the annotation process (so no MAKER does not look at read coverage). Instead we use the transcript assemblies to identify introns via splice aware alignment (yes it is the introns and not the exons we care about). Trinity has a nice option called jaccard_clip which avoids false merging of neighboring transcripts (mostly occurs in fungi where UTR can overlap). Merging of transcripts will cause extra introns to be assigned as hints as well as potential overextension of UTR during final polishing steps. The jaccard_clip option is the main reason we recommend Trinity. If Stringtie has a similar option, then it can be used as well.

Thanks,
Carson


On May 4, 2017, at 12:37 AM, Salim Bougouffa <mjfi2sb3 at gmail.com<mailto:mjfi2sb3 at gmail.com>> wrote:

Hi,

I am attempting to annotate a plant genome. I have a couple of questions:

1) RNA-seq assembly
a) I assembled my RNA-seq data using Trinity and StringTie. The two produce drastically different numbers. When I compare the two assemblies for each sample using TransRate, StringTie produces a higher score. for most of the assemblies. I see in all of the threads that you recommend Trinity but doesn't trinity produce way too many transcripts (even after chucking out the "bad" ones using transrate).
b) During hint creation in MAKER, does it take into account that different transcripts have different read coverage (expression levels). I guess my question is should I filter transcripts that have a small read coverage.

2) Repeat Masking
I am following the advanced repeat library construction tutorial (http://weatherby.genetics.utah.edu/MAKER/wiki/index.php/Repeat_Library_Construction-Advanced). The initial steps find 15 sequences for the LTR and 159 for MITE. But, when I get to the perl DIR_CRL/CRL_Step4.pl step, both output files (Inner_Seq_For_BLAST.fasta, lLTRs_Seq_For_BLAST.fasta) are empty.

a) are these numbers normal because I was expecting a lot more than 16 for the LTR?
b) I don't get any errors when I run CRL_Step4.pl yet no output. What's going on?!

Many thanks,
/SB
--

____________________________
Sent from Inbox Mobile

_______________________________________________
maker-devel mailing list
maker-devel at box290.bluehost.com<mailto:maker-devel at box290.bluehost.com>
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org

_______________________________________________
maker-devel mailing list
maker-devel at box290.bluehost.com<mailto:maker-devel at box290.bluehost.com>
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org


From jiangn at msu.edu  Mon May  8 09:50:45 2017
From: jiangn at msu.edu (Jiang, Ning)
Date: Mon, 8 May 2017 15:50:45 +0000
Subject: [maker-devel] advanced repeat masking library constructions &
 rna-seq assembly choices
In-Reply-To: <076B034E-8107-49CE-90C7-277AA4AB4ED3@cshl.edu>
References: <CAJb_6LTFg7MCZa5_+uUhKDDKMeNqa51Kvq11aQCfECA8nxr1Tg@mail.gmail.com>
	<18086AF2-01C3-4671-B974-C5FF36460618@gmail.com>,
	<076B034E-8107-49CE-90C7-277AA4AB4ED3@cshl.edu>
Message-ID: <DM5PR1201MB0060CD7E9DFA122F7D1A127DCEEE0@DM5PR1201MB0060.namprd12.prod.outlook.com>

Hi Salim,


I am sorry to learn about the issues. it depends on the quality of your genome assembly for how many intact LTR elements you would get; however, 16 seems too low to me.


The inner and LTR sequence file should NOT be empty. Some times the issue could be due to that the initial sequence name is long and complicated. If that's the case for your sequences, you might want to simplify your sequence name (only including letters and numbers) and try again.


We are working on an automatic pipeline for LTR collection, if everything goes smoothly, it should be available in two to three months.


Best wishes,


Ning

________________________________
From: Campbell, Michael <mcampbel at cshl.edu>
Sent: Sunday, May 7, 2017 9:24 PM
To: Carson Holt
Cc: Salim Bougouffa; maker-devel at yandell-lab.org List; Jiang, Ning
Subject: Re: [maker-devel] advanced repeat masking library constructions & rna-seq assembly choices

Hi SB,

I?ve added Ning Jaing to this email. She has put great effort into updating this protocol recently and will be able to address your questions better than I can.

Ning, would you mind helping out with this?

Thanks,
Mike

On May 7, 2017, at 9:17 PM, Carson Holt <carsonhh at gmail.com<mailto:carsonhh at gmail.com>> wrote:

Michael can you answer the second question (Michael wrote the protocol, so I CC?d him).

With respect to the first question. Expression level is not necessarily relevant to the annotation process (so no MAKER does not look at read coverage). Instead we use the transcript assemblies to identify introns via splice aware alignment (yes it is the introns and not the exons we care about). Trinity has a nice option called jaccard_clip which avoids false merging of neighboring transcripts (mostly occurs in fungi where UTR can overlap). Merging of transcripts will cause extra introns to be assigned as hints as well as potential overextension of UTR during final polishing steps. The jaccard_clip option is the main reason we recommend Trinity. If Stringtie has a similar option, then it can be used as well.

Thanks,
Carson


On May 4, 2017, at 12:37 AM, Salim Bougouffa <mjfi2sb3 at gmail.com<mailto:mjfi2sb3 at gmail.com>> wrote:

Hi,

I am attempting to annotate a plant genome. I have a couple of questions:

1) RNA-seq assembly
a) I assembled my RNA-seq data using Trinity and StringTie. The two produce drastically different numbers. When I compare the two assemblies for each sample using TransRate, StringTie produces a higher score. for most of the assemblies. I see in all of the threads that you recommend Trinity but doesn't trinity produce way too many transcripts (even after chucking out the "bad" ones using transrate).
b) During hint creation in MAKER, does it take into account that different transcripts have different read coverage (expression levels). I guess my question is should I filter transcripts that have a small read coverage.

2) Repeat Masking
I am following the advanced repeat library construction tutorial (http://weatherby.genetics.utah.edu/MAKER/wiki/index.php/Repeat_Library_Construction-Advanced). The initial steps find 15 sequences for the LTR and 159 for MITE. But, when I get to the perl DIR_CRL/CRL_Step4.pl step, both output files (Inner_Seq_For_BLAST.fasta, lLTRs_Seq_For_BLAST.fasta) are empty.

a) are these numbers normal because I was expecting a lot more than 16 for the LTR?
b) I don't get any errors when I run CRL_Step4.pl yet no output. What's going on?!

Many thanks,
/SB
--

____________________________
Sent from Inbox Mobile

_______________________________________________
maker-devel mailing list
maker-devel at box290.bluehost.com<mailto:maker-devel at box290.bluehost.com>
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org

_______________________________________________
maker-devel mailing list
maker-devel at box290.bluehost.com<mailto:maker-devel at box290.bluehost.com>
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170508/66453bce/attachment-0003.html>

From mjfi2sb3 at gmail.com  Mon May  8 10:41:51 2017
From: mjfi2sb3 at gmail.com (Salim Bougouffa)
Date: Mon, 08 May 2017 16:41:51 +0000
Subject: [maker-devel] advanced repeat masking library constructions &
 rna-seq assembly choices
In-Reply-To: <DM5PR1201MB0060CD7E9DFA122F7D1A127DCEEE0@DM5PR1201MB0060.namprd12.prod.outlook.com>
References: <CAJb_6LTFg7MCZa5_+uUhKDDKMeNqa51Kvq11aQCfECA8nxr1Tg@mail.gmail.com>
	<18086AF2-01C3-4671-B974-C5FF36460618@gmail.com>
	<076B034E-8107-49CE-90C7-277AA4AB4ED3@cshl.edu>
	<DM5PR1201MB0060CD7E9DFA122F7D1A127DCEEE0@DM5PR1201MB0060.namprd12.prod.outlook.com>
Message-ID: <CAJb_6LQFuL0WtMh1oM44Smdc0Rp0dGHeAo-gcRWXXv3cFgu8zg@mail.gmail.com>

Thank you all for your responses.

Regards,
/SB

On Mon, 8 May 2017, 18:50 Jiang, Ning, <jiangn at msu.edu> wrote:

> Hi Salim,
>
>
> I am sorry to learn about the issues. it depends on the quality of your
> genome assembly for how many intact LTR elements you would get; however, 16
> seems too low to me.
>
>
> The inner and LTR sequence file should NOT be empty. Some times the issue
> could be due to that the initial sequence name is long and complicated. If
> that's the case for your sequences, you might want to simplify your
> sequence name (only including letters and numbers) and try again.
>
>
> We are working on an automatic pipeline for LTR collection, if everything
> goes smoothly, it should be available in two to three months.
>
>
> Best wishes,
>
>
> Ning
> ------------------------------
> *From:* Campbell, Michael <mcampbel at cshl.edu>
> *Sent:* Sunday, May 7, 2017 9:24 PM
> *To:* Carson Holt
> *Cc:* Salim Bougouffa; maker-devel at yandell-lab.org List; Jiang, Ning
> *Subject:* Re: [maker-devel] advanced repeat masking library
> constructions & rna-seq assembly choices
>
> Hi SB,
>
> I?ve added Ning Jaing to this email. She has put great effort into
> updating this protocol recently and will be able to address your questions
> better than I can.
>
> Ning, would you mind helping out with this?
>
> Thanks,
> Mike
>
> On May 7, 2017, at 9:17 PM, Carson Holt <carsonhh at gmail.com<mailto:
> carsonhh at gmail.com>> wrote:
>
> Michael can you answer the second question (Michael wrote the protocol, so
> I CC?d him).
>
> With respect to the first question. Expression level is not necessarily
> relevant to the annotation process (so no MAKER does not look at read
> coverage). Instead we use the transcript assemblies to identify introns via
> splice aware alignment (yes it is the introns and not the exons we care
> about). Trinity has a nice option called jaccard_clip which avoids false
> merging of neighboring transcripts (mostly occurs in fungi where UTR can
> overlap). Merging of transcripts will cause extra introns to be assigned as
> hints as well as potential overextension of UTR during final polishing
> steps. The jaccard_clip option is the main reason we recommend Trinity. If
> Stringtie has a similar option, then it can be used as well.
>
> Thanks,
> Carson
>
>
>
> On May 4, 2017, at 12:37 AM, Salim Bougouffa <mjfi2sb3 at gmail.com<mailto:
> mjfi2sb3 at gmail.com>> wrote:
>
> Hi,
>
> I am attempting to annotate a plant genome. I have a couple of questions:
>
> 1) RNA-seq assembly
> a) I assembled my RNA-seq data using Trinity and StringTie. The two
> produce drastically different numbers. When I compare the two assemblies
> for each sample using TransRate, StringTie produces a higher score. for
> most of the assemblies. I see in all of the threads that you recommend
> Trinity but doesn't trinity produce way too many transcripts (even after
> chucking out the "bad" ones using transrate).
> b) During hint creation in MAKER, does it take into account that different
> transcripts have different read coverage (expression levels). I guess my
> question is should I filter transcripts that have a small read coverage.
>
> 2) Repeat Masking
> I am following the advanced repeat library construction tutorial (
> http://weatherby.genetics.utah.edu/MAKER/wiki/index.php/Repeat_Library_Construction-Advanced).
> The initial steps find 15 sequences for the LTR and 159 for MITE. But, when
> I get to the perl DIR_CRL/CRL_Step4.pl step, both output files
> (Inner_Seq_For_BLAST.fasta, lLTRs_Seq_For_BLAST.fasta) are empty.
>
> a) are these numbers normal because I was expecting a lot more than 16 for
> the LTR?
> b) I don't get any errors when I run CRL_Step4.pl yet no output. What's
> going on?!
>
> Many thanks,
> /SB
> --
>
> ____________________________
> Sent from Inbox Mobile
>
> _______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com<mailto:maker-devel at box290.bluehost.com>
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org
>
> _______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com<mailto:maker-devel at box290.bluehost.com>
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170508/0cfc9de1/attachment-0003.html>

From qwzhang0601 at gmail.com  Wed May 10 09:48:01 2017
From: qwzhang0601 at gmail.com (Quanwei Zhang)
Date: Wed, 10 May 2017 11:48:01 -0400
Subject: [maker-devel] want coding sequences
Message-ID: <CAOW6FSJFhCP8-V7L9N6RLnnyvtoa3SvQZaZY9Phi0-stXe+vGQ@mail.gmail.com>

Hello:

Thanks for development and maintenance of the tool "Maker2".  We have used
Maker2 to do genome annotation of a new rodent species. Now we are doing
downstream analysis, which requires inputs of coding sequences from
different species.

I found the outputs I got from Maker2 only include protein sequences and
transcripts. Is there an easy way that I can get the coding sequences for
our annotated genome?

Many thanks

Best
Quanwei
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170510/ef5f7263/attachment-0003.html>

From munholl at uwindsor.ca  Wed May 10 14:24:49 2017
From: munholl at uwindsor.ca (Seth Munholland)
Date: Wed, 10 May 2017 16:24:49 -0400
Subject: [maker-devel] MAKER only running 1 task
Message-ID: <CAL=sJwroONDyNp_Y2mbgs-E3DYh0qxZAcn3LkEtvdRW37jLLVA@mail.gmail.com>

Hello,

I'm running a MAKER annotation on an ubuntu cluster and my top screen shows
the following:

Tasks: 831 total,   3 running, 826 sleeping,   0 stopped,   1 zombie

and my maker run (on a screen) shows:

...
total clusters:4 now processing 0
 ...processing 0 of 3
 ...processing 1 of 3
 ...processing 2 of 3
total clusters:4 now processing 0
flattening protein clusters
prepare section files
merging blast reports...
flattening protein clusters
prepare section files
merging blast reports...
flattening protein clusters
prepare section files

Of the 826, the vast majority of them are maker and only one of the running
tasks is maker.  Is this normal behaviour or has my maker run stopped
processing?

Seth Munholland, B.Sc.
Department of Biological Sciences
Rm. 304 Biology Building
University of Windsor
401 Sunset Ave. N9B 3P4
T: (519) 253-3000 Ext: 4755
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170510/b5663dbe/attachment-0003.html>

From carsonhh at gmail.com  Thu May 11 09:58:59 2017
From: carsonhh at gmail.com (Carson Holt)
Date: Thu, 11 May 2017 09:58:59 -0600
Subject: [maker-devel] want coding sequences
In-Reply-To: <CAOW6FSJFhCP8-V7L9N6RLnnyvtoa3SvQZaZY9Phi0-stXe+vGQ@mail.gmail.com>
References: <CAOW6FSJFhCP8-V7L9N6RLnnyvtoa3SvQZaZY9Phi0-stXe+vGQ@mail.gmail.com>
Message-ID: <82DF4E49-8E78-45F6-8A78-01A45F908987@gmail.com>

Use the fasta_tool utility with ?trim_maker_utr to get just the CDS part of each transcript.

?Carson


> On May 10, 2017, at 9:48 AM, Quanwei Zhang <qwzhang0601 at gmail.com> wrote:
> 
> Hello:
> 
> Thanks for development and maintenance of the tool "Maker2".  We have used Maker2 to do genome annotation of a new rodent species. Now we are doing downstream analysis, which requires inputs of coding sequences from different species. 
> 
> I found the outputs I got from Maker2 only include protein sequences and transcripts. Is there an easy way that I can get the coding sequences for our annotated genome?
> 
> Many thanks
> 
> Best
> Quanwei
> _______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org


From carsonhh at gmail.com  Thu May 11 10:03:04 2017
From: carsonhh at gmail.com (Carson Holt)
Date: Thu, 11 May 2017 10:03:04 -0600
Subject: [maker-devel] MAKER only running 1 task
In-Reply-To: <CAL=sJwroONDyNp_Y2mbgs-E3DYh0qxZAcn3LkEtvdRW37jLLVA@mail.gmail.com>
References: <CAL=sJwroONDyNp_Y2mbgs-E3DYh0qxZAcn3LkEtvdRW37jLLVA@mail.gmail.com>
Message-ID: <2119AF6E-3571-4D28-9D81-B24A513839E5@gmail.com>

It may be frozen, or if it is on the last contig it can be running a non-paralelizable step (very last cluster merging step for each contig is not paralelizable). So on a large contig the very last step can take a little while, and if there are no other contigs, then there is no work to give to other processes to keep them busy in the meantime. So everyone has to wait so they can all exit together once the last step is done. But as I said, this will only happen if you are on the last contig and it is large. Otherwise it is probably frozen somehow (look for any errors further up the log).

?Carson


> On May 10, 2017, at 2:24 PM, Seth Munholland <munholl at uwindsor.ca> wrote:
> 
> Hello,
> 
> I'm running a MAKER annotation on an ubuntu cluster and my top screen shows the following:
> 
> Tasks: 831 total,   3 running, 826 sleeping,   0 stopped,   1 zombie
> 
> and my maker run (on a screen) shows:
> 
> ...
> total clusters:4 now processing 0
>  ...processing 0 of 3
>  ...processing 1 of 3
>  ...processing 2 of 3
> total clusters:4 now processing 0
> flattening protein clusters
> prepare section files
> merging blast reports...
> flattening protein clusters
> prepare section files
> merging blast reports...
> flattening protein clusters
> prepare section files
> 
> Of the 826, the vast majority of them are maker and only one of the running tasks is maker.  Is this normal behaviour or has my maker run stopped processing?
> 
> Seth Munholland, B.Sc.
> Department of Biological Sciences
> Rm. 304 Biology Building
> University of Windsor
> 401 Sunset Ave. N9B 3P4
> T: (519) 253-3000 Ext: 4755 <>_______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170511/f218b2f3/attachment-0003.html>

From mnaymik at tgen.org  Thu May 11 12:29:46 2017
From: mnaymik at tgen.org (Marcus Naymik)
Date: Thu, 11 May 2017 11:29:46 -0700
Subject: [maker-devel] Maker gene vs snap match in final GFF's
Message-ID: <CAMY2gz5vm7hZ4n2OoUSPsfwjmdCY0V6Y1iDdf3bi7=mpGXHDiQ@mail.gmail.com>

In the final GFF annotations what is the difference between a 'gene' from
maker and a 'match' from snap?

-- 
 

*This electronic message is intended to be for the use only of the named 
recipient, and may contain information that is confidential or privileged, 
including patient health information. If you are not the intended 
recipient, you are hereby notified that any disclosure, copying, 
distribution or use of the contents of this message is strictly prohibited. 
If you have received this message in error or are not the named recipient, 
please notify us immediately by contacting the sender at the electronic 
mail address noted above, and delete and destroy all copies of this 
message. Thank you.*
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170511/52fd0d90/attachment-0003.html>

From carsonhh at gmail.com  Thu May 11 12:33:55 2017
From: carsonhh at gmail.com (Carson Holt)
Date: Thu, 11 May 2017 12:33:55 -0600
Subject: [maker-devel] Maker gene vs snap match in final GFF's
In-Reply-To: <CAMY2gz5vm7hZ4n2OoUSPsfwjmdCY0V6Y1iDdf3bi7=mpGXHDiQ@mail.gmail.com>
References: <CAMY2gz5vm7hZ4n2OoUSPsfwjmdCY0V6Y1iDdf3bi7=mpGXHDiQ@mail.gmail.com>
Message-ID: <8A77F384-9BAD-4BF4-BD1E-EDAE4E010612@gmail.com>

MAKER results can be the result of additional hints sent to SNAP together with post processing to add UTR and additional exons that have support form transcript evidence. MAKER results will also have support from either protein or EST/mRNA evidence. SNAP match is simply the raw ab initio call made by SNAP (no hints, no post processing, and may or may not have evidence supporting the structure). They are there just for reference purposes. so you know what SNAP will produce outside of MAKER given the underlying HMM.

?Carson


> On May 11, 2017, at 12:29 PM, Marcus Naymik <mnaymik at tgen.org> wrote:
> 
> In the final GFF annotations what is the difference between a 'gene' from maker and a 'match' from snap?
> 
> This electronic message is intended to be for the use only of the named recipient, and may contain information that is confidential or privileged, including patient health information. If you are not the intended recipient, you are hereby notified that any disclosure, copying, distribution or use of the contents of this message is strictly prohibited. If you have received this message in error or are not the named recipient, please notify us immediately by contacting the sender at the electronic mail address noted above, and delete and destroy all copies of this message. Thank you.
> 
> _______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170511/42fb9023/attachment-0003.html>

From d.ence at ufl.edu  Thu May 11 12:35:00 2017
From: d.ence at ufl.edu (Ence,daniel)
Date: Thu, 11 May 2017 18:35:00 +0000
Subject: [maker-devel] Maker gene vs snap match in final GFF's
In-Reply-To: <CAMY2gz5vm7hZ4n2OoUSPsfwjmdCY0V6Y1iDdf3bi7=mpGXHDiQ@mail.gmail.com>
References: <CAMY2gz5vm7hZ4n2OoUSPsfwjmdCY0V6Y1iDdf3bi7=mpGXHDiQ@mail.gmail.com>
Message-ID: <2560F44E-3D8D-4E81-B7B9-621921408B61@mail.ufl.edu>

The two might have identical coordinates in some cases, but they are different kinds of features. The ?match? is a product of an abinitio gene prediction algorithm, while the ?gene? is is supported by evidence and passed through the maker polishing and filtering steps.


On May 11, 2017, at 2:29 PM, Marcus Naymik <mnaymik at tgen.org<mailto:mnaymik at tgen.org>> wrote:

In the final GFF annotations what is the difference between a 'gene' from maker and a 'match' from snap?


This electronic message is intended to be for the use only of the named recipient, and may contain information that is confidential or privileged, including patient health information. If you are not the intended recipient, you are hereby notified that any disclosure, copying, distribution or use of the contents of this message is strictly prohibited. If you have received this message in error or are not the named recipient, please notify us immediately by contacting the sender at the electronic mail address noted above, and delete and destroy all copies of this message. Thank you.

_______________________________________________
maker-devel mailing list
maker-devel at box290.bluehost.com<mailto:maker-devel at box290.bluehost.com>
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170511/f577dab6/attachment-0003.html>

From nathan.ricks at gmail.com  Fri May 12 15:05:45 2017
From: nathan.ricks at gmail.com (Nathan Ricks)
Date: Fri, 12 May 2017 15:05:45 -0600
Subject: [maker-devel] Using mpich2 with Maker
Message-ID: <CAJfcix0T1mVc4Rxz+7fbEbf_he78Aqj9UgA4uYKtKAZsdE8sCA@mail.gmail.com>

I've been using maker for some time now. However, I would like to speed up
the process by using the mpich2 option. When use the command ./Build
install, the following error is produced. Any help would be appreciated.

Nathan Ricks
[image: Inline image 1]
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170512/ffbe42da/attachment-0003.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image.png
Type: image/png
Size: 170139 bytes
Desc: not available
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170512/ffbe42da/attachment-0003.png>

From yuejiaxing at gmail.com  Mon May 15 05:28:47 2017
From: yuejiaxing at gmail.com (Jia-Xing Yue)
Date: Mon, 15 May 2017 13:28:47 +0200
Subject: [maker-devel] multiple overlapped snoRNA genes got annotated by
 maker
Message-ID: <CAETgyPSWXrZPO=nrkgT5MomgJELf-0ze7yu0A_NCsVqTnLYkDA@mail.gmail.com>

Hello,

I configured snoscan (v.0.9.1) for my maker installation (v2.31.9) and run
the annotation for a yeast (S. cerevisiae) genome. I think the annotation
went well with regard to tRNAs and protein-coding genes but I am not sure
about snoRNAs. I found multiple overlapped snoRNA genes were annotated by
maker as the example below shows. I was wondering if this is expected. If
not, what might have caused this problem and is there a way to work around.
Thanks in advance!

chrIX   maker   gene    4328    4416    .       +       .
ID=snoscan-chrIX-noncoding-gene-0.49;Name=snoscan-chrIX-noncoding-gene-0.49
chrIX   maker   snoRNA  4328    4416    .       +       .
ID=snoscan-chrIX-noncoding-gene-0.49-snoRNA-1;Parent=snoscan-chrIX-noncoding-gene-0.49;Name=snoscan-chrIX-noncoding-gene-0.49-snoRNA-1;_AED=1.00;_eAED=1.00;_QI=0|-1|0|0|-1|0|1|90|0
chrIX   maker   exon    4328    4416    .       +       .
ID=snoscan-chrIX-noncoding-gene-0.49-snoRNA-1:exon:12260;Parent=snoscan-chrIX-noncoding-gene-0.49-snoRNA-1
chrIX   maker   gene    4375    4563    .       +       .
ID=snoscan-chrIX-noncoding-gene-0.50;Name=snoscan-chrIX-noncoding-gene-0.50
chrIX   maker   snoRNA  4375    4563    .       +       .
ID=snoscan-chrIX-noncoding-gene-0.50-snoRNA-1;Parent=snoscan-chrIX-noncoding-gene-0.50;Name=snoscan-chrIX-noncoding-gene-0.50-snoRNA-1;_AED=1.00;_eAED=1.00;_QI=0|-1|0|0|-1|0|1|190|0
chrIX   maker   exon    4375    4563    .       +       .
ID=snoscan-chrIX-noncoding-gene-0.50-snoRNA-1:exon:12261;Parent=snoscan-chrIX-noncoding-gene-0.50-snoRNA-1
chrIX   maker   gene    4375    4461    .       +       .
ID=snoscan-chrIX-noncoding-gene-0.51;Name=snoscan-chrIX-noncoding-gene-0.51
chrIX   maker   snoRNA  4375    4461    .       +       .
ID=snoscan-chrIX-noncoding-gene-0.51-snoRNA-1;Parent=snoscan-chrIX-noncoding-gene-0.51;Name=snoscan-chrIX-noncoding-gene-0.51-snoRNA-1;_AED=1.00;_eAED=1.00;_QI=0|-1|0|0|-1|0|1|88|0
chrIX   maker   exon    4375    4461    .       +       .
ID=snoscan-chrIX-noncoding-gene-0.51-snoRNA-1:exon:12262;Parent=snoscan-chrIX-noncoding-gene-0.51-snoRNA-1
chrIX   maker   gene    4375    4491    .       +       .
ID=snoscan-chrIX-noncoding-gene-0.52;Name=snoscan-chrIX-noncoding-gene-0.52
chrIX   maker   snoRNA  4375    4491    .       +       .
ID=snoscan-chrIX-noncoding-gene-0.52-snoRNA-1;Parent=snoscan-chrIX-noncoding-gene-0.52;Name=snoscan-chrIX-noncoding-gene-0.52-snoRNA-1;_AED=1.00;_eAED=1.00;_QI=0|-1|0|0|-1|0|1|118|0
chrIX   maker   exon    4375    4491    .       +       .
ID=snoscan-chrIX-noncoding-gene-0.52-snoRNA-1:exon:12263;Parent=snoscan-chrIX-noncoding-gene-0.52-snoRNA-1
chrIX   maker   gene    4375    4500    .       +       .
ID=snoscan-chrIX-noncoding-gene-0.53;Name=snoscan-chrIX-noncoding-gene-0.53
chrIX   maker   snoRNA  4375    4500    .       +       .
ID=snoscan-chrIX-noncoding-gene-0.53-snoRNA-1;Parent=snoscan-chrIX-noncoding-gene-0.53;Name=snoscan-chrIX-noncoding-gene-0.53-snoRNA-1;_AED=1.00;_eAED=1.00;_QI=0|-1|0|0|-1|0|1|127|0
chrIX   maker   exon    4375    4500    .       +       .
ID=snoscan-chrIX-noncoding-gene-0.53-snoRNA-1:exon:12264;Parent=snoscan-chrIX-noncoding-gene-0.53-snoRNA-1

Best,
Jia-Xing


-- 
Jia-Xing Yue

Population Genomics and Complex Traits Group
Tour Pasteur 8eme etage
Facult? de M?decine
Institute for Research on Cancer and Aging, Nice (IRCAN)
CNRS UMR 7284 - INSERM U 1081 - Universit? C?te d?Azur (UCA)
28 Avenue de Valombrose
06107 NICE Cedex 2
France

Personal website: http://www.iamphioxus.org/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170515/fb928f1e/attachment-0003.html>

From michael.s.campbell1 at gmail.com  Mon May 15 10:28:58 2017
From: michael.s.campbell1 at gmail.com (Michael Campbell)
Date: Mon, 15 May 2017 12:28:58 -0400
Subject: [maker-devel] multiple overlapped snoRNA genes got annotated by
 maker
In-Reply-To: <CAETgyPSWXrZPO=nrkgT5MomgJELf-0ze7yu0A_NCsVqTnLYkDA@mail.gmail.com>
References: <CAETgyPSWXrZPO=nrkgT5MomgJELf-0ze7yu0A_NCsVqTnLYkDA@mail.gmail.com>
Message-ID: <0AC3F89F-28EB-4E2A-ADDE-2DF8BD625416@gmail.com>

Hi Jia-Xing,

That has been my experience in the past as well. For the non-coding RNAs tRNA-scan is very accurate while snoscan seems to be quite sensitive but very specific. Did you give it a ?snoscan_meth? file? Giving it a snoscan_meth file will help with accuracy. The biggest gains in accuracy are from small RNA-seq data. In the paper where we used snoscan on maize we didn?t keep any snoRNA predictions that didn?t have support from small RNA-seq data, in practical terms we got rid of anything with a AED of 1.

I hope this helps,
Mike
> On May 15, 2017, at 7:28 AM, Jia-Xing Yue <yuejiaxing at gmail.com> wrote:
> 
> Hello,
> 
> I configured snoscan (v.0.9.1) for my maker installation (v2.31.9) and run the annotation for a yeast (S. cerevisiae) genome. I think the annotation went well with regard to tRNAs and protein-coding genes but I am not sure about snoRNAs. I found multiple overlapped snoRNA genes were annotated by maker as the example below shows. I was wondering if this is expected. If not, what might have caused this problem and is there a way to work around. Thanks in advance!
> 
> chrIX   maker   gene    4328    4416    .       +       .       ID=snoscan-chrIX-noncoding-gene-0.49;Name=snoscan-chrIX-noncoding-gene-0.49
> chrIX   maker   snoRNA  4328    4416    .       +       .       ID=snoscan-chrIX-noncoding-gene-0.49-snoRNA-1;Parent=snoscan-chrIX-noncoding-gene-0.49;Name=snoscan-chrIX-noncoding-gene-0.49-snoRNA-1;_AED=1.00;_eAED=1.00;_QI=0|-1|0|0|-1|0|1|90|0
> chrIX   maker   exon    4328    4416    .       +       .       ID=snoscan-chrIX-noncoding-gene-0.49-snoRNA-1:exon:12260;Parent=snoscan-chrIX-noncoding-gene-0.49-snoRNA-1
> chrIX   maker   gene    4375    4563    .       +       .       ID=snoscan-chrIX-noncoding-gene-0.50;Name=snoscan-chrIX-noncoding-gene-0.50
> chrIX   maker   snoRNA  4375    4563    .       +       .       ID=snoscan-chrIX-noncoding-gene-0.50-snoRNA-1;Parent=snoscan-chrIX-noncoding-gene-0.50;Name=snoscan-chrIX-noncoding-gene-0.50-snoRNA-1;_AED=1.00;_eAED=1.00;_QI=0|-1|0|0|-1|0|1|190|0
> chrIX   maker   exon    4375    4563    .       +       .       ID=snoscan-chrIX-noncoding-gene-0.50-snoRNA-1:exon:12261;Parent=snoscan-chrIX-noncoding-gene-0.50-snoRNA-1
> chrIX   maker   gene    4375    4461    .       +       .       ID=snoscan-chrIX-noncoding-gene-0.51;Name=snoscan-chrIX-noncoding-gene-0.51
> chrIX   maker   snoRNA  4375    4461    .       +       .       ID=snoscan-chrIX-noncoding-gene-0.51-snoRNA-1;Parent=snoscan-chrIX-noncoding-gene-0.51;Name=snoscan-chrIX-noncoding-gene-0.51-snoRNA-1;_AED=1.00;_eAED=1.00;_QI=0|-1|0|0|-1|0|1|88|0
> chrIX   maker   exon    4375    4461    .       +       .       ID=snoscan-chrIX-noncoding-gene-0.51-snoRNA-1:exon:12262;Parent=snoscan-chrIX-noncoding-gene-0.51-snoRNA-1
> chrIX   maker   gene    4375    4491    .       +       .       ID=snoscan-chrIX-noncoding-gene-0.52;Name=snoscan-chrIX-noncoding-gene-0.52
> chrIX   maker   snoRNA  4375    4491    .       +       .       ID=snoscan-chrIX-noncoding-gene-0.52-snoRNA-1;Parent=snoscan-chrIX-noncoding-gene-0.52;Name=snoscan-chrIX-noncoding-gene-0.52-snoRNA-1;_AED=1.00;_eAED=1.00;_QI=0|-1|0|0|-1|0|1|118|0
> chrIX   maker   exon    4375    4491    .       +       .       ID=snoscan-chrIX-noncoding-gene-0.52-snoRNA-1:exon:12263;Parent=snoscan-chrIX-noncoding-gene-0.52-snoRNA-1
> chrIX   maker   gene    4375    4500    .       +       .       ID=snoscan-chrIX-noncoding-gene-0.53;Name=snoscan-chrIX-noncoding-gene-0.53
> chrIX   maker   snoRNA  4375    4500    .       +       .       ID=snoscan-chrIX-noncoding-gene-0.53-snoRNA-1;Parent=snoscan-chrIX-noncoding-gene-0.53;Name=snoscan-chrIX-noncoding-gene-0.53-snoRNA-1;_AED=1.00;_eAED=1.00;_QI=0|-1|0|0|-1|0|1|127|0
> chrIX   maker   exon    4375    4500    .       +       .       ID=snoscan-chrIX-noncoding-gene-0.53-snoRNA-1:exon:12264;Parent=snoscan-chrIX-noncoding-gene-0.53-snoRNA-1
> 
> Best,
> Jia-Xing
> 
> 
> -- 
> Jia-Xing Yue 
> 
> Population Genomics and Complex Traits Group
> Tour Pasteur 8eme etage
> Facult? de M?decine
> Institute for Research on Cancer and Aging, Nice (IRCAN)
> CNRS UMR 7284 - INSERM U 1081 - Universit? C?te d?Azur (UCA)
> 28 Avenue de Valombrose
> 06107 NICE Cedex 2
> France
> 
> Personal website: http://www.iamphioxus.org/ <http://www.iamphioxus.org/>
> 
> _______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170515/0ea2c0e5/attachment-0003.html>

From yuejiaxing at gmail.com  Mon May 15 11:14:26 2017
From: yuejiaxing at gmail.com (Jia-Xing Yue)
Date: Mon, 15 May 2017 19:14:26 +0200
Subject: [maker-devel] multiple overlapped snoRNA genes got annotated by
 maker
In-Reply-To: <0AC3F89F-28EB-4E2A-ADDE-2DF8BD625416@gmail.com>
References: <CAETgyPSWXrZPO=nrkgT5MomgJELf-0ze7yu0A_NCsVqTnLYkDA@mail.gmail.com>
	<0AC3F89F-28EB-4E2A-ADDE-2DF8BD625416@gmail.com>
Message-ID: <CAETgyPQOwmGsXPGyAcUcEOf_-LHoOvqzJQuKHLTocRYE6tyVtg@mail.gmail.com>

Hi Michael,

Many thanks for the information! I will specify the "snoscan_meth" file and
give it another try then. I majorly want to use maker to annotate
protein-coding genes and tRNAs. But it would be nice to have snoRNA
reasonably annotated as well.
Thanks gain and have a great day!

Best,
Jia-Xing


On Mon, May 15, 2017 at 6:28 PM, Michael Campbell <
michael.s.campbell1 at gmail.com> wrote:

> Hi Jia-Xing,
>
> That has been my experience in the past as well. For the non-coding RNAs
> tRNA-scan is very accurate while snoscan seems to be quite sensitive but
> very specific. Did you give it a ?snoscan_meth? file? Giving it
> a snoscan_meth file will help with accuracy. The biggest gains in accuracy
> are from small RNA-seq data. In the paper where we used snoscan on maize we
> didn?t keep any snoRNA predictions that didn?t have support from small
> RNA-seq data, in practical terms we got rid of anything with a AED of 1.
>
> I hope this helps,
> Mike
>
> On May 15, 2017, at 7:28 AM, Jia-Xing Yue <yuejiaxing at gmail.com> wrote:
>
> Hello,
>
> I configured snoscan (v.0.9.1) for my maker installation (v2.31.9) and run
> the annotation for a yeast (S. cerevisiae) genome. I think the annotation
> went well with regard to tRNAs and protein-coding genes but I am not sure
> about snoRNAs. I found multiple overlapped snoRNA genes were annotated by
> maker as the example below shows. I was wondering if this is expected. If
> not, what might have caused this problem and is there a way to work around.
> Thanks in advance!
>
> chrIX   maker   gene    4328    4416    .       +       .
> ID=snoscan-chrIX-noncoding-gene-0.49;Name=snoscan-chrIX-
> noncoding-gene-0.49
> chrIX   maker   snoRNA  4328    4416    .       +       .
> ID=snoscan-chrIX-noncoding-gene-0.49-snoRNA-1;Parent=
> snoscan-chrIX-noncoding-gene-0.49;Name=snoscan-chrIX-
> noncoding-gene-0.49-snoRNA-1;_AED=1.00;_eAED=1.00;_QI=0|-1|0|0|-1|0|1|90|0
> chrIX   maker   exon    4328    4416    .       +       .
> ID=snoscan-chrIX-noncoding-gene-0.49-snoRNA-1:exon:12260;
> Parent=snoscan-chrIX-noncoding-gene-0.49-snoRNA-1
> chrIX   maker   gene    4375    4563    .       +       .
> ID=snoscan-chrIX-noncoding-gene-0.50;Name=snoscan-chrIX-
> noncoding-gene-0.50
> chrIX   maker   snoRNA  4375    4563    .       +       .
> ID=snoscan-chrIX-noncoding-gene-0.50-snoRNA-1;Parent=
> snoscan-chrIX-noncoding-gene-0.50;Name=snoscan-chrIX-
> noncoding-gene-0.50-snoRNA-1;_AED=1.00;_eAED=1.00;_QI=0|-1|
> 0|0|-1|0|1|190|0
> chrIX   maker   exon    4375    4563    .       +       .
> ID=snoscan-chrIX-noncoding-gene-0.50-snoRNA-1:exon:12261;
> Parent=snoscan-chrIX-noncoding-gene-0.50-snoRNA-1
> chrIX   maker   gene    4375    4461    .       +       .
> ID=snoscan-chrIX-noncoding-gene-0.51;Name=snoscan-chrIX-
> noncoding-gene-0.51
> chrIX   maker   snoRNA  4375    4461    .       +       .
> ID=snoscan-chrIX-noncoding-gene-0.51-snoRNA-1;Parent=
> snoscan-chrIX-noncoding-gene-0.51;Name=snoscan-chrIX-
> noncoding-gene-0.51-snoRNA-1;_AED=1.00;_eAED=1.00;_QI=0|-1|0|0|-1|0|1|88|0
> chrIX   maker   exon    4375    4461    .       +       .
> ID=snoscan-chrIX-noncoding-gene-0.51-snoRNA-1:exon:12262;
> Parent=snoscan-chrIX-noncoding-gene-0.51-snoRNA-1
> chrIX   maker   gene    4375    4491    .       +       .
> ID=snoscan-chrIX-noncoding-gene-0.52;Name=snoscan-chrIX-
> noncoding-gene-0.52
> chrIX   maker   snoRNA  4375    4491    .       +       .
> ID=snoscan-chrIX-noncoding-gene-0.52-snoRNA-1;Parent=
> snoscan-chrIX-noncoding-gene-0.52;Name=snoscan-chrIX-
> noncoding-gene-0.52-snoRNA-1;_AED=1.00;_eAED=1.00;_QI=0|-1|
> 0|0|-1|0|1|118|0
> chrIX   maker   exon    4375    4491    .       +       .
> ID=snoscan-chrIX-noncoding-gene-0.52-snoRNA-1:exon:12263;
> Parent=snoscan-chrIX-noncoding-gene-0.52-snoRNA-1
> chrIX   maker   gene    4375    4500    .       +       .
> ID=snoscan-chrIX-noncoding-gene-0.53;Name=snoscan-chrIX-
> noncoding-gene-0.53
> chrIX   maker   snoRNA  4375    4500    .       +       .
> ID=snoscan-chrIX-noncoding-gene-0.53-snoRNA-1;Parent=
> snoscan-chrIX-noncoding-gene-0.53;Name=snoscan-chrIX-
> noncoding-gene-0.53-snoRNA-1;_AED=1.00;_eAED=1.00;_QI=0|-1|
> 0|0|-1|0|1|127|0
> chrIX   maker   exon    4375    4500    .       +       .
> ID=snoscan-chrIX-noncoding-gene-0.53-snoRNA-1:exon:12264;
> Parent=snoscan-chrIX-noncoding-gene-0.53-snoRNA-1
>
> Best,
> Jia-Xing
>
>
> --
> Jia-Xing Yue
>
> Population Genomics and Complex Traits Group
> Tour Pasteur 8eme etage
> Facult? de M?decine
> Institute for Research on Cancer and Aging, Nice (IRCAN)
> CNRS UMR 7284 - INSERM U 1081 - Universit? C?te d?Azur (UCA)
> 28 Avenue de Valombrose
> 06107 NICE Cedex 2
> France
>
> Personal website: http://www.iamphioxus.org/
>
> _______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org
>
>
>


-- 
Jia-Xing Yue

Population Genomics and Complex Traits Group
Tour Pasteur 8eme etage
Facult? de M?decine
Institute for Research on Cancer and Aging, Nice (IRCAN)
CNRS UMR 7284 - INSERM U 1081 - Universit? C?te d?Azur (UCA)
28 Avenue de Valombrose
06107 NICE Cedex 2
France

Personal website: http://www.iamphioxus.org/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170515/dd2096fa/attachment-0003.html>

From carsonhh at gmail.com  Tue May 16 08:51:00 2017
From: carsonhh at gmail.com (Carson Holt)
Date: Tue, 16 May 2017 08:51:00 -0600
Subject: [maker-devel] Using mpich2 with Maker
In-Reply-To: <CAJfcix0T1mVc4Rxz+7fbEbf_he78Aqj9UgA4uYKtKAZsdE8sCA@mail.gmail.com>
References: <CAJfcix0T1mVc4Rxz+7fbEbf_he78Aqj9UgA4uYKtKAZsdE8sCA@mail.gmail.com>
Message-ID: <6C739B3B-D5D6-4263-A73C-4EA1762B1EE1@gmail.com>

You probably need to reinstall Parse::RecDescent, Inline, Inline::C, or all of the above via CPAN (perl?s module installer). The ones already installed on your system may have issues. If you do not have the ability to install modules, you can install them just for your user using local::lib and the bootstrapping instructions here ?> http://search.cpan.org/~haarg/local-lib-2.000019/lib/local/lib.pm#The_bootstrapping_technique <http://search.cpan.org/~haarg/local-lib-2.000019/lib/local/lib.pm#The_bootstrapping_technique>

Then reinstall MAKER.

?Carson
 
> On May 12, 2017, at 3:05 PM, Nathan Ricks <nathan.ricks at gmail.com> wrote:
> 
> I've been using maker for some time now. However, I would like to speed up the process by using the mpich2 option. When use the command ./Build install, the following error is produced. Any help would be appreciated.
> 
> Nathan Ricks
> <image.png>
> 
> 
> 
> _______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170516/70b7eb1e/attachment-0003.html>

From salim.bougouffa at kaust.edu.sa  Sun May 21 01:45:50 2017
From: salim.bougouffa at kaust.edu.sa (Salim Bougouffa)
Date: Sun, 21 May 2017 10:45:50 +0300
Subject: [maker-devel] augustus exon calling ~
Message-ID: <CAGekO4ucZ6=H1X=s-B-+8KmaNpZph5s73sFHu5+o1L6m0sVg2A@mail.gmail.com>

Hi Maker folks,

I have several issues with a plant genome annotation that I am currently
doing but perhaps the most recurrent issues are:

1/ CDSs that are missed where significant rna-seq evidence is there (figure
artemis01)
2/ vice versa where one or two exons are added without rna-seq
evidence/intron hints (figure artemis02)

info about the runs:
1/ using augustus with a pre-existing model for a related plant that has
high homology to the one I am annotating
2/ umask=1 (seems to do better than umask=0; is this a good thing to do)
3/ evm = 1 (seems to perform better than emv=0)
4/ repeatmasking (denovo + repbase)

Best,
/SB


_______________________________________________________Salim
Bougouffa(PhD), Postdoctoral Fellow
4700 KAUST, CBRC, Blg3. Office4326-WS05,
Thuwal, Jeddah, KSA, 23955-6900
(966) 012 808 2963 || salim.bougouff at kaust.edu.sa

-- 

------------------------------
This message and its contents, including attachments are intended solely 
for the original recipient. If you are not the intended recipient or have 
received this message in error, please notify me immediately and delete 
this message from your computer system. Any unauthorized use or 
distribution is prohibited. Please consider the environment before printing 
this email.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170521/ee869f4a/attachment-0003.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: artemis01.png
Type: image/png
Size: 166352 bytes
Desc: not available
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170521/ee869f4a/attachment-0006.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: artemis02.png
Type: image/png
Size: 97140 bytes
Desc: not available
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170521/ee869f4a/attachment-0007.png>

From mjfi2sb3 at gmail.com  Sun May 21 01:48:48 2017
From: mjfi2sb3 at gmail.com (Salim Bougouffa)
Date: Sun, 21 May 2017 07:48:48 +0000
Subject: [maker-devel] augustus exon calling ~
Message-ID: <CAJb_6LSEknrL9oNUozj9=OQihk-rSKZbo0W+TAq+-4agZ_h1FA@mail.gmail.com>

Hi Maker folks,

I have several issues with a plant genome annotation that I am currently
doing but perhaps the most recurrent issues are:

1/ CDSs that are missed where significant rna-seq evidence is there (figure
artemis01)
2/ vice versa where one or two exons are added without rna-seq
evidence/intron hints (figure artemis02)

info about the runs:
1/ using augustus with a pre-existing model for a related plant that has
high homology to the one I am annotating
2/ umask=1 (seems to do better than umask=0; is this a good thing to do)
3/ evm = 1 (seems to perform better than emv=0)
4/ repeatmasking (denovo + repbase)

Best,
/SB
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170521/721eee4f/attachment-0003.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: artemis01.png
Type: image/png
Size: 166352 bytes
Desc: not available
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170521/721eee4f/attachment-0006.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: artemis02.png
Type: image/png
Size: 97140 bytes
Desc: not available
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170521/721eee4f/attachment-0007.png>

From mjfi2sb3 at gmail.com  Sun May 21 01:55:31 2017
From: mjfi2sb3 at gmail.com (Salim Bougouffa)
Date: Sun, 21 May 2017 07:55:31 +0000
Subject: [maker-devel] augustus exon calling ~
In-Reply-To: <CAJb_6LSEknrL9oNUozj9=OQihk-rSKZbo0W+TAq+-4agZ_h1FA@mail.gmail.com>
References: <CAJb_6LSEknrL9oNUozj9=OQihk-rSKZbo0W+TAq+-4agZ_h1FA@mail.gmail.com>
Message-ID: <CAJb_6LQtJCiDLOtM0hQpq1Eb4BHV_LyqzRKGKEMNMz_YDJiuqg@mail.gmail.com>

Hi,

I should have mentioned a third scenario where an exon is not called fully
by maker despite augustus getting it right (figure artemis03)

[image: artemis03.png]


On Sun, 21 May 2017 at 10:48 Salim Bougouffa <mjfi2sb3 at gmail.com> wrote:

> Hi Maker folks,
>
> I have several issues with a plant genome annotation that I am currently
> doing but perhaps the most recurrent issues are:
>
> 1/ CDSs that are missed where significant rna-seq evidence is there
> (figure artemis01)
> 2/ vice versa where one or two exons are added without rna-seq
> evidence/intron hints (figure artemis02)
>
> info about the runs:
> 1/ using augustus with a pre-existing model for a related plant that has
> high homology to the one I am annotating
> 2/ umask=1 (seems to do better than umask=0; is this a good thing to do)
> 3/ evm = 1 (seems to perform better than emv=0)
> 4/ repeatmasking (denovo + repbase)
>
> Best,
> /SB
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170521/1c0c8274/attachment-0003.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: artemis03.png
Type: image/png
Size: 145860 bytes
Desc: not available
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170521/1c0c8274/attachment-0003.png>

From admin at genome.arizona.edu  Tue May 23 13:52:17 2017
From: admin at genome.arizona.edu (System Admin)
Date: Tue, 23 May 2017 12:52:17 -0700
Subject: [maker-devel] Hyperthreading
Message-ID: <66d4d95b-fc19-233c-af19-787c2af73e25@genome.arizona.edu>

We are using maker in a cluster with mpich.  Currently hyperthreading is 
on and we use 'mpiexec -n <cores>' to start maker.  Our machinelist file 
for mpich specifies the total emulated cores for each node.
With hyperthreading on, we have up to 256 total emulated cores available.

Which is the optimal scenario?
1. Use '-n 256'
2. Use '-n 128' with hyperthreading still on
3. Use '-n 128' with hyperthreading turned off

Thanks


From carsonhh at gmail.com  Tue May 23 14:19:29 2017
From: carsonhh at gmail.com (Carson Holt)
Date: Tue, 23 May 2017 14:19:29 -0600
Subject: [maker-devel] Hyperthreading
In-Reply-To: <66d4d95b-fc19-233c-af19-787c2af73e25@genome.arizona.edu>
References: <66d4d95b-fc19-233c-af19-787c2af73e25@genome.arizona.edu>
Message-ID: <D9561BB5-DC17-4D4B-9657-A5522B6F9231@gmail.com>

MAKER is more of a pipeline. It will launch external tools on as many CPUs as you give it with the mpiexec command. I?ve found that many of the tools used get a boost with hyperthreading even though optimizations are not explicitly built into their code. The short answer is you would have to try it both ways. I doubt there will be much more than a 10-15% difference in runtime.

You can pull back to 128 would if you find that you are running low on RAM or have a high IO burden (both of which will double if you go from 128 to 256 even though CPU isn?t really doubling). Also MAKER per job performance plateaus at around 200 processes due to communication overhead. Above that threshold it is often useful to divide datasets into multiple separate jobs that can run simultaneously.

?Carson


> On May 23, 2017, at 1:52 PM, System Admin <admin at genome.arizona.edu> wrote:
> 
> We are using maker in a cluster with mpich.  Currently hyperthreading is on and we use 'mpiexec -n <cores>' to start maker.  Our machinelist file for mpich specifies the total emulated cores for each node.
> With hyperthreading on, we have up to 256 total emulated cores available.
> 
> Which is the optimal scenario?
> 1. Use '-n 256'
> 2. Use '-n 128' with hyperthreading still on
> 3. Use '-n 128' with hyperthreading turned off
> 
> Thanks
> 
> _______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org


From admin at genome.arizona.edu  Tue May 23 14:31:48 2017
From: admin at genome.arizona.edu (admin at genome.arizona.edu)
Date: Tue, 23 May 2017 13:31:48 -0700
Subject: [maker-devel] Hyperthreading
In-Reply-To: <D9561BB5-DC17-4D4B-9657-A5522B6F9231@gmail.com>
References: <66d4d95b-fc19-233c-af19-787c2af73e25@genome.arizona.edu>
	<D9561BB5-DC17-4D4B-9657-A5522B6F9231@gmail.com>
Message-ID: <fd619da1-cb2b-9248-502e-501b4a459cbb@genome.arizona.edu>

Carson Holt wrote on 05/23/2017 01:19 PM:
> You can pull back to 128 would if you find that you are running low on RAM or have a high IO burden (both of which will double if you go from 128 to 256 even though CPU isn?t really doubling). Also MAKER per job performance plateaus at around 200 processes due to communication overhead. Above that threshold it is often useful to divide datasets into multiple separate jobs that can run simultaneously.

Yes, with '-n 192' we found the load on the cluster will initially go up 
to 360-380 but then continually decreases until maker is finished. 
Memory usage was very low during the processing (under 20%).


From carsonhh at gmail.com  Tue May 23 14:38:42 2017
From: carsonhh at gmail.com (Carson Holt)
Date: Tue, 23 May 2017 14:38:42 -0600
Subject: [maker-devel] Hyperthreading
In-Reply-To: <fd619da1-cb2b-9248-502e-501b4a459cbb@genome.arizona.edu>
References: <66d4d95b-fc19-233c-af19-787c2af73e25@genome.arizona.edu>
	<D9561BB5-DC17-4D4B-9657-A5522B6F9231@gmail.com>
	<fd619da1-cb2b-9248-502e-501b4a459cbb@genome.arizona.edu>
Message-ID: <47000CA1-0F27-40B2-8BB7-3289C2010853@gmail.com>

Also make sure cpus in the control file are set to 1 when using MPI. Otherwise it will tell each program it calls to try and use more CPUs per call.

?Carson

> On May 23, 2017, at 2:31 PM, admin at genome.arizona.edu wrote:
> 
> Carson Holt wrote on 05/23/2017 01:19 PM:
>> You can pull back to 128 would if you find that you are running low on RAM or have a high IO burden (both of which will double if you go from 128 to 256 even though CPU isn?t really doubling). Also MAKER per job performance plateaus at around 200 processes due to communication overhead. Above that threshold it is often useful to divide datasets into multiple separate jobs that can run simultaneously.
> 
> Yes, with '-n 192' we found the load on the cluster will initially go up to 360-380 but then continually decreases until maker is finished. Memory usage was very low during the processing (under 20%).
> 
> _______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org


From mmokrejs at gmail.com  Tue May 23 14:45:21 2017
From: mmokrejs at gmail.com (=?UTF-8?Q?Martin_MOKREJ=c5=a0?=)
Date: Tue, 23 May 2017 22:45:21 +0200
Subject: [maker-devel] Hyperthreading
In-Reply-To: <fd619da1-cb2b-9248-502e-501b4a459cbb@genome.arizona.edu>
References: <66d4d95b-fc19-233c-af19-787c2af73e25@genome.arizona.edu>
	<D9561BB5-DC17-4D4B-9657-A5522B6F9231@gmail.com>
	<fd619da1-cb2b-9248-502e-501b4a459cbb@genome.arizona.edu>
Message-ID: <30ae31f7-264e-44d0-30bc-a010de3e54a7@gmail.com>

admin at genome.arizona.edu wrote:
> Carson Holt wrote on 05/23/2017 01:19 PM:
>> You can pull back to 128 would if you find that you are running low on RAM or have a high IO burden (both of which will double if you go from 128 to 256 even though CPU isn?t really doubling). Also MAKER per job performance plateaus at around 200 processes due to communication overhead. Above that threshold it is often useful to divide datasets into multiple separate jobs that can run simultaneously.
> 
> Yes, with '-n 192' we found the load on the cluster will initially go up to 360-380 but then continually decreases until maker is finished. Memory usage was very low during the processing (under 20%).

Hi,
 the high load could be caused by disk IO or other reasons. The only proof is to run top, htop or similar and check that the processes are in *running* state ("R" is displayed in the status column). There could be "S" (sleep) when task is waiting for data input or output and also "D"(disk) coudl be shown when waiting for disk IO (unlike network IO).
Martin


From mmokrejs at gmail.com  Tue May 23 14:51:18 2017
From: mmokrejs at gmail.com (=?UTF-8?Q?Martin_MOKREJ=c5=a0?=)
Date: Tue, 23 May 2017 22:51:18 +0200
Subject: [maker-devel] Hyperthreading
In-Reply-To: <66d4d95b-fc19-233c-af19-787c2af73e25@genome.arizona.edu>
References: <66d4d95b-fc19-233c-af19-787c2af73e25@genome.arizona.edu>
Message-ID: <54fbcfd6-ba28-e570-cfab-a6d83620f747@gmail.com>

System Admin wrote:
> We are using maker in a cluster with mpich.  Currently hyperthreading is on and we use 'mpiexec -n <cores>' to start maker.  Our machinelist file for mpich specifies the total emulated cores for each node.
> With hyperthreading on, we have up to 256 total emulated cores available.
> 
> Which is the optimal scenario?
> 1. Use '-n 256'
> 2. Use '-n 128' with hyperthreading still on
> 3. Use '-n 128' with hyperthreading turned off

Go for 3. but make sure to disable *hyperthreading* in the kernel of the machines as well. I also disable multicore scheduler (which should again be helping if there are more long-term running processes than physical cores available and if some should probably share a cache). We do not have such jobs, hmmer and blast are mostly accessing data from memory, so the CPU cache is not much relevant for these.
  Hyperthreading only helps if jobs are lousy, waiting for some input/output etc., and in that case *it helps* if another process can be executed on the CPU core (hopefully not having same bottleneck). This is generally a helped in bad situations. You are after good setup, so disable hyperthreading in kernel, load only that many jobs equal to the number of physical CPI cores, and monitor performance. If jobs are starving, resolve the issue.

Martin


From admin at genome.arizona.edu  Tue May 23 14:57:56 2017
From: admin at genome.arizona.edu (admin at genome.arizona.edu)
Date: Tue, 23 May 2017 13:57:56 -0700
Subject: [maker-devel] Hyperthreading
In-Reply-To: <47000CA1-0F27-40B2-8BB7-3289C2010853@gmail.com>
References: <66d4d95b-fc19-233c-af19-787c2af73e25@genome.arizona.edu>
	<D9561BB5-DC17-4D4B-9657-A5522B6F9231@gmail.com>
	<fd619da1-cb2b-9248-502e-501b4a459cbb@genome.arizona.edu>
	<47000CA1-0F27-40B2-8BB7-3289C2010853@gmail.com>
Message-ID: <bf57acaf-5f70-1bd1-4c1d-ab2b5ced581e@genome.arizona.edu>

Carson Holt wrote on 05/23/2017 01:38 PM:
> Also make sure cpus in the control file are set to 1 when using MPI. 
> Otherwise it will tell each program it calls to try and use more
> CPUs per call.

Yes we are using cpus=1 in the control file
Thanks


From carsonhh at gmail.com  Tue May 23 15:03:17 2017
From: carsonhh at gmail.com (Carson Holt)
Date: Tue, 23 May 2017 15:03:17 -0600
Subject: [maker-devel] Hyperthreading
In-Reply-To: <fd619da1-cb2b-9248-502e-501b4a459cbb@genome.arizona.edu>
References: <66d4d95b-fc19-233c-af19-787c2af73e25@genome.arizona.edu>
	<D9561BB5-DC17-4D4B-9657-A5522B6F9231@gmail.com>
	<fd619da1-cb2b-9248-502e-501b4a459cbb@genome.arizona.edu>
Message-ID: <054B76C7-FAC0-4B9B-A6D1-29A8202E35B3@gmail.com>

One last thing to check if using CentOS or RedHat. I?ve seen it happen on a handful of clusters where transparent hugepages can create odd load issues and very high sys CPU usage under top (not just with maker but with BWA, GATK, and other programs that can have larger memory footprints). If using CentOS or RedHat, you may want to disable defrag for hugepages.

You do this on CentOS 6 to disable it (the process is similar on CentOS 7 and RedHat but you may have to google it) ?> 

echo never > /sys/kernel/mm/transparent_hugepage/defrag 
echo 0 > /sys/kernel/mm/transparent_hugepage/khugepaged/defrag

?Carson


> On May 23, 2017, at 2:31 PM, admin at genome.arizona.edu wrote:
> 
> Carson Holt wrote on 05/23/2017 01:19 PM:
>> You can pull back to 128 would if you find that you are running low on RAM or have a high IO burden (both of which will double if you go from 128 to 256 even though CPU isn?t really doubling). Also MAKER per job performance plateaus at around 200 processes due to communication overhead. Above that threshold it is often useful to divide datasets into multiple separate jobs that can run simultaneously.
> 
> Yes, with '-n 192' we found the load on the cluster will initially go up to 360-380 but then continually decreases until maker is finished. Memory usage was very low during the processing (under 20%).
> 
> _______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org


From carsonhh at gmail.com  Tue May 23 15:34:15 2017
From: carsonhh at gmail.com (Carson Holt)
Date: Tue, 23 May 2017 15:34:15 -0600
Subject: [maker-devel] augustus exon calling ~
In-Reply-To: <CAJb_6LSEknrL9oNUozj9=OQihk-rSKZbo0W+TAq+-4agZ_h1FA@mail.gmail.com>
References: <CAJb_6LSEknrL9oNUozj9=OQihk-rSKZbo0W+TAq+-4agZ_h1FA@mail.gmail.com>
Message-ID: <EDDC2B86-8D94-4700-93D0-09CEAF358C3C@gmail.com>

EVM works extremely well when evidence closely matches the predictions and there are no assembly anomalies affecting ORF. Otherwise, EVM performs very very poorly. Also I would not set unmask=1. It adds noise to the calls.

Note in all cases given, gene models are from Augustus (MAKER doesn?t make predictions). MAKER just provides hints that Augustus can use for the second call set. Hints boost the score a model gets whenever a feature matches the hint. What you see as an Augustus match/match_part feature are just references of what Augustus calls without hints.

So if I tell Augustus there is probably an exon/intron at location X, then any model that includes that exon/intron will bump up its score thus causing Augustus to keep models that match the hints and report those over models that don?t match. However if there is an issue with the evidence (i.e. merge mRNA-seq assembly), or an issue with the assembly (base change generates an early stop codon or causes a frameshift), then Augustus may choose to truncate or skip an exon in order to capture the bonus from downstream hints. So it is unlikely that there is a workable model that capture the exact intron exon structure because it breaks the ORF at some point. So Augustus instead produces the best model it can to capture as many hint bonuses as it can.

That being said, look for any odd hint sources like very poor protein or transcript evidence alignments. Eliminating bad hints will improve performance (if using mRNA-seq assemblies Trinity has a jaccard_clip option which helps avoid false merging of transcript evidence for example). Or if an organism you used for protein evidence constantly produces bad protein alignments, then you may want to drop it completely from evidence.

Finally training Augustus on the genome being annotated will help improve performance (note just because a species is closely related in evolutionary space does not mean that its HMM's will perform well; it?s a common fallacy about ab initio prediction discussed in the SNAP paper). Also try adding another gene predictor like SNAP to see if it hurts or helps.

?Carson


> On May 21, 2017, at 1:48 AM, Salim Bougouffa <mjfi2sb3 at gmail.com> wrote:
> 
> Hi Maker folks,
> 
> I have several issues with a plant genome annotation that I am currently doing but perhaps the most recurrent issues are:
> 
> 1/ CDSs that are missed where significant rna-seq evidence is there (figure artemis01)
> 2/ vice versa where one or two exons are added without rna-seq evidence/intron hints (figure artemis02)
> 
> info about the runs:
> 1/ using augustus with a pre-existing model for a related plant that has high homology to the one I am annotating
> 2/ umask=1 (seems to do better than umask=0; is this a good thing to do)
> 3/ evm = 1 (seems to perform better than emv=0)
> 4/ repeatmasking (denovo + repbase)
> 
> Best,
> /SB
> 
> <artemis01.png><artemis02.png>_______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170523/3c7b4bf1/attachment-0003.html>

From nathan.ricks at gmail.com  Tue May 23 15:23:49 2017
From: nathan.ricks at gmail.com (Nathan Ricks)
Date: Tue, 23 May 2017 15:23:49 -0600
Subject: [maker-devel] maker_functional_gff
Message-ID: <CAJfcix1BmK6LyE7F=sF10ARr81k9FSm3eKgNgLMZbP-X7=C8=A@mail.gmail.com>

I've been working with maker and trying to use the maker_functional_gff to
create an annotated .gff file. However, whenever I run the command, the
following pops up, and just continues for a long time.
Use of uninitialized value $qid in hash element at ./maker_functional_gff
line 165, <$IN> line 58363.
Use of uninitialized value $qid in hash element at ./maker_functional_gff
line 170, <$IN> line 58363.
Use of uninitialized value $qid in hash element at ./maker_functional_gff
line 165, <$IN> line 58367.
Use of uninitialized value $qid in hash element at ./maker_functional_gff
line 170, <$IN> line 58367.
^CUse of uninitialized value $qid in hash element at ./maker_functional_gff
line 165, <$IN> line 58368.


Nathan Ricks
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170523/b34f6585/attachment-0003.html>

From d.ence at ufl.edu  Tue May 23 15:39:32 2017
From: d.ence at ufl.edu (Ence,daniel)
Date: Tue, 23 May 2017 21:39:32 +0000
Subject: [maker-devel] maker_functional_gff
In-Reply-To: <CAJfcix1BmK6LyE7F=sF10ARr81k9FSm3eKgNgLMZbP-X7=C8=A@mail.gmail.com>
References: <CAJfcix1BmK6LyE7F=sF10ARr81k9FSm3eKgNgLMZbP-X7=C8=A@mail.gmail.com>
Message-ID: <9388406A-302C-4B19-9F35-D56C06CC9582@mail.ufl.edu>

Hi Nathan, can you send the command line that you?re using and is giving the error? 

Thanks,
Daniel Ence

> On May 23, 2017, at 5:23 PM, Nathan Ricks <nathan.ricks at gmail.com> wrote:
> 
> I've been working with maker and trying to use the maker_functional_gff to create an annotated .gff file. However, whenever I run the command, the following pops up, and just continues for a long time.
> Use of uninitialized value $qid in hash element at ./maker_functional_gff line 165, <$IN> line 58363.
> Use of uninitialized value $qid in hash element at ./maker_functional_gff line 170, <$IN> line 58363.
> Use of uninitialized value $qid in hash element at ./maker_functional_gff line 165, <$IN> line 58367.
> Use of uninitialized value $qid in hash element at ./maker_functional_gff line 170, <$IN> line 58367.
> ^CUse of uninitialized value $qid in hash element at ./maker_functional_gff line 165, <$IN> line 58368.
> 
> 
> Nathan Ricks
> _______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org


From carsonhh at gmail.com  Tue May 23 15:44:05 2017
From: carsonhh at gmail.com (Carson Holt)
Date: Tue, 23 May 2017 15:44:05 -0600
Subject: [maker-devel] maker_functional_gff
In-Reply-To: <CAJfcix1BmK6LyE7F=sF10ARr81k9FSm3eKgNgLMZbP-X7=C8=A@mail.gmail.com>
References: <CAJfcix1BmK6LyE7F=sF10ARr81k9FSm3eKgNgLMZbP-X7=C8=A@mail.gmail.com>
Message-ID: <25B003D7-6A19-4486-B21F-71070F00A580@gmail.com>

The blast report you gave it is in the wrong format, it is partial/truncated, or you provided the files in the wrong order. Basically it receive an empty line from the file at some point.

The blast report format must in tabular foramt which is "wu-blast -mformat 2? or "ncbi-blast -outfmt 6"

Also the script only supports blast results against UniProt/Swiss-prot.

?Carson


> On May 23, 2017, at 3:23 PM, Nathan Ricks <nathan.ricks at gmail.com> wrote:
> 
> I've been working with maker and trying to use the maker_functional_gff to create an annotated .gff file. However, whenever I run the command, the following pops up, and just continues for a long time.
> Use of uninitialized value $qid in hash element at ./maker_functional_gff line 165, <$IN> line 58363.
> Use of uninitialized value $qid in hash element at ./maker_functional_gff line 170, <$IN> line 58363.
> Use of uninitialized value $qid in hash element at ./maker_functional_gff line 165, <$IN> line 58367.
> Use of uninitialized value $qid in hash element at ./maker_functional_gff line 170, <$IN> line 58367.
> ^CUse of uninitialized value $qid in hash element at ./maker_functional_gff line 165, <$IN> line 58368.
> 
> 
> Nathan Ricks
> _______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org


From yuejiaxing at gmail.com  Fri May 26 03:28:48 2017
From: yuejiaxing at gmail.com (Jia-Xing Yue)
Date: Fri, 26 May 2017 11:28:48 +0200
Subject: [maker-devel] multiple overlapped snoRNA genes got annotated by
 maker
In-Reply-To: <CAETgyPQOwmGsXPGyAcUcEOf_-LHoOvqzJQuKHLTocRYE6tyVtg@mail.gmail.com>
References: <CAETgyPSWXrZPO=nrkgT5MomgJELf-0ze7yu0A_NCsVqTnLYkDA@mail.gmail.com>
	<0AC3F89F-28EB-4E2A-ADDE-2DF8BD625416@gmail.com>
	<CAETgyPQOwmGsXPGyAcUcEOf_-LHoOvqzJQuKHLTocRYE6tyVtg@mail.gmail.com>
Message-ID: <CAETgyPSsjRpskkgEs0TY=xbXz0J2N9E0wGQV2S2mVCpej6ZaCg@mail.gmail.com>

Hi Michael,

This is a follow-up for the snoscan issue. I found the snoscan_meth option
seems have been removed in the current maker_opts.ctl template file
(v2.31.9). This option used to be there according to this post (
https://www.biostars.org/p/217240/). I manually specified this option in my
maker_opts.ctl file but I don't think maker has correctly recognized this
option:


STATUS: Parsing control files...
WARNING: Invalid option 'snoscan_meth' in control file maker_opts.ctl
...

Do you know is there a way to work around this problem? Thanks!

Best,
Jia-Xing


On Mon, May 15, 2017 at 7:14 PM, Jia-Xing Yue <yuejiaxing at gmail.com> wrote:

> Hi Michael,
>
> Many thanks for the information! I will specify the "snoscan_meth" file
> and give it another try then. I majorly want to use maker to annotate
> protein-coding genes and tRNAs. But it would be nice to have snoRNA
> reasonably annotated as well.
> Thanks gain and have a great day!
>
> Best,
> Jia-Xing
>
>
> On Mon, May 15, 2017 at 6:28 PM, Michael Campbell <
> michael.s.campbell1 at gmail.com> wrote:
>
>> Hi Jia-Xing,
>>
>> That has been my experience in the past as well. For the non-coding RNAs
>> tRNA-scan is very accurate while snoscan seems to be quite sensitive but
>> very specific. Did you give it a ?snoscan_meth? file? Giving it
>> a snoscan_meth file will help with accuracy. The biggest gains in accuracy
>> are from small RNA-seq data. In the paper where we used snoscan on maize we
>> didn?t keep any snoRNA predictions that didn?t have support from small
>> RNA-seq data, in practical terms we got rid of anything with a AED of 1.
>>
>> I hope this helps,
>> Mike
>>
>> On May 15, 2017, at 7:28 AM, Jia-Xing Yue <yuejiaxing at gmail.com> wrote:
>>
>> Hello,
>>
>> I configured snoscan (v.0.9.1) for my maker installation (v2.31.9) and
>> run the annotation for a yeast (S. cerevisiae) genome. I think the
>> annotation went well with regard to tRNAs and protein-coding genes but I am
>> not sure about snoRNAs. I found multiple overlapped snoRNA genes were
>> annotated by maker as the example below shows. I was wondering if this is
>> expected. If not, what might have caused this problem and is there a way to
>> work around. Thanks in advance!
>>
>> chrIX   maker   gene    4328    4416    .       +       .
>> ID=snoscan-chrIX-noncoding-gene-0.49;Name=snoscan-chrIX-nonc
>> oding-gene-0.49
>> chrIX   maker   snoRNA  4328    4416    .       +       .
>> ID=snoscan-chrIX-noncoding-gene-0.49-snoRNA-1;Parent=snoscan
>> -chrIX-noncoding-gene-0.49;Name=snoscan-chrIX-noncoding-
>> gene-0.49-snoRNA-1;_AED=1.00;_eAED=1.00;_QI=0|-1|0|0|-1|0|1|90|0
>> chrIX   maker   exon    4328    4416    .       +       .
>> ID=snoscan-chrIX-noncoding-gene-0.49-snoRNA-1:exon:12260;Par
>> ent=snoscan-chrIX-noncoding-gene-0.49-snoRNA-1
>> chrIX   maker   gene    4375    4563    .       +       .
>> ID=snoscan-chrIX-noncoding-gene-0.50;Name=snoscan-chrIX-nonc
>> oding-gene-0.50
>> chrIX   maker   snoRNA  4375    4563    .       +       .
>> ID=snoscan-chrIX-noncoding-gene-0.50-snoRNA-1;Parent=snoscan
>> -chrIX-noncoding-gene-0.50;Name=snoscan-chrIX-noncoding-
>> gene-0.50-snoRNA-1;_AED=1.00;_eAED=1.00;_QI=0|-1|0|0|-1|0|1|190|0
>> chrIX   maker   exon    4375    4563    .       +       .
>> ID=snoscan-chrIX-noncoding-gene-0.50-snoRNA-1:exon:12261;Par
>> ent=snoscan-chrIX-noncoding-gene-0.50-snoRNA-1
>> chrIX   maker   gene    4375    4461    .       +       .
>> ID=snoscan-chrIX-noncoding-gene-0.51;Name=snoscan-chrIX-nonc
>> oding-gene-0.51
>> chrIX   maker   snoRNA  4375    4461    .       +       .
>> ID=snoscan-chrIX-noncoding-gene-0.51-snoRNA-1;Parent=snoscan
>> -chrIX-noncoding-gene-0.51;Name=snoscan-chrIX-noncoding-
>> gene-0.51-snoRNA-1;_AED=1.00;_eAED=1.00;_QI=0|-1|0|0|-1|0|1|88|0
>> chrIX   maker   exon    4375    4461    .       +       .
>> ID=snoscan-chrIX-noncoding-gene-0.51-snoRNA-1:exon:12262;Par
>> ent=snoscan-chrIX-noncoding-gene-0.51-snoRNA-1
>> chrIX   maker   gene    4375    4491    .       +       .
>> ID=snoscan-chrIX-noncoding-gene-0.52;Name=snoscan-chrIX-nonc
>> oding-gene-0.52
>> chrIX   maker   snoRNA  4375    4491    .       +       .
>> ID=snoscan-chrIX-noncoding-gene-0.52-snoRNA-1;Parent=snoscan
>> -chrIX-noncoding-gene-0.52;Name=snoscan-chrIX-noncoding-
>> gene-0.52-snoRNA-1;_AED=1.00;_eAED=1.00;_QI=0|-1|0|0|-1|0|1|118|0
>> chrIX   maker   exon    4375    4491    .       +       .
>> ID=snoscan-chrIX-noncoding-gene-0.52-snoRNA-1:exon:12263;Par
>> ent=snoscan-chrIX-noncoding-gene-0.52-snoRNA-1
>> chrIX   maker   gene    4375    4500    .       +       .
>> ID=snoscan-chrIX-noncoding-gene-0.53;Name=snoscan-chrIX-nonc
>> oding-gene-0.53
>> chrIX   maker   snoRNA  4375    4500    .       +       .
>> ID=snoscan-chrIX-noncoding-gene-0.53-snoRNA-1;Parent=snoscan
>> -chrIX-noncoding-gene-0.53;Name=snoscan-chrIX-noncoding-
>> gene-0.53-snoRNA-1;_AED=1.00;_eAED=1.00;_QI=0|-1|0|0|-1|0|1|127|0
>> chrIX   maker   exon    4375    4500    .       +       .
>> ID=snoscan-chrIX-noncoding-gene-0.53-snoRNA-1:exon:12264;Par
>> ent=snoscan-chrIX-noncoding-gene-0.53-snoRNA-1
>>
>> Best,
>> Jia-Xing
>>
>>
>> --
>> Jia-Xing Yue
>>
>> Population Genomics and Complex Traits Group
>> Tour Pasteur 8eme etage
>> Facult? de M?decine
>> Institute for Research on Cancer and Aging, Nice (IRCAN)
>> CNRS UMR 7284 - INSERM U 1081 - Universit? C?te d?Azur (UCA)
>> 28 Avenue de Valombrose
>> 06107 NICE Cedex 2
>> France
>>
>> Personal website: http://www.iamphioxus.org/
>>
>> _______________________________________________
>> maker-devel mailing list
>> maker-devel at box290.bluehost.com
>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org
>>
>>
>>
>
>
> --
> Jia-Xing Yue
>
> Population Genomics and Complex Traits Group
> Tour Pasteur 8eme etage
> Facult? de M?decine
> Institute for Research on Cancer and Aging, Nice (IRCAN)
> CNRS UMR 7284 - INSERM U 1081 - Universit? C?te d?Azur (UCA)
> 28 Avenue de Valombrose
> 06107 NICE Cedex 2
> France
>
> Personal website: http://www.iamphioxus.org/
>
>


-- 
Jia-Xing Yue

Population Genomics and Complex Traits Group
Tour Pasteur 8eme etage
Facult? de M?decine
Institute for Research on Cancer and Aging, Nice (IRCAN)
CNRS UMR 7284 - INSERM U 1081 - Universit? C?te d?Azur (UCA)
28 Avenue de Valombrose
06107 NICE Cedex 2
France

Personal website: http://www.iamphioxus.org/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170526/0edf08e2/attachment-0003.html>

From michael.s.campbell1 at gmail.com  Fri May 26 07:54:44 2017
From: michael.s.campbell1 at gmail.com (Michael Campbell)
Date: Fri, 26 May 2017 09:54:44 -0400
Subject: [maker-devel] multiple overlapped snoRNA genes got annotated by
 maker
In-Reply-To: <CAETgyPSsjRpskkgEs0TY=xbXz0J2N9E0wGQV2S2mVCpej6ZaCg@mail.gmail.com>
References: <CAETgyPSWXrZPO=nrkgT5MomgJELf-0ze7yu0A_NCsVqTnLYkDA@mail.gmail.com>
	<0AC3F89F-28EB-4E2A-ADDE-2DF8BD625416@gmail.com>
	<CAETgyPQOwmGsXPGyAcUcEOf_-LHoOvqzJQuKHLTocRYE6tyVtg@mail.gmail.com>
	<CAETgyPSsjRpskkgEs0TY=xbXz0J2N9E0wGQV2S2mVCpej6ZaCg@mail.gmail.com>
Message-ID: <6461FDD0-BE78-403A-9FEF-E71C3D24F2CA@gmail.com>

Hi Jia-Xing,

v2.31.9 may not have had that option. I know that it is in the v3.00.0 version, so you best option may be to update.

Thanks,
Mike
> On May 26, 2017, at 5:28 AM, Jia-Xing Yue <yuejiaxing at gmail.com> wrote:
> 
> Hi Michael,
> 
> This is a follow-up for the snoscan issue. I found the snoscan_meth option seems have been removed in the current maker_opts.ctl template file (v2.31.9). This option used to be there according to this post (https://www.biostars.org/p/217240/ <https://www.biostars.org/p/217240/>). I manually specified this option in my maker_opts.ctl file but I don't think maker has correctly recognized this option:
> 
> 
> STATUS: Parsing control files...
> WARNING: Invalid option 'snoscan_meth' in control file maker_opts.ctl
> ...
> 
> Do you know is there a way to work around this problem? Thanks!
> 
> Best,
> Jia-Xing
> 
>  
> 
> On Mon, May 15, 2017 at 7:14 PM, Jia-Xing Yue <yuejiaxing at gmail.com <mailto:yuejiaxing at gmail.com>> wrote:
> Hi Michael,
> 
> Many thanks for the information! I will specify the "snoscan_meth" file and give it another try then. I majorly want to use maker to annotate protein-coding genes and tRNAs. But it would be nice to have snoRNA reasonably annotated as well. 
> Thanks gain and have a great day!
> 
> Best,
> Jia-Xing
> 
> 
> On Mon, May 15, 2017 at 6:28 PM, Michael Campbell <michael.s.campbell1 at gmail.com <mailto:michael.s.campbell1 at gmail.com>> wrote:
> Hi Jia-Xing,
> 
> That has been my experience in the past as well. For the non-coding RNAs tRNA-scan is very accurate while snoscan seems to be quite sensitive but very specific. Did you give it a ?snoscan_meth? file? Giving it a snoscan_meth file will help with accuracy. The biggest gains in accuracy are from small RNA-seq data. In the paper where we used snoscan on maize we didn?t keep any snoRNA predictions that didn?t have support from small RNA-seq data, in practical terms we got rid of anything with a AED of 1.
> 
> I hope this helps,
> Mike
>> On May 15, 2017, at 7:28 AM, Jia-Xing Yue <yuejiaxing at gmail.com <mailto:yuejiaxing at gmail.com>> wrote:
>> 
>> Hello,
>> 
>> I configured snoscan (v.0.9.1) for my maker installation (v2.31.9) and run the annotation for a yeast (S. cerevisiae) genome. I think the annotation went well with regard to tRNAs and protein-coding genes but I am not sure about snoRNAs. I found multiple overlapped snoRNA genes were annotated by maker as the example below shows. I was wondering if this is expected. If not, what might have caused this problem and is there a way to work around. Thanks in advance!
>> 
>> chrIX   maker   gene    4328    4416    .       +       .       ID=snoscan-chrIX-noncoding-gene-0.49;Name=snoscan-chrIX-noncoding-gene-0.49
>> chrIX   maker   snoRNA  4328    4416    .       +       .       ID=snoscan-chrIX-noncoding-gene-0.49-snoRNA-1;Parent=snoscan-chrIX-noncoding-gene-0.49;Name=snoscan-chrIX-noncoding-gene-0.49-snoRNA-1;_AED=1.00;_eAED=1.00;_QI=0|-1|0|0|-1|0|1|90|0
>> chrIX   maker   exon    4328    4416    .       +       .       ID=snoscan-chrIX-noncoding-gene-0.49-snoRNA-1:exon:12260;Parent=snoscan-chrIX-noncoding-gene-0.49-snoRNA-1
>> chrIX   maker   gene    4375    4563    .       +       .       ID=snoscan-chrIX-noncoding-gene-0.50;Name=snoscan-chrIX-noncoding-gene-0.50
>> chrIX   maker   snoRNA  4375    4563    .       +       .       ID=snoscan-chrIX-noncoding-gene-0.50-snoRNA-1;Parent=snoscan-chrIX-noncoding-gene-0.50;Name=snoscan-chrIX-noncoding-gene-0.50-snoRNA-1;_AED=1.00;_eAED=1.00;_QI=0|-1|0|0|-1|0|1|190|0
>> chrIX   maker   exon    4375    4563    .       +       .       ID=snoscan-chrIX-noncoding-gene-0.50-snoRNA-1:exon:12261;Parent=snoscan-chrIX-noncoding-gene-0.50-snoRNA-1
>> chrIX   maker   gene    4375    4461    .       +       .       ID=snoscan-chrIX-noncoding-gene-0.51;Name=snoscan-chrIX-noncoding-gene-0.51
>> chrIX   maker   snoRNA  4375    4461    .       +       .       ID=snoscan-chrIX-noncoding-gene-0.51-snoRNA-1;Parent=snoscan-chrIX-noncoding-gene-0.51;Name=snoscan-chrIX-noncoding-gene-0.51-snoRNA-1;_AED=1.00;_eAED=1.00;_QI=0|-1|0|0|-1|0|1|88|0
>> chrIX   maker   exon    4375    4461    .       +       .       ID=snoscan-chrIX-noncoding-gene-0.51-snoRNA-1:exon:12262;Parent=snoscan-chrIX-noncoding-gene-0.51-snoRNA-1
>> chrIX   maker   gene    4375    4491    .       +       .       ID=snoscan-chrIX-noncoding-gene-0.52;Name=snoscan-chrIX-noncoding-gene-0.52
>> chrIX   maker   snoRNA  4375    4491    .       +       .       ID=snoscan-chrIX-noncoding-gene-0.52-snoRNA-1;Parent=snoscan-chrIX-noncoding-gene-0.52;Name=snoscan-chrIX-noncoding-gene-0.52-snoRNA-1;_AED=1.00;_eAED=1.00;_QI=0|-1|0|0|-1|0|1|118|0
>> chrIX   maker   exon    4375    4491    .       +       .       ID=snoscan-chrIX-noncoding-gene-0.52-snoRNA-1:exon:12263;Parent=snoscan-chrIX-noncoding-gene-0.52-snoRNA-1
>> chrIX   maker   gene    4375    4500    .       +       .       ID=snoscan-chrIX-noncoding-gene-0.53;Name=snoscan-chrIX-noncoding-gene-0.53
>> chrIX   maker   snoRNA  4375    4500    .       +       .       ID=snoscan-chrIX-noncoding-gene-0.53-snoRNA-1;Parent=snoscan-chrIX-noncoding-gene-0.53;Name=snoscan-chrIX-noncoding-gene-0.53-snoRNA-1;_AED=1.00;_eAED=1.00;_QI=0|-1|0|0|-1|0|1|127|0
>> chrIX   maker   exon    4375    4500    .       +       .       ID=snoscan-chrIX-noncoding-gene-0.53-snoRNA-1:exon:12264;Parent=snoscan-chrIX-noncoding-gene-0.53-snoRNA-1
>> 
>> Best,
>> Jia-Xing
>> 
>> 
>> -- 
>> Jia-Xing Yue 
>> 
>> Population Genomics and Complex Traits Group
>> Tour Pasteur 8eme etage
>> Facult? de M?decine
>> Institute for Research on Cancer and Aging, Nice (IRCAN)
>> CNRS UMR 7284 - INSERM U 1081 - Universit? C?te d?Azur (UCA)
>> 28 Avenue de Valombrose
>> 06107 NICE Cedex 2
>> France
>> 
>> Personal website: http://www.iamphioxus.org/ <http://www.iamphioxus.org/>
>> 
>> _______________________________________________
>> maker-devel mailing list
>> maker-devel at box290.bluehost.com <mailto:maker-devel at box290.bluehost.com>
>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org <http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org>
> 
> 
> 
> 
> -- 
> Jia-Xing Yue 
> 
> Population Genomics and Complex Traits Group
> Tour Pasteur 8eme etage
> Facult? de M?decine
> Institute for Research on Cancer and Aging, Nice (IRCAN)
> CNRS UMR 7284 - INSERM U 1081 - Universit? C?te d?Azur (UCA)
> 28 Avenue de Valombrose
> 06107 NICE Cedex 2
> France
> 
> Personal website: http://www.iamphioxus.org/ <http://www.iamphioxus.org/>
> 
> 
> 
> 
> -- 
> Jia-Xing Yue 
> 
> Population Genomics and Complex Traits Group
> Tour Pasteur 8eme etage
> Facult? de M?decine
> Institute for Research on Cancer and Aging, Nice (IRCAN)
> CNRS UMR 7284 - INSERM U 1081 - Universit? C?te d?Azur (UCA)
> 28 Avenue de Valombrose
> 06107 NICE Cedex 2
> France
> 
> Personal website: http://www.iamphioxus.org/ <http://www.iamphioxus.org/>
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170526/207e742d/attachment-0003.html>

From yuejiaxing at gmail.com  Fri May 26 08:20:03 2017
From: yuejiaxing at gmail.com (Jia-Xing Yue)
Date: Fri, 26 May 2017 16:20:03 +0200
Subject: [maker-devel] multiple overlapped snoRNA genes got annotated by
 maker
In-Reply-To: <6461FDD0-BE78-403A-9FEF-E71C3D24F2CA@gmail.com>
References: <CAETgyPSWXrZPO=nrkgT5MomgJELf-0ze7yu0A_NCsVqTnLYkDA@mail.gmail.com>
	<0AC3F89F-28EB-4E2A-ADDE-2DF8BD625416@gmail.com>
	<CAETgyPQOwmGsXPGyAcUcEOf_-LHoOvqzJQuKHLTocRYE6tyVtg@mail.gmail.com>
	<CAETgyPSsjRpskkgEs0TY=xbXz0J2N9E0wGQV2S2mVCpej6ZaCg@mail.gmail.com>
	<6461FDD0-BE78-403A-9FEF-E71C3D24F2CA@gmail.com>
Message-ID: <CAETgyPT_K8DDaujyUVYaRhaoKZaCT4MSrAi4DPczZHSkaTpDsQ@mail.gmail.com>

I see. Thanks Michael!

Best,
Jia-Xing

On Fri, May 26, 2017 at 3:54 PM, Michael Campbell <
michael.s.campbell1 at gmail.com> wrote:

> Hi Jia-Xing,
>
> v2.31.9 may not have had that option. I know that it is in the v3.00.0
> version, so you best option may be to update.
>
> Thanks,
> Mike
>
> On May 26, 2017, at 5:28 AM, Jia-Xing Yue <yuejiaxing at gmail.com> wrote:
>
> Hi Michael,
>
> This is a follow-up for the snoscan issue. I found the snoscan_meth option
> seems have been removed in the current maker_opts.ctl template file
> (v2.31.9). This option used to be there according to this post (
> https://www.biostars.org/p/217240/). I manually specified this option in
> my maker_opts.ctl file but I don't think maker has correctly recognized
> this option:
>
>
> STATUS: Parsing control files...
> WARNING: Invalid option 'snoscan_meth' in control file maker_opts.ctl
> ...
>
> Do you know is there a way to work around this problem? Thanks!
>
> Best,
> Jia-Xing
>
>
>
> On Mon, May 15, 2017 at 7:14 PM, Jia-Xing Yue <yuejiaxing at gmail.com>
> wrote:
>
>> Hi Michael,
>>
>> Many thanks for the information! I will specify the "snoscan_meth" file
>> and give it another try then. I majorly want to use maker to annotate
>> protein-coding genes and tRNAs. But it would be nice to have snoRNA
>> reasonably annotated as well.
>> Thanks gain and have a great day!
>>
>> Best,
>> Jia-Xing
>>
>>
>> On Mon, May 15, 2017 at 6:28 PM, Michael Campbell <
>> michael.s.campbell1 at gmail.com> wrote:
>>
>>> Hi Jia-Xing,
>>>
>>> That has been my experience in the past as well. For the non-coding RNAs
>>> tRNA-scan is very accurate while snoscan seems to be quite sensitive but
>>> very specific. Did you give it a ?snoscan_meth? file? Giving it
>>> a snoscan_meth file will help with accuracy. The biggest gains in accuracy
>>> are from small RNA-seq data. In the paper where we used snoscan on maize we
>>> didn?t keep any snoRNA predictions that didn?t have support from small
>>> RNA-seq data, in practical terms we got rid of anything with a AED of 1.
>>>
>>> I hope this helps,
>>> Mike
>>>
>>> On May 15, 2017, at 7:28 AM, Jia-Xing Yue <yuejiaxing at gmail.com> wrote:
>>>
>>> Hello,
>>>
>>> I configured snoscan (v.0.9.1) for my maker installation (v2.31.9) and
>>> run the annotation for a yeast (S. cerevisiae) genome. I think the
>>> annotation went well with regard to tRNAs and protein-coding genes but I am
>>> not sure about snoRNAs. I found multiple overlapped snoRNA genes were
>>> annotated by maker as the example below shows. I was wondering if this is
>>> expected. If not, what might have caused this problem and is there a way to
>>> work around. Thanks in advance!
>>>
>>> chrIX   maker   gene    4328    4416    .       +       .
>>> ID=snoscan-chrIX-noncoding-gene-0.49;Name=snoscan-chrIX-nonc
>>> oding-gene-0.49
>>> chrIX   maker   snoRNA  4328    4416    .       +       .
>>> ID=snoscan-chrIX-noncoding-gene-0.49-snoRNA-1;Parent=snoscan
>>> -chrIX-noncoding-gene-0.49;Name=snoscan-chrIX-noncoding-gene
>>> -0.49-snoRNA-1;_AED=1.00;_eAED=1.00;_QI=0|-1|0|0|-1|0|1|90|0
>>> chrIX   maker   exon    4328    4416    .       +       .
>>> ID=snoscan-chrIX-noncoding-gene-0.49-snoRNA-1:exon:12260;Par
>>> ent=snoscan-chrIX-noncoding-gene-0.49-snoRNA-1
>>> chrIX   maker   gene    4375    4563    .       +       .
>>> ID=snoscan-chrIX-noncoding-gene-0.50;Name=snoscan-chrIX-nonc
>>> oding-gene-0.50
>>> chrIX   maker   snoRNA  4375    4563    .       +       .
>>> ID=snoscan-chrIX-noncoding-gene-0.50-snoRNA-1;Parent=snoscan
>>> -chrIX-noncoding-gene-0.50;Name=snoscan-chrIX-noncoding-gene
>>> -0.50-snoRNA-1;_AED=1.00;_eAED=1.00;_QI=0|-1|0|0|-1|0|1|190|0
>>> chrIX   maker   exon    4375    4563    .       +       .
>>> ID=snoscan-chrIX-noncoding-gene-0.50-snoRNA-1:exon:12261;Par
>>> ent=snoscan-chrIX-noncoding-gene-0.50-snoRNA-1
>>> chrIX   maker   gene    4375    4461    .       +       .
>>> ID=snoscan-chrIX-noncoding-gene-0.51;Name=snoscan-chrIX-nonc
>>> oding-gene-0.51
>>> chrIX   maker   snoRNA  4375    4461    .       +       .
>>> ID=snoscan-chrIX-noncoding-gene-0.51-snoRNA-1;Parent=snoscan
>>> -chrIX-noncoding-gene-0.51;Name=snoscan-chrIX-noncoding-gene
>>> -0.51-snoRNA-1;_AED=1.00;_eAED=1.00;_QI=0|-1|0|0|-1|0|1|88|0
>>> chrIX   maker   exon    4375    4461    .       +       .
>>> ID=snoscan-chrIX-noncoding-gene-0.51-snoRNA-1:exon:12262;Par
>>> ent=snoscan-chrIX-noncoding-gene-0.51-snoRNA-1
>>> chrIX   maker   gene    4375    4491    .       +       .
>>> ID=snoscan-chrIX-noncoding-gene-0.52;Name=snoscan-chrIX-nonc
>>> oding-gene-0.52
>>> chrIX   maker   snoRNA  4375    4491    .       +       .
>>> ID=snoscan-chrIX-noncoding-gene-0.52-snoRNA-1;Parent=snoscan
>>> -chrIX-noncoding-gene-0.52;Name=snoscan-chrIX-noncoding-gene
>>> -0.52-snoRNA-1;_AED=1.00;_eAED=1.00;_QI=0|-1|0|0|-1|0|1|118|0
>>> chrIX   maker   exon    4375    4491    .       +       .
>>> ID=snoscan-chrIX-noncoding-gene-0.52-snoRNA-1:exon:12263;Par
>>> ent=snoscan-chrIX-noncoding-gene-0.52-snoRNA-1
>>> chrIX   maker   gene    4375    4500    .       +       .
>>> ID=snoscan-chrIX-noncoding-gene-0.53;Name=snoscan-chrIX-nonc
>>> oding-gene-0.53
>>> chrIX   maker   snoRNA  4375    4500    .       +       .
>>> ID=snoscan-chrIX-noncoding-gene-0.53-snoRNA-1;Parent=snoscan
>>> -chrIX-noncoding-gene-0.53;Name=snoscan-chrIX-noncoding-gene
>>> -0.53-snoRNA-1;_AED=1.00;_eAED=1.00;_QI=0|-1|0|0|-1|0|1|127|0
>>> chrIX   maker   exon    4375    4500    .       +       .
>>> ID=snoscan-chrIX-noncoding-gene-0.53-snoRNA-1:exon:12264;Par
>>> ent=snoscan-chrIX-noncoding-gene-0.53-snoRNA-1
>>>
>>> Best,
>>> Jia-Xing
>>>
>>>
>>> --
>>> Jia-Xing Yue
>>>
>>> Population Genomics and Complex Traits Group
>>> Tour Pasteur 8eme etage
>>> Facult? de M?decine
>>> Institute for Research on Cancer and Aging, Nice (IRCAN)
>>> CNRS UMR 7284 - INSERM U 1081 - Universit? C?te d?Azur (UCA)
>>> 28 Avenue de Valombrose
>>> 06107 NICE Cedex 2
>>> France
>>>
>>> Personal website: http://www.iamphioxus.org/
>>>
>>> _______________________________________________
>>> maker-devel mailing list
>>> maker-devel at box290.bluehost.com
>>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org
>>>
>>>
>>>
>>
>>
>> --
>> Jia-Xing Yue
>>
>> Population Genomics and Complex Traits Group
>> Tour Pasteur 8eme etage
>> Facult? de M?decine
>> Institute for Research on Cancer and Aging, Nice (IRCAN)
>> CNRS UMR 7284 - INSERM U 1081 - Universit? C?te d?Azur (UCA)
>> 28 Avenue de Valombrose
>> 06107 NICE Cedex 2
>> France
>>
>> Personal website: http://www.iamphioxus.org/
>>
>>
>
>
> --
> Jia-Xing Yue
>
> Population Genomics and Complex Traits Group
> Tour Pasteur 8eme etage
> Facult? de M?decine
> Institute for Research on Cancer and Aging, Nice (IRCAN)
> CNRS UMR 7284 - INSERM U 1081 - Universit? C?te d?Azur (UCA)
> 28 Avenue de Valombrose
> 06107 NICE Cedex 2
> France
>
> Personal website: http://www.iamphioxus.org/
>
>
>


-- 
Jia-Xing Yue

Population Genomics and Complex Traits Group
Tour Pasteur 8eme etage
Facult? de M?decine
Institute for Research on Cancer and Aging, Nice (IRCAN)
CNRS UMR 7284 - INSERM U 1081 - Universit? C?te d?Azur (UCA)
28 Avenue de Valombrose
06107 NICE Cedex 2
France

Personal website: http://www.iamphioxus.org/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170526/9c764175/attachment-0003.html>