[maker-devel] Maker consensus

Carson Holt carsonhh at gmail.com
Fri May 10 12:29:32 MDT 2013


You can use any species augustus already has.  If it doesn't then you train
it yourself.  The species folder is pointed to by the AUGUSTUS_CONFIG_PATH
environmental variable, and is usually Š/augusts/config/species

Thanks,
Carson


From:  Diana LeDuc <diana_leduc at eva.mpg.de>
Reply-To:  Diana LeDuc <diana_leduc at eva.mpg.de>
Date:  Friday, 10 May, 2013 2:16 PM
To:  <maker-devel at yandell-lab.org>, Carson Holt <carsonhh at gmail.com>
Cc:  Torsten Schoeneberg <torsten.schoeneberg at medizin.uni-leipzig.de>,
Gabriel Renaud <gabriel_renaud at eva.mpg.de>, Janet Kelso <kelso at eva.mpg.de>
Subject:  Re: [maker-devel] Maker consensus

  
   
 Hi Carson, 
  
   
  
 In maker_exe.ctl I would have to provide the path to augustus. Augustus has
a training set for chicken that I would use. Is it possible to specify the
species i want to use, or the only way is training Augustus myself?
  
   
  
 Thank you! 
  
   
  
 Best, 
  
   
  
 Diana 
  
 On May 10, 2013 at 7:51 PM Carson Holt <carsonhh at gmail.com> wrote:
  
  
>   
>  Ok.  You just ran the evidence and didn't give a gene predictor.  You need to
> provide an HMM file for SNAP a species for augustus, or for rough annotations
> you can set protein3genome=1 and est2genome=1.  This will try and generate
> models direct from the alignments.
>   
>    
>   
>  If you provide a gene predictor, then MAKER can talk to it about the evidence
> alignments so it can make a best gene call for the region.  Then there will be
> gene/mRNA/exon model in the GFF3 file and entires in the proteins.fasta and
> transcripts.fasta.  If you need to train a predictor, you can train SNAP using
> the maker2zff script and the SNAP documentation or maker GMOD tutorial.  If
> you want to train augustus Jason Stajich wrote an excellent explanation as
> well as tools in a previous list message.
>   
>    
>   
>   
>  list msg -  http://brie4.cshl.edu/pipermail/gmod-help/2012-June/001724.html
>   
>  Script is in this github repo -
>   
> https://github.com/hyphaltip/genome-scripts/blob/master/gene_prediction/zff2au
> gustus_gbk.pl 
>   
>    
>   
>  Thanks, 
>   
>  Carson 
>   
>    
>   
>    
>   
>    
>   
>  From:  Diana LeDuc < diana_leduc at eva.mpg.de>
>  Reply-To:  Diana LeDuc < diana_leduc at eva.mpg.de>
>  Date:  Friday, 10 May, 2013 1:41 PM
>  To:  < maker-devel at yandell-lab.org>, Carson Holt < carsonhh at gmail.com>
>  Cc:  Torsten Schoeneberg < torsten.schoeneberg at medizin.uni-leipzig.de>,
> Gabriel Renaud < gabriel_renaud at eva.mpg.de>, Janet Kelso < kelso at eva.mpg.de>
>  Subject:  Re: [maker-devel] Maker consensus
>   
>    
>   
>   
>   
>  Hi Carson, 
>   
>    
>   
>  Thank you for the quick answer.
>   
>  I ran gff3_merge to merge all the gff files and this resulted in a gff file,
> which has these type of fields:
>   
>  scaffold32239   blastx  protein_match   22905   34500   174     +       .
> ID=scaffold32239:hit:976144;Name=ENSTGUG00000000198|ENSTGUT00000000219|DSCAML1
> -2039;  
> scaffold32239   blastx  match_part      22905   23045   174     +       .
> ID=scaffold32239:hsp:2806529;Parent=scaffold32239:hit:976144;Name=ENSTGUG00000
> 000198|ENSTGUT00000000219|DSCAML1-2039;Target=ENSTGUG00000000198|ENSTGUT000000
> 00219|DSCAML1-2039 172 218;Gap=M47;
>   
>  In comparison to the dpp_contig test file, I am missing est2genome evidence,
> most probably because my est data set is pretty poor. I have blastx and
> protein2genome evidence though.
>   
>    
>   
>  My goal is to extract the genes that could be annotated on the scaffolds. In
> the gff files the hits overlap most of the times, I can visualize this
> properly in apollo: for example one scaffold hits DSCAML gene in both
> zebrafinch and chicken, but extracting the coordinates between which this
> scaffold fits this annotated gene is difficult from the gff. Manually curating
> the genes is also not an option, since I am trying to do this for a 1.7Gb
> genome. 
>   
>    
>   
>  I hope this explains better what we are after.
>   
>    
>   
>  Thank you once again.
>   
>    
>   
>  Best regards, 
>   
>    
>   
>  Diana  
> On May 10, 2013 at 6:13 PM Carson Holt < carsonhh at gmail.com> wrote:
>   
>   
>>   
>>  I'm sorry I don¹t' understand question 1.  You are you missing resulting
>> fasta files, correct?  Did your resulting GFF3 file have any features of type
>> "gene"?  Did you run fasta_merge after running gff3_merge?
>>   
>>    
>>   
>>  Could you give me more details on what you are trying to do, so I can take a
>> stab at question 2 as well.
>>   
>>    
>>   
>>  Thanks, 
>>   
>>  Carson 
>>   
>>    
>>   
>>    
>>   
>>    
>>   
>>  From:  Diana LeDuc <  diana_leduc at eva.mpg.de>
>>   Reply-To:  Diana LeDuc <  diana_leduc at eva.mpg.de>
>>   Date:  Friday, 10 May, 2013 10:44 AM
>>   To:  <  maker-devel at yandell-lab.org>
>>   Cc:  Gabriel Renaud <  gabriel_renaud at eva.mpg.de>, Janet Kelso <
>> kelso at eva.mpg.de>, Torsten Schoeneberg <
>> torsten.schoeneberg at medizin.uni-leipzig.de>
>>   Subject:  [maker-devel] Maker consensus
>>   
>>    
>>   
>>   
>>   
>>   
>> 
>> Dear maker developers,
>>   
>> 
>> I am a phD student working on de novo assembly and annotation of a bird
>> genome. I used Maker as annotation pipeline, which ran very well, and I
>> obtained different annotations with evidence from Augustus gene predictor,
>> small EST dataset from my organism and protein sequences from chicken, turkey
>> and zebrafinch. I could combine the different gff files from different
>> scaffolds into one gff file with annotations for the entire genome.
>>   
>> 
>> I now have two questions:
>>   
>> 
>> 1. What could be the reason that I haven't gotten the protein.fasta and
>> trancript.fasta files
>>   
>> 
>> 2. How can I obtain a consensus gene list of different evidences from maker?
>> What I would actually need is the scaffold, coordinates and annotation (gene
>> name) according to the 3 other bird species.
>>  Thank you in advance.
>>   
>>    
>>   
>>  Best regards, 
>>   
>>    
>>   
>>  Diana Le Duc 
>>   
>>    
>>   
>>  --  
>>   
>> Max Planck Institute for Evolutionary Anthropology
>> Department of Evolutionary Genetics
>> Deutscher Platz 6
>> D-04103 Leipzig 
>>   
>> Phone +49 (0)341-3550-554
>>   www.eva.mpg.de <http://www.eva.mpg.de>
>>   
>>   
>>   _______________________________________________ maker-devel mailing list
>> maker-devel at box290.bluehost.com
>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org
>   
>   
>   
>   
>   
  
  
 


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20130510/1af35be3/attachment-0003.html>


More information about the maker-devel mailing list