[maker-devel] The est_gff option

Carson Holt carsonhh at gmail.com
Wed Aug 1 20:50:14 MDT 2012


The file is generated any time you provide GFF3 input.  It is just a quick
way to pull out the features for a region via SQLite.  It's generated at the
start of a MAKER run.

Thanks,
Carson

From:  <Sean.Li at csiro.au>
Date:  Wednesday, 1 August, 2012 10:45 PM
To:  Carson Holt <carsonhh at gmail.com>, <maker-devel at yandell-lab.org>
Subject:  RE: [maker-devel] The est_gff option

Dear Carson,
 
Thank you for your explanation! Now I understand how the RNA-Seq data¹s been
interpreted in MAKER.  I check the tag est_gff and only found it in a binary
db file: the scaffold_56.db, which should be a SQLite file. I guess this is
right? Will this file be only generated when est data¹s been provided or I
turned the alt_splice to 1? As the db file didn¹t exist for my first run of
Maker when no EST flags have been switched on.
 
Best regards,
Sean   
 

From: Carson Holt [mailto:carsonhh at gmail.com]
Sent: Thursday, 2 August 2012 11:40 AM
To: Li, Sean (CMIS, Acton); maker-devel at yandell-lab.org
Subject: Re: [maker-devel] The est_gff option
 

The mRANseq reads from GFF3 don't need to be blasted/exonerate polished
because they are already aligned (so no run.log entry is generated), they
are just popped into an alignment object in memory and integrated directly.
The run.log file is used primarily to track finished calls to external
programs (I.e. to avoid partial results files on failure). So since there is
nothing to run, there is no run.log entry.  You will just get a message
saying reading GFF3 in STDERR. The GFF3 results are combined with any other
results produced within maker and are then used to generate hints for gene
predictors you run as to the location of introns/exons.  This is during the
step where you see SNAP or augustus running again and again.  They also
become part of the AED score for selecting from the pool of gene models, and
can be used to add UTR to resulting gene models.  If you specify
alt_splice=1 they can even be used to infer alternate splice forms.  For
reference purposes, they will end up in the maker results with the tag
est_gff:Cufflinks.

 

Thanks,

Carson

 

 

 

From: <Sean.Li at csiro.au>
Date: Wednesday, 1 August, 2012 9:21 PM
To: <maker-devel at yandell-lab.org>
Subject: [maker-devel] The est_gff option

 

Hello,
 
We are trying to add RNA-Seq data into the Maker run. As they have been
properly aligned to our draft scaffolds by Cufflinks, we just gave the
appropriate gff file to the est_gff option. The gff file is with the
following format:
 
Š.
scaffold_56     Cufflinks       match_part      248833  249471  .       +
.       
ID=1:TCONS_00039698:exon-1;Name=1:TCONS_00039698;Parent=1:TCONS_00039698;Tar
get=1:TCONS_00039698 121525597 121526235 +;
scaffold_56     Cufflinks       match_part      253262  253362  .       +
.       
ID=1:TCONS_00039698:exon-2;Name=1:TCONS_00039698;Parent=1:TCONS_00039698;Tar
get=1:TCONS_00039698 121526236 121526336 +;
Š.
 
All the prediction algorithms have been trained before running Maker.
Meanwhile, we provided the protein homology evidence.  By simply checking
the run.log file, I couldn¹t find any records that show how RNA-Seq got
involved into the prediction process (no blastn or exonerate for est).  Is
there anything I missed so the RNA-Seq evidence didn¹t include in the
prediction? Do I need to turn the est2genome on, since I suspect this option
is only used when the algorithms aren¹t trained properly?  Or, do we need to
set the value of pcov_blastn and eval_blastn a bit lower?
 
Thanks! 
 
Regards,
Sean  
 
 
 
_______________________________________________ maker-devel mailing list
maker-devel at box290.bluehost.com
http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20120801/e466dcaa/attachment-0003.html>


More information about the maker-devel mailing list