[maker-devel] The est_gff option

Sean.Li at csiro.au Sean.Li at csiro.au
Wed Aug 1 20:45:10 MDT 2012


Dear Carson,

Thank you for your explanation! Now I understand how the RNA-Seq data's been interpreted in MAKER.  I check the tag est_gff and only found it in a binary db file: the scaffold_56.db, which should be a SQLite file. I guess this is right? Will this file be only generated when est data's been provided or I turned the alt_splice to 1? As the db file didn't exist for my first run of Maker when no EST flags have been switched on.

Best regards,
Sean

From: Carson Holt [mailto:carsonhh at gmail.com]
Sent: Thursday, 2 August 2012 11:40 AM
To: Li, Sean (CMIS, Acton); maker-devel at yandell-lab.org
Subject: Re: [maker-devel] The est_gff option

The mRANseq reads from GFF3 don't need to be blasted/exonerate polished because they are already aligned (so no run.log entry is generated), they are just popped into an alignment object in memory and integrated directly.  The run.log file is used primarily to track finished calls to external programs (I.e. to avoid partial results files on failure). So since there is nothing to run, there is no run.log entry.  You will just get a message saying reading GFF3 in STDERR. The GFF3 results are combined with any other results produced within maker and are then used to generate hints for gene predictors you run as to the location of introns/exons.  This is during the step where you see SNAP or augustus running again and again.  They also become part of the AED score for selecting from the pool of gene models, and can be used to add UTR to resulting gene models.  If you specify alt_splice=1 they can even be used to infer alternate splice forms.  For reference purposes, they will end up in the maker results with the tag est_gff:Cufflinks.

Thanks,
Carson



From: <Sean.Li at csiro.au<mailto:Sean.Li at csiro.au>>
Date: Wednesday, 1 August, 2012 9:21 PM
To: <maker-devel at yandell-lab.org<mailto:maker-devel at yandell-lab.org>>
Subject: [maker-devel] The est_gff option

Hello,

We are trying to add RNA-Seq data into the Maker run. As they have been properly aligned to our draft scaffolds by Cufflinks, we just gave the appropriate gff file to the est_gff option. The gff file is with the following format:

....
scaffold_56     Cufflinks       match_part      248833  249471  .       +       .       ID=1:TCONS_00039698:exon-1;Name=1:TCONS_00039698;Parent=1:TCONS_00039698;Target=1:TCONS_00039698 121525597 121526235 +;
scaffold_56     Cufflinks       match_part      253262  253362  .       +       .       ID=1:TCONS_00039698:exon-2;Name=1:TCONS_00039698;Parent=1:TCONS_00039698;Target=1:TCONS_00039698 121526236 121526336 +;
....

All the prediction algorithms have been trained before running Maker. Meanwhile, we provided the protein homology evidence.  By simply checking the run.log file, I couldn't find any records that show how RNA-Seq got involved into the prediction process (no blastn or exonerate for est).  Is there anything I missed so the RNA-Seq evidence didn't include in the prediction? Do I need to turn the est2genome on, since I suspect this option is only used when the algorithms aren't trained properly?  Or, do we need to set the value of pcov_blastn and eval_blastn a bit lower?

Thanks!

Regards,
Sean



_______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com<mailto:maker-devel at box290.bluehost.com> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20120802/d8018277/attachment-0003.html>


More information about the maker-devel mailing list