[maker-devel] The est_gff option
Sean.Li at csiro.au
Sean.Li at csiro.au
Wed Aug 1 20:45:10 MDT 2012
Dear Carson,
Thank you for your explanation! Now I understand how the RNA-Seq data's been interpreted in MAKER. I check the tag est_gff and only found it in a binary db file: the scaffold_56.db, which should be a SQLite file. I guess this is right? Will this file be only generated when est data's been provided or I turned the alt_splice to 1? As the db file didn't exist for my first run of Maker when no EST flags have been switched on.
Best regards,
Sean
From: Carson Holt [mailto:carsonhh at gmail.com]
Sent: Thursday, 2 August 2012 11:40 AM
To: Li, Sean (CMIS, Acton); maker-devel at yandell-lab.org
Subject: Re: [maker-devel] The est_gff option
The mRANseq reads from GFF3 don't need to be blasted/exonerate polished because they are already aligned (so no run.log entry is generated), they are just popped into an alignment object in memory and integrated directly. The run.log file is used primarily to track finished calls to external programs (I.e. to avoid partial results files on failure). So since there is nothing to run, there is no run.log entry. You will just get a message saying reading GFF3 in STDERR. The GFF3 results are combined with any other results produced within maker and are then used to generate hints for gene predictors you run as to the location of introns/exons. This is during the step where you see SNAP or augustus running again and again. They also become part of the AED score for selecting from the pool of gene models, and can be used to add UTR to resulting gene models. If you specify alt_splice=1 they can even be used to infer alternate splice forms. For reference purposes, they will end up in the maker results with the tag est_gff:Cufflinks.
Thanks,
Carson
From: <Sean.Li at csiro.au<mailto:Sean.Li at csiro.au>>
Date: Wednesday, 1 August, 2012 9:21 PM
To: <maker-devel at yandell-lab.org<mailto:maker-devel at yandell-lab.org>>
Subject: [maker-devel] The est_gff option
Hello,
We are trying to add RNA-Seq data into the Maker run. As they have been properly aligned to our draft scaffolds by Cufflinks, we just gave the appropriate gff file to the est_gff option. The gff file is with the following format:
....
scaffold_56 Cufflinks match_part 248833 249471 . + . ID=1:TCONS_00039698:exon-1;Name=1:TCONS_00039698;Parent=1:TCONS_00039698;Target=1:TCONS_00039698 121525597 121526235 +;
scaffold_56 Cufflinks match_part 253262 253362 . + . ID=1:TCONS_00039698:exon-2;Name=1:TCONS_00039698;Parent=1:TCONS_00039698;Target=1:TCONS_00039698 121526236 121526336 +;
....
All the prediction algorithms have been trained before running Maker. Meanwhile, we provided the protein homology evidence. By simply checking the run.log file, I couldn't find any records that show how RNA-Seq got involved into the prediction process (no blastn or exonerate for est). Is there anything I missed so the RNA-Seq evidence didn't include in the prediction? Do I need to turn the est2genome on, since I suspect this option is only used when the algorithms aren't trained properly? Or, do we need to set the value of pcov_blastn and eval_blastn a bit lower?
Thanks!
Regards,
Sean
_______________________________________________ maker-devel mailing list maker-devel at box290.bluehost.com<mailto:maker-devel at box290.bluehost.com> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20120802/d8018277/attachment-0003.html>
More information about the maker-devel
mailing list