[maker-devel] Problem extracting fasta from a GFF file generated with MAKER

Barry Moore barry.utah at gmail.com
Tue Mar 25 11:51:38 MDT 2014


Hi Diana,

There is a Perl library - The Genome Annotation Library - that is designed to make writing code like this easy.  I just added a script to this library called gal_CDS_sequence which you would run like this:

gal_CDS_sequence --translate genes.gff3 genome.fasta

The focus of GAL is to try to make writing quick scripts like this easy, so if you're comfortable with a bit of Perl, you can modify existing scripts and write new ones to search, iterate through, and traverse the relationships of features in GFF3 files.

You can access the library here:

http://www.sequenceontology.org/software/GAL.html

Support for GAL is available via the SO mailing list:

https://lists.sourceforge.net/lists/listinfo/song-devel

Hope that helps,

Barry

On Mar 24, 2014, at 5:11 PM, Diana Garnica Moreno wrote:

> Hi there,
> 
> We recently assembled a fungal genome using MAKER and we got the gene models. and the corresponding transcripts, predicted proteins and GFF files. However, the predicted proteins do not have the stop codon included so I do not know which proteins are complete and which ones are incomplete at the 3' end. To solve that I have used different programs to extract the fasta sequence of the CDSs given the gff file and the genome sequence. The problem is that with the tools I have tested I get the right sequence for some of the proteins and wrong sequences for others (with multiple stop codons for example). I am not sure why it happens and since it happens with different tools (different python scripts and even gffread from cufflink) I do not know where is the problem. Could you please give me some advice on how to extract the right sequences with the stop codons included?
> 
> Thanks!
> 
> Diana
> _______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org

Barry Moore
Research Scientist
Dept. of Human Genetics
University of Utah
Salt Lake City, UT 84112
--------------------------------------------
(801) 585-3543




-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20140325/fb1d5733/attachment-0003.html>


More information about the maker-devel mailing list