[maker-devel] gff pass thru problem and unsupported EST nucleotides

Carson Holt carsonhh at gmail.com
Mon Feb 24 11:34:28 MST 2014


Actually that is not true.  CDS IDs can be the same or different.  MAKER
doesn’t care either way.  Both are valid in GFF3.  Having the same ID just
allows then to be put together by some GMOD viewers without having to go
through a container feature.

—Carson

On 2/24/14, 11:31 AM, "Daniel Ence" <dence at genetics.utah.edu> wrote:

>Hi Megan, 
>
>One problem with the GFF3 that you attached is that the ID's for the CDS
>features are being made wrong. All of the CDS features for a given mRNA
>or transcript should have the same ID. The CDS features in your GFF3 have
>IDs that use the exon name.
>
>You can fix it with this command-line perl:
>cat part_passthru.gff | perl -ane 'if(/\tCDS\t/){ chomp;
>/Parent=([\S]+)/; my $parent=$1; s/ID=([^\;]+)/ID=$parent-cds/; print
>"$_\n"}else{print $_}' > fixed.gff3
>
>It just fixes the ID attributes in all of the CDS features. Try it on the
>test gff3 you sent and let me know if it works. I can't test it myself
>without the fasta file that you are annotating.
>
>Thanks,
>Daniel
>
>Daniel Ence
>Graduate Student
>Eccles Institute of Human Genetics
>University of Utah
>15 North 2030 East, Room 2100
>Salt Lake City, UT 84112-5330
>________________________________________
>From: maker-devel [maker-devel-bounces at yandell-lab.org] on behalf of
>Carson Holt [carsonhh at gmail.com]
>Sent: Monday, February 24, 2014 11:18 AM
>To: Megan; maker-devel at yandell-lab.org
>Subject: Re: [maker-devel] gff pass thru problem and unsupported EST
>nucleotides
>
>The -fix_nucleotides flag is added to the command line (I.e. maker
>-fix_nucleotides flag).  It is there so you are aware that there is an
>issue with your fasta file, that will cause things downstream to fail.
>MAKER can fix the errors for you, but first it gives a warning designed to
>make you look at the file and validate it.  Why would you want to do this?
> For example, what if you provided protein sequence to the EST option
>accidentally, you wouldn’t want MAKER to just proceed.  You want a warning
>so you can check first.  If your file is in fact EST data, then set the
>flag and those characters will be changed to N’s in the fixed fasta
>sequence, otherwise those characters will cause errors in downstream tools
>like exonerate, and even some downstream GMOD tools, so they can’t be
>allowed to remain as is.
>
>For the GFF3 file, there is almost definitely a logic issue in the file
>(mod encode validator won’t check for those).  This can be from prior
>manipulation of the GFF3 file.  For example, IDs for a gene that are the
>same across two contigs (technically valid but a logic error).  The GFF3
>error message will normally give the ID of the feature causing the issue.
>
>I could also take a look for you.  You can upload the GFF3 file here —>
>http://weatherby.genetics.utah.edu/cgi-bin/mwas/bug.cgi
>Click on 'new guest account' then e-mail me back you guest ID, so I know
>which files to review.
>
>Thanks,
>Carson
>
>
>
>On 2/24/14, 12:02 AM, "Megan" <hedgyx at yahoo.com> wrote:
>
>>Maker folks,
>>I am re-annotating a single contig and I am having a few problems.
>>
>>First, I am having trouble passing through a Maker derived gff (from
>>Maker 2.09, with some modifications to gene names and functional
>>information added).  The gff file passes the modencode validator but
>>Maker always fails on the first gene in the file, regardless of which
>>gene comes first.  So it appears to be a systematic error across the
>>entire file.  The Maker error is "Check your input GFF3 file for errors!
>>(from GFFDB)".   I have tried Maker 2.10 and 2.31, using both genome_gff
>>with model_pass=1 and pred_gff.  Attached is a gff with the first 2
>>genes.
>>
>>Second, when I updated to Maker 2.31, Maker now complains that my EST
>>fasta file has nucleotides that are not supported [RYKMSWBDHV].  It
>>suggests "set -fix_nucleotides on the command line to fix this
>>automatically".  Is the -fix_nucleotides a Maker flag?  What exactly does
>>it do?  Does it remove the entire sequence or replace ambiguous bases
>>with a randomly selected one?  Half of my 20k ESTs contain these
>>characters, so I don't want to throw them out entirely.
>>
>>Also, just curious, has Maker never supported these characters but just
>>never complained?  I used this EST data set with Maker 2.09.  I did note
>>poor EST coverage, but thought it was an issue with the EST data itself.
>>
>>I appreciate any suggestions.
>>Thanks,
>>Megan_______________________________________________
>>maker-devel mailing list
>>maker-devel at box290.bluehost.com
>>http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org
>
>
>
>_______________________________________________
>maker-devel mailing list
>maker-devel at box290.bluehost.com
>http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org






More information about the maker-devel mailing list