[maker-devel] thousands of array-refs in merged .gff after 'gff3_merge'

Carson Holt carsonhh at gmail.com
Tue Apr 22 14:35:31 MDT 2014


You can provide a comma separated list of files to est_gff.  Also from
experience cufflinks gives far better results than tophat.  Tophat tends to
have a lot of false positives that adversely affect the overall quality of
gene models, so I usually recommend that people use cufflinks output and not
even include the tophat results in their run.

Thanks,
Carson


From:  Michael Seidl <michael.seidl at wur.nl>
Date:  Tuesday, April 22, 2014 at 2:30 PM
To:  Carson Holt <carsonhh at gmail.com>
Subject:  Re: [maker-devel] thousands of array-refs in merged .gff after
'gff3_merge'

Dear Carson,

thanks a lot I will try. More importantly, you pointed me to a mistake in my
procedure which will make me rerun the maker anyway :p

I want maker to use the tophat.gff next to cufflinks est (fa + gff) as well
as a protein.fa. I provide them currently as follows:

#-----EST Evidence (for best results provide a file for at least one)

est= 
/home/michael/data/side/alternaria/maker_annotation/Alternaria-CBS-916.96/tr
anscripts.cds.fa #set of ESTs or assembled mRNA-seq

altest= #EST/cDNA sequence file in fasta format from an alternate organism

est_gff= 
/home/michael/data/side/alternaria/maker_annotation/Alternaria-CBS-916.96/tr
anscripts.gff3 #aligned ESTs or mRNA-seq from a

altest_gff= #aligned ESTs from a closly relate species in GFF3 format



#-----Protein Homology Evidence (for best results provide a file for at
least one)

protein= 
/home/michael/data/side/alternaria/maker_annotation/fungal_proteins.fa
#protein sequence file in fasta format (i.e. from mu

protein_gff=  #aligned protein homology evidence from an external GFF3 file


Can I give the tophat.gff as a alttest.gff or is maker internally using
est_gff and altest_gff differently? Sorry for this question, but I did not
yet realized that the other_gff will be omitted during maker

Thanks a lot 
Michael


On Tue, Apr 22, 2014 at 9:10 PM, Carson Holt <carsonhh at gmail.com> wrote:
> The issue was indeed caused by a bug in using the other_gff= file option.
> Could you place the attached file in .../maker/lib/.  Then you can rerun maker
> to test if it fixes it ('maker -a' for fast rerun without analysis rerun).
> 
> Alternately if you don't feel like rerunning everything, you can also filter
> out the lines using --> grep -v "ARRAY" file.gff
> 
> Since the other_gff file is not used in any part of the analysis and is just a
> convenience option that prints any text given to it into the final GFF3 file,
> then filtering them out is the same as if you would have left other_gff blank
> when running MAKER. You can then use 'gff3_merge -s tophat.gff
> merged_genome.gff' to merge the desired extra lines back into your file
> outside of MAKER.
> 
> Thanks,
> Carson
> 
> 
> 
> From: Michael Seidl <michael.seidl at wur.nl<mailto:michael.seidl at wur.nl>>
> Date: Tuesday, April 22, 2014 at 12:29 PM
> To: Carson Holt <carsonhh at gmail.com<mailto:carsonhh at gmail.com>>
> Subject: Re: [maker-devel] thousands of array-refs in merged .gff after
> 'gff3_merge'
> 
> Hi Carson,
> 
> I uploaded the files as an archive.
> 
> Thanks
> Michael
> 
> 
> On Tue, Apr 22, 2014 at 5:04 PM, Carson Holt
> <carsonhh at gmail.com<mailto:carsonhh at gmail.com>> wrote:
> In the base maker.output directory for the job, there will be a file with a
> .db extension.  Could you send that as well?  I'm leaning towards this being
> something odd happening with the GFF3 files used as input.  Particularly the
> other_gff= file.  Could you upload this file as well -->
> /home/michael/data/side/alternaria/maker_annotation/Alternaria-CBS-916.96/toph
> at.gff3.
> 
> --Carson
> 
> 
> From: Michael Seidl
> <michael.seidl at wur.nl<mailto:michael.seidl at wur.nl><mailto:michael.seidl at wur.nl
> <mailto:michael.seidl at wur.nl>>>
> Date: Tuesday, April 22, 2014 at 8:56 AM
> To: Carson Holt 
> <carsonhh at gmail.com<mailto:carsonhh at gmail.com><mailto:carsonhh at gmail.com<mailt
> o:carsonhh at gmail.com>>>
> Subject: Re: [maker-devel] thousands of array-refs in merged .gff after
> 'gff3_merge'
> 
> Should be uploading right now...
> 
> Thanks Michael
> 
> 
> 
> On Tue, Apr 22, 2014 at 4:46 PM, Carson Holt
> <carsonhh at gmail.com<mailto:carsonhh at gmail.com><mailto:carsonhh at gmail.com<mailt
> o:carsonhh at gmail.com>>> wrote:
> Could you pack up this directory for me --> /84/ED/scaffold3.1/ and upload it
> here --> http://weatherby.genetics.utah.edu/cgi-bin/mwas/bug.cgi
> 
> Thanks,
> Carson
> 
> 
> From: Michael Seidl
> <michael.seidl at wur.nl<mailto:michael.seidl at wur.nl><mailto:michael.seidl at wur.nl
> <mailto:michael.seidl at wur.nl>><mailto:michael.seidl at wur.nl<mailto:michael.seid
> l at wur.nl><mailto:michael.seidl at wur.nl<mailto:michael.seidl at wur.nl>>>>
> Date: Tuesday, April 22, 2014 at 8:43 AM
> To: Carson Holt 
> <carsonhh at gmail.com<mailto:carsonhh at gmail.com><mailto:carsonhh at gmail.com<mailt
> o:carsonhh at gmail.com>><mailto:carsonhh at gmail.com<mailto:carsonhh at gmail.com><ma
> ilto:carsonhh at gmail.com<mailto:carsonhh at gmail.com>>>>
> Cc: 
> "maker-devel at yandell-lab.org<mailto:maker-devel at yandell-lab.org><mailto:maker-
> devel at yandell-lab.org<mailto:maker-devel at yandell-lab.org>><mailto:maker-devel@
> yandell-lab.org<mailto:maker-devel at yandell-lab.org><mailto:maker-devel at yandell
> -lab.org<mailto:maker-devel at yandell-lab.org>>>"
> <maker-devel at yandell-lab.org<mailto:maker-devel at yandell-lab.org><mailto:maker-
> devel at yandell-lab.org<mailto:maker-devel at yandell-lab.org>><mailto:maker-devel@
> yandell-lab.org<mailto:maker-devel at yandell-lab.org><mailto:maker-devel at yandell
> -lab.org<mailto:maker-devel at yandell-lab.org>>>>
> Subject: Re: [maker-devel] thousands of array-refs in merged .gff after
> 'gff3_merge'
> 
> 
> On Tue, Apr 22, 2014 at 4:39 PM, Carson Holt
> <carsonhh at gmail.com<mailto:carsonhh at gmail.com><mailto:carsonhh at gmail.com<mailt
> o:carsonhh at gmail.com>><mailto:carsonhh at gmail.com<mailto:carsonhh at gmail.com><ma
> ilto:carsonhh at gmail.com<mailto:carsonhh at gmail.com>>>> wrote:
> any
> 
> Dear Carson,
> 
> maker -version returns 2.31. Yes, also the individual scaffolds seem to
> contain ARRAY refs, e.g.
> find -name "*gff"  | xargs grep "ARRAY":
> 
> ./84/ED/scaffold3.1/theVoid.scaffold3.1/evidence_6.gff:ARRAY(0x41f6ea0)
> ./84/ED/scaffold3.1/theVoid.scaffold3.1/evidence_6.gff:ARRAY(0xb87d888)
> ./84/ED/scaffold3.1/theVoid.scaffold3.1/evidence_6.gff:ARRAY(0xd343528)
> ./84/ED/scaffold3.1/theVoid.scaffold3.1/evidence_6.gff:ARRAY(0xb12fc48)
> ./84/ED/scaffold3.1/theVoid.scaffold3.1/evidence_6.gff:ARRAY(0xde02488)
> ./84/ED/scaffold3.1/theVoid.scaffold3.1/evidence_6.gff:ARRAY(0x8d4c698)
> ./84/ED/scaffold3.1/theVoid.scaffold3.1/evidence_6.gff:ARRAY(0x447a8a0)
> ./84/ED/scaffold3.1/theVoid.scaffold3.1/evidence_6.gff:ARRAY(0x4390048)
> ./84/ED/scaffold3.1/theVoid.scaffold3.1/evidence_6.gff:ARRAY(0xdbb4e00)
> ./84/ED/scaffold3.1/theVoid.scaffold3.1/evidence_6.gff:ARRAY(0xe3f1790)
> ./84/ED/scaffold3.1/theVoid.scaffold3.1/evidence_6.gff:ARRAY(0x438d570)
> ./84/ED/scaffold3.1/theVoid.scaffold3.1/evidence_6.gff:ARRAY(0xae00088
> 
> Cheers
> M
> 
> 
> 
> 
> 
> --
> Michael F Seidl, PhD
> Research Fellow (Postdoc)
> Laboratory of Phytopathology
> Wageningen University
> P.O. Box 8025, 6700 EE Wageningen
> Wageningen Campus, building 107 (Radix)
> Droevendaalsesteeg 1, 6708 PB Wageningen
> 
> Tel.: +31-317-481288
> Fax: +31-317-483412
> 
> Email: 
> michael.seidl at wur.nl<mailto:michael.seidl at wur.nl><mailto:michael.seidl at wur.nl<
> mailto:michael.seidl at wur.nl>>
> Website: http://www.php.wur.nl/UK/
> Twitter: @MFSeidl
> 
> www.disclaimer-uk.wur.nl <http://www.disclaimer-uk.wur.nl>
> <http://www.disclaimer-uk.wur.nl><http://www.disclaimer-uk.wur.nl/>
> 
> 
> 
> 
> --
> Michael F Seidl, PhD
> Research Fellow (Postdoc)
> Laboratory of Phytopathology
> Wageningen University
> P.O. Box 8025, 6700 EE Wageningen
> Wageningen Campus, building 107 (Radix)
> Droevendaalsesteeg 1, 6708 PB Wageningen
> 
> Tel.: +31-317-481288 <tel:%2B31-317-481288>
> Fax: +31-317-483412 <tel:%2B31-317-483412>
> 
> Email: michael.seidl at wur.nl<mailto:michael.seidl at wur.nl>
> Website: http://www.php.wur.nl/UK/
> Twitter: @MFSeidl
> 
> www.disclaimer-uk.wur.nl <http://www.disclaimer-uk.wur.nl>
> <http://www.disclaimer-uk.wur.nl/>



-- 
Michael F Seidl, PhD
Research Fellow (Postdoc)
Laboratory of Phytopathology
Wageningen University
P.O. Box 8025, 6700 EE Wageningen
Wageningen Campus, building 107 (Radix)
Droevendaalsesteeg 1, 6708 PB Wageningen
 
Tel.: +31-317-481288
Fax: +31-317-483412
 
Email: michael.seidl at wur.nl
Website: http://www.php.wur.nl/UK/ <http://www.php.wur.nl/UK/>
Twitter: @MFSeidl
 
www.disclaimer-uk.wur.nl <http://www.disclaimer-uk.wur.nl/>


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20140422/d3f97e8e/attachment-0003.html>


More information about the maker-devel mailing list