[maker-devel] How to preserve human-friendly IDs when reannotating

Jeremy Semeiks jeremy.semeiks at utsw.edu
Tue Sep 11 17:56:23 MDT 2012


maker 2.26.

And I have verified for myself that the three options I mentioned below
suffice to preserve human-friendly IDs when reannotating.

Thanks,
J

On Tue, Sep 11, 2012 at 3:33 PM, Carson Holt <carsonhh at gmail.com> wrote:

> Which version of MAKER are you running (maker --version)?
>
> --Carson
>
>
>
> From: Jeremy Semeiks <jeremy.semeiks at utsw.edu>
> Date: Monday, 10 September, 2012 11:02 AM
> To: Carson Holt <carsonhh at gmail.com>
> Cc: <maker-devel at yandell-lab.org>
> Subject: Re: [maker-devel] How to preserve human-friendly IDs when
> reannotating
>
> OK, thanks. So if I understand correctly, to preserve human-friendly IDs
> requires setting just three options: map_forward=1,
> maker_gff=<my_human_friendly.gff>, and model_pass=1. (Or instead of the
> last two I could equivalently just set model_gff to a GFF containing only
> models.)
>
> A couple new issues came up when I tried to run with these options. I
> started maker like this:
>
> /usr/bin/time mpiexec -n 10 maker -q < /dev/null > maker.oe 2>&1
>
> 1. I get a bunch of messages as follows, but with variable line number:
>
> DBD::SQLite::db do failed: database is locked at
> /home/jrs/maker-2.26-beta/bin/../lib/GFFDB.pm line 186.
>
> I saw that this came up in another thread <
> https://groups.google.com/forum/?fromgroups=#!topic/maker-devel/TscBgbQfBX4>,
> but I'm not sure it was ever resolved, nor whether it will affect my
> reannotation results (as I'm not sure what "your GFF3 results will not be
> integrated" means). This error did not come up the last time I ran maker
> for reannotation with similar options in a different directory. And both my
> current directory and my tmp directory are locally mounted, ie not NFS.
>
> 2. Both in this run and in previous runs, I get a lot of lines like this,
> seemingly at random:
>
> Warning: unable to close filehandle DF properly.
>
>
> On Mon, Sep 10, 2012 at 6:01 AM, Carson Holt <carsonhh at gmail.com> wrote:
>
>> The map_forward option requires that the pass option for the gene models
>> be turned on.  Otherwise you will have to do some spacial overlap test
>> outside of MAKER.
>>
>> If you have a new assembly, you can try mapping the old models onto the
>> new assembly using the old transcripts as input to the est= and setting
>> est2genome=1 (nothing else set, i.e no repeat masking etc.).  Then there is
>> an undocumented option that is still a little buggy (hence why it is still
>> undocumented).  Add the line est_forward=1 to your control files.  This
>> tells MAKER to copy names from the ESTs, build the models directly from
>> their alignment, and to do other things to try and make a 1 to 1 match
>> across the genome.  You will have to manually check that it is 1 to 1 in
>> the end (as I said still a little buggy and hence undocumented).  Use the
>> resulting file as input to the model_gff option on a separate run with
>> map_forward=1 for additional reannotation wil more evidence, etc. where you
>> want to still be able to map names forward.
>>
>> From: Jeremy Semeiks <jeremy.semeiks at utsw.edu>
>> Date: Sunday, 9 September, 2012 3:49 PM
>> To: <maker-devel at yandell-lab.org>
>> Subject: [maker-devel] How to preserve human-friendly IDs when
>> reannotating
>>
>> Hi all,
>>
>> I have sequenced some novel fungal genomes, and I am annotating them with
>> maker-2.26-beta. The entire project is pretty iterative, in the sense that
>> I first get some seemingly-sane annotation sets, then analyze and compare
>> the proteomes biologically, then reannotate when new data comes in or as I
>> learn more about how maker works. Because I have already attached
>> biological meaning to some of my proteins, I would like to retain the same
>> human-friendly IDs across annotations. Eg, if maker suddenly finds 1,000
>> new proteins on a reannotation run because I turned on keep_preds, then I
>> don't want the transcript formerly known as mymold_09652T0 to become
>> mymold_10698T0 when I run maker_map_ids; I want to keep it named
>> mymold_09652T0.
>>
>> So, is there any built-in way to preserve human-friendly IDs, or do I
>> need to write my own script for this? I have tried setting map_forward=1
>> and maker_gff=<the GFF file output by the previous run of maker_map_ids>,
>> but setting these seems to preserve neither the human-friendly IDs nor even
>> the original IDs. (Eg, protein
>> "genemark-scaffold353-processed-gene-0.9-mRNA-1" changed its name to
>> "genemark-scaffold353-processed-gene-0.6-mRNA-1" when reannotated.) I
>> haven't turned on any of the *_pass options, eg protein_pass; would this be
>> relevant?
>>
>> Extra credit question: I am making some mate-pair libraries for these
>> fungi; when I re-assemble, that will completely change my scaffold names.
>> Is there any easy way to preserve human-friendly transcript names in this
>> case? As with the above simpler case, I think it would be pretty easy to
>> transfer 90% of the names just by doing an all-vs-all blastp between two
>> annotation sets and fishing out the best hits, but the remaining 10% might
>> be a headache.
>>
>> Thanks,
>> Jeremy
>> Grad student, Grishin lab
>> UT Southwestern, Dallas TX
>> 510.385.8959
>> _______________________________________________ maker-devel mailing list
>> maker-devel at box290.bluehost.com
>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20120911/9f83c0b7/attachment-0003.html>


More information about the maker-devel mailing list