[maker-devel] How to preserve human-friendly IDs when reannotating

Jeremy Semeiks jeremy.semeiks at utsw.edu
Mon Sep 10 09:02:25 MDT 2012


OK, thanks. So if I understand correctly, to preserve human-friendly IDs
requires setting just three options: map_forward=1,
maker_gff=<my_human_friendly.gff>, and model_pass=1. (Or instead of the
last two I could equivalently just set model_gff to a GFF containing only
models.)

A couple new issues came up when I tried to run with these options. I
started maker like this:

/usr/bin/time mpiexec -n 10 maker -q < /dev/null > maker.oe 2>&1

1. I get a bunch of messages as follows, but with variable line number:

DBD::SQLite::db do failed: database is locked at
/home/jrs/maker-2.26-beta/bin/../lib/GFFDB.pm line 186.

I saw that this came up in another thread <
https://groups.google.com/forum/?fromgroups=#!topic/maker-devel/TscBgbQfBX4>,
but I'm not sure it was ever resolved, nor whether it will affect my
reannotation results (as I'm not sure what "your GFF3 results will not be
integrated" means). This error did not come up the last time I ran maker
for reannotation with similar options in a different directory. And both my
current directory and my tmp directory are locally mounted, ie not NFS.

2. Both in this run and in previous runs, I get a lot of lines like this,
seemingly at random:

Warning: unable to close filehandle DF properly.


On Mon, Sep 10, 2012 at 6:01 AM, Carson Holt <carsonhh at gmail.com> wrote:

> The map_forward option requires that the pass option for the gene models
> be turned on.  Otherwise you will have to do some spacial overlap test
> outside of MAKER.
>
> If you have a new assembly, you can try mapping the old models onto the
> new assembly using the old transcripts as input to the est= and setting
> est2genome=1 (nothing else set, i.e no repeat masking etc.).  Then there is
> an undocumented option that is still a little buggy (hence why it is still
> undocumented).  Add the line est_forward=1 to your control files.  This
> tells MAKER to copy names from the ESTs, build the models directly from
> their alignment, and to do other things to try and make a 1 to 1 match
> across the genome.  You will have to manually check that it is 1 to 1 in
> the end (as I said still a little buggy and hence undocumented).  Use the
> resulting file as input to the model_gff option on a separate run with
> map_forward=1 for additional reannotation wil more evidence, etc. where you
> want to still be able to map names forward.
>
> From: Jeremy Semeiks <jeremy.semeiks at utsw.edu>
> Date: Sunday, 9 September, 2012 3:49 PM
> To: <maker-devel at yandell-lab.org>
> Subject: [maker-devel] How to preserve human-friendly IDs when
> reannotating
>
> Hi all,
>
> I have sequenced some novel fungal genomes, and I am annotating them with
> maker-2.26-beta. The entire project is pretty iterative, in the sense that
> I first get some seemingly-sane annotation sets, then analyze and compare
> the proteomes biologically, then reannotate when new data comes in or as I
> learn more about how maker works. Because I have already attached
> biological meaning to some of my proteins, I would like to retain the same
> human-friendly IDs across annotations. Eg, if maker suddenly finds 1,000
> new proteins on a reannotation run because I turned on keep_preds, then I
> don't want the transcript formerly known as mymold_09652T0 to become
> mymold_10698T0 when I run maker_map_ids; I want to keep it named
> mymold_09652T0.
>
> So, is there any built-in way to preserve human-friendly IDs, or do I need
> to write my own script for this? I have tried setting map_forward=1 and
> maker_gff=<the GFF file output by the previous run of maker_map_ids>, but
> setting these seems to preserve neither the human-friendly IDs nor even the
> original IDs. (Eg, protein "genemark-scaffold353-processed-gene-0.9-mRNA-1"
> changed its name to "genemark-scaffold353-processed-gene-0.6-mRNA-1" when
> reannotated.) I haven't turned on any of the *_pass options, eg
> protein_pass; would this be relevant?
>
> Extra credit question: I am making some mate-pair libraries for these
> fungi; when I re-assemble, that will completely change my scaffold names.
> Is there any easy way to preserve human-friendly transcript names in this
> case? As with the above simpler case, I think it would be pretty easy to
> transfer 90% of the names just by doing an all-vs-all blastp between two
> annotation sets and fishing out the best hits, but the remaining 10% might
> be a headache.
>
> Thanks,
> Jeremy
> Grad student, Grishin lab
> UT Southwestern, Dallas TX
> 510.385.8959
> _______________________________________________ maker-devel mailing list
> maker-devel at box290.bluehost.com
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20120910/f6339cf4/attachment-0003.html>


More information about the maker-devel mailing list