[maker-devel] How to preserve human-friendly IDs when reannotating

Carson Holt carsonhh at gmail.com
Fri Sep 21 09:42:46 MDT 2012


That's good to know :-)

--Carson



From:  Jeremy Semeiks <jeremy.semeiks at utsw.edu>
Date:  Friday, 21 September, 2012 11:39 AM
To:  Carson Holt <carsonhh at gmail.com>
Cc:  <maker-devel at yandell-lab.org>
Subject:  Re: [maker-devel] How to preserve human-friendly IDs when
reannotating

For the record: After analyzing these runs, I have confirmed that neither
error I described (ie, "DBD::SQLite::db do failed" and "Warning: unable to
close filehandle DF properly") affects maker's protein output in any way I
can detect.

Thanks,
J

On Thu, Sep 13, 2012 at 8:11 AM, Carson Holt <carsonhh at gmail.com> wrote:
> The error --> DBD::SQLite::db do failed: database is locked at
> /home/jrs/maker-2.26-beta/bin/../lib/GFFDB.pm line 186.
> 
> The location of the specific error you are getting is probably benign.  It is
> a failure to alter the default cache size for the database.  The database
> should already be populated.  I'm planning removing the SQLlite database
> entirely in the future.  Perhaps in favor of something like tabix based
> indexing of the GFF3 file.
> 
> Thanks,
> Carson
> 
> 
> 
> From:  Jeremy Semeiks <jeremy.semeiks at utsw.edu>
> Date:  Tuesday, 11 September, 2012 7:56 PM
> 
> To:  Carson Holt <carsonhh at gmail.com>
> Cc:  <maker-devel at yandell-lab.org>
> Subject:  Re: [maker-devel] How to preserve human-friendly IDs when
> reannotating
> 
> maker 2.26.
> 
> And I have verified for myself that the three options I mentioned below
> suffice to preserve human-friendly IDs when reannotating.
> 
> Thanks,
> J
> 
> On Tue, Sep 11, 2012 at 3:33 PM, Carson Holt <carsonhh at gmail.com> wrote:
>> Which version of MAKER are you running (maker --version)?
>> 
>> --Carson
>> 
>> 
>> 
>> From:  Jeremy Semeiks <jeremy.semeiks at utsw.edu>
>> Date:  Monday, 10 September, 2012 11:02 AM
>> To:  Carson Holt <carsonhh at gmail.com>
>> Cc:  <maker-devel at yandell-lab.org>
>> Subject:  Re: [maker-devel] How to preserve human-friendly IDs when
>> reannotating
>> 
>> OK, thanks. So if I understand correctly, to preserve human-friendly IDs
>> requires setting just three options: map_forward=1,
>> maker_gff=<my_human_friendly.gff>, and model_pass=1. (Or instead of the last
>> two I could equivalently just set model_gff to a GFF containing only models.)
>> 
>> A couple new issues came up when I tried to run with these options. I started
>> maker like this:
>> 
>> /usr/bin/time mpiexec -n 10 maker -q < /dev/null > maker.oe 2>&1
>> 
>> 1. I get a bunch of messages as follows, but with variable line number:
>> 
>> DBD::SQLite::db do failed: database is locked at
>> /home/jrs/maker-2.26-beta/bin/../lib/GFFDB.pm line 186.
>> 
>> I saw that this came up in another thread
>> <https://groups.google.com/forum/?fromgroups=#!topic/maker-devel/TscBgbQfBX4>
>> , but I'm not sure it was ever resolved, nor whether it will affect my
>> reannotation results (as I'm not sure what "your GFF3 results will not be
>> integrated" means). This error did not come up the last time I ran maker for
>> reannotation with similar options in a different directory. And both my
>> current directory and my tmp directory are locally mounted, ie not NFS.
>> 
>> 2. Both in this run and in previous runs, I get a lot of lines like this,
>> seemingly at random:
>> 
>> Warning: unable to close filehandle DF properly.
>> 
>> 
>> On Mon, Sep 10, 2012 at 6:01 AM, Carson Holt <carsonhh at gmail.com> wrote:
>>> The map_forward option requires that the pass option for the gene models be
>>> turned on.  Otherwise you will have to do some spacial overlap test outside
>>> of MAKER.
>>> 
>>> If you have a new assembly, you can try mapping the old models onto the new
>>> assembly using the old transcripts as input to the est= and setting
>>> est2genome=1 (nothing else set, i.e no repeat masking etc.).  Then there is
>>> an undocumented option that is still a little buggy (hence why it is still
>>> undocumented).  Add the line est_forward=1 to your control files.  This
>>> tells MAKER to copy names from the ESTs, build the models directly from
>>> their alignment, and to do other things to try and make a 1 to 1 match
>>> across the genome.  You will have to manually check that it is 1 to 1 in the
>>> end (as I said still a little buggy and hence undocumented).  Use the
>>> resulting file as input to the model_gff option on a separate run with
>>> map_forward=1 for additional reannotation wil more evidence, etc. where you
>>> want to still be able to map names forward.
>>> 
>>> From:  Jeremy Semeiks <jeremy.semeiks at utsw.edu>
>>> Date:  Sunday, 9 September, 2012 3:49 PM
>>> To:  <maker-devel at yandell-lab.org>
>>> Subject:  [maker-devel] How to preserve human-friendly IDs when reannotating
>>> 
>>> Hi all,
>>> 
>>> I have sequenced some novel fungal genomes, and I am annotating them with
>>> maker-2.26-beta. The entire project is pretty iterative, in the sense that I
>>> first get some seemingly-sane annotation sets, then analyze and compare the
>>> proteomes biologically, then reannotate when new data comes in or as I learn
>>> more about how maker works. Because I have already attached biological
>>> meaning to some of my proteins, I would like to retain the same
>>> human-friendly IDs across annotations. Eg, if maker suddenly finds 1,000 new
>>> proteins on a reannotation run because I turned on keep_preds, then I don't
>>> want the transcript formerly known as mymold_09652T0 to become
>>> mymold_10698T0 when I run maker_map_ids; I want to keep it named
>>> mymold_09652T0.
>>> 
>>> So, is there any built-in way to preserve human-friendly IDs, or do I need
>>> to write my own script for this? I have tried setting map_forward=1 and
>>> maker_gff=<the GFF file output by the previous run of maker_map_ids>, but
>>> setting these seems to preserve neither the human-friendly IDs nor even the
>>> original IDs. (Eg, protein "genemark-scaffold353-processed-gene-0.9-mRNA-1"
>>> changed its name to "genemark-scaffold353-processed-gene-0.6-mRNA-1" when
>>> reannotated.) I haven't turned on any of the *_pass options, eg
>>> protein_pass; would this be relevant?
>>> 
>>> Extra credit question: I am making some mate-pair libraries for these fungi;
>>> when I re-assemble, that will completely change my scaffold names. Is there
>>> any easy way to preserve human-friendly transcript names in this case? As
>>> with the above simpler case, I think it would be pretty easy to transfer 90%
>>> of the names just by doing an all-vs-all blastp between two annotation sets
>>> and fishing out the best hits, but the remaining 10% might be a headache.
>>> 
>>> Thanks,
>>> Jeremy
>>> Grad student, Grishin lab
>>> UT Southwestern, Dallas TX
>>> 510.385.8959 <tel:510.385.8959>
>>> _______________________________________________ maker-devel mailing list
>>> maker-devel at box290.bluehost.comhttp://box290.bluehost.com/mailman/listinfo/m
>>> aker-devel_yandell-lab.org
>> 
> 



-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20120921/5ce679de/attachment-0003.html>


More information about the maker-devel mailing list