[maker-devel] errors in final gff
Anurag Priyam
a.priyam at qmul.ac.uk
Wed Jun 25 15:38:17 MDT 2014
-a option looks like just the thing I need.
I will forward concerns about NFS to our IT team. And definitely use
MPI for parallelisation next time.
Thanks a lot :).
-- Priyam
On Thu, Jun 26, 2014 at 2:56 AM, Carson Holt <carsonhh at gmail.com> wrote:
> Maybe if it died in a weird way some of the processes could have continued
> briefly without active locks, but I'd more likely attribute this to NFS
> weirdness. Because of how network storage works, some implementations
> take shortcuts (like returning success on an IO operation even though it
> has not completed and may even fail later on). Or an IO operation can be
> buffered and completed several seconds later (the process that called the
> write operation may not even be active anymore). This is extremely common
> on NFS. You should probably just start MAKER fewer times in the same
> directory on your system. You may also want to start a single MAKER job
> (you should use MPI to parallelize it though), and use the -a flag. This
> will cause that job just to just rebuild the current GFF3 and FASTA files.
> That way you can clean up your current results without having to rerun
> everything. It should run relatively quickly since MAKER will be able to
> make use of the existing BLAST reports etc. that are already there
> (exonerate will run again though, but it shouldn't take too long).
>
> --Carson
>
>
> On 6/25/14, 3:11 PM, "Anurag Priyam" <a.priyam at qmul.ac.uk> wrote:
>
>>Mmm ... I didn't use -nolock option. But I did launch some 10 MAKER
>>processes in the same directory.
>>
>>I feel it's unlikely that my file system doesn't allow hardlinks
>>because a few processes quit earlier than the others, saying something
>>to the tune of "Another MAKER process is processing this scaffold
>>already."
>>
>>I remember one process in particular had _just_ crashed. I don't
>>remember how: I might have Ctrl-C'ed by mistake instead of detaching
>>screen? admin killed it? temporary system glitch? Could this have
>>caused the same issue?
>>
>>-- Priyam
>>
>>
>>On Wed, Jun 25, 2014 at 1:35 AM, Carson Holt <carsonhh at gmail.com> wrote:
>>> Thanks. For the first two --> scaffold00002:hit:1026:1.3.0.12
>>>
>>> The value 1026 is held in a global iterator, so it cannot repeat the
>>>same
>>> value during the life of the process. And 1.3.0.12 is generated from the
>>> point in the code the ID is being generated. This means that two
>>>distinct
>>> processses had to write to the same file at the same point in the code,
>>> which should normally be impossible.
>>>
>>> However, there are ways to make this happen. First if you turn file
>>>locks
>>> off (-nolock) option and then run MAKER multiple times on the same
>>>dataset
>>> you can get process collisions (because you disabled the locks that stop
>>> this). If your NFS file system does not support hard links (FhGFS for
>>> example) then you cannot lock the files (which is the same as setting
>>> -nolock). Or you have other serious IO failures over NFS. Note that NFS
>>> is your Network Mounted Storage.
>>>
>>> The last example you give shows the preceding line being truncated.
>>>This
>>> suggests that two processes are trying to write to the same file
>>> simultaneously (inserting lines in between other lines), or serious IO
>>> failures are occurring where writes are not completing but true is being
>>> returned for the operations (can happen on unreliable NFS
>>>implementations).
>>>
>>> So in summary either your NFS storage implementation is giving IO
>>>errors,
>>> you have run MAKER with -nolock set and then started MAKER multiple
>>>times
>>> in the same directory (process collisions), or your NFS implementation
>>> doesn't support hardlinks and won't allow MAKER to lock files (process
>>> collisions). If it is one of the latter two, you will have to make sure
>>> you never start MAKER more than once simultaneously on the same dataset.
>>> You can still run via MPI fro parallelization, but you won't be able to
>>> start a second MPI process while the first one is still running.
>>>
>>> Thanks,
>>> Carson
>>>
>>>
>>> On 6/24/14, 12:56 PM, "Anurag Priyam" <a.priyam at qmul.ac.uk> wrote:
>>>
>>>>I am sorry. I have updated the gist -
>>>>https://gist.github.com/yeban/ffaf5cd419639dd073a7.
>>>>1. The first two chunks contain the annotations with duplicate ids. (4
>>>>rows)
>>>>2. The last chunk contains the annotations that refer to a
>>>>non-existent parent. And what looks like an incomplete line of
>>>>annotation (I forgot to state this in my original email).
>>>>
>>>>No, I didn't use est_forward. I am not passing in any old data via GFF3.
>>>>
>>>>-- Priyam
>>>>
>>>>On Sat, Jun 21, 2014 at 3:26 AM, Carson Holt <carsonhh at gmail.com> wrote:
>>>>> Also note that ID= must be unique. Name= does not have to be, and
>>>>>won't
>>>>>be
>>>>> if the same protein or repeat element aligns to more than one location
>>>>>for
>>>>> example.
>>>>>
>>>>> Thanks,
>>>>> Carson
>>>>>
>>>>>
>>>>> On 6/20/14, 3:50 PM, "Carson Holt" <carsonhh at gmail.com> wrote:
>>>>>
>>>>>>did you use est_forward? Also in the example you showed all the IDs
>>>>>>are
>>>>>>unique (one says hit and the other hsp in the ID, so they are
>>>>>>different)?
>>>>>>Could you find the non-uunique IDs causing the error?
>>>>>>
>>>>>>--Carson
>>>>>>
>>>>>>
>>>>>>On 6/19/14, 2:05 AM, "Anurag Priyam" <anurag08priyam at gmail.com> wrote:
>>>>>>
>>>>>>>I used est_gff= option, which refers to a GFF file generated by
>>>>>>>cufflinks2gff3. The erroneous annotations didn't come from this GFF.
>>>>>>>
>>>>>>>-- Priyam
>>>>>>>
>>>>>>>On Thu, Jun 19, 2014 at 3:03 AM, Carson Holt <carsonhh at gmail.com>
>>>>>>>wrote:
>>>>>>>> Are you passing in old data via GFF3?
>>>>>>>>
>>>>>>>> --Carson
>>>>>>>>
>>>>>>>>
>>>>>>>> On 6/18/14, 12:15 PM, "Anurag Priyam" <anurag08priyam at gmail.com>
>>>>>>>>wrote:
>>>>>>>>
>>>>>>>>>It's version 2.31.
>>>>>>>>>
>>>>>>>>>-- Priyam
>>>>>>>>>
>>>>>>>>>On Wed, Jun 18, 2014 at 11:41 PM, Carson Holt <carsonhh at gmail.com>
>>>>>>>>>wrote:
>>>>>>>>>> What MAKER version are you using?
>>>>>>>>>>
>>>>>>>>>> --Carson
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 6/18/14, 11:44 AM, "Anurag Priyam" <a.priyam at qmul.ac.uk>
>>>>>>>>>>wrote:
>>>>>>>>>>
>>>>>>>>>>>Hi,
>>>>>>>>>>>
>>>>>>>>>>>I compiled all annotations generated by MAKER into a single GFF
>>>>>>>>>>>file
>>>>>>>>>>>using the gff3_merge script distributed with MAKER. While
>>>>>>>>>>>formatting
>>>>>>>>>>>this GFF for use with JBrowse, I found a few errors:
>>>>>>>>>>>
>>>>>>>>>>>1. Three instances where two features were assigned the same id.
>>>>>>>>>>>2. One instance where a group of three subfeatures refer to a
>>>>>>>>>>>non-existent parent.
>>>>>>>>>>>
>>>>>>>>>>>Here is the relevant portion of the GFF file:
>>>>>>>>>>>https://gist.github.com/yeban/ffaf5cd419639dd073a7
>>>>>>>>>>>
>>>>>>>>>>>I worked around the issue temporarily for the job at hand, but I
>>>>>>>>>>>am
>>>>>>>>>>>left wondering why would these errors creep in.
>>>>>>>>>>>
>>>>>>>>>>>-- Priyam
>>>>>>>>>>>
>>>>>>>>>>>_______________________________________________
>>>>>>>>>>>maker-devel mailing list
>>>>>>>>>>>maker-devel at box290.bluehost.com
>>>>>>>>>>>http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-l
>>>>>>>>>>>ab
>>>>>>>>>>>.o
>>>>>>>>>>>r
>>>>>>>>>>>g
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>
>>>
>
>
More information about the maker-devel
mailing list