[maker-devel] errors in final gff

Carson Holt carsonhh at gmail.com
Wed Jun 25 15:26:45 MDT 2014


Maybe if it died in a weird way some of the processes could have continued
briefly without active locks, but I'd more likely attribute this to NFS
weirdness.  Because of how network storage works, some implementations
take shortcuts (like returning success on an IO operation even though it
has not completed and may even fail later on). Or an IO operation can be
buffered and completed several seconds later (the process that called the
write operation may not even be active anymore). This is extremely common
on NFS.  You should probably just start MAKER fewer times in the same
directory on your system.  You may also want to start a single MAKER job
(you should use MPI to parallelize it though), and use the -a flag.  This
will cause that job just to just rebuild the current GFF3 and FASTA files.
That way you can clean up your current results without having to rerun
everything.  It should run relatively quickly since MAKER will be able to
make use of the existing BLAST reports etc. that are already there
(exonerate will run again though, but it shouldn't take too long).

--Carson


On 6/25/14, 3:11 PM, "Anurag Priyam" <a.priyam at qmul.ac.uk> wrote:

>Mmm ... I didn't use -nolock option. But I did launch some 10 MAKER
>processes in the same directory.
>
>I feel it's unlikely that my file system doesn't allow hardlinks
>because a few processes quit earlier than the others, saying something
>to the tune of "Another MAKER process is processing this scaffold
>already."
>
>I remember one process in particular had _just_ crashed. I don't
>remember how: I might have Ctrl-C'ed by mistake instead of detaching
>screen? admin killed it? temporary system glitch? Could this have
>caused the same issue?
>
>-- Priyam
>
>
>On Wed, Jun 25, 2014 at 1:35 AM, Carson Holt <carsonhh at gmail.com> wrote:
>> Thanks. For the first two --> scaffold00002:hit:1026:1.3.0.12
>>
>> The value 1026 is held in a global iterator, so it cannot repeat the
>>same
>> value during the life of the process. And 1.3.0.12 is generated from the
>> point in the code the ID is being generated.  This means that two
>>distinct
>> processses had to write to the same file at the same point in the code,
>> which should normally be impossible.
>>
>> However, there are ways to make this happen.  First if you turn file
>>locks
>> off (-nolock) option and then run MAKER multiple times on the same
>>dataset
>> you can get process collisions (because you disabled the locks that stop
>> this).  If your NFS file system does not support hard links (FhGFS for
>> example) then you cannot lock the files (which is the same as setting
>> -nolock).  Or you have other serious IO failures over NFS. Note that NFS
>> is your Network Mounted Storage.
>>
>> The last example you give shows the preceding line being truncated.
>>This
>> suggests that two processes are trying to write to the same file
>> simultaneously (inserting lines in between other lines), or serious IO
>> failures are occurring where writes are not completing but true is being
>> returned for the operations (can happen on unreliable NFS
>>implementations).
>>
>> So in summary either your NFS storage implementation is giving IO
>>errors,
>> you have run MAKER with -nolock set and then started MAKER multiple
>>times
>> in the same directory (process collisions), or your NFS implementation
>> doesn't support hardlinks and won't allow MAKER to lock files (process
>> collisions).  If it is one of the latter two, you will have to make sure
>> you never start MAKER more than once simultaneously on the same dataset.
>> You can still run via MPI fro parallelization, but you won't be able to
>> start a second MPI process while the first one is still running.
>>
>> Thanks,
>> Carson
>>
>>
>> On 6/24/14, 12:56 PM, "Anurag Priyam" <a.priyam at qmul.ac.uk> wrote:
>>
>>>I am sorry. I have updated the gist -
>>>https://gist.github.com/yeban/ffaf5cd419639dd073a7.
>>>1. The first two chunks contain the annotations with duplicate ids. (4
>>>rows)
>>>2. The last chunk contains the annotations that refer to a
>>>non-existent parent. And what looks like an incomplete line of
>>>annotation (I forgot to state this in my original email).
>>>
>>>No, I didn't use est_forward. I am not passing in any old data via GFF3.
>>>
>>>-- Priyam
>>>
>>>On Sat, Jun 21, 2014 at 3:26 AM, Carson Holt <carsonhh at gmail.com> wrote:
>>>> Also note that ID= must be unique. Name= does not have to be, and
>>>>won't
>>>>be
>>>> if the same protein or repeat element aligns to more than one location
>>>>for
>>>> example.
>>>>
>>>> Thanks,
>>>> Carson
>>>>
>>>>
>>>> On 6/20/14, 3:50 PM, "Carson Holt" <carsonhh at gmail.com> wrote:
>>>>
>>>>>did you use est_forward?  Also in the example you showed all the IDs
>>>>>are
>>>>>unique (one says hit and the other hsp in the ID, so they are
>>>>>different)?
>>>>>Could you find the non-uunique IDs causing the error?
>>>>>
>>>>>--Carson
>>>>>
>>>>>
>>>>>On 6/19/14, 2:05 AM, "Anurag Priyam" <anurag08priyam at gmail.com> wrote:
>>>>>
>>>>>>I used est_gff= option, which refers to a GFF file generated by
>>>>>>cufflinks2gff3. The erroneous annotations didn't come from this GFF.
>>>>>>
>>>>>>-- Priyam
>>>>>>
>>>>>>On Thu, Jun 19, 2014 at 3:03 AM, Carson Holt <carsonhh at gmail.com>
>>>>>>wrote:
>>>>>>> Are you passing in old data via GFF3?
>>>>>>>
>>>>>>> --Carson
>>>>>>>
>>>>>>>
>>>>>>> On 6/18/14, 12:15 PM, "Anurag Priyam" <anurag08priyam at gmail.com>
>>>>>>>wrote:
>>>>>>>
>>>>>>>>It's version 2.31.
>>>>>>>>
>>>>>>>>-- Priyam
>>>>>>>>
>>>>>>>>On Wed, Jun 18, 2014 at 11:41 PM, Carson Holt <carsonhh at gmail.com>
>>>>>>>>wrote:
>>>>>>>>> What MAKER version are you using?
>>>>>>>>>
>>>>>>>>> --Carson
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 6/18/14, 11:44 AM, "Anurag Priyam" <a.priyam at qmul.ac.uk>
>>>>>>>>>wrote:
>>>>>>>>>
>>>>>>>>>>Hi,
>>>>>>>>>>
>>>>>>>>>>I compiled all annotations generated by MAKER into a single GFF
>>>>>>>>>>file
>>>>>>>>>>using the gff3_merge script distributed with MAKER. While
>>>>>>>>>>formatting
>>>>>>>>>>this GFF for use with JBrowse, I found a few errors:
>>>>>>>>>>
>>>>>>>>>>1. Three instances where two features were assigned the same id.
>>>>>>>>>>2. One instance where a group of three subfeatures refer to a
>>>>>>>>>>non-existent parent.
>>>>>>>>>>
>>>>>>>>>>Here is the relevant portion of the GFF file:
>>>>>>>>>>https://gist.github.com/yeban/ffaf5cd419639dd073a7
>>>>>>>>>>
>>>>>>>>>>I worked around the issue temporarily for the job at hand, but I
>>>>>>>>>>am
>>>>>>>>>>left wondering why would these errors creep in.
>>>>>>>>>>
>>>>>>>>>>-- Priyam
>>>>>>>>>>
>>>>>>>>>>_______________________________________________
>>>>>>>>>>maker-devel mailing list
>>>>>>>>>>maker-devel at box290.bluehost.com
>>>>>>>>>>http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-l
>>>>>>>>>>ab
>>>>>>>>>>.o
>>>>>>>>>>r
>>>>>>>>>>g
>>>>>>>>>
>>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>
>>






More information about the maker-devel mailing list