[maker-devel] Error from Pseudo gene identification scripts

Quanwei Zhang qwzhang0601 at gmail.com
Mon Dec 11 09:47:30 MST 2017


Hi Nick:

Thank you Nick. It seems the error is really due to the sequence name. Now
"pseudo_wrap.py" is running after I change the names. But there are still
some things seem strange.

(1) I found in the file "log_step1", it shows below information at the top.
Does it matter?

NOTE: DatabaseOp not imported
NOTE: DatabaseOp not imported
...

(2) It also shows below information
Done!
sh:
/gs/gsfs0/users/qzhang/tools/maker2_pseudogene/fasta36.3.8e/bin//tfasty34:
No such file or directory
sh:
/gs/gsfs0/users/qzhang/tools/maker2_pseudogene/fasta36.3.8e/bin//tfasty34:
No such file or directory
sh:
/gs/gsfs0/users/qzhang/tools/maker2_pseudogene/fasta36.3.8e/bin//tfasty34:
No such file or directory
....

While when I look into the folder
"/gs/gsfs0/users/qzhang/tools/maker2_pseudogene/fasta36.3.8e/bin/", I found
there is no file "tfasty34". Instead, there is a file named "tfasty36".
fasta36  fastf36  fastm36  fasts36  fastx36  fasty36  ggsearch36
glsearch36  lalign36  map_db  README  ssearch36  tfastf36  tfastm36
tfasts36  tfastx36  tfasty36

Do you have any suggestions on this?

Best
Quanwei

2017-12-11 10:18 GMT-05:00 Quanwei Zhang <qwzhang0601 at gmail.com>:

> Thank you Nick. The output is shown below.
>
> Traceback (most recent call last):
>   File "/gs/gsfs0/users/qzhang/tools/maker2_pseudogene/pseudo_pkg//ParseBlast.py",
> line 4330, in <module>
>     parse.get_qualified4(blast,fasta,E,I,L,P,Q)
>   File "/gs/gsfs0/users/qzhang/tools/maker2_pseudogene/pseudo_pkg//ParseBlast.py",
> line 3050, in get_qualified4
>     N = sizes[L[0]]
> KeyError: 'maker-Contig2656-snap-gene-1.9-mRNA-1'
> Traceback (most recent call last):
>   File "/gs/gsfs0/users/qzhang/tools/maker2_pseudogene/pseudo_pkg/script_step3b.py",
> line 98, in <module>
>     oup.write("%s\t%s\t%s\t%s\n" % (all_contigs[i][0][0],
> IndexError: list index out of range
> Done!
>
> NOTE: DatabaseOp not imported
> Program  : tfasty34
> Pair list: ../maker2.blastn.m_parsed_G500.PE_I500.PS1.pairs
> Fasta1   : ../../prediction2_final.proteins.fasta
> Fasta2   : ../maker2.blastn.m_parsed_G500.PE_I500.PS1.subj_coord.fa
> Fasta dir: /gs/gsfs0/users/qzhang/tools/maker2_pseudogene/fasta36.3.
> 8e/bin/
> Working d:
> Flags    : -A -m 3 -q
> E thres  : 1.0
> Read gene pairs...
>  0 pairs
> Read fasta files...
> Do sw...
> Done!
> Read BLOSUM50 matrix...
> Read the sw.out file...
> Compare sequences:
>  total: 0 alignments
> Done!
> Check parameter file...
>  default: ml_t=30
>  default: ev_t=5
>  default: ml_p=0.05
>  default: id_t=40
> Filter ../maker2.blastn.m...
>  E:5 I:40 L:30 P:0.05
> Get pseudoexons...
>  pseudoexon file: ../maker2.blastn.m_parsed_G500.PE
> Get phase 1 pseudogene...
>  phase1 ps file: ../maker2.blastn.m_parsed_G500.PE_I500.PS1
> Get pair file and subject coordinates...
>  pair file  : ../maker2.blastn.m_parsed_G500.PE_I500.PS1.pairs
>  coordinates: ../maker2.blastn.m_parsed_G500.PE_I500.PS1.subj_coord
> Get phase 1 pseudogene sequences...
>  phase 1 ps sequence: ../maker2.blastn.m_parsed_
> G500.PE_I500.PS1.subj_coord.fa
> Find stop and framshifts...
>  Smith-Waterman outputs: ../maker2.blastn.m_parsed_
> G500.PE_I500.PS1_pairs.sw.*
>  Final output: ../maker2.blastn.m_parsed_G500.PE_I500.PS1_pairs.sw.out.
> disable_count
>
> The pseudogene pipeline has finished!
>
>
> Best
> Quanwei
>
> 2017-12-11 10:01 GMT-05:00 <panchyni at msu.edu>:
>
>> Quanwei, in addition to checking the sequence file, can you also send
>> me the standard output of the run (this the text that normally prints
>> to the terminal unless you pipe it somewhere else)? This would help
>> in diagnosing the problem, but Shin-Han is likely correct that it is an
>> issue in the name formatting.
>>
>> Nick
>>
>> Quoting Shin-Han Shiu <shius at msu.edu>:
>>
>> > Hi Mike and Carson, we will take over from here. Thanks for referring
>> > the message to us.
>> >
>> > Quanwei, it looks like for some reason your input sequence file is
>> > missing "maker-Contig2656-snap-gene-1.9-mRNA-1". This can be an issue
>> > with the sequence name since the code use space as delimiter in
>> > places. Can you check your sequence file for this sequence and let us
>> > know how the name after ">" look like?
>> >
>> > Nick, sorry for bugging you. Do you have any input on this?
>> >
>> > Shinhan
>> >
>> >
>> > On 12/10/2017 8:37 PM, Michael Campbell wrote:
>> >> Hi Quanwei,
>> >>
>> >> My guess would be a file format issue, but the code has evolved
>> >> since I worked with it. The last time that ran it the fasta header
>> >> had to contain only the sequence ID without a space after it. That
>> >> was the big gotcha that I remember.
>> >>
>> >> I?ve ccd Shin-Han Shiu on this one. The pipeline was developed in his
>> lab.
>> >>
>> >> Thanks,
>> >> Mike
>> >>
>> >> On Dec 8, 2017, at 8:46 AM, Quanwei Zhang <qwzhang0601 at gmail.com
>> >> <mailto:qwzhang0601 at gmail.com>> wrote:
>> >>
>> >>> Thank you Carson and Michael.
>> >>>
>> >>> Best
>> >>> Quanwei
>> >>>
>> >>> 2017-12-07 23:42 GMT-05:00 Carson Holt <carsonhh at gmail.com
>> >>> <mailto:carsonhh at gmail.com>>:
>> >>>
>> >>>    I?m going to CC Michael Campbell on this. I wasn?t really
>> >>>    involved with any of the pseudogene accessory scripts and
>> >>>    protocols that went with the MAKER-P publication nor have I
>> >>>    really been involved with pseudogene annotation in general. So
>> >>>    Michael might have more insight here.
>> >>>
>> >>>    ?Carson
>> >>>
>> >>>>    On Dec 7, 2017, at 2:44 PM, Quanwei Zhang <qwzhang0601 at gmail.com
>> >>>>    <mailto:qwzhang0601 at gmail.com>> wrote:
>> >>>>
>> >>>>    Hello:
>> >>>>
>> >>>>    I am trying to identify pseudo genes following
>> >>>>    http://shiulab.plantbiology.msu.edu/index.php/Protocol:Pseudogene
>> >>>>    <http://shiulab.plantbiology.msu.edu/index.php/Protocol:Pseu
>> dogene>
>> >>>>
>> >>>>    After I get the blast result, I am trying to scan pseudogenes by
>> >>>>    the command "python pseudo_wrap.py parameter". But I got the
>> >>>>    following errors. Do you have any ideas and suggestions about
>> >>>>    the errors? Thanks.
>> >>>>
>> >>>>    ##below shows reported errors
>> >>>>    Traceback (most recent call last):
>> >>>>      File
>> >>>>
>> >>>> "/gs/gsfs0/users/qzhang/tools/maker2_pseudogene/pseudo_pkg//
>> ParseBlast.py",
>> >>>>    line 4330, in <module>
>> >>>>        parse.get_qualified4(blast,fasta,E,I,L,P,Q)
>> >>>>      File
>> >>>>
>> >>>> "/gs/gsfs0/users/qzhang/tools/maker2_pseudogene/pseudo_pkg//
>> ParseBlast.py",
>> >>>>    line 3050, in get_qualified4
>> >>>>        N = sizes[L[0]]
>> >>>>    KeyError: 'maker-Contig2656-snap-gene-1.9-mRNA-1'
>> >>>>    Traceback (most recent call last):
>> >>>>      File
>> >>>>
>> >>>> "/gs/gsfs0/users/qzhang/tools/maker2_pseudogene/pseudo_pkg/s
>> cript_step3b.py",
>> >>>>    line 98, in <module>
>> >>>>        oup.write("%s\t%s\t%s\t%s\n" % (all_contigs[i][0][0],
>> >>>>    IndexError: list index out of range
>> >>>>    Done!
>> >>>>
>> >>>>    Best
>> >>>>    Quanwei
>> >>>>    _______________________________________________
>> >>>>    maker-devel mailing list
>> >>>>    maker-devel at box290.bluehost.com
>> >>>>    <mailto:maker-devel at box290.bluehost.com>
>> >>>>    http://box290.bluehost.com/mailman/listinfo/maker-devel_yand
>> ell-lab.org
>> <https://urldefense.proofpoint.com/v2/url?u=http-3A__box290.bluehost.com_mailman_listinfo_maker-2Ddevel-5Fyandell-2Dlab.org&d=DwMCAw&c=nE__W8dFE-shTxStwXtp0A&r=amxxAQj1r58HtnljR_ldsiP-fOO39FxiO68ZT1UBHoE&m=p2cRRr06c6fbtMo9iCwpKi0X1QnRoogK33tX5Bfd5ek&s=SeSN1OfBVQVhi5forovCTfSJGCO1bR7Q7LuCcWlhctI&e=>
>> >>>>
>> >>>> <https://urldefense.proofpoint.com/v2/url?u=http-3A__box290.
>> bluehost.com_mailman_listinfo_maker-2Ddevel-5Fyandell-2Dlab.
>> org&d=DwMFaQ&c=nE__W8dFE-shTxStwXtp0A&r=rf2UnAHeUSb4ulp2JbXt
>> _w&m=4eCUx-nUmZ43poIB8geM9XkIKXoND4Yzi4aw4bXAfUU&s=2GYyuVGmT
>> 8vENvvk0LPCHjSUEmEzXdcyOnhXDjoTEcQ&e=>
>> >>>
>> >>>
>> >
>> > --
>> > --------------------------------------
>> > Shin-Han Shiu
>> > Michigan State University
>> > Department of Plant Biology
>> > 2265 Mol Plant Sci Bldg
>> > (TEL) +1-517-353-7196 <(517)%20353-7196>
>> > http://goo.gl/keiHZX
>> <https://urldefense.proofpoint.com/v2/url?u=http-3A__goo.gl_keiHZX&d=DwMCAw&c=nE__W8dFE-shTxStwXtp0A&r=amxxAQj1r58HtnljR_ldsiP-fOO39FxiO68ZT1UBHoE&m=p2cRRr06c6fbtMo9iCwpKi0X1QnRoogK33tX5Bfd5ek&s=pk3W0L_Z0dlNJDAdwrb7dA0_NSdr21LVNthDV6V43qg&e=>
>> > --------------------------------------
>> >
>> >
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20171211/e6afd2b7/attachment-0003.html>


More information about the maker-devel mailing list