[maker-devel] [EXTERNAL] Re: Troubleshooting Maker failure

Jacques Dainat jacques.dainat at gmail.com
Fri Sep 24 02:41:49 MDT 2021


Hi Kevin,

I have checked the 2 files.
About pasa_predictions.gff3
<http://genomes.ua.edu/Kocot/2021-09-10_Maker/pasa_predictions.gff3>:
It is parsed without any problem with agat_convert_sp_gxf2gxf.pl in about
10 mins. MAKER (>=v3) should be able to use it like that, but you might
prefer to convert it into match/match_part style using the
agat_sp_alignment_output_style.p <http://agat_sp_alignment_output_style.pl>l.
The match/match_part style works in all of MAKER version.
About  protein_alignments.gff3
<http://genomes.ua.edu/Kocot/2021-09-10_Maker/protein_alignments.gff3>:
This file is more problematic, it contains only 1 feature type, which is
level1 in AGAT (i.e. like a gene), and in the current state is expecting
sub-features. On top of that, the ID attribute is confusing because it is
supposed to be unique, and it is not. So here the commands you should
launch to get a proper file for MAKER:
First
```
sed 's/nucleotide_to_protein_match/match_part/' protein_alignments.gff3 |
sed 's/ID=/Parent=/' > protein_alignments_repared.gff3
```
Then
```
agat_convert_sp_gxf2gxf.pl --gff  protein_alignments_repared.gff3 -o
protein_alignments_clean.gff3
```
Then you should be good with a proper match/match_part file.

Best,

/Jacques


> Hi Carson and Jacques,
>
> Thank you both very much for the help. It looks like my PASA and exonerate
> (via Funannotate) gff3 files are not correctly formatted for maker.  I've
> uploaded them here just in case it might be helpful for you to see them,
> but I will try to figure out how to reformat them correctly following
> Jacques's advice.
> http://genomes.ua.edu/Kocot/2021-09-10_Maker/pasa_predictions.gff3
> http://genomes.ua.edu/Kocot/2021-09-10_Maker/protein_alignments.gff3
>
> Is there a standalone tool or Maker feature I'm not seeing that can assess
> whether a gff3 file is correctly formatted for Maker?
>
> Thank you again!
> Kevin
> ------------------------------
> *From:* Jacques Dainat <jacques.dainat at nbis.se>
> *Sent:* Wednesday, September 22, 2021 2:09 PM
> *To:* Kevin Kocot <kmkocot at ua.edu>
> *Cc:* maker-devel at yandell-lab.org <maker-devel at yandell-lab.org>; Carson
> Holt <carsonhh at gmail.com>
> *Subject:* [EXTERNAL] Re: [maker-devel] Troubleshooting Maker failure
>
> Hi Kevin,
>
> About using AGAT (agat_convert_sp_gxf2gxf.pl), two reasons to get empty
> output files.
> I) The feature types (3rd column) are not yet handled by AGAT. You can
> inform AGAT how to deal with it
> See
> https://agat.readthedocs.io/en/latest/troubleshooting.html#agat-throws-features-out-because-the-feature-type-is-not-yet-taken-into-account
> <https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fagat.readthedocs.io%2Fen%2Flatest%2Ftroubleshooting.html%23agat-throws-features-out-because-the-feature-type-is-not-yet-taken-into-account&data=04%7C01%7Ckmkocot%40ua.edu%7C168456deb2c74e62a85f08d97dfca290%7C2a00728ef0d040b4a4e8ce433f3fbca7%7C0%7C0%7C637679346284061013%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=Wc7PNmVtOVuQHd%2BiDYxQJTuXHlkoltF7eJ4Gv1fkoQc%3D&reserved=0>
> II) The features are thrown by AGAT because child feature are missing (e.g.
> gene feature expect at least one transcript linked to it). See
> https://agat.readthedocs.io/en/latest/troubleshooting.html#agat-throws-features-out-because-child-features-are-not-provided
> <https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fagat.readthedocs.io%2Fen%2Flatest%2Ftroubleshooting.html%23agat-throws-features-out-because-child-features-are-not-provided&data=04%7C01%7Ckmkocot%40ua.edu%7C168456deb2c74e62a85f08d97dfca290%7C2a00728ef0d040b4a4e8ce433f3fbca7%7C0%7C0%7C637679346284071013%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=UBEc%2BdXAtr%2FOLaJnf5RCMwrp8U3RprkDyPBNDR08ORo%3D&reserved=0>
>
> I invite you to open an issue in the AGAT GitHub repository.
>
> Once the file is parsed correctly you can use the script
> agat_sp_alignment_output_style.pl to turn level1 feature types (e.g.
> gene) and level2 feature types (e.g. mRNA) into match and match_part
> features respectively as it can be preferred by MAKER.
>
> Best regards,
>
> Jacques Dainat, Ph.D.
>
>
> On 22 Sep 2021, at 18:21, Carson Holt <carsonhh at gmail.com> wrote:
>
> I’d have to see the GFF, but in general you should organize sequence
> alignments as match/match_part features.
>
> Here is n example from the GFF3 format specification:
>
> ctg123 . cDNA_match  1200  9000  .        .  .    ID=cDNA00001
> ctg123 . match_part  1200  3200  2.2e-30  +  .
>  ID=match00002;Parent=cDNA00001;Target=mjm1123.5 5 506;Gap=M301 D1499 M201
> ctg123 . match_part  7000  9000  7.4e-32  -  .
>  ID=match00003;Parent=cDNA00001;Target=mjm1123.3 1 502;Gap=M101 D1499 M401
>
> Also make sure you are not inadvertently using GFF2 or GTF. They are not
> backwards compatible with GFF3.
>
> —Carson
>
>
> On Sep 22, 2021, at 8:06 AM, Kevin Kocot <kmkocot at ua.edu> wrote:
>
> Thanks Carson,
>
> I think I see the problem now. Here's what I'm getting:
>
> -----
> STATUS: Parsing control files...
> STATUS: Processing and indexing input FASTA files...
> STATUS: Setting up database for any GFF3 input...
> A data structure will be created for you at:
>
> /home/wirenia/Desktop/2021-08-11_MAKER_Dreissena_rostriformis/PGA_assembly_shortened_headers.fasta.maker.output/PGA_assembly_shortened_headers.fasta_datastore
>
> To access files for individual sequences use the datastore index:
>
> /home/wirenia/Desktop/2021-08-11_MAKER_Dreissena_rostriformis/PGA_assembly_shortened_headers.fasta.maker.output/PGA_assembly_shortened_headers.fasta_master_datastore_index.log
>
> STATUS: Now running MAKER...
> examining contents of the fasta file and run log
>
>
>
> --Next Contig--
>
> Processing run.log file...
> #---------------------------------------------------------------------
> Now retrying the contig!!
> SeqID: PGA_scaffold0
> Length: 141759199
> Tries: 5!!
> #---------------------------------------------------------------------
>
>
> setting up GFF3 output and fasta chunks
> prepare section files
> Gathering GFF3 input into hits - chunk:0
> ERROR: Non-unique top level ID for match.19561.56
> While this is technically legal in GFF3, it usually
> indicates a poorly fomatted GFF3 file (perhaps you
> tried to merge two GFF3 files without accounting for
> unique IDs).  MAKER will not handle these correctly.
>
> --> rank=NA, hostname=wirenia
> ERROR: Failed while prepare section files
> ERROR: Chunk failed at level:12, tier_type:3
> FAILED CONTIG:PGA_scaffold0
>
> ERROR: Chunk failed at level:4, tier_type:0
> FAILED CONTIG:PGA_scaffold0
>
> examining contents of the fasta file and run log
> -----
>
> It looks like maker doesn't like the format of the exonerate gff3 I am
> using. I tried 'fixing' it with agat_convert_sp_gxf2gxf.pl, which seemed
> to work on my Braker output, but that just produced an empty gff3 file for
> both my exonerate and PASA gff3 files. Any advice on how to prepare
> exonerate or PASA gff3 files for Maker?
>
> Thanks!
> Kevin
> ------------------------------
> *From:* Carson Holt <carsonhh at gmail.com>
> *Sent:* Monday, September 20, 2021 8:06 AM
> *To:* Kevin Kocot <kmkocot at ua.edu>
> *Cc:* maker-devel at yandell-lab.org <maker-devel at yandell-lab.org>
> *Subject:* [EXTERNAL] Re: [maker-devel] Troubleshooting Maker failure
>
> Hi Kevin,
>
> The files are already being skipped because of previous failures. Can you
> increase the try count (-t on the command line) to something like 6, and
> send me the STDERR after it generates a new failure.
>
> —Carson
>
> Sent from my iPhone
>
> On Sep 20, 2021, at 4:56 AM, Kevin Kocot <kmkocot at ua.edu> wrote:
>
> 
> Thanks Carson! I've uploaded that file here:
> http://genomes.ua.edu/Kocot/2021-09-10_Maker/round1_run_maker_try2.log
>
>
> ------------------------------
> *From:* Carson Holt
> *Sent:* Tuesday, September 14, 2021 9:51 PM
> *To:* Kevin Kocot
> *Cc:* maker-devel at yandell-lab.org
> *Subject:* [EXTERNAL] Re: [maker-devel] Troubleshooting Maker failure
>
> What I really need is the captured STDERR from the failed run.
>
> —Carson
>
> On Sep 10, 2021, at 9:19 AM, Kevin Kocot <kmkocot at ua.edu> wrote:
>
> Hi Carson and all,
>
> I ran Maker 3.01.03 on a chromosome-level mollusc genome assembly using
> some evidence I previously generated with Funannotate and BRAKER2, but
> Maker is not completing successfully. Every scaffold in the
> datastore_index.log file has both STARTED and FAILED statuses. I can't
> figure out where the problem lies, though. Running ./Build status indicates
> all the dependencies are there (I’m not using MPI). The run.log files just
> ends with "DIED RANK 0" and "DIED COUNT 3." Is there a way to tell which
> (if any) of the dependencies is misbehaving here (they all seem to run fine
> independently) or if my evidence .gff3 files are not correctly formatted?
>
> I’ve uploaded a zipped sample output folder as well as my config files and
> the datastore.log file here: http://genomes.ua.edu/Kocot/2021-09-10_Maker/
>
> Any guidance on what might be the issue would be greatly appreciated.
>
> Thanks!
> Kevin
>
>
>
> *Kevin M. Kocot *he/him/his
>
> Associate Professor & Curator of Invertebrates
> Department of Biological Sciences & Alabama Museum of Natural History
> The University of Alabama <https://www.ua.edu/>
> 307 Mary Harmon Bryant Hall
> Box 870344
> Tuscaloosa, AL 35487
> phone 205-348-4052 | fax 205-348-4039
> kmkocot at ua.edu | www.kocotlab.com
> <https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.kocotlab.com%2F&data=04%7C01%7Ckmkocot%40ua.edu%7C168456deb2c74e62a85f08d97dfca290%7C2a00728ef0d040b4a4e8ce433f3fbca7%7C0%7C0%7C637679346284081007%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=IPRtDsChZSGeknM5I9qWvgWNHbRQc9a3NREKARlkxY4%3D&reserved=0>
> https://uasystem.zoom.us/j/3755490727
>
> _______________________________________________
> maker-devel mailing list
> maker-devel at yandell-lab.org
> http://yandell-lab.org/mailman/listinfo/maker-devel_yandell-lab.org
> <https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fyandell-lab.org%2Fmailman%2Flistinfo%2Fmaker-devel_yandell-lab.org&data=04%7C01%7Ckmkocot%40ua.edu%7C168456deb2c74e62a85f08d97dfca290%7C2a00728ef0d040b4a4e8ce433f3fbca7%7C0%7C0%7C637679346284081007%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=jOkLAShSbdfpwvbNvu%2B5%2FiJjqdldin3gybzZfJAIyoo%3D&reserved=0>
>
>
> _______________________________________________
> maker-devel mailing list
> maker-devel at yandell-lab.org
> http://yandell-lab.org/mailman/listinfo/maker-devel_yandell-lab.org
>
>
>

-- 
Jacques Dainat
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20210924/4df67923/attachment-0003.html>


More information about the maker-devel mailing list