[maker-devel] Error trying to submit genome to ncbi
Carson Holt
carsonhh at gmail.com
Thu Nov 2 14:48:40 MDT 2017
If you modified the fasta files to remove N’s etc after they were annotated, then that would generate a mismatch between the GFF3 coordinates and the fasta sequence.
Have you modified or split contigs in the assembly in any way? I seem to remember you posting an issue about the fasta submission to NCBI previously.
—Carson
> On Nov 2, 2017, at 2:46 PM, Daniel Ence <dandence at gmail.com> wrote:
>
> These gene features with the “nonfunctional due to frameshift” indeed do not have other features associated with them in the tbl files. Is this reflected in the gff3 files for these annotations that maker produced? I’m not certain how maker would maker a gene without a CDS or mRNA, but identifying those discrepancies would a place to understand what has happened.
>
>
>
>> On Nov 2, 2017, at 4:30 PM, Emmanuel Nnadi <eennadi at gmail.com <mailto:eennadi at gmail.com>> wrote:
>>
>> Hi Daniel,
>>
>> This is the mail they sent to me
>>
>> [1] Please remove any N nucleotides from the beginning or end of the sequence.
>>
>> [2] No feature should begin or end inside a gap. Instead the feature should
>> be made partial at the gap boundary.
>>
>> [3] Coding regions should not be 5' partial if they begin with the start
>> methionine. If this is an internal methionine int he translation than
>> it is fine if they are partial. Conversely, all coding regions
>> must have a stop codon or be 3' partial.
>>
>> [4] You have a large number of gene features that are not associated
>> with other features. Please include on these features in the
>> gene description field some description of what the gene would
>> have encoded.
>>
>> A feature table example of this is:
>>
>> <41156 >40652 gene
>> gene_desc transposon
>> locus_tag CR513_45338
>> note nonfunctional due to frameshift
>>
>> [5] Every coding region must have a corresponding mRNA and in
>> every case the mRNA product name must match exactly that of the
>> CDS feature.
>>
>> 2 coding regions do not have an mRNA
>> ORIG/combined_1-5000.sqn:CDS cytochrome c oxidase subunit 2 (contig_100:<38458-
>> 39198, 40429->40623) CR513_00692
>> ORIG/combined_1-5000.sqn:CDS cytochrome c oxidase subunit 1
>> (contig_100:c>113064-111485, c111245-111221) CR513_00691
>>
>> So I just went to the .tbl file and searched for nonfunctional due to frameshift They are quite much, I have two more .tbl files
>>
>> I used GAG annotation to remove NNN and to add start and stop codon but ncbi still complained.
>>
>>
>> I have ran out of idea
>>
>> Please help me
>>
>>
>>
>>
>>
>>
>> Nnadi Nnaemeka Emmanuel
>> Department of Microbiology,
>> Faculty of Natural and Applied Science,
>> Plateau State University, Bokkos, Plateau State, Nigeria.
>> Publications: https://www.researchgate.net/profile/Emmanuel_Nnadi/publications <https://www.researchgate.net/profile/Emmanuel_Nnadi/publications>
>> On Thu, Nov 2, 2017 at 9:24 PM, Daniel Ence <dandence at gmail.com <mailto:dandence at gmail.com>> wrote:
>> Hi, Thank you for sending me your data, but which ones are the offending genes that NCBI is complaining about? Can you identify the problem that NCBI is giving in some subset of the gene features?
>>
>> ~Daniel
>>
>>
>>
>>
>>> On Nov 2, 2017, at 4:20 PM, Emmanuel Nnadi <eennadi at gmail.com <mailto:eennadi at gmail.com>> wrote:
>>>
>>> Hi Daniel thanks for your reply.
>>>
>>> I have attached my .tbl file
>>>
>>> you would see
>>> <77753 >77549 gene
>>> locus_tag CR513_00193
>>> gene AtMg00820
>>> note nonfunctional due to frameshift
>>>
>>>
>>> Is another example.
>>>
>>> Its becoming frustrating.
>>>
>>> I have not posted the two errors before
>>> [1] Please remove any N nucleotides from the beginning or end of the sequence.
>>>
>>> [2] No feature should begin or end inside a gap. Instead the feature should
>>> be made partial at the gap boundary.
>>>
>>> [3] Coding regions should not be 5' partial if they begin with the start
>>> methionine. If this is an internal methionine int he translation than
>>> it is fine if they are partial. Conversely, all coding regions
>>> must have a stop codon or be 3' partial.
>>>
>>> Nnadi Nnaemeka Emmanuel
>>> Department of Microbiology,
>>> Faculty of Natural and Applied Science,
>>> Plateau State University, Bokkos, Plateau State, Nigeria.
>>> Publications: https://www.researchgate.net/profile/Emmanuel_Nnadi/publications <https://www.researchgate.net/profile/Emmanuel_Nnadi/publications>
>>> On Thu, Nov 2, 2017 at 9:08 PM, Daniel Ence <dandence at gmail.com <mailto:dandence at gmail.com>> wrote:
>>> Hi, I think you’ve posted before about issues 1 and 2 from the NCBI. The note for issue 3 from NCBI sounds like there are gene features that don’t have associated transcript, CDS or exon features. I’m not certain how that could be a result from MAKER. It might be something that someone else created (manually or with another tool), and then passed to maker from a GFF file. In the example included in your email, it looks like these offending genes are transposons that have been annotated as genes. If that is the case for the rest of the offending genes, then I would suggest changing the “type” field (column 3) from “gene” to something else, like “transposable_element” perhaps.
>>>
>>> ~Daniel
>>>
>>>
>>>> On Nov 2, 2017, at 3:51 PM, Emmanuel Nnadi <eennadi at gmail.com <mailto:eennadi at gmail.com>> wrote:
>>>>
>>>> Hi,
>>>>
>>>> I am trying to submit my genome i annotated using maker and they sent back this error,
>>>> 1. Please remove any N nucleotides from the beginning or end of the sequence
>>>> 2.No feature should begin or end inside a gap. Instead the feature should
>>>> be made partial at the gap boundary.
>>>>
>>>> [3] Coding regions should not be 5' partial if they begin with the start
>>>> methionine. If this is an internal methionine int he translation than
>>>> it is fine if they are partial. Conversely, all coding regions
>>>> must have a stop codon or be 3' partial.
>>>> You have a large number of gene features that are not associated
>>>> with other features. Please include on these features in the
>>>> gene description field some description of what the gene would
>>>> have encoded.
>>>>
>>>> A feature table example of this is:
>>>>
>>>> <41156 >40652 gene
>>>> gene_desc transposon
>>>> locus_tag CR513_45338
>>>> note nonfunctional due to frameshift
>>>> Please how can i use maker to solve this problem?
>>>>
>>>>
>>>> Nnadi Nnaemeka Emmanuel
>>>> Department of Microbiology,
>>>> Faculty of Natural and Applied Science,
>>>> Plateau State University, Bokkos, Plateau State, Nigeria.
>>>> Publications: https://www.researchgate.net/profile/Emmanuel_Nnadi/publications <https://www.researchgate.net/profile/Emmanuel_Nnadi/publications>_______________________________________________
>>>> maker-devel mailing list
>>>> maker-devel at box290.bluehost.com <mailto:maker-devel at box290.bluehost.com>
>>>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org <http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org>
>>>
>>>
>>> <combined_5001-10000.tbl>
>>
>>
>
> _______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20171102/7d67750c/attachment-0003.html>
More information about the maker-devel
mailing list