[maker-devel] How do I extract matrix from annotation

Carson Holt carsonhh at gmail.com
Wed Oct 20 00:07:19 MDT 2021


This is an older email that looks like never got an answer. Briefly you need to use Linux command line tools to evaluate the GFF3 file.

Examples:

#count protein coding transcripts
cat annotations.gff | grep -c -P “\tmRNA\t"

#count tRNA transcripts
cat annotations.gff | grep -c -P “\ttRNA\t"

#count snoRNA transcripts
cat annotations.gff | grep -c -P “\tsnoRNA\t"


You can also pull out the Parent feature and count uniq entries to look at genes instead of transcripts. Example:

#count protein coding genes
cat annotations.gff | grep -c -P “\tmRNA\t” | perl -ane ‘/Parent=([^\;\n]+)/; print "$1\n”’ | sort | uniq | grep -c “"


You can also try tools like SOBA from the Sequence Ontology that give statistics on GFF3 features —> http://www.sequenceontology.org/cgi-bin/soba.cgi

—Carson



> On Jul 27, 2021, at 9:15 AM, Emmanuel Nnadi <eennadi at gmail.com> wrote:
> 
> Hello Carson, Greetings from Nigeria.
> Please how can I extract these matrix from my annotations?
> 
> Number of  protein-coding genes in the assembled tea plant genome  Those with known proteins and/or domains . Annotation of noncoding RNA genes  ribosomal RNA genes Number of  transfer RNA genes, Number  transcription factor genes and  simple sequence 
> 
> Thanks
> 
> Nnaemeka Emmanuel Nnadi,Ph.D
> Department of Microbiology,
> Faculty of Natural and Applied Science,
> Plateau State University, Bokkos, Plateau State, Nigeria.
> +2348068124819
> Publications: 
> 
> https://orcid.org/0000-0002-8424-4859 <https://orcid.org/0000-0002-8424-4859>
> 
>  https://www.researchgate.net/profile/Emmanuel_Nnadi/publications <https://www.researchgate.net/profile/Emmanuel_Nnadi/publications>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20211020/b8841a68/attachment-0002.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 1376 bytes
Desc: not available
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20211020/b8841a68/attachment-0002.p7s>


More information about the maker-devel mailing list