<html><head><meta http-equiv="Content-Type" content="text/html; charset=utf-8"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;" class="">This is an older email that looks like never got an answer. Briefly you need to use Linux command line tools to evaluate the GFF3 file.<div class=""><br class=""></div><div class="">Examples:<div class=""><br class=""></div><div class="">#count protein coding transcripts</div><div class="">cat annotations.gff | grep -c -P “\tmRNA\t"<br class=""><div class=""><div><br class=""></div><div><div class="">#count tRNA transcripts</div><div class="">cat annotations.gff | grep -c -P “\ttRNA\t"</div></div><div><br class=""></div><div><div class="">#count snoRNA transcripts</div><div class="">cat annotations.gff | grep -c -P “\tsnoRNA\t"</div></div><div><br class=""></div><div><br class=""></div><div>You can also pull out the Parent feature and count uniq entries to look at genes instead of transcripts. Example:</div><div><br class=""></div><div>#count protein coding genes</div><div>cat annotations.gff | grep -c -P “\tmRNA\t” | perl -ane ‘/Parent=([^\;\n]+)/; print "$1\n”’ | sort | uniq | grep -c “"</div><div><br class=""></div><div><br class=""></div><div>You can also try tools like SOBA from the Sequence Ontology that give statistics on GFF3 features —> <a href="http://www.sequenceontology.org/cgi-bin/soba.cgi" class="">http://www.sequenceontology.org/cgi-bin/soba.cgi</a></div><div><br class=""></div><div>—Carson</div><div><br class=""></div><div><br class=""></div><div><br class=""><blockquote type="cite" class=""><div class="">On Jul 27, 2021, at 9:15 AM, Emmanuel Nnadi <<a href="mailto:eennadi@gmail.com" class="">eennadi@gmail.com</a>> wrote:</div><br class="Apple-interchange-newline"><div class=""><div dir="ltr" class="">Hello Carson, Greetings from Nigeria.<div class="">Please how can I extract these matrix from my annotations?</div><div class=""><br class=""></div><div class=""><span style="font-size:12pt;font-family:"Times New Roman",serif" class="">Number of protein-coding genes in the assembled tea plant genome Those with known
proteins and/or domains . Annotation of noncoding RNA
genes ribosomal RNA genes Number of </span><span style="font-family:"Times New Roman",serif;font-size:12pt" class=""> transfer RNA genes, Number transcription factor
genes and </span><span style="font-family:"Times New Roman",serif;font-size:12pt" class=""> simple sequence </span></div><div class=""><span style="font-family:"Times New Roman",serif;font-size:12pt" class=""><br class=""></span></div><div class=""><font face="Times New Roman, serif" class=""><span style="font-size:16px" class="">Thanks</span></font></div><div class=""><font face="Times New Roman, serif" class=""><span style="font-size:16px" class=""><br class=""></span></font></div><div class=""><div class=""><div dir="ltr" data-smartmail="gmail_signature" class=""><div dir="ltr" class=""><div class=""><div dir="ltr" class="">Nnaemeka Emmanuel Nnadi,Ph.D<div class="">Department of Microbiology,</div><div class="">Faculty of Natural and Applied Science,</div><div class="">Plateau State University, Bokkos, Plateau State, Nigeria.</div><div class="">+2348068124819</div><div class="">Publications: </div><div class=""><br class=""></div><div class=""><a href="https://orcid.org/0000-0002-8424-4859" target="_blank" class="">https://orcid.org/0000-0002-8424-4859</a><br class=""></div><div class=""><br class=""></div><div class=""> <a href="https://www.researchgate.net/profile/Emmanuel_Nnadi/publications" target="_blank" class="">https://www.researchgate.net/profile/Emmanuel_Nnadi/publications</a></div></div></div></div></div></div></div></div>
</div></blockquote></div><br class=""></div></div></div></body></html>