<div dir="ltr"><div><div>Hello:<br><br></div>Thank you for all your previous comments and suggestions. We annotated a new rodent species using the maker2 pipeline. The assembly is about 3.2Gb with N50 24.3Mb. I included all scaffolds longer than 300bp for gene annotation (about 250k scaffolds). <br><br>For repeats masking, we also build a species specific library. We used both transcriptome and protein sequences as evidences (including 10k reviewed Mammalian and 340k predicted rodent protein sequences from uniprot). We predicted 28800 genes with AED<1 (the "default" gene set). <br></div><div><br></div><div>For the 28800 predicted proteins, about 90% have AED value less than 0.5, and 74% have domains by "InterProScan". It seems the genome was well annotated, but I still feelĀ 28800 protein coding genes are too many for a rodent species. Do you think this gene set is good for downstream analysis (e.g., gene family expansion analysis, positive selection analysis)? Or can I do further filtering to make the number of genes closer to estimated number (e.g., 22,000)?<br></div><div><br></div><div>Thanks</div><div><br></div><div>Best</div><div>Quanwei<br></div><div><br></div><span style="font-size:18pt;font-family:Calibri;color:black;vertical-align:super"></span>
</div>