[maker-devel] collecting protein sequences as evidences
Quanwei Zhang
qwzhang0601 at gmail.com
Tue Jan 31 10:36:13 MST 2017
I wonder what's the best way to collect protein sequences for gene
annotation of a de novo genome assembly.
(1) My first choice is to get protein sequences of human and mouse from
UniProt. At this step, I am not clear whether I should download the
reviewed ones (i.e., SWISS-prot) or automatically annotated ones (i.e.,
TrEMBL).
(2) On ther other hand, I also get protein sequences from NCBI, should I
just simply merge those fasta files. Does it matter if there are
redundancies? And also, if I get protein sequences from different sources,
they may not have the same quality. Do I need to do something before I
integrate protein sequences from different sources?
Many thanks
Best
Quanwei
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20170131/315d4a00/attachment-0002.html>
More information about the maker-devel
mailing list