[maker-devel] loading scaffold features into chado
Scott Cain
scott at scottcain.net
Tue Mar 20 11:25:04 MDT 2012
Hi Claudia,
I agree with everything that Carson wrote, except about name
searching--it's a little trickier in Chado. What you probably want to
do is implement full text searching. See:
http://gmod.org/wiki/Chado_Full_Text_Search
for more information on setting it up and maintaining it.
Scott
On Tue, Mar 20, 2012 at 1:13 PM, Carson Holt <carsonhh at gmail.com> wrote:
>
>>I have 2 concerns, the first is: regarding representing scaffold
>>features in chado and gbrowse. I noticed that the Sequence ontology uses
>>the term supercontig and so if my assembly generated scaffolds entitled
>>"scaffold" should I change the names to supercontigs so that chado
>>recognizes the terms?
>
> Yes. You must use valid SO terms. It is a requirement of GFF3, and Chado
> will enforce this requirement on loading a GFF3 file (note Chado will even
> go as far as to check the validity of the Ontology_term= attribute in GFF3
> if you use it). You can decide to use contig or supercontig as your
> sequence feature. It doesn¹t really matter unless you are placing both
> into the database as separate features (i.e. You have a supercontig as the
> parent feature and then you enter contigs individually as children of the
> supercontig).
>
>
>>
>>Corresponding to my first question, Maker does not know that the contigs
>>are actually scaffold/supercontigs when annotating and so Maker will
>>still call the "type" feature or column 3 in the GFF3, a 'contig', how
>>can Maker be implemented to change this naming convention before
>>annotation, or after?
>
> Not really important unless you plan on making contigs children of the
> supercontig. But you can always do a search and replace. -->
> cat file.gff | perl -ane 's/\tcontig\t/\tsupercontig\t/s; print $_' >
> new_file.gff
>
>
>>
>>Consequently, I am having problems pulling up gene features in Gbrowse
>>when doing a generic gene search, and I must provide the maker generated
>>unique-gene_id in the gbrowse search bar or the known sequence id i.e
>>'scaffold001', which is not useful for someone who does not have this
>>information.
>>---- I do not have this problem when my seq_id, and 'type' feature id
>>match in the true case of 'contigs'. I can do a generic gene search in
>>gbrowse with the term 'maker' and gbrowse will provide me all the
>>associated maker generated gene calls.
>
> See "Adjusting GBrowse Name Searches" in the GBrowse tutorial -->
> http://gmod.org/gbrowse2/tutorial/tutorial.html#naming
>
>
> Thanks,
> Carson
>
>
>
>
>
>
>
>
>
>
>
>>
>>Thank you for any guidance resolving these concerns,
>>Claudia
>>
>>
>>
>>--
>>Claudia DiNatale
>>Master's Candidate
>>The Crosby Lab
>>University of Windsor
>>519-253-3000 ext: 4755
>>
>>
>>_______________________________________________
>>maker-devel mailing list
>>maker-devel at box290.bluehost.com
>>http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org
>
>
>
> _______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org
--
------------------------------------------------------------------------
Scott Cain, Ph. D. scott at scottcain dot net
GMOD Coordinator (http://gmod.org/) 216-392-3087
Ontario Institute for Cancer Research
More information about the maker-devel
mailing list