[maker-devel] loading scaffold features into chado

claudia dinatal at uwindsor.ca
Tue Mar 20 12:03:27 MDT 2012


Hi,

  I have enabled full text searching and I still have this problem, 
another reason for concern... So I wondered if in fact I changed all the 
ID's in the GFF3 file to supercontigs, then perhaps Chado would better 
link all the terms, annotations, and fasta files.... Although, i realize 
that the seq_id ( column 1) shouldn't need to be specific since the 
'type' term would take care of designating the feature type, no?

Claudia



On 20/03/2012 1:25 PM, Scott Cain wrote:
> Hi Claudia,
>
> I agree with everything that Carson wrote, except about name
> searching--it's a little trickier in Chado.  What you probably want to
> do is implement full text searching.  See:
>
>    http://gmod.org/wiki/Chado_Full_Text_Search
>
> for more information on setting it up and maintaining it.
>
> Scott
>
>
> On Tue, Mar 20, 2012 at 1:13 PM, Carson Holt<carsonhh at gmail.com>  wrote:
>>> I have 2 concerns, the first is:  regarding representing scaffold
>>> features in chado and gbrowse. I noticed that the Sequence ontology uses
>>> the term supercontig and so if my assembly generated scaffolds entitled
>>> "scaffold" should I change the names to supercontigs so that chado
>>> recognizes the terms?
>> Yes.  You must use valid SO terms.  It is a requirement of GFF3, and Chado
>> will enforce this requirement on loading a GFF3 file (note Chado will even
>> go as far as to check the validity of the Ontology_term= attribute in GFF3
>> if you use it).  You can decide to use contig or supercontig as your
>> sequence feature.  It doesn¹t really matter unless you are placing both
>> into the database as separate features (i.e. You have a supercontig as the
>> parent feature and then you enter contigs individually as children of the
>> supercontig).
>>
>>
>>> Corresponding to my first question, Maker does not know that the contigs
>>> are actually scaffold/supercontigs when annotating and so Maker will
>>> still call the "type" feature or column 3 in the GFF3, a 'contig', how
>>> can Maker be implemented to change this naming convention before
>>> annotation, or after?
>> Not really important unless you plan on making contigs children of the
>> supercontig.  But you can always do a search and replace. -->
>> cat file.gff | perl -ane 's/\tcontig\t/\tsupercontig\t/s; print $_'>
>> new_file.gff
>>
>>
>>> Consequently, I am having problems pulling up gene features in Gbrowse
>>> when doing a generic gene search, and I must provide the maker generated
>>> unique-gene_id in the gbrowse search bar or the known sequence id i.e
>>> 'scaffold001', which is not useful for someone who does not have this
>>> information.
>>> ---- I do not have this problem when my seq_id, and 'type' feature id
>>> match in the true case of 'contigs'. I can do a generic gene search in
>>> gbrowse with the term 'maker' and gbrowse will provide me all the
>>> associated maker generated gene calls.
>> See "Adjusting GBrowse Name Searches" in the GBrowse tutorial -->
>> http://gmod.org/gbrowse2/tutorial/tutorial.html#naming
>>
>>
>> Thanks,
>> Carson
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>> Thank you for any guidance resolving these concerns,
>>> Claudia
>>>
>>>
>>>
>>> --
>>> Claudia DiNatale
>>> Master's Candidate
>>> The Crosby Lab
>>> University of Windsor
>>> 519-253-3000 ext: 4755
>>>
>>>
>>> _______________________________________________
>>> maker-devel mailing list
>>> maker-devel at box290.bluehost.com
>>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org
>>
>>
>> _______________________________________________
>> maker-devel mailing list
>> maker-devel at box290.bluehost.com
>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org
>
>


-- 
Claudia DiNatale
Master's Candidate
The Crosby Lab
University of Windsor
519-253-3000 ext: 4755





More information about the maker-devel mailing list