[maker-devel] loading scaffold features into chado

Scott Cain scott at scottcain.net
Tue Mar 20 13:50:55 MDT 2012


Hi Claudia,

Can you post a sample of the gff that shows what you are looking for and not finding?

Scott


Sent from my iPad

On Mar 20, 2012, at 2:03 PM, claudia <dinatal at uwindsor.ca> wrote:

> Hi,
> 
> I have enabled full text searching and I still have this problem, another reason for concern... So I wondered if in fact I changed all the ID's in the GFF3 file to supercontigs, then perhaps Chado would better link all the terms, annotations, and fasta files.... Although, i realize that the seq_id ( column 1) shouldn't need to be specific since the 'type' term would take care of designating the feature type, no?
> 
> Claudia
> 
> 
> 
> On 20/03/2012 1:25 PM, Scott Cain wrote:
>> Hi Claudia,
>> 
>> I agree with everything that Carson wrote, except about name
>> searching--it's a little trickier in Chado.  What you probably want to
>> do is implement full text searching.  See:
>> 
>>   http://gmod.org/wiki/Chado_Full_Text_Search
>> 
>> for more information on setting it up and maintaining it.
>> 
>> Scott
>> 
>> 
>> On Tue, Mar 20, 2012 at 1:13 PM, Carson Holt<carsonhh at gmail.com>  wrote:
>>>> I have 2 concerns, the first is:  regarding representing scaffold
>>>> features in chado and gbrowse. I noticed that the Sequence ontology uses
>>>> the term supercontig and so if my assembly generated scaffolds entitled
>>>> "scaffold" should I change the names to supercontigs so that chado
>>>> recognizes the terms?
>>> Yes.  You must use valid SO terms.  It is a requirement of GFF3, and Chado
>>> will enforce this requirement on loading a GFF3 file (note Chado will even
>>> go as far as to check the validity of the Ontology_term= attribute in GFF3
>>> if you use it).  You can decide to use contig or supercontig as your
>>> sequence feature.  It doesn¹t really matter unless you are placing both
>>> into the database as separate features (i.e. You have a supercontig as the
>>> parent feature and then you enter contigs individually as children of the
>>> supercontig).
>>> 
>>> 
>>>> Corresponding to my first question, Maker does not know that the contigs
>>>> are actually scaffold/supercontigs when annotating and so Maker will
>>>> still call the "type" feature or column 3 in the GFF3, a 'contig', how
>>>> can Maker be implemented to change this naming convention before
>>>> annotation, or after?
>>> Not really important unless you plan on making contigs children of the
>>> supercontig.  But you can always do a search and replace. -->
>>> cat file.gff | perl -ane 's/\tcontig\t/\tsupercontig\t/s; print $_'>
>>> new_file.gff
>>> 
>>> 
>>>> Consequently, I am having problems pulling up gene features in Gbrowse
>>>> when doing a generic gene search, and I must provide the maker generated
>>>> unique-gene_id in the gbrowse search bar or the known sequence id i.e
>>>> 'scaffold001', which is not useful for someone who does not have this
>>>> information.
>>>> ---- I do not have this problem when my seq_id, and 'type' feature id
>>>> match in the true case of 'contigs'. I can do a generic gene search in
>>>> gbrowse with the term 'maker' and gbrowse will provide me all the
>>>> associated maker generated gene calls.
>>> See "Adjusting GBrowse Name Searches" in the GBrowse tutorial -->
>>> http://gmod.org/gbrowse2/tutorial/tutorial.html#naming
>>> 
>>> 
>>> Thanks,
>>> Carson
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>>> Thank you for any guidance resolving these concerns,
>>>> Claudia
>>>> 
>>>> 
>>>> 
>>>> --
>>>> Claudia DiNatale
>>>> Master's Candidate
>>>> The Crosby Lab
>>>> University of Windsor
>>>> 519-253-3000 ext: 4755
>>>> 
>>>> 
>>>> _______________________________________________
>>>> maker-devel mailing list
>>>> maker-devel at box290.bluehost.com
>>>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org
>>> 
>>> 
>>> _______________________________________________
>>> maker-devel mailing list
>>> maker-devel at box290.bluehost.com
>>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org
>> 
>> 
> 
> 
> -- 
> Claudia DiNatale
> Master's Candidate
> The Crosby Lab
> University of Windsor
> 519-253-3000 ext: 4755
> 




More information about the maker-devel mailing list