[maker-devel] loading scaffold features into chado
Scott Cain
scott at scottcain.net
Tue Mar 20 13:50:55 MDT 2012
Hi Claudia,
Can you post a sample of the gff that shows what you are looking for and not finding?
Scott
Sent from my iPad
On Mar 20, 2012, at 2:03 PM, claudia <dinatal at uwindsor.ca> wrote:
> Hi,
>
> I have enabled full text searching and I still have this problem, another reason for concern... So I wondered if in fact I changed all the ID's in the GFF3 file to supercontigs, then perhaps Chado would better link all the terms, annotations, and fasta files.... Although, i realize that the seq_id ( column 1) shouldn't need to be specific since the 'type' term would take care of designating the feature type, no?
>
> Claudia
>
>
>
> On 20/03/2012 1:25 PM, Scott Cain wrote:
>> Hi Claudia,
>>
>> I agree with everything that Carson wrote, except about name
>> searching--it's a little trickier in Chado. What you probably want to
>> do is implement full text searching. See:
>>
>> http://gmod.org/wiki/Chado_Full_Text_Search
>>
>> for more information on setting it up and maintaining it.
>>
>> Scott
>>
>>
>> On Tue, Mar 20, 2012 at 1:13 PM, Carson Holt<carsonhh at gmail.com> wrote:
>>>> I have 2 concerns, the first is: regarding representing scaffold
>>>> features in chado and gbrowse. I noticed that the Sequence ontology uses
>>>> the term supercontig and so if my assembly generated scaffolds entitled
>>>> "scaffold" should I change the names to supercontigs so that chado
>>>> recognizes the terms?
>>> Yes. You must use valid SO terms. It is a requirement of GFF3, and Chado
>>> will enforce this requirement on loading a GFF3 file (note Chado will even
>>> go as far as to check the validity of the Ontology_term= attribute in GFF3
>>> if you use it). You can decide to use contig or supercontig as your
>>> sequence feature. It doesn¹t really matter unless you are placing both
>>> into the database as separate features (i.e. You have a supercontig as the
>>> parent feature and then you enter contigs individually as children of the
>>> supercontig).
>>>
>>>
>>>> Corresponding to my first question, Maker does not know that the contigs
>>>> are actually scaffold/supercontigs when annotating and so Maker will
>>>> still call the "type" feature or column 3 in the GFF3, a 'contig', how
>>>> can Maker be implemented to change this naming convention before
>>>> annotation, or after?
>>> Not really important unless you plan on making contigs children of the
>>> supercontig. But you can always do a search and replace. -->
>>> cat file.gff | perl -ane 's/\tcontig\t/\tsupercontig\t/s; print $_'>
>>> new_file.gff
>>>
>>>
>>>> Consequently, I am having problems pulling up gene features in Gbrowse
>>>> when doing a generic gene search, and I must provide the maker generated
>>>> unique-gene_id in the gbrowse search bar or the known sequence id i.e
>>>> 'scaffold001', which is not useful for someone who does not have this
>>>> information.
>>>> ---- I do not have this problem when my seq_id, and 'type' feature id
>>>> match in the true case of 'contigs'. I can do a generic gene search in
>>>> gbrowse with the term 'maker' and gbrowse will provide me all the
>>>> associated maker generated gene calls.
>>> See "Adjusting GBrowse Name Searches" in the GBrowse tutorial -->
>>> http://gmod.org/gbrowse2/tutorial/tutorial.html#naming
>>>
>>>
>>> Thanks,
>>> Carson
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>> Thank you for any guidance resolving these concerns,
>>>> Claudia
>>>>
>>>>
>>>>
>>>> --
>>>> Claudia DiNatale
>>>> Master's Candidate
>>>> The Crosby Lab
>>>> University of Windsor
>>>> 519-253-3000 ext: 4755
>>>>
>>>>
>>>> _______________________________________________
>>>> maker-devel mailing list
>>>> maker-devel at box290.bluehost.com
>>>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org
>>>
>>>
>>> _______________________________________________
>>> maker-devel mailing list
>>> maker-devel at box290.bluehost.com
>>> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org
>>
>>
>
>
> --
> Claudia DiNatale
> Master's Candidate
> The Crosby Lab
> University of Windsor
> 519-253-3000 ext: 4755
>
More information about the maker-devel
mailing list