[maker-devel] maker output- transcripts.fasta and proteins.fasta files missing
Carson Holt
carsonhh at gmail.com
Thu Mar 13 10:55:40 MDT 2014
The second time, it should have just started where it left off, so it would
run faster (because the processing from the previous job counted towards the
second one). The archived output you sent me had 21,183 proteins and
transcripts. If you are using the fasta_merge to collect them, just make
sure the datastore.index file is not truncated or corrupt otherwise it won’t
collect all the fastas from every contig. You can rebuild the
datastore.index using the -dsindex flag with MAKER, if you want to check
that. Also you can have maker just regenerate results without rerunning
BLAST etc., by using the -a flag if you want to just recalculate ll results
quickly (rebuilds all FASTA and GFF3 without redoing most analysis).
—Carson
From: dhivya arasappan <darasappan at gmail.com>
Date: Thursday, March 13, 2014 at 10:47 AM
To: Carson Holt <carsonhh at gmail.com>
Cc: Daniel Ence <dence at genetics.utah.edu>, "maker-devel at yandell-lab.org"
<maker-devel at yandell-lab.org>
Subject: Re: maker output- transcripts.fasta and proteins.fasta files
missing
Thanks Carson for the response. I understand that est2genome=1 does not use
any ab initio gene predictions, but simply identifies ests based on
alignment. I'm a little confused because I ran maker on my assembly before,
using the same parameters ( including est2genome=1). I got a very good
result with > 20,000 transcripts and proteins.
Then I was able to get an improved assembly, where many scaffolds were
combined into superscaffolds. So I reran maker on this assembly. Same
parameters, same transcriptome and proteins files. Now, I see such
drastically different results: Only 500+ genes and transcripts. My
scaffolds are now bigger than before, so I'm not sure how this is happening.
These were the results I sent you.
Another odd thing I noticed (and I am hesitant to report this because
perhaps it is due to some sort of error on my part): I ran maker on the
improved assembly the first time and maker did not complete in the 48 hours
I allocated. But I had 19,000+ transcripts in the unfinished output. When
I reran maker, just changing the time allocated, it completed much faster,
but is giving much fewer transcripts and proteins as output. Could
something like this happen? If not, then I'm guessing I must have changed
something although I'm pretty sure that I did not change anything other than
the time allocated. I've attached the trascripts and proteins files from the
first time I ran maker on my improved assembly.
Thanks again for your help
Dhivya
On Mar 13, 2014, at 11:14 AM, Carson Holt wrote:
> Note protein/transcript fasts are only created when there are gene models to
> output to those files (so their absence means there were no gene models for
> that contig). Most sequences without protein/transcript fasts in your sample
> are very short and thus don’t contain anything. What is left either have no
> est2genome results or the est2genome alignments do not have sufficient open
> reading frame to be turned into a gene model (false merging of regions by
> trinity can cause this, so make sure you use the jaccard index option when
> assembling reads with trinity to avoid this).
>
> You are using only the est2genome=1 option. This will result in a limited set
> of genes that can be used for training SNAP/Augustus (so not getting results
> on all contigs is expected). You really won’t get much as far as results
> until you have one of the ab initio predictors turned on.
>
> Thanks,
> Carson
>
>
> From: dhivya arasappan <darasappan at gmail.com>
> Date: Tuesday, March 11, 2014 at 8:52 AM
> To: Carson Holt <carsonhh at gmail.com>
> Cc: Daniel Ence <dence at genetics.utah.edu>
> Subject: Re: maker output- transcripts.fasta and proteins.fasta files missing
>
> Alright done. My username is daras
>
> Thanks
> Dhivya
>
> On Mar 10, 2014, at 5:10 PM, Carson Holt wrote:
>
>> Input and compressed file of output.
>>
>> Thanks,
>> Carson
>>
>> From: dhivya arasappan <darasappan at gmail.com>
>> Date: Monday, March 10, 2014 at 2:09 PM
>> To: Carson Holt <carsonhh at gmail.com>
>> Cc: Daniel Ence <dence at genetics.utah.edu>
>> Subject: Re: maker output- transcripts.fasta and proteins.fasta files
>> missing
>>
>> Hi Carson,
>>
>> Do you mean the whole maker output?
>>
>> Thanks
>> dhivya
>>
>> On Mar 10, 2014, at 4:55 PM, Carson Holt wrote:
>>
>>> Could you upload everything here —>
>>> http://weatherby.genetics.utah.edu/cgi-bin/mwas/bug.cgi
>>>
>>> Than send us the link generated or your user ID.
>>>
>>> Thanks,
>>> Carson
>>>
>>>
>>>
>>> From: dhivya arasappan <darasappan at gmail.com>
>>> Date: Monday, March 10, 2014 at 1:50 PM
>>> To: Carson Holt <carsonhh at gmail.com>, Daniel Ence <dence at genetics.utah.edu>
>>> Subject: Fwd: maker output- transcripts.fasta and proteins.fasta files
>>> missing
>>>
>>> Hi Carson and Daniel,
>>>
>>> I'm sending this across to you separately since maker list is blocking my
>>> email due to attachment size.
>>>
>>> As always, thanks for any guidance you can provide.
>>> Dhivya
>>>
>>>
>>> Begin forwarded message:
>>>
>>>> From: dhivya arasappan <darasappan at gmail.com>
>>>> Date: March 10, 2014 3:14:03 PM CDT
>>>> To: maker-devel at yandell-lab.org
>>>> Subject: maker output- transcripts.fasta and proteins.fasta files missing
>>>>
>>>>
>>>> Hello,
>>>>
>>>> I've been running maker with different assembly files, reference files etc
>>>> and I check the output by:
>>>>
>>>> 1. concatenating the gff files
>>>> 2. concatenating the *transcripts.fasta files
>>>> 3. concatenating the *proteins.fasta files
>>>>
>>>> I'm noticing that when I ran maker twice with same parameters, the second
>>>> time around, many of the output subdirectories do not have a
>>>> *transcripts.fasta or *proteins.fasta file in it.
>>>> There are 251 subdirectories and only 97 of them have all 3 output files.
>>>> Maker log looks ok to me, but I've attached it here as well.
>>>>
>>>> What could be the reason for this?
>>>>
>>>> Thanks
>>>> dhivya
>>>>
>>>>
>>>>
>>>>
>>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20140313/a1a879a2/attachment-0003.html>
More information about the maker-devel
mailing list