Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chromosomes longer than 2.1 GB lead to crash #353

Open
MarioStanke opened this issue Aug 11, 2022 · 2 comments
Open

chromosomes longer than 2.1 GB lead to crash #353

MarioStanke opened this issue Aug 11, 2022 · 2 comments
Assignees

Comments

@MarioStanke
Copy link
Contributor

Apparently, this is a result of 4Byte int not allowing for positions that are 2^31 or larger.
Error message

examining piece 1..-1928087540 (-1928087540 bp)
terminate called after throwing an instance of 'std::bad_alloc'
@MarioStanke
Copy link
Contributor Author

Apparently this is not completely solved, at least when predictions are requested on the complete chromosome in one run (rather than using --predictionStart and --predictionEnd)

$AUGUSTUS --species=rice --softmasking=0 --protein=on --codingseq=on --progress=true --gff3=on --alternatives-from-evidence=false --alternatives-from-sampling=false --extrinsicCfgFile=$EXCFFILE $GENOME_PART

leads to a segmentation fault after ~10k minutes compute time.

examining piece 2147286171..-2147481126 

@piroyon
Copy link

piroyon commented Aug 20, 2022

How about changing the type of beginPos, endPos, seqlen, restlen and the return value of getNextCutEndPoint from int to long in namgene.cc.

diff namgene.cc namgene.cc.org 
536,537c536,537
<   long endPos, beginPos;
<   long seqlen = strlen(dna);
---
>   int endPos, beginPos;
>   int seqlen = strlen(dna);
972,973c972,973
< long NAMGene::getNextCutEndPoint(const char *dna, long beginPos, int maxstep, SequenceFeatureCollection& sfc){
<   long restlen = strlen(dna+beginPos);
---
> int NAMGene::getNextCutEndPoint(const char *dna, int beginPos, int maxstep, SequenceFeatureCollection& sfc){
>   int restlen = strlen(dna+beginPos);

Using long would increase the memory requirements.
I haven't encountered this error, so sorry if it doesn't work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants