Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Index out of range errors #166

Open
RagnarGrootKoerkamp opened this issue Apr 21, 2022 · 2 comments
Open

Index out of range errors #166

RagnarGrootKoerkamp opened this issue Apr 21, 2022 · 2 comments

Comments

@RagnarGrootKoerkamp
Copy link

I'm getting some index out of range errors, possibly because of setting the same value (or too close?) for -min and -max:

-min 10000 -max 10000:

2022-04-21 13:17:35: Start simulation of aligned reads
Process Process-1:
Traceback (most recent call last):
  File "/home/philae/.local/share/miniconda3/lib/python3.9/multiprocessing/process.py", line 315, in _bootstrap
    self.run()
  File "/home/philae/.local/share/miniconda3/lib/python3.9/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "/home/philae/.local/share/miniconda3/bin/simulator.py", line 1293, in simulation_aligned_genome
    remainder = int(remainder_lengths[each_read])
IndexError: list index out of range

and

-min 900000 -max 1100000:

2022-04-21 13:19:34: Start simulation of aligned reads
Process Process-1:
Traceback (most recent call last):
  File "/home/philae/.local/share/miniconda3/lib/python3.9/multiprocessing/process.py", line 315, in _bootstrap
    self.run()
  File "/home/philae/.local/share/miniconda3/lib/python3.9/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "/home/philae/.local/share/miniconda3/bin/simulator.py", line 1294, in simulation_aligned_genome
    head_vs_ht_ratio = head_vs_ht_ratio_list[each_read]
IndexError: list index out of range
@SaberHQ
Copy link
Member

SaberHQ commented Apr 21, 2022

With the first case, obviously it is not logical to set min and max length equal to each other. With your second case scenario, I suspect that the reference genome you are using is smaller than the read lengths you specified. May I ask whether you are using the pre-trained models or if you trained your own model?

@RagnarGrootKoerkamp
Copy link
Author

RagnarGrootKoerkamp commented Apr 21, 2022

With the first case, obviously it is not logical to set min and max length equal to each other.

Hmm OK, that wasn't obvious to me. I would like to generate some reads to test a pairwise aligner I'm working on, and to benchmark it, it is nice to have reads of a specific length. I changed it some some interval around it and it works now. Anyway, displaying a warning of just crashing would be nice ;)

With your second case scenario, I suspect that the reference genome you are using is smaller than the read lengths you specified.

Oh right, that may well be the case. I am using some human genome reference but I noticed my fasta file also has some shorter sequences in addition to the long chromosomes. Again, a warning message would be nice.

May I ask whether you are using the pre-trained models or if you trained your own model?

I'm using pre-trained models, since I don't have direct access to reads.

My full NanoSim invocation is this, where {..} will be substituted by snakemake:

    simulator.py genome \
    --ref_g input/reference/human.fa \
    --output input/simulated/human-x{wildcards.x}-n{wildcards.n} \
    -dna_type linear \
    --model_prefix ../../nanosim/pre-trained_models/human_NA12878_DNA_FAB49712_guppy/training \
    --min_len {params.min} \
    --median_len {wildcards.n} \
    --max_len {params.max} \
    --sd_len 1.05 \
    --number {params.generate_x} \
    --strandness 1 \
    --seed 314151 \
    --num_threads 6

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants