Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Write reads to disk immediately instead of caching them in RAM #210

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

sebschmi
Copy link

@sebschmi sebschmi commented Aug 3, 2021

First of all, thank you for making ISS. I find it very fast and easy to use, especially because it ships with error models.

When trying it out I noted that it uses a lot of RAM, which seemed odd for a read simulator, especially since it slowly eats RAM over time. However I think I found the reason and a fix for that.

When generating reads, ISS first stores all reads in a python list in RAM. Only after generating all reads, it writes them to disk.

However, it would be much more memory efficient to write them to disk immediately after generation. So this is what I did. I moved the read generation code into a generator function reads_generator which I pass to to_fastq.

As a result, the memory usage is now small and stays constant during generation.

Copy link

@Tj-Idowu Tj-Idowu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This worked great. Thank you

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants