Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Options / suggestions for how to simulate nCats data? #200

Open
tfenne opened this issue Nov 29, 2023 · 1 comment
Open

Options / suggestions for how to simulate nCats data? #200

tfenne opened this issue Nov 29, 2023 · 1 comment

Comments

@tfenne
Copy link

tfenne commented Nov 29, 2023

Hi - I'm trying to simulate data similar to that generated by the nCATS protocol.

What this means is that I would like to be able to specify e.g. one or more small regions (on the order of 1-50bp) where all reads should start, rather than start positions being randomly distributed throughout the genome.

I don't see any options to constrain the read locations, so I'm thinking that what I'll have to do is:
i) Generate small FASTA files that start where I want reads to start and extend for 100-200kb
ii) Simulate a lot of reads from that file
iii) Filter the simulated reads to only those that start within the region I want

I'm guessing (ii) and (iii) will be rather slow, and I'm wondering if you have better suggestions for how to proceed? Thanks!

@SaberHQ
Copy link
Member

SaberHQ commented Dec 5, 2023

Thank you @tfenne for using NanoSim.

NanoSim currently does not have such a feature. It would be interesting to explore adding that in future releases. However, I can not give you a guaranteed answer whether or not we will work on it and an approximate timeframe for it.

In the meantime, I would suggest you follow the approach you suggested, generating a lot of reads and then filtering them based on their location. NanoSim is fairly fast in generating reads and you should be able to get millions of reads generated within a day.

I will keep you updated on this.
Thanks, Saber.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants