Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Input from stdin #29

Open
koopkaup opened this issue Jan 26, 2018 · 2 comments
Open

Input from stdin #29

koopkaup opened this issue Jan 26, 2018 · 2 comments

Comments

@koopkaup
Copy link

Is it possible to get the input file from standard input?
For example, all my data is compressed and it would be more convenient to just pipe gunzip output to stdout and the use it as stdin in nonpareil.

@lmrodriguezr
Copy link
Owner

lmrodriguezr commented Jan 26, 2018

Hello @koopkaup
Unfortunately, that would require major changes in the code, because the input files are read multiple times:

  • For -T kmer or -T alignment in one machine: There is an initial file pass to sample query reads and count total reads, and a second pass to run the comparisons.
  • For -T alignment with MPI: Each machine makes a pass as opposed to sending data directly to worker nodes to reduce bandwidth use.

However, I think we could implement an option to read directly from zipped files (gzip / bzip2), what do you think @gunturus ?

M

@gunturus
Copy link
Collaborator

For random sampling, we randomly move to a position in the file. So, this will require us to have the file uncompressed to begin with.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants