Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Obtaining diversity estimate #38

Open
snayfach opened this issue Aug 28, 2019 · 3 comments
Open

Obtaining diversity estimate #38

snayfach opened this issue Aug 28, 2019 · 3 comments
Assignees
Labels

Comments

@snayfach
Copy link

Thanks for the great documentation and software design. Installation and running was super easy.

One question - how to I obtain the nonpareil sequence diversity estimate? I may have missed it, but I couldn't find this info in the documentation. I assume I get this from the 'diversty' slot in the nonpareil object after running the Nonpareil.curve function. Is that correct?

@lmrodriguezr
Copy link
Owner

lmrodriguezr commented Aug 28, 2019

Hello @snayfach
Thanks! I'm glad to hear the docs/interfaces were clear 😃

Yes, you're correct. The diversity estimate is stored in the diversity slot of the Nonpareil.Curve object. You can access it directly with $diversity, or you can see it along with the rest of the estimates using summary(np).

[I'm gonna leave this comment open until I update the documentation to include this, please feel free to add any comments]

@lmrodriguezr lmrodriguezr self-assigned this Aug 28, 2019
@snayfach
Copy link
Author

Thanks! I'd suggest adding this info to the docs for those who are impatient and using the tool just for this value :)

@handibles
Copy link

Thanks to the Devs for the great package.

I would second the above - a discussion / mention of the diversity metric in the docs would be extremely helpful, as it's not clear if this is even available in the current version (in my case, it is the aspect I am most interested in). As above, the Nd value is available in R via:

library(NonPareil)
samp <- '/path/to/output.npo'
Nonpareil.curve(samp)$diversity

My understanding of the theory is, at best, partial - I presume it is not possible to estimate Nd without estimating coverage at all depths, but In the paper for NP3.0, it implies that estimating coverage is not important for estimating diversity:

"Since the shapes of the Nonpareil curves from replicates and subsamples 
closely resemble each other regardless of coverage (3), we propose Nd as 
a coverage-independent measurement of the diversity of the sampled community."

If so, could the diversity estimate of kmers then be a separate function (R/C++)?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants