Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

distinct minimizer count integer overflow with large reference fasta #1192

Open
cjprybol opened this issue Apr 14, 2024 · 0 comments
Open

distinct minimizer count integer overflow with large reference fasta #1192

cjprybol opened this issue Apr 14, 2024 · 0 comments

Comments

@cjprybol
Copy link

cjprybol commented Apr 14, 2024

Hi @lh3,

I'm using slices of NCBI's NT database for pacbio hifi read mapping. I noticed that the number of minimizers seems to have overflowed the integer count. Would you expect this to impact the trustworthiness of the mapping results, or is this just a cosmetic issue in the stderr reporting?

prior mapping run with nt_others, number of minimizers are positive and percentages are < 100

[M::mm_idx_stat::128.816*2.87] distinct minimizers: 232724554 (61.91% are singletons); average occurrences: 2.649; average spacing: 10.138; total length: 6250598365

mapping nt_prok, number of minimizers are negative

[M::mm_idx_stat::4819.440*3.47] distinct minimizers: -814309831 (-245.31% are singletons); average occurrences: -30.207; average spacing: 10.032; total length: 246767835107

I believe that I am using the latest current release [M::main] Version: 2.28-r1209

Thank you,
Cameron

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant