Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Errors in the encoding of some bits #70

Open
salvogalati opened this issue Nov 9, 2022 · 1 comment
Open

Errors in the encoding of some bits #70

salvogalati opened this issue Nov 9, 2022 · 1 comment

Comments

@salvogalati
Copy link

Hi,
I was checking the code behind the PubChem fingerprint generation.
I did some comparisons between fingerprints calculated with your code and those calculated with PyFingerprint which uses the cdk library and noticed some differences.
I noticed that for bits in the range 0-98, smarts are not used and therefore when carbons are counted for example, only aliphatic carbons are considered since the corresponding key is C.
As a result the counting and encoding are incorrect.
The second point concerns the bits in the range 115-231: in this case there are two conditions to be met such as bits 116 and 117 mention ">= 1 saturated or aromatic carbon-only ring size 3 " and ">= 1 saturated or aromatic nitrogen-containing ring size 3" respectively. In this case a cyclopropane ring should be detected by bit 116 but not by bit 117. Instead with your code it is encoded for both bits.

I hope the bugs I reported are corrected otherwise I would be glad to have an explanation of my mistake

Thank you for your helpfulness
Salvatore

@nbehrnd
Copy link

nbehrnd commented Nov 9, 2022 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants