Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pre-condition violations for BCUT descriptor calculations #7364

Open
paulsonak opened this issue Apr 17, 2024 · 0 comments
Open

Pre-condition violations for BCUT descriptor calculations #7364

paulsonak opened this issue Apr 17, 2024 · 0 comments
Labels

Comments

@paulsonak
Copy link

paulsonak commented Apr 17, 2024

Describe the bug
I receive pre-condition violations for every molecule I've tested when calculating BCUT descriptors. Our standardization code adds H's to each molecule first. The violations print out in my notebook multiple times per molecule and (sometimes multiple) per descriptor, resulting in thousands of outputs and eventually crashing my notebook. The descriptors that cause it are:

('BCUT2D_MWHI',
 'BCUT2D_MWLOW',
 'BCUT2D_CHGHI',
 'BCUT2D_CHGLO',
 'BCUT2D_LOGPHI',
 'BCUT2D_LOGPLOW',
 'BCUT2D_MRHI',
 'BCUT2D_MRLOW')

And the violation repeatedly reads:

****
Pre-condition Violation
bad result vector size
Violation occurred on line 40 in file /project/build/temp.linux-aarch64-cpython-39/rdkit/Code/GraphMol/Descriptors/Crippen.cpp
Failed Expression: logpContribs.size() == mol.getNumAtoms() && mrContribs.size() == mol.getNumAtoms()
----------
Stacktrace:
----------
****

It also seems to be somewhat dependent on first calling AddHs and/or passing the descriptors in an iterable vs a single descriptor, although I don't know why that's the case.

To Reproduce

smiles='C#CCC(C(=O)c1ccc(C)cc1)N1CCCC1'

from rdkit import Chem
mol=Chem.MolFromSmiles(smiles)
mol=Chem.AddHs(mol)

from rdkit.Chem import Descriptors
from rdkit.ML.Descriptors import MoleculeDescriptors
calc = MoleculeDescriptors.MolecularDescriptorCalculator([x[0] for x in Descriptors._descList])

desc_names = calc.GetDescriptorNames()

calc=MoleculeDescriptors.MolecularDescriptorCalculator(['BCUT2D_MWHI'])
descriptors = calc.CalcDescriptors(mol)

Expected behavior
I'd like the warnings not to appear and not crash my notebook.

Screenshots
image

Configuration (please complete the following information):

  • RDKit version: 2023.9.5 (It happens for 2023.9.X but not 2023.3.3)
  • OS: [e.g. Ubuntu 20.04] Apple M2 Max running Ubuntu Lunar 23.04 ARM64 on a VM.
  • Python version (if relevant):
    Python 3.9.17 (main, Aug 21 2023, 10:36:40)
    [GCC 12.3.0] on linux
  • Are you using conda? No
  • If you are using conda, which channel did you install the rdkit from? NA
  • If you are not using conda: how did you install the RDKit? pip install rdkit==2023.9.5

Additional context
I believe it is NOT ARM-specific since my colleagues see the same issue on regular Linux/Ubuntu with the 2023.9.X versions.

@paulsonak paulsonak added the bug label Apr 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant