Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RegistrationHash.GetMolLayers does not distinguish atropisomers #7367

Closed
ricrogz opened this issue Apr 17, 2024 · 1 comment · Fixed by #7426
Closed

RegistrationHash.GetMolLayers does not distinguish atropisomers #7367

ricrogz opened this issue Apr 17, 2024 · 1 comment · Fixed by #7426
Assignees
Labels
Milestone

Comments

@ricrogz
Copy link
Contributor

ricrogz commented Apr 17, 2024

The registration hash should be able to distinguish between atropisomers, but that's currently not the case:

In [1]: from rdkit import Chem

In [2]: from rdkit.Chem import RegistrationHash

# This is one of the examples from the initial atropisomers PR
In [3]: m = Chem.MolFromMolFile('RP-6306_atrop1.sdf')

In [4]: b = m.GetBondWithIdx(3)

# The atropisomer was correctly identified
In [5]: b.GetStereo()
Out[5]: rdkit.Chem.rdchem.BondStereo.STEREOATROPCW

In [6]: hash1 = RegistrationHash.GetMolLayers(m, enable_tautomer_hash_v2=True)

# No atropisomer info in the hash: for SMILES, it's encoded in the CX extension, which is not part of the hash
In [7]: print(hash1)
{
<HashLayer.CANONICAL_SMILES: 1>: 'Cc1cc2c(C(N)=O)c(N)n(-c3c(C)ccc(O)c3C)c2nc1C', 
<HashLayer.ESCAPE: 2>: '', 
<HashLayer.FORMULA: 3>: 'C18H20N4O2', 
<HashLayer.NO_STEREO_SMILES: 4>: 'Cc1cc2c(C(N)=O)c(N)n(-c3c(C)ccc(O)c3C)c2nc1C', 
<HashLayer.NO_STEREO_TAUTOMER_HASH: 5>: '[C]:[C]1:[N]:[C]2:[C](:[C]:[C]:1-[CH3]):[C](:[C](:[N]):[O]):[C](:[N]):[N]:2:[C]1:[C](-[CH3]):[C]:[C]:[C](:[O]):[C]:1-[CH3]_11_0', 
<HashLayer.SGROUP_DATA: 6>: '[]', 
<HashLayer.TAUTOMER_HASH: 7>: '[C]:[C]1:[N]:[C]2:[C](:[C]:[C]:1-[CH3]):[C](:[C](:[N]):[O]):[C](:[N]):[N]:2:[C]1:[C](-[CH3]):[C]:[C]:[C](:[O]):[C]:1-[CH3]_11_0'
}

# Now, reverse the atropisomer
In [8]: b.SetStereo(Chem.BondStereo.STEREOATROPCCW)

In [9]: hash2 = RegistrationHash.GetMolLayers(m, enable_tautomer_hash_v2=True)

# No atropisomer info either
In [10]: print(hash2)
{
<HashLayer.CANONICAL_SMILES: 1>: 'Cc1cc2c(C(N)=O)c(N)n(-c3c(C)ccc(O)c3C)c2nc1C', 
<HashLayer.ESCAPE: 2>: '', 
<HashLayer.FORMULA: 3>: 'C18H20N4O2', 
<HashLayer.NO_STEREO_SMILES: 4>: 'Cc1cc2c(C(N)=O)c(N)n(-c3c(C)ccc(O)c3C)c2nc1C', 
<HashLayer.NO_STEREO_TAUTOMER_HASH: 5>: '[C]:[C]1:[N]:[C]2:[C](:[C]:[C]:1-[CH3]):[C](:[C](:[N]):[O]):[C](:[N]):[N]:2:[C]1:[C](-[CH3]):[C]:[C]:[C](:[O]):[C]:1-[CH3]_11_0', 
<HashLayer.SGROUP_DATA: 6>: '[]', 
<HashLayer.TAUTOMER_HASH: 7>: '[C]:[C]1:[N]:[C]2:[C](:[C]:[C]:1-[CH3]):[C](:[C](:[N]):[O]):[C](:[N]):[N]:2:[C]1:[C](-[CH3]):[C]:[C]:[C](:[O]):[C]:1-[CH3]_11_0'
}

# We'd want these to be different to deduplicate the atropisomers, but ...
In [11]: hash1 != hash2
Out[11]: False
@ricrogz ricrogz added the bug label Apr 17, 2024
@ricrogz
Copy link
Contributor Author

ricrogz commented Apr 19, 2024

The fix for this is trivial once #7301 is merged, we'll have to wait for it.

@ricrogz ricrogz self-assigned this Apr 22, 2024
@ricrogz ricrogz mentioned this issue May 8, 2024
@greglandrum greglandrum added this to the 2024_03_3 milestone May 16, 2024
greglandrum pushed a commit that referenced this issue May 16, 2024
* make wedgeBondsFromAtropisomers get symm SSSR if not present

* add atropisomer wedging ring info test

* add a test

* update test
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants