You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
When we canonicalize atropisomers, some atropisomers lose the bond stereo, while others do not. Most do not lose it. We've identified some minimal examples where they do lose it.
To Reproduce
from rdkit import Chem
from rdkit.Chem.MolStandardize import rdMolStandardize
enumerator = rdMolStandardize.TautomerEnumerator()
for mol in Chem.SDMolSupplier("min.sdf"):
pre_canon = set(bond.GetStereo() for bond in mol.GetBonds())
canonical = enumerator.Canonicalize(mol)
post_canon = set(bond.GetStereo() for bond in canonical.GetBonds())
print(len(pre_canon), len(post_canon))
Expected behavior
The atropisomer bond stereo should not be lost during canonicalization.
Screenshots
min.sdf
ChemDraw03222416162D
0 0 0 0 0 0 V3000
M V30 BEGIN CTAB
M V30 COUNTS 18 19 0 0 0
M V30 BEGIN ATOM
M V30 1 C -2.492152 -0.616602 0.000000 0
M V30 2 C -2.492152 -1.438737 0.000000 0
M V30 3 C -1.779687 -1.849805 0.000000 0
M V30 4 C -1.067987 -1.438737 0.000000 0
M V30 5 C -1.067987 -0.616602 0.000000 0
M V30 6 C -1.779723 -0.205597 0.000000 0
M V30 7 N -1.779723 0.616539 0.000000 0
M V30 8 C -0.355996 0.616602 0.000000 0
M V30 9 C -0.355996 -0.205534 0.000000 0
M V30 10 C 0.356352 -0.616808 0.000000 0
M V30 11 C 1.068171 -0.205534 0.000000 0
M V30 12 C 1.068171 0.616602 0.000000 0
M V30 13 C 0.356471 1.027671 0.000000 0
M V30 14 C 1.780162 1.027670 0.000000 0
M V30 15 C 1.780162 1.849805 0.000000 0
M V30 16 Cl -1.067987 1.027670 0.000000 0
M V30 17 F 0.356352 -1.438945 0.000000 0
M V30 18 O 2.492152 0.616602 0.000000 0
M V30 END ATOM
M V30 BEGIN BOND
M V30 1 2 1 2
M V30 2 1 2 3 CFG=3
M V30 3 2 3 4
M V30 4 1 5 4 CFG=3
M V30 5 2 5 6
M V30 6 1 6 1
M V30 7 1 6 7
M V30 8 2 8 9
M V30 9 1 9 10
M V30 10 2 10 11
M V30 11 1 11 12
M V30 12 2 12 13
M V30 13 1 13 8
M V30 14 1 12 14
M V30 15 1 14 15
M V30 16 1 8 16
M V30 17 1 10 17
M V30 18 2 14 18
M V30 19 1 5 9
M V30 END BOND
M V30 END CTAB
M END
$$$$
min.sdf
ChemDraw03222416162D
0 0 0 0 0 0 V3000
M V30 BEGIN CTAB
M V30 COUNTS 18 19 0 0 0
M V30 BEGIN ATOM
M V30 1 C -2.492100 -0.616993 0.000000 0
M V30 2 C -2.492100 -1.439099 0.000000 0
M V30 3 C -1.779659 -1.849384 0.000000 0
M V30 4 C -1.067986 -1.439099 0.000000 0
M V30 5 C -1.067986 -0.616993 0.000000 0
M V30 6 C -1.779861 -0.205526 0.000000 0
M V30 7 N -1.779861 0.616610 0.000000 0
M V30 8 C -0.355996 0.616181 0.000000 0
M V30 9 C -0.355996 -0.205926 0.000000 0
M V30 10 C 0.356657 -0.617376 0.000000 0
M V30 11 C 1.068119 -0.205926 0.000000 0
M V30 12 C 1.068119 0.616181 0.000000 0
M V30 13 C 0.356444 1.026468 0.000000 0
M V30 14 C 1.780109 1.027249 0.000000 0
M V30 15 C 1.780109 1.849384 0.000000 0
M V30 16 Cl -1.067986 1.027249 0.000000 0
M V30 17 F 0.356657 -1.439512 0.000000 0
M V30 18 O 2.492100 0.616181 0.000000 0
M V30 END ATOM
M V30 BEGIN BOND
M V30 1 2 1 2
M V30 2 1 2 3 CFG=1
M V30 3 2 3 4
M V30 4 1 5 4 CFG=1
M V30 5 2 5 6
M V30 6 1 6 1
M V30 7 1 6 7
M V30 8 2 8 9
M V30 9 1 9 10
M V30 10 2 10 11
M V30 11 1 11 12
M V30 12 2 12 13
M V30 13 1 13 8
M V30 14 1 12 14
M V30 15 1 14 15
M V30 16 1 8 16
M V30 17 1 10 17
M V30 18 2 14 18
M V30 19 1 5 9
M V30 END BOND
M V30 END CTAB
M END
$$$$
min.sdf
ChemDraw03222416162D
0 0 0 0 0 0 V3000
M V30 BEGIN CTAB
M V30 COUNTS 17 18 0 0 0
M V30 BEGIN ATOM
M V30 1 C -2.136157 -0.616602 0.000000 0
M V30 2 C -2.136157 -1.438737 0.000000 0
M V30 3 C -1.423691 -1.849805 0.000000 0
M V30 4 C -0.711991 -1.438737 0.000000 0
M V30 5 C -0.711991 -0.616602 0.000000 0
M V30 6 C -1.423727 -0.205597 0.000000 0
M V30 7 N -1.423727 0.616538 0.000000 0
M V30 8 C -0.000001 0.616602 0.000000 0
M V30 9 C -0.000001 -0.205534 0.000000 0
M V30 10 C 0.712348 -0.616809 0.000000 0
M V30 11 C 1.424167 -0.205534 0.000000 0
M V30 12 C 1.424167 0.616602 0.000000 0
M V30 13 C 0.712467 1.027670 0.000000 0
M V30 14 C 2.136157 1.027669 0.000000 0
M V30 15 C 2.136157 1.849805 0.000000 0
M V30 16 Cl -0.711991 1.027669 0.000000 0
M V30 17 F 0.712348 -1.438945 0.000000 0
M V30 END ATOM
M V30 BEGIN BOND
M V30 1 2 1 2
M V30 2 1 2 3 CFG=3
M V30 3 2 3 4
M V30 4 1 5 4 CFG=3
M V30 5 2 5 6
M V30 6 1 6 1
M V30 7 1 6 7
M V30 8 2 8 9
M V30 9 1 9 10
M V30 10 2 10 11
M V30 11 1 11 12
M V30 12 2 12 13
M V30 13 1 13 8
M V30 14 1 12 14
M V30 15 1 14 15
M V30 16 1 8 16
M V30 17 1 10 17
M V30 18 1 5 9
M V30 END BOND
M V30 END CTAB
M END
$$$$
min.sdf
ChemDraw03222416162D
0 0 0 0 0 0 V3000
M V30 BEGIN CTAB
M V30 COUNTS 17 18 0 0 0
M V30 BEGIN ATOM
M V30 1 C -2.136104 -0.616993 0.000000 0
M V30 2 C -2.136104 -1.439097 0.000000 0
M V30 3 C -1.423663 -1.849384 0.000000 0
M V30 4 C -0.711992 -1.439097 0.000000 0
M V30 5 C -0.711992 -0.616993 0.000000 0
M V30 6 C -1.423865 -0.205526 0.000000 0
M V30 7 N -1.423865 0.616610 0.000000 0
M V30 8 C -0.000001 0.616180 0.000000 0
M V30 9 C -0.000001 -0.205926 0.000000 0
M V30 10 C 0.712652 -0.617376 0.000000 0
M V30 11 C 1.424113 -0.205926 0.000000 0
M V30 12 C 1.424113 0.616180 0.000000 0
M V30 13 C 0.712440 1.026468 0.000000 0
M V30 14 C 2.136104 1.027248 0.000000 0
M V30 15 C 2.136104 1.849384 0.000000 0
M V30 16 Cl -0.711992 1.027248 0.000000 0
M V30 17 F 0.712652 -1.439512 0.000000 0
M V30 END ATOM
M V30 BEGIN BOND
M V30 1 2 1 2
M V30 2 1 2 3 CFG=1
M V30 3 2 3 4
M V30 4 1 5 4 CFG=1
M V30 5 2 5 6
M V30 6 1 6 1
M V30 7 1 6 7
M V30 8 2 8 9
M V30 9 1 9 10
M V30 10 2 10 11
M V30 11 1 11 12
M V30 12 2 12 13
M V30 13 1 13 8
M V30 14 1 12 14
M V30 15 1 14 15
M V30 16 1 8 16
M V30 17 1 10 17
M V30 18 1 5 9
M V30 END BOND
M V30 END CTAB
M END
$$$$
Configuration (please complete the following information):
RDKit version: 2024.3.1b1
OS: Ubuntu 20.04
Python version (if relevant): 3.11 (not sure if relevant)
Are you using conda? no
If you are using conda, which channel did you install the rdkit from? n/a
Additional context
I have other possible examples of lack of identification, this is a minimized example that hopefully has the same issue as my other issues.
The text was updated successfully, but these errors were encountered:
Hi @pechersky,
Thanks for testing out the beta of the new release!
Since the atropisomer support is new, this is new territory for us, but I believe that what happens in this case is correct.
Here are the tautomers which result from enumerating your first molecule:
In three of those tautomers, the atropisomeric bond has been converted to a double bond, that will scramble the atropisomerism..
The second molecule only has one tautomer (the starting structure), so this scrambling doesn't happen.
One can argue about the rules for enumerating tautomers (and people do!), but given the rules which are being used, I think this is the correct behavior. Seem reasonable?
Thanks for the quick response. I completely agree with you that given the current rules for enumerating tautomers, the conversion to the double bond will get lost. In our use case, we can choose to record the atropisomerism before canonicalization, so this isn't a blocker.
However, seems like this behavior is unexpected in two ways: in the case of tautomerization during canonicalization on E/Z defined double bonds, SetRemoveBondStereo(False) retains E/Z in the canonicomer. Here, retaining by toggling the flag has no effect. Additionally, and this is where it gets tricky, I think in atropisomeric systems, the orbitals are not conjugated so that shifting around the bond orders would not occur readily. With regards to tautomerization during canonicalization, I understand why this shouldn't be a factor in applying existing rules -- this would mean that either all biaryls need to be first assessed for potential atropisomerism (of drawn with a flat biaryl bond) or that stereoindicated biaryls get treated differently. Once again, in our use case, we will bypass the issue by capturing stereo prior to canonicalization and hope to not have tautome versions of the same atropisomers around.
For a separate, upcoming issue, we have identified that atropisomerism isn't recognized in biaryl macrocyclic systems -- we're still minimizing the test cases. Thank you again for your quick response!
Describe the bug
When we canonicalize atropisomers, some atropisomers lose the bond stereo, while others do not. Most do not lose it. We've identified some minimal examples where they do lose it.
To Reproduce
Expected behavior
The atropisomer bond stereo should not be lost during canonicalization.
Screenshots
Configuration (please complete the following information):
Additional context
I have other possible examples of lack of identification, this is a minimized example that hopefully has the same issue as my other issues.
The text was updated successfully, but these errors were encountered: