Replies: 4 comments
-
Hi @peastman, that's an interesting use case that I don't think we necessarily planned for. It is an interesting one though and it would be nice if it actually worked. The use case we envisioned is that you start with a molecule that has 3D coordinates, and then call either I will fix the bug you reported w.r.t. the handling of unspecified bonds (#7299), but in the meantime, I think you can probably make your code work by always setting bonds to SINGLE (you do this with the optional argument to
|
Beta Was this translation helpful? Give feedback.
-
Thanks! I'd also tried the pathway you suggest of loading coordinates and letting it identify the bonds automatically. The problem I had with that is that it isn't reliable. It sometimes misidentifies which atoms are bonded to each other. Since I already know exactly which ones are bonded, I hoped this would be more robust by eliminating a potential point of failure. By the way, this is also related to #6501, which is another manifestation of the same problem: the bonds are already known, but it ignores the known bonds, tries to identify them based on coordinates, and sometimes gets them wrong. |
Beta Was this translation helpful? Give feedback.
-
I wrote the following routine to test it out. It takes a SMILES string and creates a Mol from it. It then creates a second Mol by copying over just the atoms and bonds and calls def compare_molecules(smiles):
original_mol = Chem.MolFromSmiles(smiles)
original_mol = Chem.AddHs(original_mol)
new_mol = Chem.EditableMol(Chem.Mol())
for atom in original_mol.GetAtoms():
a = Chem.Atom(atom.GetAtomicNum())
a.SetNoImplicit(True)
new_mol.AddAtom(a)
for bond in original_mol.GetBonds():
new_mol.AddBond(bond.GetBeginAtomIdx(), bond.GetEndAtomIdx(), Chem.BondType.SINGLE)
new_mol = new_mol.GetMol()
charge = sum(atom.GetFormalCharge() for atom in original_mol.GetAtoms())
rdDetermineBonds.DetermineBondOrders(new_mol, charge=charge, embedChiral=False)
original_smiles = Chem.MolToSmiles(original_mol)
new_smiles = Chem.MolToSmiles(new_mol)
if original_smiles != new_smiles:
print('Error:', smiles)
print(original_smiles)
print(new_smiles)
return False
return True I tested it on thousands of SMILES strings drawn from PubChem, and over 90% of them give a perfect match, so that's encouraging. Even when they don't match, it isn't clear to me that the result is wrong. I'm not a chemist, but in a lot of cases the result of There are some clear problems though. I found a couple of cases where it just hangs. It seems to be going into an infinite loop or something. Try this: compare_molecules('[N-]=[N+]=NCC1OC(OS(=O)(=O)[O-])C(OS(=O)(=O)[O-])C(OS(=O)(=O)[O-])C1OS(=O)(=O)[O-]') It hangs and I have to force quit the process. There also are some cases where the results look very questionable to me. For example, try this: compare_molecules('[NH-]c1n[n+](-c2ccc(Cl)c(Cl)c2)no1') It produces the output
It puts the positive charge onto the oxygen rather than the nitrogen, which seems unlikely to be correct. |
Beta Was this translation helpful? Give feedback.
-
I'm also finding a lot of cases where it fails because it makes invalid assumptions about the possible valence of different elements.
|
Beta Was this translation helpful? Give feedback.
-
I'm trying to use
DetermineBondOrders()
, but I can't figure out how to make it work correctly. The following script builds a methane molecule and tries to determine the bond orders.It fails to determine them. Here is the output.
Can anyone tell me what I'm doing wrong?
Thanks!
Beta Was this translation helpful? Give feedback.
All reactions