Keto-enol tautomerization and general Carboxylic acid tautomer considerations #6822
Unanswered
bwolfe-benchling
asked this question in
General
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I am curious as to the logic of keto-enol tautomerization in RDKit. The smarts pattern used for 1,3 keto-enol allows for aldehydes and ketones:
But also for esters and acids which are not likely to undergo tautomerization:
Maybe likelihood isn't a factor.
But then why does the 1,5 keto enol pattern only allow for ketones, not aldehydes, etsers or acids?
It seems like the 1,5 should at least allow aldehydes, and there could be some debate as to whether 1,5 should include ester and acid or 1,3 should exclude them.
Carboxylic acids in general
In terms of excluding COOH from tautomerization in general, not just keto-enol, there are some compounds like pubchem CID 5479494 that spend a LOT of cycles of enumeration (~9,000 tautomers if limits are increased) generating geminal vinylic diol structures (80 % of generated tautomer forms) that are pretty ugly and make the choice of a canonical tautomer a choice of the best out of many bad structures, especially so if the tautomer and/or transform count is limited.
Besides keto-enol shifts these are largely due to various types of carboxylic acid endpoint based shifts like this subset:
In initial experiments where I have adjusted the smarts patterns to disallow COOH as a tautomer H endpoint that same compound (pubchem CID 5479494) that originally generated 9,000 tautomers, generates only 6. I guess that indicates that there are many reasonable tautomer forms that don't involve an enol form of COOH that can only be enumerated to via an intermediate tautomer form that does involve an enol form of COOH. I don't know if that's good or bad.
Beta Was this translation helpful? Give feedback.
All reactions