2023_03_1 (Q1 2023) Release #6327
Replies: 3 comments 6 replies
-
I see Torsion fingerprint generation changed between 2022_09 and 2023_03. The release notes say "Fix TorsionFingerprints for 15 membered rings", but CHEMBL2448329 with only a 9-membered ring, changed its fingerprint. Overall there are 136 differences in ChEMBL 30 (starting from a previously generated SMILES file, not directly from the SDF distribution). All of them contain a boron. Almost all of them have a B-P ([CHEMBL4524079] (https://www.ebi.ac.uk/chembl/compound_report_card/CHEMBL4524079/), [CHEMBL4524080] (https://www.ebi.ac.uk/chembl/compound_report_card/CHEMBL4524080/), and CHEMBL4103340 do not). Is this expected? Here's a reproducible, starting with 2022.09.5 (these all use Python 3.11 and Boost 1.82.0).
Using 2023.03.1
The differences:
The 136 differing records are: CHEMBL2448329, CHEMBL4299778, CHEMBL4300169, CHEMBL4299873, CHEMBL2448332, CHEMBL2448333, CHEMBL4300689, CHEMBL4300879, CHEMBL122888, CHEMBL4301413, CHEMBL4301911, CHEMBL4302231, CHEMBL4302317, CHEMBL4302850, CHEMBL2448334, CHEMBL2448335, CHEMBL2448336, CHEMBL2448337, CHEMBL2448338, CHEMBL2448339, CHEMBL2086772, CHEMBL223397, CHEMBL223954, CHEMBL224851, CHEMBL4454496, CHEMBL2448345, CHEMBL2029004, CHEMBL2029005, CHEMBL388190, CHEMBL3350429, CHEMBL3350430, CHEMBL3350428, CHEMBL4524079, CHEMBL4524080, CHEMBL484082, CHEMBL4538289, CHEMBL4551891, CHEMBL1802095, CHEMBL1802094, CHEMBL601711, CHEMBL610595, CHEMBL611086, CHEMBL2028999, CHEMBL2029001, CHEMBL2029003, CHEMBL2373179, CHEMBL2029007, CHEMBL2029008, CHEMBL2029009, CHEMBL2029010, CHEMBL2029006, CHEMBL2021421, CHEMBL2029000, CHEMBL2029002, CHEMBL2086768, CHEMBL2086769, CHEMBL2086766, CHEMBL2086773, CHEMBL2068733, CHEMBL2086763, CHEMBL2086764, CHEMBL2086765, CHEMBL2086762, CHEMBL2068734, CHEMBL2086761, CHEMBL2181938, CHEMBL2181939, CHEMBL2181940, CHEMBL2181941, CHEMBL2181943, CHEMBL2181934, CHEMBL2181942, CHEMBL2309024, CHEMBL2368436, CHEMBL2368437, CHEMBL2368438, CHEMBL2368440, CHEMBL2368441, CHEMBL2368442, CHEMBL2368443, CHEMBL2368444, CHEMBL2368445, CHEMBL2364572, CHEMBL2374533, CHEMBL2386492, CHEMBL2386496, CHEMBL2386497, CHEMBL2386498, CHEMBL2386499, CHEMBL2386342, CHEMBL2386493, CHEMBL2386494, CHEMBL2386495, CHEMBL2386341, CHEMBL2386489, CHEMBL2448386, CHEMBL2448387, CHEMBL2448390, CHEMBL2448391, CHEMBL2448397, CHEMBL2448398, CHEMBL2448399, CHEMBL2448400, CHEMBL2448442, CHEMBL2448443, CHEMBL2448444, CHEMBL2448445, CHEMBL2448446, CHEMBL3087162, CHEMBL3087163, CHEMBL3087164, CHEMBL3087165, CHEMBL3087158, CHEMBL3087159, CHEMBL3087160, CHEMBL3087161, CHEMBL3087166, CHEMBL3087167, CHEMBL3392138, CHEMBL4299192, CHEMBL4299194, CHEMBL4299198, CHEMBL4299199, CHEMBL4299204, CHEMBL4299205, CHEMBL4299206, CHEMBL4299207, CHEMBL4299208, CHEMBL4299209, CHEMBL3603848, CHEMBL3604561, CHEMBL3604558, CHEMBL3604560, CHEMBL3558885, CHEMBL3558886, CHEMBL4103340. |
Beta Was this translation helpful? Give feedback.
-
Torsional fingerprints using includeChirality=1 have changed quite a bit between 2022_09_5 and 2023_03_1. Which of the above items lead to that change? Here's a reproducible (it occurs with both SDF and SMILES input). Both are with Python 3.11 and Boost 1.82.0:
Here are the differences in the bits:
The includeChirality=0 fingerprints are identical between these two RDKit versions. |
Beta Was this translation helpful? Give feedback.
-
Last one I noticed a change in the AtomPair fingerprint when using fromAtoms, but I didn't notice anything in the release notes about it. Here's a test program which I'll use to show the issue. It starts by downloading and storing a copy of PubChem record 9425015 into
With 2022.09.5:
With 2023.03.1, the
|
Beta Was this translation helpful? Give feedback.
-
Release_2023.03.1
(Changes relative to Release_2022.09.1)
Acknowledgements
(Note: I'm no longer attempting to manually curate names. If you would like to
see your contribution acknowledged with your name, please set your name in
GitHub)
Michael Banck, Christopher Von Bargen, Jason Biggs, Jonathan Bisson, Jacob
Bloom, Shang Chien, David Cosgrove, Iren Azra Azra Coskun, Andrew Dalke, Eloy
Félix, Peter Gedeck, Desmond Gilmour, Mosè Giordano, Emanuele Guidotti, Tad
Hurst, Gareth Jones, Calvin Josenhans, Maria Kadukova, Brian Kelley, Joos
Kiener, Chris Kuenneth, Martin Larralde, Krzysztof Maziarz, Jeremy Monat, Michel
Moreau, Rocco Moretti, Lucas Morin, Dan Nealschneider, Noel O'Boyle, Vladas
Oleinikovas, Rachael Pirie, Ricardo Rodriguez-Schmidt, Vincent F. Scalfani,
Gregor Simm, Marco Stenta, Georgi Stoychev, Paolo Tosco, Kazuya Ujihara,
Riccardo Vianello, Franz Waibl, Rachel Walker, Patrick Walters,
'dangthatsright', 'mihalyszabo88', 'Deltaus', 'radchenkods',
'josh-collaborationspharma', 'jkh', 'yamasakih'
Highlights
Backwards incompatible changes
includeDativeBonds
which can be used to change this behaviorbond.GetBondDir() == Bond.BondDir.EITHERDOUBLE
) now have their BondStereo set toBond.BondStereo.STEREOANY
and the BondDir information removed by default when molecules are parsed orAssignStereochemistry()
is called with thecleanIt
argument set to True.Bug Fixes:
(github issue GetSubstructMatches uniquify and maxMatches don't work well together #888 from adalke)
(github issue DrawRDKBits raised RDKit error when it applied to the compounds that contains imidazole. #2164 from yamasakih)
(github issue MolFromMol2File: O.co2 atom type correctness check ignores phosphate groups #3246 from chmnk)
(github issue Segfault with coordgen v3.0.0 #4845 from kienerj)
(github issue Segfault with coordgen v3.0.0 #4845 from lucasmorin222)
(github issue Dative bond and alkali and alkaline earth metals #5120 from marcostenta)
(github issue RGD Stereochemistry in decomposed structure is not copied to the matching core #5613 from jones-gareth)
(github issue fp.ToList() fails for empty molecule #5677 from baoilleach)
(github issue SMILES and SMARTS parse bonds in a different order #5683 from ricrogz)
(github issue postgresql makefile needs to be updated to use c++17 #5685 from mbanck)
(github issue Exception raised when reading very large SMILES file #5692 from DavidACosgrove)
(github pull Update warning message about aromaticity detection #5696 from d-b-w)
(github pull stop building catch_main when tests are disabled #5697 from greglandrum)
(github pull Make PandasTools.RGroupDecompositionToFrame re-entrant #5698 from greglandrum)
(github issue PandasTools.RGroupDecompositionToFrame() should call ChangeMoleculeRendering() #5702 from greglandrum)
(github issue MolDraw2D should automatically set bond highlight color when atom colors are changed #5704 from greglandrum)
_WIN32
macro for checking Windows target(github pull Use correct
_WIN32
macro for checking Windows target #5710 from giordano)(github pull Environment not set properly in chirality tests for MinGW builds #5711 from ptosco)
(github pull windows.h header should be lowercase #5712 from ptosco)
(github pull Fixes bond index parsing for w/c/t/ctu labels in CXSMILES/CXSMARTS #5722 from ricrogz)
(github pull Fix a deprecation warning in pythonTestDirRoot #5723 from ricrogz)
(github pull allowNontetrahedralChiralty should be honored when reading/writing SMILES #5728 from greglandrum)
(github pull CFFI/MinimalLib fixes #5729 from ptosco)
(github pull Allow setting custom FREETYPE_LIBRARY/FREETYPE_INCLUDE_DIRS through CMake #5730 from ptosco)
(github issue Missing update path for postgreSQL from 3.8 to 4.2 #5734 from Deltaus)
(github pull Avoid passing a NULL pointer to CanSmiles() #5750 from ptosco)
(github issue CDXML reader incorrectly sets stereo on crossed double bonds #5752 from baoilleach)
R
atom label information lost in molfile if not backed by aM RGP
entry(github issue
R
atom label information lost in molfile if not backed by aM RGP
entry #5763 from eloyfelix)MON
SGroups(github issue Missing monomer labels when depicting
MON
SGroups #5767 from eloyfelix)(github issue Wrongly oriented SGroup bracket #5768 from eloyfelix)
(github pull Adjust LocaleSwitcher on Windows when RDK_BUILD_THREADSAFE_SSS not set #5783 from roccomoretti)
(github issue KekulizationException in tautomer canonicalization #5784 from d-b-w)
(github issue ChemicalReactionToRxnBlock ignores separateAgents if forceV3000 is true #5785 from jacobbloom)
(github pull extend the allowed valences of the alkali earths #5786 from greglandrum)
(github issue Minimallib build (rdkit-js) not working for release 2022.09.2 #5792 from MichelML)
(github pull Remove dependency on MSVC runtime DLL in MinGW builds #5800 from ptosco)
(github pull Update macOS target platform to 10.13 #5802 from ptosco)
R#
atom label information lost in molfile if not handled by theRGP
spec(github issue
R#
atom label information lost in molfile if not handled by theRGP
spec #5810 from eloyfelix)(github issue Stop using recommonmark in the documentation #5812 from greglandrum)
(github issue Properties with new lines can create invalid SDFiles #5827 from bp-kelley)
(github pull Allow building PgSQL RPM and DEB packages #5836 from ptosco)
(github issue Additional output is incorrect when FP count simulation is active #5838 from ptosco)
(github issue Explicit valence check fails for certain SMILES #5849 from josh-collaborationspharma)
(github pull Set emsdk path for freetype in emscripten builds #5857 from ptosco)
(github issue DrawMorganBit fails by default #5863 from eguidotti)
R#
atom label information lost in molfile if not handled by theRGP
spec #5810 in V2000 mol files.(github pull Fix #5810 in V2000 mol files. #5864 from eloyfelix)
(github pull Chemical drawings should be automatically enabled on Colab #5868 from kuelumbus)
(github pull use enhanced stereo when uniquifying in SimpleEnum #5874 from greglandrum)
(github issue Conformer Generation Fails for three-coordinate Ns with specified stereo #5883 from gncs)
(github pull Fix documentation example for KeyFromPropHolder #5886 from gedeck)
(github pull Allow unrecognized atom types when strictParsing=False #5891 from greglandrum)
(github issue DetermineBonds assigning methyl carbon as tetrahedral center #5894 from jasondbiggs)
(github issue numpy.float is no longer supported and causes exceptions #5895 from PatWalters)
(github issue moldraw2DTest1 failure when building on aarch64 #5899 from vfscalfani)
(github issue DetermineBondOrders running out of memory on medium-sized disconnected structure #5902 from jasondbiggs)
(github pull clear MDL Rgroup labels from core atoms when we aren't using them #5904 from greglandrum)
(github issue Conformer generator produces strange structure for substituted butadiene #5913 from gncs)
MHFPEncoder::Distance
doesn't compute a (Jaccard) distance(github issue
MHFPEncoder::Distance
doesn't compute a (Jaccard) distance #5919 from althonos)(github pull AvalonTools: Avoid that trailing garbage pollutes the fmemopen buffer #5928 from ptosco)
(github issue "not" queries in molfiles get inverted #5930 from d-b-w)
(github issue CalcTPSA() doesn't use options when caching #5941 from greglandrum)
(github issue Bad drawing of end points for dative bonds #5943 from DavidACosgrove)
(github issue Extremes of drawn ellipses not being calculated correctly. #5947 from DavidACosgrove)
(github issue Arrow heads of dative bonds are different sizes #5949 from DavidACosgrove)
(github pull stop caching ring-finding results #5955 from greglandrum)
(github issue Wrong bond endpoint when connecting to wedge bond in 2D image #5963 from stgeo)
(github pull Tiny change to get demo.html to load in legacy browsers #5964 from ptosco)
(github pull detect bad double bond stereo in conformer generation #5967 from greglandrum)
(github issue drawing code should not generate kekulization errors #5974 from greglandrum)
(github pull Adjust expected test results for newer freetype versions #5979 from greglandrum)
(github issue CanonicalRankAtomsInFragment example in the documentation is not reproducible #5986 from chmnk)
(github pull Exception in RegistrationHash for molecules with bad bond directions #5987 from d-b-w)
(github pull Updated the GetMolHash docstring for accuracy #5988 from irenazra)
(github pull Fix a problem with pickling molecules with more than 255 rings #5992 from greglandrum)
(github pull Support Python 3.11 #5994 from greglandrum)
(github issue Incorrect disconnection of CC(=O)O[Mg]OC(=O)C #5997 from DavidACosgrove)
(github issue PostgreSQL autovacuum stuck when molecules with query features are stored in mol columns #6002 from mihalyszabo88)
and
from C++ headers(github pull Remove
and
from C++ headers #6003 from d-b-w)(github issue [PH3] incorrectly recognized as potential stereo center #6011 from greglandrum)
(github issue Potential nontetrahedral stereo is recognized when nontetrahedral stereo is disabled. #6012 from greglandrum)
(github issue MolEnumerator is not propagating query information to molecules #6014 from greglandrum)
(github issue Reactions do not propagate query information to products #6015 from greglandrum)
(github issue Error rendering to very small canvas #6025 from DavidACosgrove)
(github issue Bad double bond drawn for collinear atoms #6027 from DavidACosgrove)
(github pull Fix some minor leaks #6029 from ricrogz)
[!#X]
query (for any X)(github issue Cannot draw molecule which includes an atom with a
[!#X]
query (for any X) #6033 from ShangChien)(github issue FragmentOnBonds may create unexpected radicals #6034 from ricrogz)
(github issue Calling MurckoScaffold on molecule causes bug in pickling #6036 from dangthatsright)
(github pull Bump maeparser and coordgen versions #6039 from ricrogz)
(github issue enhanced stereo is still included in CXSMILES if isomericSmiles=False #6040 from greglandrum)
(github issue Issues with ACS1996 drawing mode on a small canvas #6041 from DavidACosgrove)
(github issue Cyclobutyl group in a macrocycle triggers a stereo center #6049 from cdvonbargen)
(github issue stereogroups not combined when parsing CXSMILES #6050 from greglandrum)
(github issue Regression in depicting molecules with MDL query atoms #6054 from ptosco)
(github issue Bad drawing of ferrocene #6058 from DavidACosgrove)
(github pull Remove check for ring information from Atom::Match #6063 from fwaibl)
(github pull Correct docstring for minFontSize. #6066 from DavidACosgrove)
(github pull Minor code cleanup #6101 from ptosco)
(github issue Dummy atoms should not be considered to be metals for M and MH queries #6106 from greglandrum)
(github issue Drawing in ACS mode crops small images #6111 from DavidACosgrove)
(github issue Drawing in ACS1996 mode throws ValueError: Bad Conformer Id if molecule has no coords. #6112 from DavidACosgrove)
(github issue Single atom or queries with hydrogens shouldn't trigger warning in mergeQueryHs #6119 from bp-kelley)
(github issue DetermineBonds fails for single H atom #6121 from gncs)
(github pull MinimalLib: avoid that assignStereochemistry() fails when removeHs=true #6134 from ptosco)
(github issue Round-tripping a reaction through pickle changes the outputs from RunReactants #6138 from kmaziarz)
(github issue RGD and enhanced stereochemistry #6146 from jones-gareth)
(github issue MaeMolSupplier requires bond block #6153 from cdvonbargen)
(github issue Incorrect doule bond drawing with MolDraw2DSVG #6160 from radchenkods)
(github pull BondDir not cleared from bonds that aren't stereoactive #6162 from greglandrum)
(github issue Crossed bond not correctly drawn #6170 from ptosco)
(github issue ReactionFromRxnBlock fails on bond with reacting center status set #6195 from jones-gareth)
(github issue Possible regression in the atom/bond highlighting code #6200 from ptosco)
(github pull Updated README to build the PostgreSQL cartridge + bug fix #6214 from ptosco)
(github issue Atoms may get flagged with non-tetrahedral stereo even when it is not allowed #6217 from ricrogz)
(github pull Fix TorsionFingerprints for 15 membered rings #6228 from kazuyaujihara)
(github pull Fix build warnings #6235 from ricrogz)
(github issue Tri-coordinate atom with implicit + neighbor H atom is found potentially chiral #6239 from ricrogz)
(github issue DativeBondsToHaptic doesn't set _MolFileBondEndPts correctly. #6252 from DavidACosgrove)
(github issue Round-tripping ferrocene through HapticBondsToDatives loses drawing highlights. #6253 from DavidACosgrove)
(github pull Using Chiral Tag instead of CIPCode to ensure preservation of chirality in addHs #6259 from HalflingHelper)
(github pull Update assignSubstructureFilters.py #6270 from OleinikovasV)
(github pull deal with deprecated DataFrame.append method #6272 from greglandrum)
(github issue compile-time error with GCC 12.2.1 on Fedora 36 #6274 from rvianello)
(github pull Fix UnitTestPandasTools for running without pandas installed. #6299 from roccomoretti)
(github pull Aromatic dashes look bad #6303 from greglandrum)
Cleanup work:
(github pull Do deprecations for the 2023.03 release #5675 from greglandrum)
(github pull run clang_format #5676 from greglandrum)
(github pull Cleanup work on documentation Makefile #5804 from greglandrum)
(github pull Refactor RGD moving function implementations from header to source files #5958 from jones-gareth)
(github pull Disable POPCNT on M1 #6081 from bjonnh-work)
(github pull Remove spurious full stops from warnings. #6124 from DavidACosgrove)
(github pull Reformat Python code for 2023.03 release #6294 from ricrogz)
(github pull Reformat C/C++ code ahead of 2023.03 release #6295 from ricrogz)
New Features and Enhancements:
(github issue mol V3000: multicenter dative bond #5121 from marcostenta)
(github pull add molecular filter examples #5647 from RPirie96)
(github pull Use templates in RDKit coordinate generation #5643 from rachelnwalker)
(github pull add MACCS fp to the MinimalLib #5707 from eloyfelix)
(github pull Enable additional parameters in prepareAndDrawMolecule() and expose them to CFFI/MinimalLib #5731 from ptosco)
(github pull add includeRedundantEnvironments support to GetMorganGenerator #5732 from greglandrum)
(github pull FingerprintGenerator refactoring #5748 from greglandrum)
(github pull Expose RDLog to SWIG wrappers #5749 from ptosco)
(github pull Add a timeout protection for CIP calculations #5772 from tadhurst-cdd)
(github pull Expose getMolFrags in CFFI and MinimalLib #5774 from ptosco)
(github pull Get the wrappers working with SWIG 4.0 #5795 from greglandrum)
(github pull Update AvalonTools to version 2.0.4a #5796 from ptosco)
(github pull Add early example of drawing a molecule to Getting Started with the RDKit in Python #5803 from bertiewooster)
(github pull Enable get_molblock(details_json) from MinimalLib #5806 from ptosco)
(github pull Improvements to PandasTools.SaveXlsxFromFrame #5835 from ptosco)
(github pull swap boost::tuple to std::tuple #5851 from greglandrum)
(github pull Make it easy to calculate all 2D descriptors #5892 from greglandrum)
(github pull Introduces AvgIpc descriptor #5896 from greglandrum)
(github pull Add SMILES to each group abbreviation in Cookbook #5908 from bertiewooster)
(github pull Support SubstanceGroups and StereoGroups in JSON #5909 from greglandrum)
(github pull Add info about mergeHs to README. #5910 from DavidACosgrove)
(github pull Cookbook - update entry 1 and add entries 38 and 39 #5918 from vfscalfani)
(github pull Allow the sources of conformer generation failures to be retrieved #5960 from greglandrum)
(github pull Create getExperimentalTorsions() function #5969 from greglandrum)
(github pull Molblock wedging improvements #5981 from ptosco)
(github pull MinimalLib JS functions to add/remove Hs in place #5984 from ptosco)
(github pull Adds Pat Walter's Chembl Filters extraction to the FilterCatalog #5991 from bp-kelley)
(github pull Add depiction coordinates to molzip #5993 from jones-gareth)
(github pull Enable using STL algorithms on ROMol atoms and bonds #5995 from ptosco)
(github pull Enable building MinimalLib as a plain JS file for usage in legacy/headless browsers #5999 from ptosco)
(github pull Allow WriteSDF to create v3000 SDF files #6004 from jkhales)
(github issue add maxRecursiveMatches to SubstructMatchParameters #6017 from greglandrum)
(github pull Expose fingerprint generator options to python #6024 from greglandrum)
(github pull Allow SMARTS of zero order bonds to match zero order bonds #6037 from d-b-w)
(github pull Change IUPAC metal->non-metal single bonds to dative #6038 from DavidACosgrove)
(github pull Add canonicalization of stereo groups (enhanced stereo) #6051 from greglandrum)
(github pull Improve MaeMolSupplier API #6053 from ricrogz)
(github pull Enable optional visualization of complex query atoms in a more compact form #6056 from ptosco)
(github pull Start a Maestro file (.mae) writer. #6069 from ricrogz)
(github pull Expose some stereochemistry-related functions to SWIG wrappers #6075 from ptosco)
(github pull Add option to only include shortest paths for topological torsion fingerprints #6090 from greglandrum)
(github pull Enable "smilesSymbol" substitution in SMARTS #6096 from ricrogz)
(github pull Add the option to wedge two bonds at chiral centers #6108 from greglandrum)
(github pull Another minor code cleanup #6109 from ptosco)
(github pull A few SWIG tweaks #6110 from ptosco)
(github pull Stereochemistry-related SWIG updates #6127 from ptosco)
(github pull SWIG pickling improvements (and other cleanup) #6133 from ptosco)
(github pull Improved handling of organometallics #6139 from DavidACosgrove)
(github pull expose two missing QueryAtom types to python #6158 from greglandrum)
(github pull Support Pseudoatoms like Pol and Mod in the RegistrationHash #6175 from irenazra)
(github pull Better name for areBondsLinear. #6196 from DavidACosgrove)
(github pull add features to allow drawing molecules in arbitrary positions on a large canvas #6210 from greglandrum)
(github issue Support chirality when determining if a molecule is a reaction reactant #6211 from jones-gareth)
(github issue rdMolHash.MolHash function should allow customization of the CXSmiles via Chem.CXSmilesFields #6224 from irenazra)
(github pull Updated README for cartridge installation into conda PostgreSQL #6236 from ptosco)
(github issue Add a function to translate the MDL chiral flag into enhanced stereo groups #6241 from ricrogz)
(github pull Add support for generic matching in the PgSQL cartridge #6269 from bjonnh-work)
(github pull allowOptionalAttachments should also include terminal query atoms matching hydrogen #6280 from ptosco)
(github pull Exposed queryColour in MolDrawOptions #6282 from ptosco)
(github pull Add a new tautomer mol hash #6289 from glandrum)
(github pull has_coords() now reports whether coords are 2D or 3D if present #6297 from ptosco)
(github pull Improve the installation/testing instructions for linux/conda #6298 from roccomoretti)
Code removed in this release:
SmilesParserParams
optionuseLegacyStereo
has been removed. Please useSetUseLegacyStereoPerception()
instead.which used to take several individual parameters have been removed.
Please use the versions which take a single JSON string parameter.
PrintAsBase64PNGString
function inPandasTools
has been removed.Please use
PrintAsImageString
instead.This discussion was created from the release 2023_03_1 (Q1 2023) Release.
Beta Was this translation helpful? Give feedback.
All reactions