Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No coords atropisomers - fix smiles output of atrop wedges after reordering #7418

Merged
merged 17 commits into from
May 7, 2024
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
107 changes: 107 additions & 0 deletions Code/GraphMol/CIPLabeler/catch_tests.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@
#include <GraphMol/FileParsers/FileParsers.h>
#include <GraphMol/MarvinParse/MarvinParser.h>
#include <GraphMol/test_fixtures.h>
#include <GraphMol/Substruct/SubstructMatch.h>

#include "CIPLabeler.h"
#include "Digraph.h"
Expand Down Expand Up @@ -840,3 +841,109 @@ TEST_CASE("upsertTest", "[basic]") {
CHECK(smilesOut != "");
}
}

TEST_CASE("atropisomers", "[basic]") {
SECTION("atropisomers1") {
UseLegacyStereoPerceptionFixture useLegacy(false);

std::string rdbase = getenv("RDBASE");

std::vector<std::string> files = {
"BMS-986142_atrop5.sdf", "BMS-986142_atrop1.sdf",
"BMS-986142_atrop2.sdf", "JDQ443_atrop1.sdf",
"JDQ443_atrop2.sdf", "Mrtx1719_atrop1.sdf",
"Mrtx1719_atrop2.sdf", "RP-6306_atrop1.sdf",
"RP-6306_atrop2.sdf", "Sotorasib_atrop1.sdf",
"Sotorasib_atrop2.sdf", "ZM374979_atrop1.sdf",
"ZM374979_atrop2.sdf"};

for (auto file : files) {
std::cout << "File: " << file << std::endl;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please don't include output like this in the tests, it makes the running the tests really noisy and doesn't add anything. If there's an informative message that you want to output when a test fails, you can do that with catch2's INFO() macro.

I know that I approved several older PRs that also have a large amount of output, but I plan to do a PR to clean that up in the not-too-distant future.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You fixed that one example, but there are a number of other outputs to std::cout in that file and in the other tests added for this PR

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I got them all - probably one I didn't put in.

auto fName =
rdbase + "/Code/GraphMol/FileParsers/test_data/atropisomers/" + file;

auto molsdf = MolFileToMol(fName, true, true, true);
REQUIRE(molsdf);
CIPLabeler::assignCIPLabels(*molsdf, 100000);

std::map<std::pair<unsigned int, unsigned int>, std::string> CIPVals;
for (auto bond : molsdf->bonds()) {
auto a1 = bond->getBeginAtomIdx();
auto a2 = bond->getEndAtomIdx();
if (a1 > a2) {
std::swap(a1, a2);
}

if (bond->hasProp(common_properties::_CIPCode)) {
CIPVals[std::make_pair(a1, a2)] =
bond->getProp<std::string>(common_properties::_CIPCode);
} else {
CIPVals[std::make_pair(a1, a2)] = "N/A";
}
}

molsdf->clearConformers();

SmilesWriteParams wp;
wp.canonical = false;
wp.doKekule = false;
unsigned int flags =
RDKit::SmilesWrite::CXSmilesFields::CX_MOLFILE_VALUES |
RDKit::SmilesWrite::CXSmilesFields::CX_ATOM_PROPS |
RDKit::SmilesWrite::CXSmilesFields::CX_BOND_CFG |
RDKit::SmilesWrite::CXSmilesFields::CX_BOND_ATROPISOMER;
SmilesParserParams pp;
pp.allowCXSMILES = true;
pp.sanitize = true;

auto smi = MolToCXSmiles(*molsdf, wp, flags);
auto newMol = SmilesToMol(smi, pp);
CIPLabeler::assignCIPLabels(*newMol, 100000);

std::map<std::pair<unsigned int, unsigned int>, std::string> newCIPVals;
for (auto bond : newMol->bonds()) {
auto a1 = bond->getBeginAtomIdx();
auto a2 = bond->getEndAtomIdx();
if (a1 > a2) {
std::swap(a1, a2);
}
if (bond->hasProp(common_properties::_CIPCode)) {
newCIPVals[std::make_pair(a1, a2)] =
bond->getProp<std::string>(common_properties::_CIPCode);
} else {
newCIPVals[std::make_pair(a1, a2)] = "N/A";
}
}

RDKit::SubstructMatchParameters params;

auto match = RDKit::SubstructMatch(*molsdf, *newMol, params);

for (auto thisBond : newMol->bonds()) {
unsigned int a1 = thisBond->getBeginAtomIdx();
unsigned int a2 = thisBond->getEndAtomIdx();
if (a1 > a2) {
std::swap(a1, a2);
}

unsigned int a1match = match[0][a1].second;
unsigned int a2match = match[0][a2].second;

if (a1match > a2match) {
std::swap(a1match, a2match);
}
if (CIPVals[std::make_pair(a1match, a2match)] !=
newCIPVals[std::make_pair(a1, a2)]) {
std::cout << "Bond " << thisBond->getBeginAtomIdx() << " - "
<< thisBond->getEndAtomIdx() << " has changed from "
<< CIPVals[std::make_pair(a1match, a2match)] << " to "
<< newCIPVals[std::make_pair(a1, a2)] << std::endl;
}
CHECK(CIPVals[std::make_pair(a1match, a2match)] ==
newCIPVals[std::make_pair(a1, a2)]);
}

std::cout << "Done" << std::endl;
}
}
}
79 changes: 74 additions & 5 deletions Code/GraphMol/SmilesParse/CXSmilesOps.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -2049,11 +2049,80 @@ std::string get_bond_config_block(
bool isAnAtropisomer = false;

const Atom *firstAtom = bond->getBeginAtom();
for (auto bondNbr : mol.atomBonds(firstAtom)) {
if (bondNbr->getStereo() == Bond::BondStereo::STEREOATROPCW ||
bondNbr->getStereo() == Bond::BondStereo::STEREOATROPCCW) {
isAnAtropisomer = true;
break;
if (bd == Bond::BondDir::BEGINDASH || bd == Bond::BondDir::BEGINWEDGE) {
for (auto bondNbr : mol.atomBonds(firstAtom)) {
if (bondNbr->getIdx() == bond->getIdx()) {
continue; // a bond is not its own neighbor
}
if (bondNbr->getStereo() == Bond::BondStereo::STEREOATROPCW ||
bondNbr->getStereo() == Bond::BondStereo::STEREOATROPCCW) {
isAnAtropisomer = true;

// if it is for an atropisomer and there are no coords, check to see
// if the wedge needs to be flipped based on the smiles reordering
if (!coordsIncluded && isAnAtropisomer) {
Atropisomers::AtropAtomAndBondVec atomAndBondVecs[2];
if (!Atropisomers::getAtropisomerAtomsAndBonds(
bondNbr, atomAndBondVecs, mol)) {
throw RDKit::SmilesParseException(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This shouldn't be a SmilesParseException since we aren't parsing SMILES at this point.
This should either be a ValueError if it's something which could happen but shouldn't or handled using the RDKit's CHECK_INVARIANT() if this only happens in the case of an RDKit bug.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed to value error

"Internal error - should not occur");
// should not happend
} else {
unsigned int swaps = 0;

unsigned int firstReorderedIdx =
std::find(atomOrder.begin(), atomOrder.end(),
bondNbr->getBeginAtom()->getIdx()) -
atomOrder.begin();
unsigned int secondReorderedIdx =
std::find(atomOrder.begin(), atomOrder.end(),
bondNbr->getEndAtom()->getIdx()) -
atomOrder.begin();
if (firstReorderedIdx > secondReorderedIdx) {
++swaps;
}

for (unsigned int bondAtomIndex = 0; bondAtomIndex < 2;
++bondAtomIndex) {
if (atomAndBondVecs[bondAtomIndex].first == firstAtom)
continue; // swapped atoms on the side where the wedge bond
// is does NOT change the wedge bond
if (atomAndBondVecs[bondAtomIndex].second.size() == 2) {
unsigned int firstOtherAtomIdx =
atomAndBondVecs[bondAtomIndex]
.second[0]
->getOtherAtom(atomAndBondVecs[bondAtomIndex].first)
->getIdx();
unsigned int secondOtherAtomIdx =
atomAndBondVecs[bondAtomIndex]
.second[1]
->getOtherAtom(atomAndBondVecs[bondAtomIndex].first)
->getIdx();

unsigned int firstReorderedAtomIdx =
std::find(atomOrder.begin(), atomOrder.end(),
firstOtherAtomIdx) -
atomOrder.begin();
unsigned int secondReorderedAtomIdx =
std::find(atomOrder.begin(), atomOrder.end(),
secondOtherAtomIdx) -
atomOrder.begin();

if (firstReorderedAtomIdx > secondReorderedAtomIdx) {
++swaps;
}
}
}
if (swaps % 2) {
bd = (bd == Bond::BondDir::BEGINWEDGE)
? Bond::BondDir::BEGINDASH
: Bond::BondDir::BEGINWEDGE;
}
}
}

break;
}
}
}

Expand Down