New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove redundant SSSR rings based on bond paths #2648
base: master
Are you sure you want to change the base?
Conversation
Since this would be relevant for 3.2, and we can't guarantee canonicalization across feature changes, I think we're okay wrt backwards compatibility of canonicalization. I would regenerate the result files as far as the tests. |
Is there an easy way to regenerate all result files? |
I found a simple way to update the result files for SMARTS, formula and formal charge tests. But not for the canonstable test. |
Weirdly enough, the smartstest is ok when I run it locally. I cannot find a good reason for that. I also get more changes than expected in the formalcharge test results file so something is up with that as well... I'll need to look more at this. |
for formalcharge, formula and smarts test
(One is unstable atm, giving a failed test)
Two fails in the tests. One is in testobconv_writers.py and I cannot see how that test has ever worked since the x- and y-coordinates in the expected output are the same. ( openbabel/test/testobconv_writers.py Lines 2882 to 2888 in 32cf131
The other is a bigger problem. It's in canonstabletest and the problem is that the failing test there isn't stable. $ BABEL_LIBDIR=build/lib/ build/bin/obabel -ismi -ocan -:"C1[C@@H]2C[C@H]3C[C@H]1C[C@](C2)(C3)Nc1nc(nc(n1)N[C@]12C[C@@H]3C[C@@H](C2)C[C@H](C1)C3)NC12C[C@@H]3C[C@@H](C2)C[C@H](C1)C3"
C1[C@H]2C[C@@H]3C[C@H]1C[C@@](C2)(C3)Nc1nc(nc(n1)N[C@]12C[C@@H]3C[C@@H](C2)C[C@@H](C1)C3)NC12C[C@@H]3C[C@@H](C2)C[C@H](C1)C3
1 molecule converted
$ BABEL_LIBDIR=build/lib/ build/bin/obabel -ismi -ocan -:"C1[C@H]2C[C@@H]3C[C@H]1C[C@@](C2)(C3)Nc1nc(nc(n1)N[C@]12C[C@@H]3C[C@@H](C2)C[C@@H](C1)C3)NC12C[C@@H]3C[C@@H](C2)C[C@H](C1)C3"
C1[C@@H]2C[C@H]3C[C@H]1C[C@](C2)(C3)Nc1nc(nc(n1)N[C@]12C[C@@H]3C[C@@H](C2)C[C@H](C1)C3)NC12C[C@@H]3C[C@@H](C2)C[C@H](C1)C3 So somehow, there seem to be an edge case in the canonicalization algo that the new SSSR ring code breaks. I don't know the canon code well enough to understand why atm but I will try to look into it. But maybe someone more versed into it can help out? |
OK, found the source of the failing pcjson-test. There was a bug in that format, fixed in #2596. However, this test was not updated to match. I will add a PR for that... |
I found the problem to the failing canonical test. There was a bug in my new code when there were multiple ring systems in the molecule. It is now fixed and thereby I could also revert the changes to the canonstable file meaning that there are no longer any changes to the canonicalization. |
As found in #2641 (comment), the current SSSR has a problem when removing redundant rings. The overlap is currently based on atoms instead of bonds, leading to incorrect results.
This code includes a fix for that problem, but I do not know how efficient the solution is.
In addition, I have a few questions/problems regarding the testing. I added a test case to
ringtest.cpp
where I thought it belonged. However, since several other tests uses the filetest/files/attype.00.smi
this leads to a bunch of failing tests. Would it be best to rework "my test" to a regression test or regenerate the result files for the other failing tests (such as formalcharge and formula etc)?In addition, this change seems to have affected canonicalization, at least I get a ton of errors in test_canonstable. If that code relies on SSSR it would be reasonable to expect this and in that case maybe a "backwards-compability-switch" so that the canonicalization could continue to call the old code that although not correct for SSSR-purposes should be good enough to produce stable canonicalization.