Can I get the largest substructure that contains a specific substructure? #7311
-
Hello, I have a question regarding a unique use case. I have two molecules that share a substructure A. I would like to find their largest common substructure, lets call it substructure B, that contains substructure A within it. Is there a built-in way to do this in Rdkit? Thanks! |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 3 replies
-
Yes, I believe what you are referring to is the maximum common substructure. There is an example in the RDKit Getting Started Guide here: https://www.rdkit.org/docs/GettingStartedInPython.html#maximum-common-substructure Vin |
Beta Was this translation helpful? Give feedback.
-
I think the RASCAL MCES code can help you here.
which gives
Mapping the SMARTS strings back onto the molecules gives where the blue bonds are the unmatched ones. You would need to add some extra code to fragment the molecules based on bonds you don't want and then pick out the fragment that matches the substructure you care about. I believe (though I can't prove it) that in a fragmented system like this, the largest common piece that has your substructure will always be the largest available such fragment. See https://greglandrum.github.io/rdkit-blog/posts/2023-11-08-introducingrascalmces.html for more information about RASCAL. |
Beta Was this translation helpful? Give feedback.
I think the RASCAL MCES code can help you here.
which gives
Mapping the SMARTS strings back onto the molecules gives
where the blue bonds are the unmatched ones. You would need to add some extra code to fragment the molecules based on bonds you don't want and then pick out the fragment that matches the substructure you care about. I believe (though I can't prove it) that in a fragmented system like this, the largest …