Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

discussion: Repeat querying #48

Open
oslopanda opened this issue Mar 31, 2023 · 1 comment
Open

discussion: Repeat querying #48

oslopanda opened this issue Mar 31, 2023 · 1 comment
Assignees

Comments

@oslopanda
Copy link
Contributor

Hello,

Thank you for the nice work!
While using Xerus, i feel there might be many repeating querying from the database?
For instance a system with elements A, B, C, and D
the program try to query combinations A, B, C, D, (AB), (AC), (AD), (BC),(CD).....
However, if you just query for (ABCD) you will get all the combinations from the database? Or i am totally wrong?

@pedrobcst
Copy link
Owner

Hey,

Sorry for the late reply.

Indeed we do query by combination of elements instead of querying the list of elements. I do not know remember exactly why I made in this way (the queriers itself where one the first things I coded, and its actually due to a refactor), but I believe it was the following reasons:

  • Few of the providers did not support querying the elements in an inclusive manner for a given number of species. Ie, if you asked elements A,B,C,D up to 4 elements, meaning you want A, AB, ABC etc, one of the providers (or couple, not sure) would give back for AB elements + max 3 for example AB + something else such as EF (sorry if its confusing). For example: If you gave [Sr, Ca, Ba, O] with max to 4, some providers for Sr,Ca for example would give all ternary compounds that has Sr,Ca plus something else. And everything else for all the other possible combinations. The way to circumvent this was to pricesely build up all the possible chemical elements and query the chemical space seperately.

  • I wanted to keep track of the elements combinations for when querying overlapping spaces be able to avoid already downloaded spaces. Thinking back, this is quite pointless since this can be checked after querying.

I believe the queriers can be revisited and remade to be more efficient. Specially, the COD querying should be moved totally to OPTIMADE interface that I believe would give more flexibility. This can be reeinvesitgated, just now I dont have much free time to work on Xerus (sorry!), but any PR I would be glad to review and check. Any other improvemenrts are also welcome!

@pedrobcst pedrobcst self-assigned this Apr 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants