Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve FhirEngine Search performance for sorting #2547

Open
LZRS opened this issue May 17, 2024 · 7 comments · May be fixed by #2553
Open

Improve FhirEngine Search performance for sorting #2547

LZRS opened this issue May 17, 2024 · 7 comments · May be fixed by #2553
Assignees
Labels
effort:small Small effort - 2 days P1 High priority issue

Comments

@LZRS
Copy link
Collaborator

LZRS commented May 17, 2024

Describe the bug
When loading around 7k patients ordered by Patient.Name, the db query takes around 25s
Without sorting the query takes around 35ms

This is when using

fhirEngine.search<Patient> {
        filter(Patient.ACTIVE, { value = of(true) })
        sort(Patient.NAME, Order.ASCENDING)
        count = PaginationConstant.DEFAULT_PAGE_SIZE
        from = currentPage * PaginationConstant.DEFAULT_PAGE_SIZE
      }

that generates SQL query

SELECT a.resourceUuid, a.serializedResource
FROM ResourceEntity a
         LEFT JOIN StringIndexEntity b
                   ON a.resourceType = b.resourceType AND a.resourceUuid = b.resourceUuid AND b.index_name = 'name'
WHERE a.resourceType = 'Patient'
  AND a.resourceUuid IN (SELECT resourceUuid
                         FROM TokenIndexEntity
                         WHERE resourceType = 'Patient'
                           AND index_name = 'active'
                           AND index_value = 'true')
ORDER BY b.index_value ASC
LIMIT 20 OFFSET 0

Tested out that removing the ORDER BY b.index_value ASC reduces the latency of the query to milliseconds

Expected behavior
The usecase of loading resources with sort should be more performant

Additional context
Here's a link to the resources db containing around 7k patients, that can be used for testing
https://drive.google.com/file/d/1TgPw36BFI8BijsUPc1JkQE3l3RXANYzT/view?usp=drive_link

Would you like to work on the issue?
Please state if this issue should be assigned to you or who you think could help to solve this issue.

@LZRS
Copy link
Collaborator Author

LZRS commented May 17, 2024

@aditya-07 @MJ1998 here's the ticket about the issue on performance we're experiencing

@MJ1998
Copy link
Collaborator

MJ1998 commented May 21, 2024

Can you tell me how does it perform without filter of active Patients ?

@MJ1998
Copy link
Collaborator

MJ1998 commented May 21, 2024

One improvement I can think of is removing a.resourceType = b.resourceType clause in JOIN statement as we are filtering resourceType using WHERE clause.
But not sure if this is feasible or even if it's gonna improve the performance

@ndegwamartin
Copy link
Collaborator

It is worth noting that the above latency difference is observable even on non-device sqlite db browsers like SQLiteStudio.

@ndegwamartin
Copy link
Collaborator

Can you tell me how does it perform without filter of active Patients ?

Removing the subquery that performs this filter has no impact on the performance.

@ndegwamartin
Copy link
Collaborator

One improvement I can think of is removing a.resourceType = b.resourceType clause in JOIN statement as we are filtering resourceType using WHERE clause. But not sure if this is feasible or even if it's gonna improve the performance

Removing this does improve the performance as well, executing the query in milliseconds cc LZRS

@MJ1998 MJ1998 self-assigned this May 27, 2024
@MJ1998 MJ1998 added the P1 High priority issue label May 27, 2024
@MJ1998 MJ1998 linked a pull request May 27, 2024 that will close this issue
7 tasks
@aditya-07 aditya-07 added the effort:small Small effort - 2 days label May 30, 2024
@jingtang10
Copy link
Collaborator

thank you very much for catching this @LZRS! very happy for the improvement in @MJ1998 's pr.

just one comment for clarification and potential future work: #2553 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
effort:small Small effort - 2 days P1 High priority issue
Projects
Status: New
Development

Successfully merging a pull request may close this issue.

5 participants