Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Batch reprepare speculatively #531

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

wprzytula
Copy link
Collaborator

@wprzytula wprzytula commented Aug 23, 2022

Extracted repreparing as a method on Connection to avoid duplication. Also, after one statement in a batch is found to be unprepared, we speculatively prepare all prepared statements in the batch, as there is likelihood that once one of them becomes unprepared, the others do too.
Fixes: #529

Pre-review checklist

  • I have split my patch into logically separate commits.
  • All commit messages clearly explain what they change and why.
  • I added relevant tests for new features and bug fixes.
  • All commits compile, pass static checks and pass test.
  • PR description sums up the changes and reasons why they should be introduced.
  • I added appropriate Fixes: annotations to PR description.

As reprepare logic was duplicated (in prepared statements and in
batches), I've extracted it as a separate function.
@wprzytula
Copy link
Collaborator Author

I've noticed I perform all prepares sequentially, effectively contradicting the purpose of this PR. Will change it into concurrent reprepare.

@wprzytula wprzytula force-pushed the batch-reprepare-speculatively branch from 6597b1c to 0041016 Compare August 23, 2022 15:47
@wprzytula
Copy link
Collaborator Author

v2: made reprepares concurrent.

Before, each unprepared statement in a batch caused a full roundtrip
to Scylla. We instead speculatively prepare them all, as there is strong
likelihood that once one of them becomes unprepared, the others
do too. This change may significantly reduce latency in such case.
@psarna
Copy link
Contributor

psarna commented Aug 25, 2022

On the other hand, this is pessimizing for situations where the number of invalidated statements is orders of magnitude smaller than the number of statements in the batch.

We currently implement a variant of LRU with two pools (ref: scylladb/scylladb@1a9c6d9fd3) and this speculative preparation may interact badly with the algorithm by pushing other prepared statements from the first, probationary buffer, even though lots of them might have already been prepared.

So, in order to judge how we should proceed with this, here's question 1: was this pull request based on an existing issue that somebody (or our test case) discovered, or was it a result of code inspection? If it's not based on a real issue, we at least need a couple of tests to prove that it makes sense, comparing what happens when a minority and majority of batch statements are not prepared, respectively.

@psarna
Copy link
Contributor

psarna commented Aug 25, 2022

(btw, the commit which deduplicates repreparing logic is good on its own anyway, so we can definitely apply that one regardless of the result of this discussion)

_ => None,
})
.map(|prepared_statement| {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's probable that a single batch will have multiple instances of the same prepared query, only with different arguments - for example imagine 100 inserts to the same table. It would be good to deduplicate them before repreparing.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume you mean deduplicating by prepared_id, am I right?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup

@wprzytula
Copy link
Collaborator Author

On the other hand, this is pessimizing for situations where the number of invalidated statements is orders of magnitude smaller than the number of statements in the batch.

We currently implement a variant of LRU with two pools (ref: scylladb/scylladb@1a9c6d9fd3) and this speculative preparation may interact badly with the algorithm by pushing other prepared statements from the first, probationary buffer, even though lots of them might have already been prepared.

So, in order to judge how we should proceed with this, here's question 1: was this pull request based on an existing issue that somebody (or our test case) discovered, or was it a result of code inspection? If it's not based on a real issue, we at least need a couple of tests to prove that it makes sense, comparing what happens when a minority and majority of batch statements are not prepared, respectively.

It was based solely on code inspection.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Reprepare all prepared statements in batch speculatively after one is found to be Unprepared
3 participants