Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change initializing behaviour #3584

Open
wants to merge 2 commits into
base: dev
Choose a base branch
from

Conversation

AlexandruCihodaru
Copy link
Contributor

All Submissions:

  • Contributions should target the dev branch. Did you create your branch from dev?
  • Have you followed the guidelines in our Contributing document?
  • Have you checked to ensure there aren't other open Pull Requests for the same update/change?

Changes to Core Features:

  • Have you added an explanation of what your changes do and why you'd like us to include them?
  • [] Have you written new tests for your core changes, as applicable?
  • Have you successfully ran tests with your changes locally?

@generall
Copy link
Member

Hey @AlexandruCihodaru, we unfortunately can't remove Initializing right away, it would create problems during cluster version upgrade, when some nodes have latest versions, while others don't.

Here is a step-by-step plan of how we should proceed: #3450

@AlexandruCihodaru
Copy link
Contributor Author

Hey @AlexandruCihodaru, we unfortunately can't remove Initializing right away, it would create problems during cluster version upgrade, when some nodes have latest versions, while others don't.

Here is a step-by-step plan of how we should proceed: #3450

Hello @generall I have seen that issue and was planning to follow that list. In this first PR I attempted to address the first bullet
"Make Initializing behave as Active in all situations"

@generall
Copy link
Member

In this first PR I attempted to address the first bullet "Make Initializing behave as Active in all situations"

Oh, ok. Let me have another look then

@@ -650,8 +650,10 @@ impl ShardReplicaSet {
.await?;
}
ReplicaState::Initializing => {
self.set_local(local_shard, Some(ReplicaState::Initializing))
// Same as `Active`, we report a failure to consensus
self.set_local(local_shard, Some(ReplicaState::Active))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we can set Active local state here, even if the Active and Initializing will act the same.

General rule here is that we only change state of shards if there is a consensus operation saying to do so. Otherwise there is a risk of having de-sync in shard states between nodes.

Consider the following scenario:

  • Node A of a newest version receives Snapshot with ReplicaState::Initializing and changes it to Active.
  • Node B with an older version still thinks that Node A have shard in Initializing state.
  • Since Node A have local state of Active, it will never trigger consensus to change the state
  • Node B will think about node A as stuck in Initializing state indefinitely, even after update to a newer version.

Luckily it should not be a big deal to have Initializing state, as we have a procedure to convert Initializing into active if it happens to be a local one

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, thank you for your explication. I will revert this commit because there won't be the case to notify peer failure.

Considering that Initializing is going to be deprecated, we should skip
shards that are in Initializing state when looking for unhealty shards.

Signed-off-by: Alexandru Cihodaru <alexandru.cihodaru@gmail.com>
After the changes in qdrant#3318 replicas should not be in Initializing state
anymore, hence we should try to deactivate replicas that are now in the
Initializing state if the error is not transient.

Signed-off-by: Alexandru Cihodaru <alexandru.cihodaru@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants