-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
release-24.1: streamingccl: mark cutback retention jobs as successful #124055
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Previously we started creating a stream producer job in the destination cluster when we completed replication cutover, to preserve the history as of that cutover time in case the another cluster would subsequently want to start replicating as of that time, e.g. reversing the direction of replication, or in case the promoted cluster would want to revert to the cutover time as part of a demotion back to a standby. However, this placeholder job is, by design, never actually used by replication -- it exists only to keep the option open for some other replication job to be started -- and thus is never heartbeated or marked as no longer needed due to successful completion of replication, causing it to be marked as FAILED when it expires. This changes the initial status so that it is created already indicating that replication succeeded. Thus when it expires, it is marked as successful instead of failed, avoiding the spurious 'failures' that one observes in the job system surfaces. Release note (enterprise change): History Retention jobs created at the completion of cluster replication no longer erroneously indicate they failed when the expire. Epic: none.
blathers-crl
bot
force-pushed
the
blathers/backport-release-24.1-123934
branch
from
May 13, 2024 17:17
02f6a24
to
c6db70f
Compare
blathers-crl
bot
added
blathers-backport
This is a backport that Blathers created automatically.
O-robot
Originated from a bot.
labels
May 13, 2024
Thanks for opening a backport. Please check the backport criteria before merging:
If your backport adds new functionality, please ensure that the following additional criteria are satisfied:
Also, please add a brief release justification to the body of your PR to justify this |
blathers-crl
bot
added
the
backport
Label PR's that are backports to older release branches
label
May 13, 2024
jbowens
approved these changes
May 13, 2024
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
backport
Label PR's that are backports to older release branches
blathers-backport
This is a backport that Blathers created automatically.
O-robot
Originated from a bot.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Backport 1/1 commits from #123934 on behalf of @dt.
/cc @cockroachdb/release
Previously we started creating a stream producer job in the destination cluster when we completed replication cutover, to preserve the history as of that cutover time in case the another cluster would subsequently want to start replicating as of that time, e.g. reversing the direction of replication, or in case the promoted cluster would want to revert to the cutover time as part of a demotion back to a standby.
However, this placeholder job is, by design, never actually used by replication -- it exists only to keep the option open for some other replication job to be started -- and thus is never heartbeated or marked as no longer needed due to successful completion of replication, causing it to be marked as FAILED when it expires.
This changes the initial status so that it is created already indicating that replication succeeded. Thus when it expires, it is marked as successful instead of failed, avoiding the spurious 'failures' that one observes in the job system surfaces.
Release note (enterprise change): History Retention jobs created at the completion of cluster replication no longer erroneously indicate they failed when the expire.
Epic: none.
Release justification: