Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

release-24.1: streamingccl: mark cutback retention jobs as successful #124055

Merged
merged 2 commits into from
May 26, 2024

Conversation

blathers-crl[bot]
Copy link

@blathers-crl blathers-crl bot commented May 13, 2024

Backport 1/1 commits from #123934 on behalf of @dt.

/cc @cockroachdb/release


Previously we started creating a stream producer job in the destination cluster when we completed replication cutover, to preserve the history as of that cutover time in case the another cluster would subsequently want to start replicating as of that time, e.g. reversing the direction of replication, or in case the promoted cluster would want to revert to the cutover time as part of a demotion back to a standby.

However, this placeholder job is, by design, never actually used by replication -- it exists only to keep the option open for some other replication job to be started -- and thus is never heartbeated or marked as no longer needed due to successful completion of replication, causing it to be marked as FAILED when it expires.

This changes the initial status so that it is created already indicating that replication succeeded. Thus when it expires, it is marked as successful instead of failed, avoiding the spurious 'failures' that one observes in the job system surfaces.

Release note (enterprise change): History Retention jobs created at the completion of cluster replication no longer erroneously indicate they failed when the expire.

Epic: none.


Release justification:

Previously we started creating a stream producer job in the destination
cluster when we completed replication cutover, to preserve the history
as of that cutover time in case the another cluster would subsequently
want to start replicating as of that time, e.g. reversing the direction
of replication, or in case the promoted cluster would want to revert to
the cutover time as part of a demotion back to a standby.

However, this placeholder job is, by design, never actually used by
replication -- it exists only to keep the option open for some other
replication job to be started -- and thus is never heartbeated or marked
as no longer needed due to successful completion of replication, causing
it to be marked as FAILED when it expires.

This changes the initial status so that it is created already indicating
that replication succeeded. Thus when it expires, it is marked as
successful instead of failed, avoiding the spurious 'failures' that one
observes in the job system surfaces.

Release note (enterprise change): History Retention jobs created at the
completion of cluster replication no longer erroneously indicate they
failed when the expire.

Epic: none.
@blathers-crl blathers-crl bot requested a review from a team as a code owner May 13, 2024 17:17
@blathers-crl blathers-crl bot force-pushed the blathers/backport-release-24.1-123934 branch from 02f6a24 to c6db70f Compare May 13, 2024 17:17
@blathers-crl blathers-crl bot added blathers-backport This is a backport that Blathers created automatically. O-robot Originated from a bot. labels May 13, 2024
@blathers-crl blathers-crl bot requested review from msbutler and removed request for a team May 13, 2024 17:17
@blathers-crl blathers-crl bot requested a review from stevendanna May 13, 2024 17:17
Copy link
Author

blathers-crl bot commented May 13, 2024

Thanks for opening a backport.

Please check the backport criteria before merging:

  • Backports should only be created for serious
    issues
    or test-only changes.
  • Backports should not break backwards-compatibility.
  • Backports should change as little code as possible.
  • Backports should not change on-disk formats or node communication protocols.
  • Backports should not add new functionality (except as defined
    here).
  • Backports must not add, edit, or otherwise modify cluster versions; or add version gates.
  • All backports must be reviewed by the owning areas TL and one additional
    TL. For more information as to how that review should be conducted, please consult the backport
    policy
    .
If your backport adds new functionality, please ensure that the following additional criteria are satisfied:
  • There is a high priority need for the functionality that cannot wait until the next release and is difficult to address in another way.
  • The new functionality is additive-only and only runs for clusters which have specifically “opted in” to it (e.g. by a cluster setting).
  • New code is protected by a conditional check that is trivial to verify and ensures that it only runs for opt-in clusters. State changes must be further protected such that nodes running old binaries will not be negatively impacted by the new state (with a mixed version test added).
  • The PM and TL on the team that owns the changed code have signed off that the change obeys the above rules.
  • Your backport must be accompanied by a post to the appropriate Slack
    channel (#db-backports-point-releases or #db-backports-XX-X-release) for awareness and discussion.

Also, please add a brief release justification to the body of your PR to justify this
backport.

@blathers-crl blathers-crl bot added the backport Label PR's that are backports to older release branches label May 13, 2024
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@dt dt requested a review from jbowens May 13, 2024 17:36
@msbutler
Copy link
Collaborator

@dt could you add the following patch to this pr? #124162

As of #123934, the producer job succeeds instead of fails. This patch teaches
some test infra about this.

Fixes #124139
Fixes #124138
Fixes #124151
Fixes #124137

Release note: none
@dt dt merged commit f199555 into release-24.1 May 26, 2024
18 of 20 checks passed
@dt dt deleted the blathers/backport-release-24.1-123934 branch May 26, 2024 11:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport Label PR's that are backports to older release branches blathers-backport This is a backport that Blathers created automatically. O-robot Originated from a bot.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants