Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AWS Batch Operator ignores job definition retry strategy #39607

Closed
2 tasks done
0x26res opened this issue May 14, 2024 · 1 comment
Closed
2 tasks done

AWS Batch Operator ignores job definition retry strategy #39607

0x26res opened this issue May 14, 2024 · 1 comment
Labels
area:providers kind:bug This is a clearly a bug provider:amazon-aws AWS/Amazon - related issues

Comments

@0x26res
Copy link
Contributor

0x26res commented May 14, 2024

Apache Airflow version

2.9.1

If "Other Airflow 2 version" selected, which one?

No response

What happened?

There's been 2 recent changes to the AWS BatchOperator:

They've added the option to specify the retry strategy for the job.

Unfortunately for someone who relies on the default retry strategy, which can be set/customized in the AWS Batch job definition, this value is now ignored

What you think should happen instead?

When the user does not specify a retry_strategy in constructor BatchOperator it should default to None. Then it will automatically fall back to the job definition retry strategy.

It should NOT be set to {"attempt": 1} as it is currently the case.

How to reproduce

  • Create an aws job definition with a retry strategy, for example:
  "retryStrategy": {
    "attempts": 5,
    "evaluateOnExit": [
      {
        "onStatusReason": "Essential container in task exited*",
        "action": "exit"
      },
      {
        "onStatusReason": "*",
        "action": "retry"
      }
    ]
  }
  • submit an aws BatchOperator with no retry_strategy or retry_strategy=None
  • The retry strategy has been overwritten to this:
  "retryStrategy": {
    "attempts": 1,
    "evaluateOnExit": []
  },

Operating System

MWAA

Versions of Apache Airflow Providers

Using https://raw.githubusercontent.com/apache/airflow/constraints-2.8.1/constraints-3.11.txt

So:

  • apache-airflow-providers-amazon==8.16.0
  • airflow=2.8.1

But it is still happening with the latest version as far as I can tell:

self.retry_strategy = retry_strategy or {}

Deployment

Official Apache Airflow Helm Chart

Deployment details

MWAA

Anything else?

Here's a quick workaround that should work even with older version of the BatchOperator that don't have retry_strategy:

    if hasattr(operator, "retry_strategy"):
        operator.retry_strategy = None

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

@0x26res 0x26res added area:core kind:bug This is a clearly a bug needs-triage label for new issues that we didn't triage yet labels May 14, 2024
@RNHTTR RNHTTR added provider:amazon-aws AWS/Amazon - related issues area:providers and removed area:core needs-triage label for new issues that we didn't triage yet labels May 14, 2024
@0x26res
Copy link
Contributor Author

0x26res commented May 16, 2024

Fixed with #39608

@0x26res 0x26res closed this as completed May 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:providers kind:bug This is a clearly a bug provider:amazon-aws AWS/Amazon - related issues
Projects
None yet
Development

No branches or pull requests

2 participants