Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Auto-materialization of non-existing partitions #21812

Open
Ludaestro opened this issue May 13, 2024 · 0 comments
Open

Auto-materialization of non-existing partitions #21812

Ludaestro opened this issue May 13, 2024 · 0 comments
Labels
area: auto-materialize Related to Auto Materialization type: bug Something isn't working

Comments

@Ludaestro
Copy link

Ludaestro commented May 13, 2024

Dagster version

1.7.3

What's the issue?

I had an asset with a weekly partitions definitions with start date set to 1900-01-01. It had all partitions ranging back to 1900-01-01 was backfilled or marked as backfilled. When i changed it to have daily partitions instead with a start date of 2024-04-20 and enabled auto-materialization(it was enabled previously as well), dagster started requesting materializations of partitions before 2024-04-20. I also had set "all_partitions" to "False" in my cron-schedule but it seems it still tries to materialize all missing partitions.

I was forced to disable the rule "materialize_on_missing" to prevent it from doing that.

Furthermore I also got the following GraphQL-error when accessing the "Automation"-page of the specific asset:

Operation name: GetEvaluationsQuery

Message: 'NoneType' object has no attribute 'start'

Path: ["assetConditionEvaluationRecordsOrError"]

Locations: [{"line":19,"column":3}]

Stack Trace:
  File "/usr/local/lib/python3.10/site-packages/graphql/execution/execute.py", line 521, in execute_field
    result = resolve_fn(source, info, **args)
  File "/usr/local/lib/python3.10/site-packages/dagster_graphql/schema/roots/query.py", line 1157, in resolve_assetConditionEvaluationRecordsOrError
    return fetch_asset_condition_evaluation_records_for_asset_key(
  File "/usr/local/lib/python3.10/site-packages/dagster_graphql/implementation/fetch_asset_condition_evaluations.py", line 109, in fetch_asset_condition_evaluation_records_for_asset_key
    return _get_graphene_records_from_evaluations(
  File "/usr/local/lib/python3.10/site-packages/dagster_graphql/implementation/fetch_asset_condition_evaluations.py", line 58, in _get_graphene_records_from_evaluations
    records=[
  File "/usr/local/lib/python3.10/site-packages/dagster_graphql/implementation/fetch_asset_condition_evaluations.py", line 59, in <listcomp>
    GrapheneAssetConditionEvaluationRecord(
  File "/usr/local/lib/python3.10/site-packages/dagster_graphql/schema/asset_condition_evaluations.py", line 307, in __init__
    evaluation=GrapheneAssetConditionEvaluation(
  File "/usr/local/lib/python3.10/site-packages/dagster_graphql/schema/asset_condition_evaluations.py", line 253, in __init__
    evaluationNodes = [
  File "/usr/local/lib/python3.10/site-packages/dagster_graphql/schema/asset_condition_evaluations.py", line 254, in <listcomp>
    GraphenePartitionedAssetConditionEvaluationNode(evaluation, partitions_def)
  File "/usr/local/lib/python3.10/site-packages/dagster_graphql/schema/asset_condition_evaluations.py", line 159, in __init__
    trueSubset=GrapheneAssetSubset(evaluation.true_subset),
  File "/usr/local/lib/python3.10/site-packages/dagster_graphql/schema/asset_condition_evaluations.py", line 75, in __init__
    subsetValue=GrapheneAssetSubsetValue(asset_subset.subset_value),
  File "/usr/local/lib/python3.10/site-packages/dagster_graphql/schema/asset_condition_evaluations.py", line 49, in __init__
    for start, end in value.get_partition_key_ranges(value.partitions_def)
  File "/usr/local/lib/python3.10/site-packages/dagster/_core/definitions/time_window_partitions.py", line 1731, in get_partition_key_ranges
    return [
  File "/usr/local/lib/python3.10/site-packages/dagster/_core/definitions/time_window_partitions.py", line 1734, in <listcomp>
    ).get_partition_key_range_for_time_window(window)
  File "/usr/local/lib/python3.10/site-packages/dagster/_core/definitions/time_window_partitions.py", line 745, in get_partition_key_range_for_time_window
    cast(TimeWindow, self.get_prev_partition_window(time_window.end)).start.timestamp()

This might be related to this github issue: Auto-materialization daemon re-runs all partitions if time window is extended to the past

What did you expect to happen?

That it would only try to materialize the latest partition, and not try to catchup. Also not materialize partitions that doesn't exist.

How to reproduce?

  • use auto-materialize
  • have a weekly schedule and let it materialize
  • Change to a daily partitions definition with a later start date which means the partition time window does not include all of the earlier materializations.

Deployment type

Dagster Helm chart

Deployment details

No response

Additional information

No response

Message from the maintainers

Impacted by this issue? Give it a 👍! We factor engagement into prioritization.

@Ludaestro Ludaestro added the type: bug Something isn't working label May 13, 2024
@Ludaestro Ludaestro changed the title Auto-materialisation of non-existing partitions Auto-materialization of non-existing partitions May 13, 2024
@garethbrickman garethbrickman added the area: auto-materialize Related to Auto Materialization label May 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: auto-materialize Related to Auto Materialization type: bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants