[core][experimental] Handle NCCL errors in accelerated DAGs #45307
Labels
accelerated-dag
enhancement
Request for new feature and/or capability
P1
Issue that should be fixed within a few weeks
usability
Milestone
Description
Handle:
Ideally, actors participating in the DAG should still be usable after the error is thrown.
Use case
No response
The text was updated successfully, but these errors were encountered: