Releases: Netflix/metaflow
2.2.8 (Mar 15th, 2021)
Metaflow 2.2.8 Release Notes
The Metaflow 2.2.8 release is a minor patch release.
Bugs
Fix @environment
behavior for conflicting attribute values
Metaflow was incorrectly handling environment variables passed through the @environment
decorator in some specific instances. When @environment
decorator is specified over multiple steps, the actual environment that's available to any step is the union of attributes of all the @environment
decorators; which is incorrect behavior. For example, in the following workflow -
from metaflow import FlowSpec, step, batch, environment
import os
class LinearFlow(FlowSpec):
@environment(vars={'var':os.getenv('var_1')})
@step
def start(self):
print(os.getenv('var'))
self.next(self.a)
@environment(vars={'var':os.getenv('var_2')})
@step
def a(self):
print(os.getenv('var'))
self.next(self.end)
@step
def end(self):
pass
if __name__ == '__main__':
LinearFlow()
var_1=foo var_2=bar python flow.py run
will result in
Metaflow 2.2.7.post10+gitb7d4c48 executing LinearFlow for user:savin
Validating your flow...
The graph looks good!
Running pylint...
Pylint is happy!
2021-03-12 20:46:04.161 Workflow starting (run-id 6810):
2021-03-12 20:46:04.614 [6810/start/86638 (pid 10997)] Task is starting.
2021-03-12 20:46:06.783 [6810/start/86638 (pid 10997)] foo
2021-03-12 20:46:07.815 [6810/start/86638 (pid 10997)] Task finished successfully.
2021-03-12 20:46:08.390 [6810/a/86639 (pid 11003)] Task is starting.
2021-03-12 20:46:10.649 [6810/a/86639 (pid 11003)] foo
2021-03-12 20:46:11.550 [6810/a/86639 (pid 11003)] Task finished successfully.
2021-03-12 20:46:12.145 [6810/end/86640 (pid 11009)] Task is starting.
2021-03-12 20:46:15.382 [6810/end/86640 (pid 11009)] Task finished successfully.
2021-03-12 20:46:15.563 Done!
Note the output for the step a
which should have been bar
. PR #452 fixes the issue.
Fix environment is not callable
error when using @environment
Using @environment
would often result in an error from pylint
- E1102: environment is not callable (not-callable)
. Users were getting around this issue by launching their flows with --no-pylint
. PR #451 fixes this issue.
2.2.7 (Feb 8th, 2021)
Metaflow 2.2.7 Release Notes
The Metaflow 2.2.7 release is a minor patch release.
Bugs
Handle for-eaches properly for AWS Step Functions workflows running on AWS Fargate
Workflows orchestrated by AWS Step Functions were failing to properly execute for-each
steps on AWS Fargate. The culprit was lack of access to instance metadata for ECS. Metaflow instantiates a connection to Amazon DynamoDB to keep track of for-each
cardinality. This connection requires knowledge of the region that the job executes in and is made available via instance metadata on EC2; which unfortunately is not available on ECS (for AWS Fargate). This fix introduces the necessary checks for inferring the region correctly for tasks executing on AWS Fargate. Note that after the recent changes to Amazon S3's consistency model, the Amazon DynamoDB dependency is no longer needed and will be done away in a subsequent release. PR: #436
2.2.6 (Jan 26th, 2021)
Metaflow 2.2.6 Release Notes
The Metaflow 2.2.6 release is a minor patch release.
Features
Support AWS Fargate as compute backend for Metaflow tasks launched on AWS Batch
At AWS re:invent 2020, AWS announced support for AWS Fargate as a compute backend (in addition to EC2) for AWS Batch. With this feature, Metaflow users can now submit their Metaflow jobs to AWS Batch Job Queues which are connected to AWS Fargate Compute Environments as well. By setting the environment variable - METAFLOW_ECS_FARGATE_EXECUTION_ROLE
, users can configure the ecsTaskExecutionRole for the AWS Batch container and AWS Fargate agent. PR: #402
Support shared_memory
, max_swap
, swappiness
attributes for Metaflow tasks launched on AWS Batch
The @batch
decorator now supports shared_memory
, max_swap
, swappiness
attributes for Metaflow tasks launched on AWS Batch to provide a greater degree of control for memory management. PR: #408
Support wider very-wide workflows on top of AWS Step Functions
The tag metaflow_version:
and runtime:
is now available for all packaged executions and remote executions as well. This ensures that every run logged by Metaflow will have metaflow_version
and runtime
system tags available. PR: #403
Bug Fixes
Assign tags to Run
objects generated through AWS Step Functions executions
Run
objects generated by flows executed on top of AWS Step Functions were missing the tags assigned to the flow; even though the tags were correctly persisted to tasks. This release fixes and brings inline the tagging behavior as observed with local flow executions. PR: #386
Pipe all workflow set-up logs to stderr
Execution set-up logs for @conda
and IncludeFile
were being piped to stdout
which made manipulating the output of commands like python flow.py step-functions create --only-json
a bit difficult. This release moves the workflow set-up logs to stderr
. PR: #379
Handle null assignment to IncludeFile
properly
A workflow executed without a required IncludeFile
parameter would fail when the parameter was referenced inside the flow. This release fixes the issue by assigning a null value to the parameter in such cases. PR: #421
2.2.5 (Nov 11th, 2020)
Metaflow 2.2.5 Release Notes
The Metaflow 2.2.5 release is a minor patch release.
-
- Log
metaflow_version:
andruntime:
tag for all executions
- Log
-
- Handle inconsistently cased file system issue when creating @conda environments on macOS for linux-64
Features
Log metaflow_version:
and runtime:
tag for all executions
The tag metaflow_version:
and runtime:
is now available for all packaged executions and remote executions as well. This ensures that every run logged by Metaflow will have metaflow_version
and runtime
system tags available. PR: #376, #375
Bug Fixes
Handle inconsistently cased file system issue when creating @conda environments on macOS for linux-64
Conda fails to correctly set up environments for linux-64 packages on macOS at times due to inconsistently cased filesystems. Environment creation is needed to collect the necessary metadata for correctly setting up the conda environment on AWS Batch. This fix simply ignores the error-checks that conda throws while setting up the environments on macOS when the intended destination is AWS Batch. PR: #377
2.2.4 (Oct 28th, 2020)
Metaflow 2.2.4 Release Notes
The Metaflow 2.2.4 release is a minor patch release.
-
- Metaflow is now compliant with AWS GovCloud & AWS CN regions
-
- Address a bug with overriding the default value for IncludeFile
- Port AWS region check for AWS DynamoDb from
curl
torequests
Features
Metaflow is now compliant with AWS GovCloud & AWS CN regions
AWS GovCloud & AWS CN users can now enjoy all the features of Metaflow within their region partition with no change on their end. PR: #364
Bug Fixes
Address a bug with overriding the default value for IncludeFile
Metaflow v2.1.0 introduced a bug in IncludeFile functionality which prevented users from overriding the default value specified. PR: #346
Port AWS region check for AWS DynamoDb from curl
to requests
Metaflow's AWS Step Functions' integration relies on AWS DynamoDb to manage foreach constructs. Metaflow was leveraging curl
at runtime to detect the region for AWS DynamoDb. Some docker images don't have curl
installed by default; moving to requests
(a metaflow dependency) fixes the issue. PR: #343
2.2.3 (Sep 8th, 2020)
Metaflow 2.2.3 Release Notes
The Metaflow 2.2.3 release is a minor patch release.
- Bug Fixes
- Fix #305 : Default 'help' for parameters was not handled properly
- Pin the conda library versions for metaflow default dependencies based on the Python version
- Add conda bin path to the PATH environment variable during Metaflow step execution
- Fix a typo in metaflow/debug.py
Bug Fixes
Fix #305 : Default 'help' for parameters was not handled properly
Fix the issue where default help
for parameters was not handled properly. #305 Flow fails because IncludeFile
's default value for the help
argument is None. PR: #318
Pin the conda library versions for metaflow default dependencies based on the Python version.
The previously pinned library version does not work with python 3.8. Now we have two sets of different version combinations which should work for python 2.7, 3.5, 3.6, 3.7, and 3.8. PR: #308
Add conda bin path to the PATH environment variable during Metaflow step execution
Previously the executable installed in conda environment was not visible inside metaflow steps. Fixing this issue by appending conda bin path to the PATH environment variable PR: #307
Fix a typo in metaflow/debug.py
A typo fix. PR: #304
2.2.2 (Aug 20th, 2020)
Metaflow 2.2.2 Release Notes
The Metaflow 2.2.2 release is a minor patch release.
- Bug Fixes
- Fix a regression introduced in 2.2.1 related to Conda environments
- Clarify Pandas requirements for Tutorial Episode 04
- Fix an issue with the metadata service
Bug Fixes
Fix a regression with Conda
Metaflow 2.2.1 included a commit which was merged too early and broke the use of Conda. This release reverses this patch.
Clarify Pandas version needed for Episode 04
Recent versions of Pandas are not backward compatible with the one used in the tutorial; a small comment was added to warn of this fact.
Fix an issue with the metadata service
In some cases, the metadata service would not properly create runs or tasks.
2.2.1 (Aug 17th, 2020)
Metaflow 2.2.1 Release Notes
The Metaflow 2.2.1 release is a minor patch release.
- Features
- Add
include
parameter tomerge_artifacts
.
- Add
- Bug Fixes
- Fix a regression introduced in 2.1 related to S3 datatools
- Fix an issue where Conda execution would fail if the Conda environment was not writeable
- Fix the behavior of uploading artifacts to the S3 datastore in case of retries
Features
Add include
parameter for merge_artifacts
You can now specify the artifacts to be merged explicitly by the merge_artifacts
method as opposed to just specifying the ones that should not be merged.
Bug Fixes
Fix a regression with datatools
Fixes the regression described in #285.
Fix an issue with Conda in certain environments
In some cases, Conda is installed system wide and the user cannot write to its installation directory. This was causing issues when trying to use the Conda environment. Fixes #179.
Fix an issue with the S3 datastore in case of retries
Retries were not properly handled when uploading artifacts to the S3 datastore. This fix addresses this issue.
2.2.0 (Aug 4th, 2020)
Metaflow 2.2.0 Release Notes
The Metaflow 2.2.0 release is a minor release and introduces Metaflow's support for R lang.
- Features
- Support for R lang.
Features
Support for R lang.
This release provides an idiomatic API to access Metaflow in R lang. It piggybacks on the Pythonic implementation as the backend providing most of the functionality previously accessible to the Python community. With this release, R users can structure their code as a metaflow flow. Metaflow will snapshot the code, data, and dependencies automatically in a content-addressed datastore allowing for resuming of workflows, reproducing past results, and inspecting anything about the workflow e.g. in a notebook or RStudio IDE. Additionally, without any changes to their workflows, users can now execute code on AWS Batch and interact with Amazon S3 seamlessly.
2.1.1 (Jul 30th, 2020)
Metaflow 2.1.1 Release Notes
The Metaflow 2.1.1 release is a minor patch release.
- Bug Fixes
- Handle race condition for
/step
endpoint of metadata service.
- Handle race condition for
Bug Fixes
Handle race condition for /step
endpoint of metadata service.
The foreach
step in AWS Step Functions launches multiple AWS Batch tasks, each of which tries to register the step metadata, if it already doesn't exist. This can result in a race condition and cause the task to fail. This patch properly handles the 409 response from the service.