Skip to content
This repository has been archived by the owner on Nov 16, 2023. It is now read-only.

Generated data splits should be tracked along with model outputs, and not stored with data. #286

Open
yalaudah opened this issue Apr 23, 2020 · 0 comments
Labels
Type: Enhancement This an enhancement to an existing feature
Projects

Comments

@yalaudah
Copy link
Contributor

yalaudah commented Apr 23, 2020

When running the prepare_dutchf3.py script, the results should be stored, and tracked, along with the outputs of each model run (logs, snapshots, configs, etc), and not separately. This means the code should run as part of the data prep in the training scripts, and not once. We should also make sure that all the required parameters (e.g. stride, or section_stride) are stored in the config files.

Otherwise, newer model runs might use older data splits, and there is no way to track which data split was used with a model.

  • Note: This also requires changes to the Docker implementation, and to the README file.
@yalaudah yalaudah added the Type: Enhancement This an enhancement to an existing feature label Apr 23, 2020
@yalaudah yalaudah added this to the V0.4 [Datasets] milestone Apr 23, 2020
@yalaudah yalaudah added this to Mn: Backlog in Manganese May 28, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Type: Enhancement This an enhancement to an existing feature
Projects
Iron
Awaiting triage
Development

No branches or pull requests

2 participants