Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Different metadata keys between sims run manually vs through Fireworks #1323

Open
ggsun opened this issue Oct 27, 2022 · 4 comments
Open

Different metadata keys between sims run manually vs through Fireworks #1323

ggsun opened this issue Oct 27, 2022 · 4 comments

Comments

@ggsun
Copy link
Contributor

ggsun commented Oct 27, 2022

In many analysis scripts, we use the values stored in the metadata/metadata.json file to determine certain properties about the simulation that was run (total number of gens, seeds, whether certain options were turned on, etc.). Through the work that @rjuenemann has been doing with her project, we noticed that the keys for this metadata file are different for sims that are run by using the manual runscripts versus sims that are run through Fireworks. When we run simulations through the manual runscripts, the runParca script generates the metadata file first with certain ParCa-specific keys, which is then overwritten by the runSim.py script by its own metadata file, whose keys mostly overlap with the ParCa metadata but does not contain some of the ParCa-specific keys (e.g. operons). When sims are run through fireworks, the metadata file is constructed just once based on the workflow options and written as a file only once when the workflow is set up. This difference has mostly led to issues when we need the data from the ParCa-specific keys for our analysis scripts, such as when we want the analysis script to run only when the operon option was turned on in the ParCa.

Possible ways to address this issue would be to:

(i) Have the ParCa and the simulation output separate metadata files, and the analysis scripts read in both files
(ii) Have the runSim.py script read in the existing metadata.json file written by runParca.py, and add to this file instead of overwriting

@1fish2
Copy link
Contributor

1fish2 commented Oct 28, 2022

Good points. Also:

  • Analysis scripts should perhaps look at sim_data.operons_on to determine if operons are on, like in models/ecoli/sim/variants/apply_variant.py.
    (The incantation getattr(sim_data, 'operons_on', '') is probably just to handle sim_data files generated before the operon code was merged in.)
  • It's sensible for runSim.py and runDaughter.py to extend the metadata.json file. Still, these scripts only have info on one manual sim run at a time. There's a --total-gens TOTAL_GENS arg to put a little more info in the metadata.json file, and that could be replaced by read-modify-write code for the metadata, but one could exceed the ability of the metadata structure by passing different parameters like --growth-rate-noise to different sim generations.

@1fish2
Copy link
Contributor

1fish2 commented Oct 28, 2022

GitHub says Your GitHub Enterprise trial has expired. Is someone attending to this?

@ggsun
Copy link
Contributor Author

ggsun commented Oct 28, 2022

GitHub says Your GitHub Enterprise trial has expired. Is someone attending to this?

I'll be working with Markus to upgrade our organizational account back to the next tier today.

@ggsun
Copy link
Contributor Author

ggsun commented Oct 28, 2022

GitHub says Your GitHub Enterprise trial has expired. Is someone attending to this?

I'll be working with Markus to upgrade our organizational account back to the next tier today.

We just restored our lab's account to the Github Teams tier, which should bring back all the features that we've been using in this tier (Wiki etc.).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants