Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AnalysisParcaTask fizzling when running sims with COMPRESS_OUTPUT=1 #1419

Open
rjuenemann opened this issue Dec 13, 2023 · 1 comment
Open

Comments

@rjuenemann
Copy link
Contributor

I ran the following on Sherlock on the ng-trl-eff-shift-variant-only branch in preparation for PR #1415

DESC="sherlock_internal_shift_metadata_test" VARIANT="new_gene_expression_and_translation_efficiency_internal_shift" FIRST_VARIANT_INDEX=2 LAST_VARIANT_INDEX=2 N_GENS=4 NEW_GENES="gfp" PLOTS=ACTIVE COMPRESS_OUTPUT=1 RAISE_ON_TIME_LIMIT=1 WC_ANALYZE_FAST=1 python runscripts/fireworks/fw_queue.py

I noticed the AnalysisParcaTask fizzled:

lpad get_fws
WARNING (aesara.tensor.blas): Using NumPy C-API based implementation for BLAS functions.
[
    {
        "fw_id": 1,
        "created_on": "2023-12-05T20:43:09.522596",
        "updated_on": "2023-12-06T01:51:41.448942",
        "state": "COMPLETED",
        "name": "AnalysisSingleTask__Var_2__Seed_0__Gen_3__Cell_0"
    },
    {
        "fw_id": 2,
        "created_on": "2023-12-05T20:43:09.522475",
        "updated_on": "2023-12-06T02:02:38.083444",
        "state": "COMPLETED",
        "name": "ScriptTask_compression_simulation__Seed_0__Gen_3__Cell_0"
    },
    {
        "fw_id": 3,
        "created_on": "2023-12-05T20:43:09.522372",
        "updated_on": "2023-12-06T01:39:09.968905",
        "state": "COMPLETED",
        "name": "SimulationTask__Var_02__Seed_0__Gen_3__Cell_0"
    },
    {
        "fw_id": 4,
        "created_on": "2023-12-05T20:43:09.522171",
        "updated_on": "2023-12-06T00:14:16.978444",
        "state": "COMPLETED",
        "name": "AnalysisSingleTask__Var_2__Seed_0__Gen_2__Cell_0"
    },
    {
        "fw_id": 5,
        "created_on": "2023-12-05T20:43:09.522048",
        "updated_on": "2023-12-06T01:58:06.264906",
        "state": "COMPLETED",
        "name": "ScriptTask_compression_simulation__Seed_0__Gen_2__Cell_0"
    },
    {
        "fw_id": 6,
        "created_on": "2023-12-05T20:43:09.521947",
        "updated_on": "2023-12-06T00:05:26.230629",
        "state": "COMPLETED",
        "name": "SimulationTask__Var_02__Seed_0__Gen_2__Cell_0"
    },
    {
        "fw_id": 7,
        "created_on": "2023-12-05T20:43:09.521706",
        "updated_on": "2023-12-05T23:11:12.747786",
        "state": "COMPLETED",
        "name": "AnalysisSingleTask__Var_2__Seed_0__Gen_1__Cell_0"
    },
    {
        "fw_id": 8,
        "created_on": "2023-12-05T20:43:09.521582",
        "updated_on": "2023-12-06T01:57:57.157442",
        "state": "COMPLETED",
        "name": "ScriptTask_compression_simulation__Seed_0__Gen_1__Cell_0"
    },
    {
        "fw_id": 9,
        "created_on": "2023-12-05T20:43:09.521478",
        "updated_on": "2023-12-05T22:58:25.945713",
        "state": "COMPLETED",
        "name": "SimulationTask__Var_02__Seed_0__Gen_1__Cell_0"
    },
    {
        "fw_id": 10,
        "created_on": "2023-12-05T20:43:09.521253",
        "updated_on": "2023-12-05T22:29:31.901319",
        "state": "COMPLETED",
        "name": "AnalysisSingleTask__Var_2__Seed_0__Gen_0__Cell_0"
    },
    {
        "fw_id": 11,
        "created_on": "2023-12-05T20:43:09.521113",
        "updated_on": "2023-12-06T01:57:51.547185",
        "state": "COMPLETED",
        "name": "ScriptTask_compression_simulation__Seed_0__Gen_0__Cell_0"
    },
    {
        "fw_id": 12,
        "created_on": "2023-12-05T20:43:09.521013",
        "updated_on": "2023-12-05T22:16:52.316777",
        "state": "COMPLETED",
        "name": "SimulationTask__Var_02__Seed_0__Gen_0__Cell_0"
    },
    {
        "fw_id": 13,
        "created_on": "2023-12-05T20:43:09.520819",
        "updated_on": "2023-12-06T01:46:25.366586",
        "state": "COMPLETED",
        "name": "AnalysisMultiGenTask__Var_02__Seed_000000"
    },
    {
        "fw_id": 14,
        "created_on": "2023-12-05T20:43:09.520692",
        "updated_on": "2023-12-06T01:46:18.284736",
        "state": "COMPLETED",
        "name": "AnalysisCohortTask__Var_02"
    },
    {
        "fw_id": 15,
        "created_on": "2023-12-05T20:43:09.520588",
        "updated_on": "2023-12-06T01:58:28.094396",
        "state": "COMPLETED",
        "name": "ScriptTask_compression_variant_KB"
    },
    {
        "fw_id": 16,
        "created_on": "2023-12-05T20:43:09.520495",
        "updated_on": "2023-12-05T21:42:58.831782",
        "state": "COMPLETED",
        "name": "VariantSimDataTask__new_gene_expression_and_translation_efficiency_internal_shift_000002"
    },
    {
        "fw_id": 17,
        "created_on": "2023-12-05T20:43:09.520357",
        "updated_on": "2023-12-06T01:45:05.873231",
        "state": "COMPLETED",
        "name": "AnalysisVariantTask"
    },
    {
        "fw_id": 18,
        "created_on": "2023-12-05T20:43:09.520232",
        "updated_on": "2023-12-05T21:44:22.102542",
        "state": "FIZZLED",
        "name": "AnalysisParcaTask"
    },
    {
        "fw_id": 19,
        "created_on": "2023-12-05T20:43:09.520117",
        "updated_on": "2023-12-05T20:43:09.520120",
        "name": "ScriptTask_compression_validation_data",
        "state": "WAITING"
    },
    {
        "fw_id": 20,
        "created_on": "2023-12-05T20:43:09.520013",
        "updated_on": "2023-12-05T21:05:50.347589",
        "state": "COMPLETED",
        "name": "InitValidationData"
    },
    {
        "fw_id": 21,
        "created_on": "2023-12-05T20:43:09.519919",
        "updated_on": "2023-12-05T21:15:37.365250",
        "state": "COMPLETED",
        "name": "ScriptTask_compression_validation_data_raw"
    },
    {
        "fw_id": 22,
        "created_on": "2023-12-05T20:43:09.519828",
        "updated_on": "2023-12-05T20:52:09.182471",
        "state": "COMPLETED",
        "name": "InitValidationDataRaw"
    },
    {
        "fw_id": 23,
        "created_on": "2023-12-05T20:43:09.519744",
        "updated_on": "2023-12-05T20:43:09.519746",
        "name": "ScriptTask_compression_sim_data",
        "state": "WAITING"
    },
    {
        "fw_id": 24,
        "created_on": "2023-12-05T20:43:09.519648",
        "updated_on": "2023-12-05T21:42:46.238209",
        "state": "COMPLETED",
        "name": "ScriptTask_compression_raw_data"
    },
    {
        "fw_id": 25,
        "created_on": "2023-12-05T20:43:09.519537",
        "updated_on": "2023-12-05T21:31:30.875954",
        "state": "COMPLETED",
        "name": "CalculateSimData"
    },
    {
        "fw_id": 26,
        "created_on": "2023-12-05T20:43:09.519369",
        "updated_on": "2023-12-05T20:52:14.081648",
        "state": "COMPLETED",
        "name": "InitRawData"
    }
]

with the error

Traceback (most recent call last):
  File "/home/users/rjuene/wcEcoli/wholecell/fireworks/firetasks/analysisBase.py", line 236, in run_plot
    plot_class.main(*args, cpus=1, analysis_paths=analysis_paths)
  File "/home/users/rjuene/wcEcoli/models/ecoli/analysis/analysisPlot.py", line 166, in main
    instance.plot(inputDir, plotOutDir, plotOutFileName, simDataFile,
  File "/home/users/rjuene/wcEcoli/models/ecoli/analysis/analysisPlot.py", line 156, in plot
    do_plot()
  File "/home/users/rjuene/wcEcoli/models/ecoli/analysis/analysisPlot.py", line 143, in do_plot
    self.do_plot(inputDir, plotOutDir, plotOutFileName, simDataFile,
  File "/home/users/rjuene/wcEcoli/models/ecoli/analysis/parca/fold_changes.py", line 20, in do_plot
    with open(os.path.join(input_dir, constants.SERIALIZED_RAW_DATA), 'rb') as f:
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
FileNotFoundError: [Errno 2] No such file or directory: '/home/users/rjuene/wcEcoli/out/20231205.124309__sherlock_internal_shift_metadata_test/kb/rawData.cPickle'

@ggsun and I suspect the issue is that the Parca output was compressed before the AnalysisParcaTask could run, creating the error in finding the needed file. Indeed, rawData.cPickle.bz2 is found in out/kb, but rawData.cPickle is not.

@1fish2
Copy link
Contributor

1fish2 commented Dec 13, 2023

Good hypothesis! Indeed, fw_queue creates the task ScriptTask_compression_raw_data depending only on the completion of InitRawData and later adds a link for it to also depend on the InitValidationData task. This task runs bzip2. Checking in a local manual run, bzip2 replaced 15MB rawData.cPickle with 3.5MB rawData.cPickle.bz2.

(lpad get_fws has an option --display_format {all,more,less,ids,count,reservations}. Picking more or all would probably show the task dependency links to verify this expectation of the dependency links.)

So (going by the variables in fw_queue rather than the task names) fw_parca_analysis should be another "parent" (dependency, prerequisite) of the fw_raw_data_compression task.

The code is almost there.

self.add_links(fw_parca_analysis,
fw_sim_data_1_compression, fw_validation_data_compression)

^^^ This makes fw_parca_analysis a parent of fw_sim_data_1_compression and fw_validation_data_compression, that is, don't run those two compression tasks until fw_parca_analysis completes.

Just add fw_raw_data_compression as another arg to add_links().

If this symptom is currently reproducible (the compression task could happen to run late enough sometimes to avoid the symptom), it's a good time to test the fix.

This raises other questions:

  • Is it new that AnalysisParcaTask reads the raw data? If not, it's odd why is this problem new?
  • Are there more missing dependency links?
  • wcm.py would need the same additional dependency but on checking the code, it doesn't add any compression tasks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants