Analysis class cleanup to support future extension of the hli #3852

QRemy · 2022-03-15T18:13:55Z

Add the remaining changes proposed in #3788 related to the cleanup of the analysis class :

the methods .get_xx are replace by specific AnalysisStep classes and can be defined in the configuration file (for example config.general.steps = ["data-reduction", "fit"]), or called interactively by analysis.run(["data-reduction", "fit"])
in the cli gammapy analysis run is now equivalent to analysis.run() so it can execute all the analysis step available and not only data reduction.

codecov · 2022-03-15T18:22:53Z

Codecov Report

Merging #3852 (dadb9ef) into master (50b3297) will increase coverage by 0.00%.
The diff coverage is 94.65%.

@@           Coverage Diff           @@
##           master    #3852   +/-   ##
=======================================
  Coverage   93.78%   93.78%           
=======================================
  Files         162      163    +1     
  Lines       20138    20222   +84     
=======================================
+ Hits        18886    18965   +79     
- Misses       1252     1257    +5

Impacted Files	Coverage Δ
gammapy/analysis/steps.py	`94.05% <94.05%> (ø)`
gammapy/analysis/core.py	`98.19% <97.29%> (+2.57%)`	⬆️
gammapy/analysis/__init__.py	`100.00% <100.00%> (ø)`
gammapy/analysis/config.py	`100.00% <100.00%> (ø)`
gammapy/scripts/analysis.py	`100.00% <100.00%> (ø)`

📣 Codecov can now indicate which changes are the most critical in Pull Requests. Learn more

adonath

Thanks a lot @QRemy, I think this goes exactly in the right direction! Especially I like the improvement in code organization and possibility to extend the Analysis pipeline with a registry.

I have left a few general comments with proposals for the API and some remaining questions for now. I think it might be good to discuss this in a bit more detail again, maybe in a dedicated meeting next week?

adonath · 2022-03-18T17:35:20Z

gammapy/analysis/steps.py

+    requires_datasets = False
+    requires_models = False
+
+    def __init__(self, analysis, name=None, overwrite=True):


On a first look taking the analysis class on __init__ does not make sense to me. This introduces a "cross dependency", while in fact I think it can be hierarchical in the sense that the Analysis class is built from AnalysisStep classes. It think this can be resolved by slightly refactoring the API of the AnalysisStep class, along the lines of:

class AnalysisStep: """Analysis step class""" tag = "analysis-step" def __init__(self, analysis_sub_config, overwrite=True, log=None): self.config = analysis_sub_config self.overwrite = overwrite if log is None: log = logging.getLogger(__name__) self.log = log @property def maker_config(self): # translate analysis sub config to Gammapy API config here return config def run(self, datasets, models=None): maker = Maker(**self.maker_config) # returning might be optional...changes could happen in place return datasets

Maybe a minima, you could take the AnalysisConfig on init and pass the Analysis to run().

for now changed to take AnalysisConfig on init and pass the Analysis to run() but in the next PR I will introduce a specific AnalysisProducts container to return outputs and pass data references on run().

adonath · 2022-03-18T17:35:51Z

gammapy/analysis/steps.py

+    def __init__(self, analysis, name=None, overwrite=True):
+        self.analysis = analysis
+        self.overwrite = overwrite
+        self._name = make_name(name)


What is the name attribute used for? Is to generate the dataset later? But what happened for multiple datasets?

for now it is not used but I had in mind that it would be use to select data product from specific steps

gammapy/analysis/steps.py

adonath · 2022-03-18T17:40:24Z

gammapy/analysis/steps.py

+            self.analysis.datasets = Datasets([stacked])
+
+
+def make_energy_axis(axis, name="energy"):


I haven't thought this through, but we could maybe even introduce and API like, MapAxis.from_analysis_config(config=) and Maker.from_analysis_config()...

adonath · 2022-03-18T17:40:24Z

gammapy/analysis/steps.py

+            self.analysis.datasets = Datasets([stacked])
+
+
+def make_energy_axis(axis, name="energy"):


I haven't thought this through, but we could maybe even introduce and API like, MapAxis.from_analysis_config(config=) and Maker.from_analysis_config()...

registerrier

Thanks @QRemy. This is a very ambitious change but it is definitely very interesting.

See some inline comments.
To make progress possible and have an intermediate working solution, it might be possible to pass the Analysis object to each step rather than passing it on init.
Would that be OK @adonath ?

gammapy/analysis/config.py

gammapy/analysis/core.py

registerrier · 2022-04-15T15:07:02Z

gammapy/analysis/core.py

-            path = make_path(obs_settings.obs_file)
-            ids = list(Table.read(path, format="ascii", data_start=0).columns[0])
-            selected_obs_table = self.datastore.obs_table.select_obs_id(ids)
+    def run(self, steps=None, overwrite=None, **kwargs):


add doctring

what does the overwrite option do?

I had in mind that each AnalysisStep could have a read/write method but for now this is used only in the DataReductionAnalysisStep to read the datasets if exits.

registerrier · 2022-04-15T15:10:40Z

gammapy/analysis/core.py

-            ids = list(Table.read(path, format="ascii", data_start=0).columns[0])
-            selected_obs_table = self.datastore.obs_table.select_obs_id(ids)
+    def run(self, steps=None, overwrite=None, **kwargs):
+        if steps is None:


Maybe steps creation should be done in another method?

It means they will have to be kept in memory and attached to the analysis class which affects where the information is stored. I will try this later after sorting out the input/output of the analysis steps.

registerrier · 2022-04-15T15:14:46Z

gammapy/analysis/core.py

-    def run_fit(self):
-        """Fitting reduced datasets to model."""
-        if not self.models:
+    def check_datasets(self):


is dataset reading and/or model creation and setting a specific analysis step?

no but it could be, for now reading of config.general.datasets_file is done through the data-selection step if the file exists, and if overwrite is False.

registerrier · 2022-04-15T15:20:07Z

gammapy/analysis/steps.py

+    requires_datasets = False
+    requires_models = False
+
+    def __init__(self, analysis, name=None, overwrite=True):


Maybe a minima, you could take the AnalysisConfig on init and pass the Analysis to run().

gammapy/analysis/steps.py

registerrier · 2022-04-15T15:38:06Z

gammapy/analysis/steps.py

+                )
+            ]
+
+        self.analysis.check_datasets()


why is it needed explicitly here?

removed it for now but in future PR I will reintroduce a similar system to check if the data required for the each step are well defined

registerrier · 2022-09-21T09:34:10Z

This PR implements the basis of a large refactoring of the HLI. A number of design choices will have to be made and a full expanded and improved HLI is an objective for v2.0.
For now, even though the changes proposed in this PR do not break the API, we decided during the co-working week to postpone to v1.1 and implement the HLI in parallel with the existing one.

QRemy requested review from Bultako and registerrier March 15, 2022 18:13

QRemy added cleanup feature labels Mar 15, 2022

QRemy mentioned this pull request Mar 15, 2022

Alternative interface for the hli #3788

Closed

adonath self-assigned this Mar 18, 2022

adonath added this to the 1.0 milestone Mar 18, 2022

adonath reviewed Mar 18, 2022

View reviewed changes

adonath mentioned this pull request Mar 30, 2022

Analysis submodule text gammapy/gammapy-v1.0-paper#41

Merged

registerrier reviewed Apr 15, 2022

View reviewed changes

QRemy added 9 commits April 19, 2022 11:20

add analysis steps

d0b9dd2

cleanup steps

e97ba49

remove code moved to steps from core

6c74b96

adapt map_making

94317a0

add step config

f4f9131

check datasests and models

e50acd2

adapt cli

569b355

typo

3b99d02

implement comments

5aca4e0

QRemy force-pushed the hli_run branch from 4b085bc to 5aca4e0 Compare April 19, 2022 12:59

fix test

dadb9ef

adonath modified the milestones: 1.0rc, 1.0 May 5, 2022

registerrier modified the milestones: 1.0, 1.1 Sep 21, 2022

chaimain mentioned this pull request Dec 13, 2022

New structure for the pipeline chaimain/asgardpy#24

Closed

registerrier modified the milestones: 1.1, 1.2 Apr 24, 2023

QRemy modified the milestones: 1.2, 2.0 Aug 7, 2023

QRemy mentioned this pull request Aug 8, 2023

Ideas for the high level interface #4717

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Analysis class cleanup to support future extension of the hli #3852

Analysis class cleanup to support future extension of the hli #3852

QRemy commented Mar 15, 2022

codecov bot commented Mar 15, 2022 •

edited

adonath left a comment

adonath Mar 18, 2022

registerrier Apr 15, 2022

QRemy Apr 19, 2022

adonath Mar 18, 2022

QRemy Apr 19, 2022

adonath Mar 18, 2022

adonath Mar 18, 2022

registerrier left a comment

registerrier Apr 15, 2022

registerrier Apr 15, 2022

QRemy Apr 19, 2022

registerrier Apr 15, 2022

QRemy Apr 19, 2022

registerrier Apr 15, 2022

QRemy Apr 19, 2022

registerrier Apr 15, 2022

registerrier Apr 15, 2022

QRemy Apr 19, 2022 •

edited

registerrier commented Sep 21, 2022

		self.analysis.datasets = Datasets([stacked])


		def make_energy_axis(axis, name="energy"):

Analysis class cleanup to support future extension of the hli #3852

Are you sure you want to change the base?

Analysis class cleanup to support future extension of the hli #3852

Conversation

QRemy commented Mar 15, 2022

codecov bot commented Mar 15, 2022 • edited

Codecov Report

adonath left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

registerrier left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

QRemy Apr 19, 2022 • edited

Choose a reason for hiding this comment

registerrier commented Sep 21, 2022

codecov bot commented Mar 15, 2022 •

edited

QRemy Apr 19, 2022 •

edited