Consistency between MapDataset.stack and Datasets.stack_reduce #5261

registerrier · 2024-05-15T16:49:22Z

Currently Datasets.stack_reduce() will first apply the to_masked() method to the first dataset.
Using MapDataset.stack the mask is not applied to self, the assumption being that the first dataset used is empty.
While this is probably true in general, this might lead to some issues. At least the docstring should more clearly state this.
Note also that MapDatasetOnOff.stack applies the mask to self.counts_off, which is then inconsistent.

gammapy/gammapy/datasets/map.py

Line 2601 in 011cefe

total_off.stack(

See associated PRs : #3058 #3164

Is there any reason for this specific behavior?

The text was updated successfully, but these errors were encountered:

AtreyeeS · 2024-05-15T17:12:30Z

If I remember correctly, the behaviour in MapDataset.stack is intentional. It was the cleanest way to be able to stack maps during the data reduction process while handling the safe mask correctly.
This is included in the Note here https://docs.gammapy.org/1.2/user-guide/datasets/index.html#stacking-multiple-datasets - "To properly handle masks, it is necessary to stack onto an empty dataset."
Datasets.stack_reduce was introduced to avoid users stacking onto non-empty maps. The idea was general users should only use Datasets.stack_reduce and not MapDataset.stack. I guess we did not expose these consistently.

Not sure what is happening in MapDatasetOnOff.

adonath · 2024-05-15T18:10:12Z

Using MapDataset.stack the mask is not applied to self, the assumption being that the first dataset used is empty.

If I remember correctly, the first implementation of MapDataset.stack() had the in-place application of the mask. This lead to a lot of complex code, because one needed to apply the mask to all quantities in the dataset. The same code was then repeated for the "other" dataset. It was much simpler to introduce the .to_masked() method, with all the mask application in a single place.

Applying the mask for in-place stacking would mean to always apply it, because one cannot know whether the stacking happens the first time. This means, except for the first time, one would un-necessarily apply the (stacked) mask over and over again.

But independently, stack_reduce() is the higher level method anyway. So I think there is no need for consistency. However the behavior of the on-off dataset should be fixed.

registerrier · 2024-05-15T18:58:26Z

OK thanks. Not re-masking all the time seems a reasonable approach indeed. While solving #5245 , I try to adapt the docstring and the behaviour of MapDatasetOnOff.stack()

registerrier added the question label May 15, 2024

registerrier added this to the 1.2.1 milestone May 16, 2024

registerrier linked a pull request May 17, 2024 that will close this issue

Modify acceptance stacking behavior #5270

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Consistency between MapDataset.stack and Datasets.stack_reduce #5261

Consistency between MapDataset.stack and Datasets.stack_reduce #5261

registerrier commented May 15, 2024

AtreyeeS commented May 15, 2024

adonath commented May 15, 2024 •

edited

registerrier commented May 15, 2024

Consistency between MapDataset.stack and Datasets.stack_reduce #5261

Consistency between MapDataset.stack and Datasets.stack_reduce #5261

Comments

registerrier commented May 15, 2024

AtreyeeS commented May 15, 2024

adonath commented May 15, 2024 • edited

registerrier commented May 15, 2024

adonath commented May 15, 2024 •

edited