Activate LI L2 accumulated products gridding by default #2789

ameraner · 2024-04-22T17:22:29Z

This PR changes the default behaviour of the LI L2 reader, such that the accumulated products are delivered automatically as 2-d arrays.

Background and current state:
The LI L2 products consist of two types of products:

point products (LE, LEF, LFL, LGR): "classic" lightning products consisting of values with a lat-lon
accumulated (and gridded) products (AF, AFR, AFA): these products are the result of temporal accumulation (e.g. 30 seconds), and have been regridded to the FCI 2km grid by the ground processor, in order to facilitate the synergistic usage together with FCI.

Now, both products are actually stored in the NetCDF files as 1-d arrays. In the case of the accumulated products, each entry in the 1-d array corresponds to a specific pixel in the FCI grid (i.e. the corresponding column and row number are given in the file).

The mapping of these points onto the FCI grid, to obtain a 2-d array with an AreaDefinition, is already implemented in the reader, but needs to be activated by passing the reader kwarg with_area_definition=True. The default value of this is currently set to False. This is because, during the first implementation of the reader, it was decided to follow the rationale of "the reader shall return data as it's stored in the file and not modify it".

Why do we want to change this (set with_area_definition=True as default)?
The reasons are multiple:

the accumulated products are designed and computed to be used in the grid. It is more intuitive for a user if the reader returns an accumulated product already in the 2-d grid, ready to be used, rather than a 1-d array that needs remapping in a specific way.
activating the automatic regridding by default avoids the risk of users resampling the 1-d points to a 2-d grid with methods that can introduce artifacts (e.g. nearest or bilinear that introduce "pixel bleeding" effects)
the usage of reader_kwargs through the Scene object is a rather advanced operation, and requires the user to fully find and understand the reader documentation, which is not always trivial. It also introduces an extra step that is otherwise not required in the standard Satpy API usage.
It simplifies the usage and combination with FCI, e.g. by allowing the generation of multi-sensor composites out-of-the box with the same handling of the FCI data.
Advanced users that would like to keep the data as 1-d arrays for further customised processing, can of course still do so via the Scene reader_kwargs interface.

For these reasons, we think that this is a good exception to the "reader doesn't modify data" guideline, since it applies an operation to the data that is anyway expected to be performed for the product to be usable, improving the user-friendliness.

Note that LI L2 accumulated products have not been widely distributed so far, and the most attention for these will start with the LI pre-operational dissemination (hopefully in a couple of months time), so we think it's a good time to introduce this now.

Tests added
Fully documented

codecov · 2024-04-22T17:33:11Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 95.94%. Comparing base (4faccbb) to head (98993a4).
Report is 112 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #2789      +/-   ##
==========================================
+ Coverage   95.93%   95.94%   +0.01%     
==========================================
  Files         377      379       +2     
  Lines       53541    53693     +152     
==========================================
+ Hits        51362    51515     +153     
+ Misses       2179     2178       -1

Flag	Coverage Δ
behaviourtests	`4.09% <0.00%> (-0.01%)`	⬇️
unittests	`96.04% <100.00%> (+0.01%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

coveralls · 2024-04-22T17:49:31Z

Pull Request Test Coverage Report for Build 8802581731

Details

0 of 0 changed or added relevant lines in 0 files are covered.
No unchanged relevant lines lost coverage.
Overall coverage remained the same at 0.0%

Totals
Change from base Build 8787439081:	0.0%
Covered Lines:	0
Relevant Lines:	0

💛 - Coveralls

sjoro

thank you @ameraner for the PR! tiny suggested editorials for the docstrings, otherwise LGTM.

satpy/readers/li_l2_nc.py

sjoro · 2024-04-23T07:42:52Z

i agree with @ameraner and have discussed this issue with him earlier, so i'm good with this change.

pnuu

LGTM.

gerritholl · 2024-04-23T07:47:16Z

I fully support that this should be true by default. We are already modifying the data in most readers, because by default, we use calibration=radiance or calibration=brightness_temperature, not calibration=counts, which would be the data as they are in the file.

But I'm not sure if this should be a reader kwarg, or a DataID key (like for calibration or polarisation). Probably I'm rather late with this thought because the gridding feature already exists, just isn't activated by default.

pnuu · 2024-04-23T07:48:58Z

activating the automatic regridding by default avoids the risk of users resampling the 1-d points to a 2-d grid with methods that can introduce artifacts (e.g. nearest or bilinear that introduce "pixel bleeding" effects)

How big of an issue do you think the pixel bleeding will be when resampling the 2D array to a local area definition? Or should that be ideally done from the non-gridded data (LFL)?

sjoro

LGTM!

ameraner · 2024-04-23T15:25:40Z

How big of an issue do you think the pixel bleeding will be when resampling the 2D array to a local area definition? Or should that be ideally done from the non-gridded data (LFL)?

That is indeed a good point, and it's a bit tricky... My understanding:

The accumulated products show the extent of the measured lightning events in the FCI grid. Therefore, ideally, one would stay in the native geostationary projection (and possibly use the native resampler to up/downscale the grid if needed) to avoid resampling artefacts. If the projection needs to be changed, any of the existing resampling algorithms would end up changing the area covered by the lightnings eventually (since we map pixels between grids using only the pixel center position, and not the pixel footprint area - correct me if I'm wrong here). The bucket resampler would do the best job, in that at least it avoids the pixel bleeding effect, which could artificially increase the area covered by the lightning, possibly causing a wrong interpretation of the scene.

For the point products: the groups and flashes products represent spatial and temporal clusters of events; however, we only get one lat-lon coordinate for each group or flash (i.e. the average location of the group/flash) hence we anyway lose the spatial extent information. For this reason, I think that also in this case the bucket resampler does the best job, since it maps one group/flash to exactly one pixel in the target grid, so-to-say maintaining the "lat-lon marker" functionality.

I hope this is understandable/makes some sense. Otherwise I'm happy to discuss this further with examples and in front of a flipchart, e.g. at the user days :) What is sure is that there's still a lot to be experimented with using this data!

pnuu · 2024-04-23T16:44:30Z

Thanks @ameraner for the insight! I just today got my MTG/LI processing chain completely functional (needed the new Satpy version for S3 support), and I'm using resampler='bucket_count' to aggregate the LFL data to a 4 km grid. I'm putting the resulting images to WMS so they can be used as an overlay with any data.

I'll include my LI setup in my PUD presentation in June.

sjoro · 2024-04-24T06:13:42Z

@pnuu it can be really significant, we see this when visualising GII BUFR data, which is essentially "point data" and needs to be put in an MSG grid before viewing the result. i did the first experiments with "nearest neighbor" resampling and it all depends on the search radius... too big and we get round blobs of data bleeding to neighboring pixels, too small and we get gaps in the data. bucket resampling does the best job, or simply converting the given lonlats first to rows and cols and use those to index the data points to correct location. we know from encoding the GII binary (gridded) product to BUFR that each GII data point has only one location where it should go.

ameraner added 3 commits April 22, 2024 15:30

change default with_area_definition value

7c13959

fix tests by specifying with_area_definition=False

1f089fa

update documentation

de0e3c7

ameraner requested review from djhoese and mraspaud as code owners April 22, 2024 17:22

ameraner requested review from pnuu, sjoro and gerritholl April 22, 2024 17:23

ameraner self-assigned this Apr 22, 2024

ameraner added enhancement code enhancements, features, improvements documentation component:readers backwards-incompatibility Causes backwards incompatibility or introduces a deprecation labels Apr 22, 2024

sjoro suggested changes Apr 23, 2024

View reviewed changes

satpy/readers/li_l2_nc.py Show resolved Hide resolved

satpy/readers/li_l2_nc.py Outdated Show resolved Hide resolved

pnuu approved these changes Apr 23, 2024

View reviewed changes

enhance documentation

98993a4

sjoro self-requested a review April 23, 2024 14:56

sjoro approved these changes Apr 23, 2024

View reviewed changes

ameraner mentioned this pull request May 22, 2024

Fix LI L2 accumulated products 'with_area_definition': False 1-d coordinates computation #2804

Open

1 task

ameraner added this to the v0.49.0 milestone May 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Activate LI L2 accumulated products gridding by default #2789

Activate LI L2 accumulated products gridding by default #2789

ameraner commented Apr 22, 2024 •

edited

codecov bot commented Apr 22, 2024 •

edited

coveralls commented Apr 22, 2024 •

edited

sjoro left a comment

sjoro commented Apr 23, 2024 •

edited

pnuu left a comment

gerritholl commented Apr 23, 2024

pnuu commented Apr 23, 2024

sjoro left a comment

ameraner commented Apr 23, 2024

pnuu commented Apr 23, 2024

sjoro commented Apr 24, 2024 •

edited

Activate LI L2 accumulated products gridding by default #2789

Are you sure you want to change the base?

Activate LI L2 accumulated products gridding by default #2789

Conversation

ameraner commented Apr 22, 2024 • edited

codecov bot commented Apr 22, 2024 • edited

Codecov Report

coveralls commented Apr 22, 2024 • edited

Pull Request Test Coverage Report for Build 8802581731

Details

💛 - Coveralls

sjoro left a comment

Choose a reason for hiding this comment

sjoro commented Apr 23, 2024 • edited

pnuu left a comment

Choose a reason for hiding this comment

gerritholl commented Apr 23, 2024

pnuu commented Apr 23, 2024

sjoro left a comment

Choose a reason for hiding this comment

ameraner commented Apr 23, 2024

pnuu commented Apr 23, 2024

sjoro commented Apr 24, 2024 • edited

ameraner commented Apr 22, 2024 •

edited

codecov bot commented Apr 22, 2024 •

edited

coveralls commented Apr 22, 2024 •

edited

sjoro commented Apr 23, 2024 •

edited

sjoro commented Apr 24, 2024 •

edited