Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Activate LI L2 accumulated products gridding by default #2789

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

ameraner
Copy link
Member

@ameraner ameraner commented Apr 22, 2024

This PR changes the default behaviour of the LI L2 reader, such that the accumulated products are delivered automatically as 2-d arrays.

Background and current state:
The LI L2 products consist of two types of products:

  • point products (LE, LEF, LFL, LGR): "classic" lightning products consisting of values with a lat-lon
  • accumulated (and gridded) products (AF, AFR, AFA): these products are the result of temporal accumulation (e.g. 30 seconds), and have been regridded to the FCI 2km grid by the ground processor, in order to facilitate the synergistic usage together with FCI.

Now, both products are actually stored in the NetCDF files as 1-d arrays. In the case of the accumulated products, each entry in the 1-d array corresponds to a specific pixel in the FCI grid (i.e. the corresponding column and row number are given in the file).

The mapping of these points onto the FCI grid, to obtain a 2-d array with an AreaDefinition, is already implemented in the reader, but needs to be activated by passing the reader kwarg with_area_definition=True. The default value of this is currently set to False. This is because, during the first implementation of the reader, it was decided to follow the rationale of "the reader shall return data as it's stored in the file and not modify it".

Why do we want to change this (set with_area_definition=True as default)?
The reasons are multiple:

  • the accumulated products are designed and computed to be used in the grid. It is more intuitive for a user if the reader returns an accumulated product already in the 2-d grid, ready to be used, rather than a 1-d array that needs remapping in a specific way.
  • activating the automatic regridding by default avoids the risk of users resampling the 1-d points to a 2-d grid with methods that can introduce artifacts (e.g. nearest or bilinear that introduce "pixel bleeding" effects)
  • the usage of reader_kwargs through the Scene object is a rather advanced operation, and requires the user to fully find and understand the reader documentation, which is not always trivial. It also introduces an extra step that is otherwise not required in the standard Satpy API usage.
  • It simplifies the usage and combination with FCI, e.g. by allowing the generation of multi-sensor composites out-of-the box with the same handling of the FCI data.
  • Advanced users that would like to keep the data as 1-d arrays for further customised processing, can of course still do so via the Scene reader_kwargs interface.

For these reasons, we think that this is a good exception to the "reader doesn't modify data" guideline, since it applies an operation to the data that is anyway expected to be performed for the product to be usable, improving the user-friendliness.

Note that LI L2 accumulated products have not been widely distributed so far, and the most attention for these will start with the LI pre-operational dissemination (hopefully in a couple of months time), so we think it's a good time to introduce this now.

  • Tests added
  • Fully documented

@ameraner ameraner self-assigned this Apr 22, 2024
@ameraner ameraner added enhancement code enhancements, features, improvements documentation component:readers backwards-incompatibility Causes backwards incompatibility or introduces a deprecation labels Apr 22, 2024
Copy link

codecov bot commented Apr 22, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 95.94%. Comparing base (4faccbb) to head (98993a4).
Report is 112 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2789      +/-   ##
==========================================
+ Coverage   95.93%   95.94%   +0.01%     
==========================================
  Files         377      379       +2     
  Lines       53541    53693     +152     
==========================================
+ Hits        51362    51515     +153     
+ Misses       2179     2178       -1     
Flag Coverage Δ
behaviourtests 4.09% <0.00%> (-0.01%) ⬇️
unittests 96.04% <100.00%> (+0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@coveralls
Copy link

coveralls commented Apr 22, 2024

Pull Request Test Coverage Report for Build 8802581731

Details

  • 0 of 0 changed or added relevant lines in 0 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage remained the same at 0.0%

Totals Coverage Status
Change from base Build 8787439081: 0.0%
Covered Lines: 0
Relevant Lines: 0

💛 - Coveralls

Copy link
Collaborator

@sjoro sjoro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you @ameraner for the PR! tiny suggested editorials for the docstrings, otherwise LGTM.

satpy/readers/li_l2_nc.py Show resolved Hide resolved
satpy/readers/li_l2_nc.py Outdated Show resolved Hide resolved
@sjoro
Copy link
Collaborator

sjoro commented Apr 23, 2024

i agree with @ameraner and have discussed this issue with him earlier, so i'm good with this change.

Copy link
Member

@pnuu pnuu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@gerritholl
Copy link
Collaborator

I fully support that this should be true by default. We are already modifying the data in most readers, because by default, we use calibration=radiance or calibration=brightness_temperature, not calibration=counts, which would be the data as they are in the file.

But I'm not sure if this should be a reader kwarg, or a DataID key (like for calibration or polarisation). Probably I'm rather late with this thought because the gridding feature already exists, just isn't activated by default.

@pnuu
Copy link
Member

pnuu commented Apr 23, 2024

activating the automatic regridding by default avoids the risk of users resampling the 1-d points to a 2-d grid with methods that can introduce artifacts (e.g. nearest or bilinear that introduce "pixel bleeding" effects)

How big of an issue do you think the pixel bleeding will be when resampling the 2D array to a local area definition? Or should that be ideally done from the non-gridded data (LFL)?

@sjoro sjoro self-requested a review April 23, 2024 14:56
Copy link
Collaborator

@sjoro sjoro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@ameraner
Copy link
Member Author

How big of an issue do you think the pixel bleeding will be when resampling the 2D array to a local area definition? Or should that be ideally done from the non-gridded data (LFL)?

That is indeed a good point, and it's a bit tricky... My understanding:

The accumulated products show the extent of the measured lightning events in the FCI grid. Therefore, ideally, one would stay in the native geostationary projection (and possibly use the native resampler to up/downscale the grid if needed) to avoid resampling artefacts. If the projection needs to be changed, any of the existing resampling algorithms would end up changing the area covered by the lightnings eventually (since we map pixels between grids using only the pixel center position, and not the pixel footprint area - correct me if I'm wrong here). The bucket resampler would do the best job, in that at least it avoids the pixel bleeding effect, which could artificially increase the area covered by the lightning, possibly causing a wrong interpretation of the scene.

For the point products: the groups and flashes products represent spatial and temporal clusters of events; however, we only get one lat-lon coordinate for each group or flash (i.e. the average location of the group/flash) hence we anyway lose the spatial extent information. For this reason, I think that also in this case the bucket resampler does the best job, since it maps one group/flash to exactly one pixel in the target grid, so-to-say maintaining the "lat-lon marker" functionality.

I hope this is understandable/makes some sense. Otherwise I'm happy to discuss this further with examples and in front of a flipchart, e.g. at the user days :) What is sure is that there's still a lot to be experimented with using this data!

@pnuu
Copy link
Member

pnuu commented Apr 23, 2024

Thanks @ameraner for the insight! I just today got my MTG/LI processing chain completely functional (needed the new Satpy version for S3 support), and I'm using resampler='bucket_count' to aggregate the LFL data to a 4 km grid. I'm putting the resulting images to WMS so they can be used as an overlay with any data.

I'll include my LI setup in my PUD presentation in June.

@sjoro
Copy link
Collaborator

sjoro commented Apr 24, 2024

@pnuu it can be really significant, we see this when visualising GII BUFR data, which is essentially "point data" and needs to be put in an MSG grid before viewing the result. i did the first experiments with "nearest neighbor" resampling and it all depends on the search radius... too big and we get round blobs of data bleeding to neighboring pixels, too small and we get gaps in the data. bucket resampling does the best job, or simply converting the given lonlats first to rows and cols and use those to index the data points to correct location. we know from encoding the GII binary (gridded) product to BUFR that each GII data point has only one location where it should go.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backwards-incompatibility Causes backwards incompatibility or introduces a deprecation component:readers documentation enhancement code enhancements, features, improvements
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants