Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add decimal rounding to geostationary area utilities for better consistency #2573

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

djhoese
Copy link
Member

@djhoese djhoese commented Sep 13, 2023

As discussed a little in #2560, the AHI HSD reader (which uses the geos utilities) produces inconsistent area extents between resolutions (usually not between time steps). In the ABI readers I put extra work into rounding the calculations so that they produced the same output between resolutions. I did a quick test at a few points in the geos utilities to see if something similar could be done for AHI.

The numbers going in to these calculations are all pretty simple and float friendly (ex. 0.5), but then there is a 2**16 thrown in there, then I tried rounding some things and it didn't really change much. It wasn't until I rounded extents near the np.deg2rad that things started being a little more consistent. For AHI I saw the best results rounding after np.deg2rad and found that rounding to 7 decimal places was still producing consistent results. What I don't like is changing the number of decimal places can change the results of the Y dimension 10s of meters for each segment.

Anyway, here's the results of this PR:

Area ID: FLDK
Description: AHI FLDK area
Projection ID: geosh8
Projection: {'a': '6378137', 'h': '35785863', 'lon_0': '140.7', 'no_defs': 'None', 'proj': 'geos', 'rf': '298.257024882273', 'type': 'crs', 'units': 'm', 'x_0': '0', 'y_0': '0'}
Number of columns: 11000
Number of rows: 11000
Area extent: (-5500000.8562, -5500000.8562, 5500000.8562, 5500000.8562)
Area ID: FLDK
Description: AHI FLDK area
Projection ID: geosh8
Projection: {'a': '6378137', 'h': '35785863', 'lon_0': '140.7', 'no_defs': 'None', 'proj': 'geos', 'rf': '298.257024882273', 'type': 'crs', 'units': 'm', 'x_0': '0', 'y_0': '0'}
Number of columns: 22000
Number of rows: 22000
Area extent: (-5500000.8562, -5500000.8562, 5500000.8562, 5500000.8562)
Area ID: FLDK
Description: AHI FLDK area
Projection ID: geosh8
Projection: {'a': '6378137', 'h': '35785863', 'lon_0': '140.7', 'no_defs': 'None', 'proj': 'geos', 'rf': '298.257024882273', 'type': 'crs', 'units': 'm', 'x_0': '0', 'y_0': '0'}
Number of columns: 5500
Number of rows: 5500
Area extent: (-5500000.8562, -5500000.8562, 5500000.8562, 5500000.8562)

And for the record, here's what main produces:

Area ID: FLDK
Description: AHI FLDK area
Projection ID: geosh8
Projection: {'a': '6378137', 'h': '35785863', 'lon_0': '140.7', 'no_defs': 'None', 'proj': 'geos', 'rf': '298.257024882273', 'type': 'crs', 'units': 'm', 'x_0': '0', 'y_0': '0'}
Number of columns: 11000
Number of rows: 11000
Area extent: (-5500000.0355, -5500000.0355, 5500000.0355, 5500000.0355)
Area ID: FLDK
Description: AHI FLDK area
Projection ID: geosh8
Projection: {'a': '6378137', 'h': '35785863', 'lon_0': '140.7', 'no_defs': 'None', 'proj': 'geos', 'rf': '298.257024882273', 'type': 'crs', 'units': 'm', 'x_0': '0', 'y_0': '0'}
Number of columns: 22000
Number of rows: 22000
Area extent: (-5499999.9684, -5499999.9684, 5499999.9684, 5499999.9684)
Area ID: FLDK
Description: AHI FLDK area
Projection ID: geosh8
Projection: {'a': '6378137', 'h': '35785863', 'lon_0': '140.7', 'no_defs': 'None', 'proj': 'geos', 'rf': '298.257024882273', 'type': 'crs', 'units': 'm', 'x_0': '0', 'y_0': '0'}
Number of columns: 5500
Number of rows: 5500
Area extent: (-5499999.9012, -5499999.9012, 5499999.9012, 5499999.9012)
  • Closes #xxxx
  • Tests added
  • Fully documented
  • Add your name to AUTHORS.md if not there already

@djhoese
Copy link
Member Author

djhoese commented Sep 14, 2023

I saw AGRI reader failures so tested with a real data case. Here's the old areas:

Area ID: DISK_1000m
Description: FY-4 DISK area
Projection ID: FY-4, 1000m
Projection: {'a': '6378140', 'h': '35786000', 'lon_0': '104.699996948242', 'no_defs': 'None', 'proj': 'geos', 'rf': '298.257223562999', 'type': 'crs', 'units': 'm', 'x_0': '0', 'y_0': '0'}
Number of columns: 10992
Number of rows: 10992
Area extent: (-5496021.076, -5496021.076, 5496021.076, 5496021.076)
Area ID: DISK_500m
Description: FY-4 DISK area
Projection ID: FY-4, 500m
Projection: {'a': '6378140', 'h': '35786000', 'lon_0': '104.699996948242', 'no_defs': 'None', 'proj': 'geos', 'rf': '298.257223562999', 'type': 'crs', 'units': 'm', 'x_0': '0', 'y_0': '0'}
Number of columns: 21984
Number of rows: 21984
Area extent: (-5496021.0089, -5496021.0089, 5496021.0089, 5496021.0089)
Area ID: DISK_2000m
Description: FY-4 DISK area
Projection ID: FY-4, 2000m
Projection: {'a': '6378140', 'h': '35786000', 'lon_0': '104.699996948242', 'no_defs': 'None', 'proj': 'geos', 'rf': '298.257223562999', 'type': 'crs', 'units': 'm', 'x_0': '0', 'y_0': '0'}
Number of columns: 5496
Number of rows: 5496
Area extent: (-5496021.2103, -5496021.2103, 5496021.2103, 5496021.2103)

New:

Area ID: DISK_1000m
Description: FY-4 DISK area
Projection ID: FY-4, 1000m
Projection: {'a': '6378140', 'h': '35786000', 'lon_0': '104.699996948242', 'no_defs': 'None', 'proj': 'geos', 'rf': '298.257223562999', 'type': 'crs', 'units': 'm', 'x_0': '0', 'y_0': '0'}
Number of columns: 10992
Number of rows: 10992
Area extent: (-5496021.0372, -5496021.0372, 5496021.0372, 5496021.0372)
Area ID: DISK_500m
Description: FY-4 DISK area
Projection ID: FY-4, 500m
Projection: {'a': '6378140', 'h': '35786000', 'lon_0': '104.699996948242', 'no_defs': 'None', 'proj': 'geos', 'rf': '298.257223562999', 'type': 'crs', 'units': 'm', 'x_0': '0', 'y_0': '0'}
Number of columns: 21984
Number of rows: 21984
Area extent: (-5496021.0372, -5496021.0372, 5496021.0372, 5496021.0372)
Area ID: DISK_2000m
Description: FY-4 DISK area
Projection ID: FY-4, 2000m
Projection: {'a': '6378140', 'h': '35786000', 'lon_0': '104.699996948242', 'no_defs': 'None', 'proj': 'geos', 'rf': '298.257223562999', 'type': 'crs', 'units': 'm', 'x_0': '0', 'y_0': '0'}
Number of columns: 5496
Number of rows: 5496
Area extent: (-5496021.0372, -5496021.0372, 5496021.0372, 5496021.0372)

So now these are consistent as well.

@djhoese
Copy link
Member Author

djhoese commented Sep 14, 2023

I had one SEVIRI HRIT case on my local machine. Old:

Area ID: msg_seviri_fes_1km
Description: MSG SEVIRI Full Earth Scanning service area definition with 1 km resolution
Projection: {'a': '6378169', 'h': '35785831', 'lon_0': '0', 'no_defs': 'None', 'proj': 'geos', 'rf': '295.488065897014', 'type': 'crs', 'units': 'm', 'x_0': '0', 'y_0': '0'}
Number of columns: 11136
Number of rows: 11136
Area extent: (5567747.741, 5569748.4127, -5569748.4127, -5567747.741)
Area ID: msg_seviri_fes_3km
Description: MSG SEVIRI Full Earth Scanning service area definition with 3 km resolution
Projection: {'a': '6378169', 'h': '35785831', 'lon_0': '0', 'no_defs': 'None', 'proj': 'geos', 'rf': '295.488065897014', 'type': 'crs', 'units': 'm', 'x_0': '0', 'y_0': '0'}
Number of columns: 3712
Number of rows: 3712
Area extent: (5568748.2834, 5568748.6867, -5568748.6867, -5568748.2834)

New:

Area ID: msg_seviri_fes_1km
Description: MSG SEVIRI Full Earth Scanning service area definition with 1 km resolution
Projection: {'a': '6378169', 'h': '35785831', 'lon_0': '0', 'no_defs': 'None', 'proj': 'geos', 'rf': '295.488065897014', 'type': 'crs', 'units': 'm', 'x_0': '0', 'y_0': '0'}
Number of columns: 11136
Number of rows: 11136
Area extent: (5567746.247, 5569749.1062, -5569749.1062, -5567746.247)
Area ID: msg_seviri_fes_3km
Description: MSG SEVIRI Full Earth Scanning service area definition with 3 km resolution
Projection: {'a': '6378169', 'h': '35785831', 'lon_0': '0', 'no_defs': 'None', 'proj': 'geos', 'rf': '295.488065897014', 'type': 'crs', 'units': 'm', 'x_0': '0', 'y_0': '0'}
Number of columns: 3712
Number of rows: 3712
Area extent: (5568748.2503, 5568747.1029, -5568747.1029, -5568748.2503)

So 1.5 meters for the HRV area I think that is? And 0.03 meters for the 3km bands. I do notice that in the new version the HRV area is actually further shifted from the 3km area than it was in the old one.

@mraspaud
Copy link
Member

I'd like @ameraner to confirm for the seviri part.

@ameraner
Copy link
Member

ameraner commented Oct 9, 2023

Unfortuately, I'm not too happy about this, tbh. It causes some issues:

  • the area extent computation seems to be a delicate topic in the community, so in the past we have put quite some effort to make sure that the SEVIRI/FCI area extents returned by the readers are at the best possible accuracy and following the exact instructions in the user guides, to be on the safe side when users ask "why is my area extent different than yours?". This PR adds a rounding step in a position that is not so easily traceable and explainable..
  • Since the usage of the make_ext is not consistent across readers, this causes inconsistencies. For example the area extents of the SEVIRI HRIT will diverge more from the ones computed from the SEVIRI native of NetCDF format. Similar thing for FCI, the area extents will be different if computed in the L1 reader or L2 reader.
  • It also raises the question of what area extents we should put in the areas.yaml... should it be the highest possible accuracy or the same as returned by the reader?

We actually have a similar issue as the one you're trying to solve in the case of the FCI grids, where different resolutions have slightly (in the order of millimeters) different area extents only because of numerical precision constraints (multiplying angle_step*nb_pixels is not the same as (angle_step/2) * (nb_pixels*2)). Our approach was to keep the area extents as they are computed using the values provided in the files for each resolution, following the approach "keep what the data gives you", even though we could in principle quantify which resolution produces the most accurate extents and use these across the grids.
I would be open to change the approach on this, if having consistent extents helps some applications (what is actually your motivation for homogenising this? is some satpy/pyresample functionality disturbed by this?), particularly since we're talking mostly about sub-meter changes that for geos do not really matter in a practical sense. But I would personally not solve it with a generic rounding approach, but rather case-by-case using a best numerical precision rationale.

Bottom line, I'm not against your changes in general, but I would also endorse @sfinkens idea in slack to make this optional, and then let the developer of each reader decide how the computation should be handled case-by-case, considering different format issues etc. of the specific instrument.

@djhoese
Copy link
Member Author

djhoese commented Oct 9, 2023

A suggestion from @sfinkens on slack was to make the rounding optional. I think this can be doable for this particular utility function and only be used for AHI at the moment. I don't think it should be a reader-level kwarg. To me it is more a problem that AHI isn't consistent and the unfortunate side effect of this is that AHI passes its inconsistent parameters to a shared utility function. Rounding these parameters produced way too much of a difference if I recall correctly so the rounding went into the utility function. I'll see if I have time to move to this optional rounding keyword argument solution.

@djhoese
Copy link
Member Author

djhoese commented Oct 9, 2023

Oops I didn't see your big comment @ameraner until I had already posted my previous one. As I eluded to in my comment, I am OK handling this in a reader by reader basis. I understand (and feel similarly about) that putting a "random" rounding operation in the middle of the math on calculating extents just seems wrong and like cheating just to get the answer I want.

As for why I want these to be consistent: I think my main motivation is the theoretical extents of the data/instrument. For ABI (and AHI and similar instruments) the footprint of a coarse resolution pixel is meant to (at least my understanding) align perfectly with the footprints of multiple finer resolution pixels. For example, 4 500m pixels fit into one 1km pixel. That is, with the same exact extents. When I updated the ABI L1b reader to use some level of rounding it was to match this theoretical expectation: the pixels should be aligned between resolutions and the center point of the geostationary reference disk is perfectly half way between the edges of the image. For ABI this was a decently easy argument to make because for whatever reason the X and Y coordinate variables were given 32-bit float scale factors and offsets which are insufficient to accurately describe the coordinates across a full disk grid using the advertised operations (data * factor + offset). For AHI HSD, it might not be that easy of an excuse.

As for my concerns: I worry that equality checks and boundary operations will fail between resolutions when put to the test. For the native resampler and maybe others we worked around this iirc by checking of the extents of two area definitions were "close enough". Additionally, having them be inconsistent means that per-pixel resolutions are different too. And that doing any operation to convert from one area to another will strongly depend on which area you use as a starting point. For example, producing a 4km reduced resolution area definition will have different extents depending on if we start with 500m or 1km or 2km versions of the instrument's area. And if we went from 500m to compute 1km (say in an aggregation step), then there is a very good chance that the aggregated 500m data on a 1km area will not be the same as the 1km area of the data loaded from the data files.

@djhoese djhoese requested a review from sfinkens as a code owner October 9, 2023 19:01
@djhoese
Copy link
Member Author

djhoese commented Oct 9, 2023

Ok I think the tests should be updated now, but the more and more I work on this the less pleased I am. I never disagreed with you guys (@simonrp84 and @ameraner) I just want consistent extents. I just don't know enough about the math I think to know what's the most correct way to get what I want (rounding an offset feels a lot better than rounding an intermediate radian angle).

@codecov
Copy link

codecov bot commented Oct 9, 2023

Codecov Report

Merging #2573 (2e12275) into main (f018f12) will increase coverage by 0.00%.
The diff coverage is 100.00%.

@@           Coverage Diff           @@
##             main    #2573   +/-   ##
=======================================
  Coverage   94.92%   94.92%           
=======================================
  Files         352      352           
  Lines       51225    51233    +8     
=======================================
+ Hits        48624    48632    +8     
  Misses       2601     2601           
Flag Coverage Δ
behaviourtests 4.27% <0.00%> (-0.01%) ⬇️
unittests 95.54% <100.00%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Coverage Δ
satpy/readers/_geos_area.py 100.00% <100.00%> (ø)
satpy/readers/ahi_hsd.py 99.29% <100.00%> (ø)
satpy/tests/reader_tests/test_ahi_hsd.py 99.68% <100.00%> (ø)

@djhoese djhoese force-pushed the bugfix-consistent-ahi-extents branch from 3ea3ef5 to 2e12275 Compare October 9, 2023 20:10
@coveralls
Copy link

Pull Request Test Coverage Report for Build 6461251984

  • 16 of 16 (100.0%) changed or added relevant lines in 3 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage increased (+0.001%) to 95.493%

Totals Coverage Status
Change from base Build 6426189025: 0.001%
Covered Lines: 48755
Relevant Lines: 51056

💛 - Coveralls

Copy link
Member

@ameraner ameraner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just an optional improvement suggestion for the docstrings.

And noted regarding your rationale, makes sense. We'll consider doing the same as for FCI, as described above.

Comment on lines +53 to +54
round: Round values by 7 decimal places to avoid inconsistencies
with floating point calculations.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a suggestion, to improve clearness

Suggested change
round: Round values by 7 decimal places to avoid inconsistencies
with floating point calculations.
round: Round intermediate area extent angle values by 7 decimal places
to avoid inconsistencies with floating point calculations.

(the same to the get_area_extent docstring).

I also notice now that the docstring here is wrong, since the input values are in deg instead of the documented m.

@@ -41,7 +41,7 @@ def get_xy_from_linecol(line, col, offsets, factors):
return x__, y__


def make_ext(ll_x, ur_x, ll_y, ur_y, h):
def make_ext(ll_x, ur_x, ll_y, ur_y, h, round=False):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about passing the number of decimals? Like

def make_ext(..., round_decimals=None):
    ...

my_ext = make_ext(..., round_decimals=4)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not a huge fan of this because the rounding is technically in an arbitrary location. Putting the rounding somewhere else results in different results. The generic round kwarg was meant to convey "don't worry about it, we'll round where and how we have to".

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, so in this case how about calling it something more along the lines of align_corners or eg snap_extents_to_closest_meter? Because that's the goal, right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Eh I wish it was as simple as getting to the nearest meter, but in tests that's not what is going on. In ABI the pixel resolution is not a nice round metered number, it is something like 1015.9 meters for the 1km data (this is off the top of my head, no idea if that's accurate). For AHI, I'm not sure it follows the same assumptions/design as ABI down to this level of detail. @simonrp84 might know more details, but I know I've had trouble googling it in the past. If we had a theoretical goal for what the extents would be then we could be more specific about numbers being passed in: the pixels should be 1002.15 meters so the left extent should be (num_cols / 2 * -1002.15).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I never did change the keyword argument to take an integer number of decimal places and I don't think we ever decided on a name for that keyword argument that I considered accurate. As mentioned above I'm not rounding to the nearest/closest meter and I'm not necessarily aligning corners although that is kind of the goal. I think there is conflicting hopes for this functionality.

On one side we don't want a generic non-flexible hardcoded rounding that may be sensor/reader specific. On the other side of the argument, we might not want developers to have to think too hard about all this decimal/rounding number "fudging" so a generic "do the rounding" argument would be nice...but maybe we do want developers to have to think about this even if it is hacky.

Copy link
Member

@mraspaud mraspaud left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, I think it's a good idea to let the user provide the number of decimals as @sfinkens is proposing.

@mraspaud
Copy link
Member

mraspaud commented Jan 9, 2024

Should we merge this? It looks like giving more flexibility to the rounding might not be so good after all, so I think we can merge this as soon as the conflicts are resolved.

@djhoese
Copy link
Member Author

djhoese commented Jan 10, 2024

I actually forgot about this. I think the conversation on slack led me to think that this wasn't a good idea. Maybe I'm misremembering.

@sfinkens
Copy link
Member

I think it's useful

@mraspaud
Copy link
Member

@djhoese I'll let you fix the conflicts and press the merge button

@mraspaud mraspaud added this to the v0.48.0 milestone Apr 17, 2024
@djhoese
Copy link
Member Author

djhoese commented Apr 17, 2024

@mraspaud as mentioned in the above comments, my understanding was that as-is this isn't good enough.

@mraspaud
Copy link
Member

Ok, should we close this for now then? Or convert it to draft?

@djhoese djhoese marked this pull request as draft April 17, 2024 14:54
@djhoese
Copy link
Member Author

djhoese commented Apr 17, 2024

Converted to draft.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants