Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

POC: Handle std names and aliases (#5257) #5313

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

larsbarring
Copy link
Contributor

🚀 Pull Request

Description

This is a "proof-of-concept" PR to address how to better handle standard name aliases. It consists of the following elements:

  • Reorganising the iris.std_names.py a bit to have a separate dict for aliases (via updated tools/generate_std_names.py).
    It now includes some table version information (mentioned in CF global attribute "Conventions": expose to users, and include CF standard name table version #5255), and a separate dict for the standard name descriptions (optional when generated).
  • Adding a new std_name_table.py containing the following functions:
    • get_convention -- return a tentative Conventions string
    • set_alias_processing -- define how to handle aliases:
      "keep" - current behaviour, treat aliases in the same way as currently valid standard names,
      "warn" - issue a warning (default), otherwise as "keep",
      "replace" - silently update aliases to current standard names.
    • get_description -- return the standard name description if available
    • check_valid_standard_name -- check if a name is a standard name or an alias, and do the translation if requested as defined by set_alias_processing
  • std_name_table is [naively] imported in iris.__init__
  • common/mixin._get_valid_standard_name is modified to use check_valid_std_name

No units test have been added (would be good to first get some feedback whether this POC is a reasonable approach .... )


Consult Iris pull request check list

@codecov
Copy link

codecov bot commented May 11, 2023

Codecov Report

Attention: Patch coverage is 45.00000% with 33 lines in your changes are missing coverage. Please review.

Project coverage is 89.19%. Comparing base (a3931f6) to head (d2216e7).
Report is 300 commits behind head on main.

Files Patch % Lines
lib/iris/std_name_table.py 35.29% 30 Missing and 3 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #5313      +/-   ##
==========================================
- Coverage   89.31%   89.19%   -0.13%     
==========================================
  Files          89       90       +1     
  Lines       22375    22430      +55     
  Branches     5368     5383      +15     
==========================================
+ Hits        19985    20007      +22     
- Misses       1640     1670      +30     
- Partials      750      753       +3     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

cf-convention/discuss#229
Standard names: hard to spot "typo" in
`surface_upward_mass_flux_of_methane_due_to_emission_from_fires`
    * new optional argument -d/--descr for including standard name descriptions
    * including the description of all standard names
    * including standard name table version and related info.

    * now contains several variables:
      - variables (starting with underscore) used in the processing
      - VERSION (dict)
      - CONVENTIONS_STRING (str)
      - STD_NAME (dict)
      - ALIASES (dict)
      - optionally DESCRIPTION (dict)

    harmonize quotation marks to single quotes in description string
@CLAassistant
Copy link

CLAassistant commented Apr 8, 2024

CLA assistant check
All committers have signed the CLA.

@pp-mo
Copy link
Member

pp-mo commented May 1, 2024

@SciTools/peloton is this still alive issue, and if so could you sign the CLA @larsbarring ?

@larsbarring
Copy link
Contributor Author

I thought I had already signed the CLA because of one or two previous minor contributions. Anyway, now done. Whether it is alive or not I am not sure. I have not made any further effort since back then as I am not sure whether it is a reasonable approach in the context of Iris.

@pp-mo
Copy link
Member

pp-mo commented May 1, 2024

I thought I had already signed the CLA because of one or two previous minor contributions. Anyway, now done. Whether it is alive or not I am not sure. I have not made any further effort since back then ...

Ok thanks!

I am not sure whether it is a reasonable approach in the context of Iris.

Well, I guess we'll take a look + see about it
@larsbarring are you at least clear that something like this would still be useful ?

@larsbarring
Copy link
Contributor Author

Yes, I think that this would still be useful. In the context of CF an aliased standard name is typically regarded as deprecated. Software should be able to read data having an aliased standard name, but new data should use the replacement name. Obviously, there are judgements to be made here regarding how to deal with this in practice. But and aliased standard name should not be considered just as an alternative at the same level as standard name.

Also, note that there will likely be some minor changes to the standard name xml file format as of next version, possibly also backported to all previous versions.

@schlunma
Copy link
Contributor

schlunma commented May 2, 2024

A proper way to handle standard name aliases would also be useful for ESMValTool, see ESMValGroup/ESMValCore#1985.

One issue we currently face is merging cubes with different standard names, which is (in the current iris version) not even allowed if the standard names are aliases of each other.

@bouweandela
Copy link
Member

@LisaBock this may be of interest to you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: No status
Development

Successfully merging this pull request may close these issues.

None yet

5 participants