Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Different recognized coordinates when calling variable by standard name #357

Open
kthyng opened this issue Aug 9, 2022 · 13 comments
Open
Labels
documentation Improvements or additions to documentation

Comments

@kthyng
Copy link
Contributor

kthyng commented Aug 9, 2022

I know this isn't good form, but I am going to describe my problem to see if anyone has an idea of a direction to go, without a good example to start. I am using several libraries together and making an example case seems difficult.

Here is the base question though:
ds['zeta'].cf['longitude'] worked but ds.cf['sea_surface_height_above_mean_sea_level'].cf['longitude'] did not

in other words, when I used the variable name to access a variable in my dataset, cf-xarray knew the mapping for longitude. But, when I referred to the variable by its standard name that cf-xarray recognized, cf-xarray did not then know the mapping for longitude. This seems weird right? Any ideas of what could be wrong?

@dcherian
Copy link
Contributor

dcherian commented Aug 9, 2022

If longitude is not a dimension coordinate, I think you'll need longitude in the coordinates attribute of ds.zeta. I bet ds.cf["zeta"].ccf["longitude"] also does not work?

Can you add this to the FAQ if you have time: https://cf-xarray.readthedocs.io/en/latest/faq.html?

@kthyng
Copy link
Contributor Author

kthyng commented Aug 9, 2022

Yes ds.cf["zeta"].cf["longitude"] also did not work.

You should have an autoresponder for my questions that says it is always the coordinates attribute. And I'll try to add to the FAQ tomorrow!

@kthyng
Copy link
Contributor Author

kthyng commented Aug 10, 2022

Should the coordinates attribute trump everything else for interpreting metadata? It is hidden from view (tucked into .encoding) and I didn't know about it until I started using cf-xarray, which is why I always forget about it. I couldn't find it in the docs at all so far, though there are a lot of mentions of "coordinates" so I might have missed it. I find it very tricky whereas interpreting standard_names, units, etc seems much more straight forward.

@dcherian
Copy link
Contributor

To attach coordinate variables when you pull out a dataarray, Xarray checks if coord.dims < var.dims. cf-xarray is more "clever" and parses the coordinates and ancillary_variables attributes. You're right, it would be good to improve the docs on this point.

Comparing popds.cf["UVEL"] and popds["UVEL"] should be useful (popds is in cf_xarray.datasets). The first will have ULAT, ULONG which is what we really want, the second will have ULAT, ULONG, TLAT, TLONG because all of those variables have the same dimensions.

It is hidden from view (tucked into .encoding)

This is where a nice HTML repr for .cf would be awesome. It would just show all CF attributes from both .attrs and .encoding.

@kthyng
Copy link
Contributor Author

kthyng commented Aug 10, 2022

How is the coordinates attribute set originally? Is it subsequently modified? I'm not sure what should be in it for any given variable. For example, zeta in xr.tutorial.open_dataset('ROMS_example.nc') has 'coordinates': 'lon_rho hc h Vtransform lat_rho' though I had thought it would be ocean_time lat_rho lon_rho.

@dcherian
Copy link
Contributor

How is the coordinates attribute set originally? Is it subsequently modified?

It's set in the netcdf dataset. It is never modified. Xarray moves it to eencoding if decode_coords=True in open_*dataset.

That ROMS attribute looks incomplete IMO: hc, h, Vtransform for zeta seems weird but I guess those are needed to calculate it? Including the names of dimensions in coordinates is unnecessary IIRC but it's not wrong to do so.

@kthyng
Copy link
Contributor Author

kthyng commented Aug 10, 2022

I have gotten stuck in the past where the coordinates are either missing or wrong. One time this came up I think was when calculating the z coordinates, which should then be included in the coordinates attributes shouldn't they?

@dcherian
Copy link
Contributor

yes the 4D z variable is a great use for the coordinates attribute.

@kthyng
Copy link
Contributor Author

kthyng commented Aug 10, 2022

I found it too difficult to consistently modify the coordinates attribute to include the new z variables so I removed the coordinates attribute from everything, but I think that is not a reliable fix either. What do you suggest?

@dcherian
Copy link
Contributor

I think the only option is to keep coordinates up-to-date (if you want to propagate that z variable for e.g.).

This might be another use case for #253; automatically update coordinates attribute for various operations.

So ds.sel(latitude=4, drop=True) would remove the name of the latitude variable from coordinates attributes on all DataArrays where applicable.

@kthyng
Copy link
Contributor Author

kthyng commented Aug 11, 2022

Thanks @dcherian! I'll close this but yes I do think updating coordinates attributes would be hugely useful.

@kthyng kthyng closed this as completed Aug 11, 2022
@dcherian dcherian added the documentation Improvements or additions to documentation label Aug 11, 2022
@dcherian
Copy link
Contributor

dcherian commented Aug 11, 2022

Opening since it would be nice to add this stuff to the documentation.

@dcherian dcherian reopened this Aug 11, 2022
@kthyng
Copy link
Contributor Author

kthyng commented Aug 15, 2022

Sorry I have to kick this down the road a bit, but it's on my to do list.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
None yet
Development

No branches or pull requests

2 participants