Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Endpoint extrapolation not properly handled. #613

Open
Marston opened this issue Jun 15, 2023 · 2 comments
Open

Endpoint extrapolation not properly handled. #613

Marston opened this issue Jun 15, 2023 · 2 comments
Labels

Comments

@Marston
Copy link

Marston commented Jun 15, 2023

I've been using XGCM to transform some UM data from pressure levels to height above mean sea level using the following function:

    def z_interp(self, ida, gmh, clevs):
        """
        Leverages XGCM to do a vertical coordinate transform

        Args:
            ida (xarray): input array to be transformed as a dataarray
            gmh (xarray): input transform array (new coord values on old coord) as dataarray
            clevs (numpy array): 1D vertical grid

        Returns:
            xarray: new ida transformed to new vertical coordinates
        """
        # Xgcm expects the data as a xarray dataset
        ds = xr.Dataset({'gmh': gmh, 'varb': ida})
        # vertical transform setup: coord name to transform, center values for the grid, and this is not periodic
        grid = Grid(ds, coords={'lev': {'center':'lev'}}, periodic=False)
        # transform. Note that the lev is replaced with gmh by the transformation process
        transformed = grid.transform(ds.varb, 'lev', clevs, target_data=ds.gmh, 
                                     method='linear', bypass_checks=True, mask_edges=False)
        # rename the gmh coordinate to lev
        transformed = transformed.rename({"gmh": "lev"})
        # return the output with the same dim order as input
        return transformed.transpose('time','lev','lat','lon')

While I like the speed with which this accomplishes the task, the lowest level (near surface) is not properly represented.
Here's an example of XGCM results at the edge near the surface:
bad_edge_interp

Here's the same data but this time using function:

 def interpolate_column_scipy(self, column_data, G_column, msllevs, kind):
        f = intp1d(G_column, column_data, kind=kind, bounds_error=False, fill_value='extrapolate')
        return f(msllevs)

good_edge_interp

I suspect that XGCM is functioning as it should but I'm not sure. Is there a way to make it extrapolate at the edges?

@jbusecke
Copy link
Contributor

Hi @Marston, thanks for opening this issue.

IIRC the current algorithms do not support extrapolation. There are many methods that one could possibly use to extrapolate data, and implementing these might add maintenance burden.

I wonder if one could work around this using xarray's pad. I would try to pad the data before you transform it (there is a linear extrapolation method I think), and then apply transform. The one caveat on this is that if you are working with dask arrays, this will create multiple chunks along the core dimension (i.e. the vertical), which you would have to deal with.

cc @TomNicholas this might be another reason to run .transform through the grid_ufunc logic?

@Marston
Copy link
Author

Marston commented Jun 21, 2023

Hi @jbusecke
What is IIRC? Where can I read it?
Currently, I'm using scipy to do the job. What scipy has over XGCM is that it is simple to setup and can do other types of interpolations and not just linear, not to mention extrapolation. But it is very slow to do one profile at a time - even if I use vector programming. I'm not familiar with padding the data. I'll have to look into some custom code that can maybe use numba to speed up the processing time, but this is not my preferred path.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants