Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Issue in grid_ufunc when original_arg is a dictionary #581

Open
andrewdelman opened this issue Feb 3, 2023 · 2 comments
Open

[BUG] Issue in grid_ufunc when original_arg is a dictionary #581

andrewdelman opened this issue Feb 3, 2023 · 2 comments
Labels

Comments

@andrewdelman
Copy link

Encountered this apparent bug in v0.8.1 when calling diff_2d_vector, it originated in grid_ufunc.py (about line 1022):

        original_arg_chunks = original_arg.variable.chunksizes

gives error:

'dict' object has no attribute 'variable'

original_arg is expected to be an xarray DataArray, but instead is a dictionary with a value that is an xarray DataArray. (This does not seem to happen with regular xgcm diff.) This is my inelegant solution that seems to work, replacing the original_arg_chunks assignment with:

        if isinstance(original_arg,dict):
            original_arg_chunks = tuple(original_arg.values())[0].variable.chunksizes
        else:
            original_arg_chunks = original_arg.variable.chunksizes
@jbusecke
Copy link
Contributor

jbusecke commented May 11, 2023

Posting a reproducer (requester pays bucket, only works on the pangeo deployments unfortunately) of what I believe is the same issue (big thanks to @jdldeauna):

import intake
from ecco_v4_py.ecco_utils import get_llc_grid

cat = intake.open_catalog("https://raw.githubusercontent.com/pangeo-data/pangeo-datastore/master/intake-catalogs/ocean.yaml")
ecco_ds = cat.ECCOv4r3.to_dask()
ecco_ds = ecco_ds.rename({'face':'tile'})
xgcm_grid = get_llc_grid(ecco_ds)
yfld = ecco_ds.oceTAUY.isel(time=0)
xfld = ecco_ds.oceTAUX.isel(time=0)
velc = xgcm_grid.interp_2d_vector({'X': xfld, 'Y': yfld}, boundary='fill')
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Cell In[7], line 10
      8 yfld = ecco_ds.oceTAUY.isel(time=0)
      9 xfld = ecco_ds.oceTAUX.isel(time=0)
---> 10 velc = xgcm_grid.interp_2d_vector({'X': xfld, 'Y': yfld}, boundary='fill')

File /srv/conda/envs/notebook/lib/python3.10/site-packages/xgcm/grid.py:1323, in Grid.interp_2d_vector(self, vector, **kwargs)
   1286 def interp_2d_vector(self, vector, **kwargs):
   1287     """
   1288     Interpolate a 2D vector to the intermediate grid point. This method is
   1289     only necessary for complex grid topologies.
   (...)
   1320         are interpolated vector components along each axis
   1321     """
-> 1323     return self._apply_vector_function(self.interp, vector, **kwargs)

File /srv/conda/envs/notebook/lib/python3.10/site-packages/xgcm/grid.py:1250, in Grid._apply_vector_function(self, function, vector, **kwargs)
   1247 x_axis_name, y_axis_name = list(vector)
   1249 # apply for each component
-> 1250 x_component = function(
   1251     {x_axis_name: vector[x_axis_name]},
   1252     x_axis_name,
   1253     other_component={y_axis_name: vector[y_axis_name]},
   1254     **kwargs,
   1255 )
   1257 y_component = function(
   1258     {y_axis_name: vector[y_axis_name]},
   1259     y_axis_name,
   1260     other_component={x_axis_name: vector[x_axis_name]},
   1261     **kwargs,
   1262 )
   1263 return {x_axis_name: x_component, y_axis_name: y_component}

File /srv/conda/envs/notebook/lib/python3.10/site-packages/xgcm/grid.py:905, in Grid.interp(self, da, axis, **kwargs)
    853 def interp(self, da, axis, **kwargs):
    854     """
    855     Interpolate neighboring points to the intermediate grid point along
    856     this axis.
   (...)
    903     >>> grid.interp(da, ["X", "Y"], periodic={"X": True, "Y": False})
    904     """
--> 905     return self._1d_grid_ufunc_dispatch("interp", da, axis, **kwargs)

File /srv/conda/envs/notebook/lib/python3.10/site-packages/xgcm/grid.py:698, in Grid._1d_grid_ufunc_dispatch(self, funcname, data, axis, to, keep_coords, metric_weighted, other_component, **kwargs)
    695 else:
    696     map_overlap = False
--> 698 array = grid_ufunc(
    699     self,
    700     array,
    701     axis=[(ax_name,)],
    702     keep_coords=keep_coords,
    703     dask=dask,
    704     map_overlap=map_overlap,
    705     other_component=other_component,
    706     **remaining_kwargs,
    707 )
    709 if ax_metric_weighted:
    710     metric = self.get_metric(array, ax_metric_weighted)

File /srv/conda/envs/notebook/lib/python3.10/site-packages/xgcm/grid_ufunc.py:462, in GridUFunc.__call__(self, grid, axis, *args, **kwargs)
    460 map_overlap = kwargs.pop("map_overlap", self.map_overlap)
    461 pad_before_func = kwargs.pop("pad_before_func", self.pad_before_func)
--> 462 return apply_as_grid_ufunc(
    463     self.ufunc,
    464     *args,
    465     axis=axis,
    466     grid=grid,
    467     signature=self.signature,
    468     boundary_width=self.boundary_width,
    469     boundary=boundary,
    470     dask=dask,
    471     map_overlap=map_overlap,
    472     pad_before_func=pad_before_func,
    473     **kwargs,
    474 )

File /srv/conda/envs/notebook/lib/python3.10/site-packages/xgcm/grid_ufunc.py:768, in apply_as_grid_ufunc(func, axis, grid, signature, boundary_width, boundary, fill_value, keep_coords, dask, map_overlap, pad_before_func, other_component, *args, **kwargs)
    765 # For most ufuncs we want to pad before applying, but for some (especially cumsum) we must apply then pad
    766 # TODO could we bind a bunch of these arguments into a namedtuple/dataclass or something to save space?
    767 if pad_before_func:
--> 768     rechunked_padded_args = _pad_then_rechunk(
    769         args,
    770         grid,
    771         in_core_dims,
    772         boundary_width_real_axes,
    773         boundary,
    774         fill_value,
    775         other_component,
    776     )
    777     results = _apply(
    778         mapped_func,
    779         rechunked_padded_args,
   (...)
    785         **kwargs,
    786     )
    787 else:  # pad after func

File /srv/conda/envs/notebook/lib/python3.10/site-packages/xgcm/grid_ufunc.py:903, in _pad_then_rechunk(args, grid, in_core_dims, boundary_width_real_axes, boundary, fill_value, other_component)
    886 padded_args = [
    887     pad(
    888         a,
   (...)
    895     for a, oc in zip(args, other_component)
    896 ]
    898 if any(
    899     _has_chunked_core_dims(padded_arg, core_dims)
    900     for padded_arg, core_dims in zip(padded_args, in_core_dims)
    901 ):
    902     # merge any lonely chunks on either end created by padding
--> 903     rechunked_padded_args = _rechunk_to_merge_in_boundary_chunks(
    904         padded_args,
    905         args,
    906         boundary_width_real_axes,
    907         grid,
    908     )
    909 else:
    910     rechunked_padded_args = padded_args

File /srv/conda/envs/notebook/lib/python3.10/site-packages/xgcm/grid_ufunc.py:1024, in _rechunk_to_merge_in_boundary_chunks(padded_args, original_args, boundary_width_real_axes, grid)
   1022 rechunked_padded_args = []
   1023 for padded_arg, original_arg in zip(padded_args, original_args):
-> 1024     original_arg_chunks = original_arg.variable.chunksizes
   1025     merged_boundary_chunks = _get_chunk_pattern_for_merging_boundary(
   1026         grid,
   1027         padded_arg,
   1028         original_arg_chunks,
   1029         boundary_width_real_axes,
   1030     )
   1031     rechunked_arg = padded_arg.chunk(merged_boundary_chunks)

AttributeError: 'dict' object has no attribute 'variable'

working on a fix now.

jdldeauna added a commit to jdldeauna/xgcm that referenced this issue May 12, 2023
Quick fix for Issue xgcm#581
@andrewdelman
Copy link
Author

andrewdelman commented Apr 18, 2024

Recently encountered more errors in grid_ufunc.py, this time in the _map_func_over_core_dims function when the original_args list entries are dictionaries. Tried fixes similar to those above (replacing arg with tuple(arg.values()[0]), i.e. the first value of the arg dictionary), which seemed to resolve the problem there but then triggered some kind of dimension chunking mismatch in dask.array. For example, in dask/array/blockwise.py:

    270 elif isinstance(adjust_chunks[ind], (tuple, list)):
    271     if len(adjust_chunks[ind]) != len(chunks[i]):
--> 272         raise ValueError(
    273             f"Dimension {i} has {len(chunks[i])} blocks, adjust_chunks "
    274             f"specified with {len(adjust_chunks[ind])} blocks"
    275         )
    276     chunks[i] = tuple(adjust_chunks[ind])
    277 else:

ValueError: Dimension 0 has 13 blocks, adjust_chunks specified with 12 blocks

So the fix I tried before is no longer sufficient, at least for me. Not sure if this issue is recreated by other users when calling diff_2d_vector.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants