Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

isel(multi_index_level_name = MultiIndex.level) corrupts the MultiIndex #8952

Open
5 tasks done
dcherian opened this issue Apr 17, 2024 · 1 comment
Open
5 tasks done

Comments

@dcherian
Copy link
Contributor

dcherian commented Apr 17, 2024

What happened?

From #8951

if d is a MultiIndex-ed dataset with levels (x, y, z), and m is a dataset with a single coord x
m.isel(x=d.x) builds a dataset with a MultiIndex with levels (y, z). This seems like it should work.

cc @benbovy

What did you expect to happen?

No response

Minimal Complete Verifiable Example

import pandas as pd, xarray as xr, numpy as np

xr.set_options(use_flox=True)

test = pd.DataFrame()
test["x"] = np.arange(100) % 10
test["y"] = np.arange(100)
test["z"] = np.arange(100)
test["v"] = np.arange(100)

d = xr.Dataset.from_dataframe(test)
d = d.set_index(index = ["x", "y", "z"])
print(d)

m = d.groupby("x").mean()
print(m)

print(d.xindexes)
print(m.isel(x=d.x).xindexes)

xr.align(d, m.isel(x=d.x))
#res = d.groupby("x") - m
#print(res)
<xarray.Dataset>
Dimensions:  (index: 100)
Coordinates:
  * index    (index) object MultiIndex
  * x        (index) int64 0 1 2 3 4 5 6 7 8 9 0 1 2 ... 8 9 0 1 2 3 4 5 6 7 8 9
  * y        (index) int64 0 1 2 3 4 5 6 7 8 9 ... 90 91 92 93 94 95 96 97 98 99
  * z        (index) int64 0 1 2 3 4 5 6 7 8 9 ... 90 91 92 93 94 95 96 97 98 99
Data variables:
    v        (index) int64 0 1 2 3 4 5 6 7 8 9 ... 90 91 92 93 94 95 96 97 98 99
<xarray.Dataset>
Dimensions:  (x: 10)
Coordinates:
  * x        (x) int64 0 1 2 3 4 5 6 7 8 9
Data variables:
    v        (x) float64 45.0 46.0 47.0 48.0 49.0 50.0 51.0 52.0 53.0 54.0
Indexes:
  ┌ index    PandasMultiIndex
  │ x
  │ y
  └ z
Indexes:
  ┌ index    PandasMultiIndex
  │ y
  └ z
ValueError...

MVCE confirmation

  • Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
  • Complete example — the example is self-contained, including all data and the text of any traceback.
  • Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
  • New issue — a search of GitHub Issues suggests this is not a duplicate.
  • Recent environment — the issue occurs with the latest version of xarray and its dependencies.

Relevant log output

No response

Anything else we need to know?

No response

Environment

@benbovy
Copy link
Member

benbovy commented Apr 18, 2024

I think this occurs in the case of fancy indexing of an xarray object (i.e., provide another DataArray as indexer argument to isel) where the same coordinate name is found in both the indexed object and the indexer.

Remove the name conflict and it works fine, e.g.,

xr.align(d, m.rename(x="w").isel(w=d.x))

In such case, the coordinate in the indexer should probably be passed to the result instead of the one found in the indexed object (not the current behavior, although I haven't checked how the coordinates are merged in the result).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants