Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scalar Indexing with more than one dim broken #331

Open
tensionhead opened this issue Aug 10, 2022 · 2 comments
Open

Scalar Indexing with more than one dim broken #331

tensionhead opened this issue Aug 10, 2022 · 2 comments
Labels
Bug An error that is serious but does not break (parts of) the package. However, it clearly impedes the Minor Bug A flaw in the code or documentation that is clearly an error but does neither impede package functio

Comments

@tensionhead
Copy link
Contributor

The Problem

Trying to find why the --full tests where always failing for test_csd_arithmetic I stumbled upon a general issue which surprisingly was not covered by any other test so far. The simplest way to reproduce would be:

spec = spy.SpectralData(data=np.ones((10,10,10,10)), samplerate=1)
spec.show(foi=2, channel=0)
>>> TypeError: Only one indexing vector or array is currently allowed for advanced selection

Except for the trial selection, any other scalar selection of two dimensions causes the same error, e.g.:

spec.show(foi=2, toi=0)
spec.show(channel=2, toi=0)
# ..an so on

That this is not an issue about show can be demonstrated by manually applying and evaluating a selection with two scalar selectors:

spec.selectdata(foi=2, channel=0, inplace=True)
spec._preview_trial(0).idx
>>> (slice(0, 10, None), slice(0, 10, 1), [2], [0])

Using this tuple to index the underlying hdf5 dataset does not work and throws the same TypeError because of the nested scalars:

spec.data[spec._preview_trial(0).idx]
>>> TypeError: Only one indexing vector or array is currently allowed for advanced selection

Inside the computational routine this is the exact mechanism how to extract the index tuples for constructing the source layouts:

for tk, trialno in enumerate(self.trialList):
trial = data._preview_trial(trialno)

Hence the source of the problem seems to be how _preview_trial() constructs it's index tuples, which sent me down the rabbit hole of FauxTrials and the general Selector class. Trying to 'un-nest' scalar indices produced multiple other issues, maybe @pantaray has an idea how to best address this issue?

Solution

The same index tuple with unnested scalars is totally fine:

# note the un-nested final two scalar indices
idx = (slice(0, 10, None), slice(0, 10, 1), 2, 0)
spec.data[idx]
>>> array([[1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
       ...

As this is merely an edge case, and fancy indexing with more than one index array probably will never work due to hdf5's limitations (sth like .selectdata(foi=[0, 2], channel=[0, 1, 5] also runs into the same problems), this isn't too critical and I would remove the mentioned csd arithmetic test for now.

What should be done is to at least catch such unvalid (because not-realizable) selections, and not let it only crash once the idx get actually used to index the hdf5 dataset, in e.g. show or a computational routine.

@tensionhead tensionhead added the Bug An error that is serious but does not break (parts of) the package. However, it clearly impedes the label Aug 10, 2022
tensionhead added a commit that referenced this issue Aug 10, 2022
@pantaray
Copy link
Member

I've stumbled upon this when we first introduced CrossSpectralData but didn't have the time pursue this further. I think the main culprit (at least for the case laid out above) is here:

selection = [selection]

Every scalar selection is automatically converted to a single-element list. Removing this line should automatically fix the problem you're seeing, however, it will break many things downstream, since every code-piece that deals with selections assumes that each component of idx is either a slice or a list. That's not to say that this presumption is impossible to rectify, but it will require tinkering in more than one place (all the _preview_trial class methods for instance).

@tensionhead
Copy link
Contributor Author

Yes thanks, I ended up at exactly the same spot 🙂 But I also made the experience that it breaks multiple things further downstream, so catching such invalid selections probably is the best way right now imho. A selection basically is a promise that the hdf5 also can deliver said pieces of data represented by these idx tuples, and also in other cases as outlined above these can't always be kept. This issue should be around at least since the introduction of SpectralData.

@dfsp-spirit dfsp-spirit added the Minor Bug A flaw in the code or documentation that is clearly an error but does neither impede package functio label Sep 23, 2022
@tensionhead tensionhead added this to Under Consideration in Codebase Improvements via automation Oct 19, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug An error that is serious but does not break (parts of) the package. However, it clearly impedes the Minor Bug A flaw in the code or documentation that is clearly an error but does neither impede package functio
Projects
Codebase Improvements
Under Consideration
Development

No branches or pull requests

3 participants