Scalar Indexing with more than one dim broken #331

tensionhead · 2022-08-10T09:51:09Z

The Problem

Trying to find why the --full tests where always failing for test_csd_arithmetic I stumbled upon a general issue which surprisingly was not covered by any other test so far. The simplest way to reproduce would be:

spec = spy.SpectralData(data=np.ones((10,10,10,10)), samplerate=1)
spec.show(foi=2, channel=0)
>>> TypeError: Only one indexing vector or array is currently allowed for advanced selection

Except for the trial selection, any other scalar selection of two dimensions causes the same error, e.g.:

spec.show(foi=2, toi=0)
spec.show(channel=2, toi=0)
# ..an so on

That this is not an issue about show can be demonstrated by manually applying and evaluating a selection with two scalar selectors:

spec.selectdata(foi=2, channel=0, inplace=True)
spec._preview_trial(0).idx
>>> (slice(0, 10, None), slice(0, 10, 1), [2], [0])

Using this tuple to index the underlying hdf5 dataset does not work and throws the same TypeError because of the nested scalars:

spec.data[spec._preview_trial(0).idx]
>>> TypeError: Only one indexing vector or array is currently allowed for advanced selection

Inside the computational routine this is the exact mechanism how to extract the index tuples for constructing the source layouts:

syncopy/syncopy/shared/computational_routine.py

Lines 283 to 284 in d35a411

    
           for tk, trialno in enumerate(self.trialList): 
        
               trial = data._preview_trial(trialno)

Hence the source of the problem seems to be how _preview_trial() constructs it's index tuples, which sent me down the rabbit hole of FauxTrials and the general Selector class. Trying to 'un-nest' scalar indices produced multiple other issues, maybe @pantaray has an idea how to best address this issue?

Solution

The same index tuple with unnested scalars is totally fine:

# note the un-nested final two scalar indices
idx = (slice(0, 10, None), slice(0, 10, 1), 2, 0)
spec.data[idx]
>>> array([[1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1., 1., 1., 1., 1., 1.],
       ...

As this is merely an edge case, and fancy indexing with more than one index array probably will never work due to hdf5's limitations (sth like .selectdata(foi=[0, 2], channel=[0, 1, 5] also runs into the same problems), this isn't too critical and I would remove the mentioned csd arithmetic test for now.

What should be done is to at least catch such unvalid (because not-realizable) selections, and not let it only crash once the idx get actually used to index the hdf5 dataset, in e.g. show or a computational routine.

The text was updated successfully, but these errors were encountered:

pantaray · 2022-08-10T10:21:41Z

I've stumbled upon this when we first introduced CrossSpectralData but didn't have the time pursue this further. I think the main culprit (at least for the case laid out above) is here:

syncopy/syncopy/datatype/base_data.py

Line 1667 in d35a411

selection = [selection]

Every scalar selection is automatically converted to a single-element list. Removing this line should automatically fix the problem you're seeing, however, it will break many things downstream, since every code-piece that deals with selections assumes that each component of idx is either a slice or a list. That's not to say that this presumption is impossible to rectify, but it will require tinkering in more than one place (all the _preview_trial class methods for instance).

tensionhead · 2022-08-10T12:56:32Z

Yes thanks, I ended up at exactly the same spot 🙂 But I also made the experience that it breaks multiple things further downstream, so catching such invalid selections probably is the best way right now imho. A selection basically is a promise that the hdf5 also can deliver said pieces of data represented by these idx tuples, and also in other cases as outlined above these can't always be kept. This issue should be around at least since the introduction of SpectralData.

tensionhead added the Bug An error that is serious but does not break (parts of) the package. However, it clearly impedes the label Aug 10, 2022

tensionhead added a commit that referenced this issue Aug 10, 2022

FIX: full tests, see #331

875d47f

dfsp-spirit added the Minor Bug A flaw in the code or documentation that is clearly an error but does neither impede package functio label Sep 23, 2022

tensionhead added this to Under Consideration in Codebase Improvements via automation Oct 19, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Scalar Indexing with more than one dim broken #331

Scalar Indexing with more than one dim broken #331

tensionhead commented Aug 10, 2022

pantaray commented Aug 10, 2022

tensionhead commented Aug 10, 2022

Scalar Indexing with more than one dim broken #331

Scalar Indexing with more than one dim broken #331

Comments

tensionhead commented Aug 10, 2022

The Problem

Solution

pantaray commented Aug 10, 2022

tensionhead commented Aug 10, 2022