Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Minor consistency issue with the way source_space["nearest_dist"] is saved #12459

Open
papadop opened this issue Feb 23, 2024 · 3 comments
Open
Labels

Comments

@papadop
Copy link
Contributor

papadop commented Feb 23, 2024

Description of the problem

The code in _source_space.py write the "nearest_dist" item as:
write_float_matrix(
fid, FIFF.FIFF_MNE_SOURCE_SPACE_NEAREST_DIST, this["nearest_dist"]
)

But in all the examples, the data type stored is a list of floats (which is a fiff type different from an array).
I guess the python code manages to read them back because it is weakly typed, but it would be better to be consistent and not multiply the data types.

The correction is probably as simple as (I chose the vector version because this is what exists in test files and because this piece of data is indeed a vector):
write_float(fid, FIFF.FIFF_MNE_SOURCE_SPACE_NEAREST_DIST, this["nearest_dist"]).

If the array version is preferred, it might be better to add some test files that contain arrays....

Steps to reproduce

Difficult to see without modifying the code. Add some print of type info in _read_tag_header (tag.py untested):
def _read_tag_header(fid, pos):
    """Read only the header of a Tag."""
    fid.seek(pos, 0)
    s = fid.read(16)
    if len(s) != 16:
        where = fid.tell() - len(s)
        extra = f" in file {fid.name}" if hasattr(fid, "name") else ""
        warn(f"Invalid tag with only {len(s)}/16 bytes at position {where}{extra}")
        return None
    # struct.unpack faster than np.frombuffer, saves ~10% of time some places
    kind, type_, size, next_ = struct.unpack(">iIii", s)
    if kind==FIFF_MNE_SOURCE_SPACE_NEAREST_DIST:
        if type_==FIFF.FIFFT_FLOAT and size!=4:
            print("Float Vector")
        if type_==FIFF.FIFFT_MATRIX:
            print("Float Array")
    return Tag(kind, type_, size, next_, pos)

Then use:

from mne.minimum_norm import read_inverse_operator
inv = read_inverse_operator('DATA_PATH/sample_audvis-eeg-oct-6-eeg-inv.fif')
mne.minimum_norm.write_inverse_operator('/tmp/toto-inv.fiff',inv)
inv1 = read_inverse_operator('/tmp/toto-inv.fiff')

Link to data

No response

Expected results

The two read should show the same type.

Float Vector
Float Vector

Actual results

Float Vector
Float Array

Additional information

Platform Linux-6.6.13-200.fc39.x86_64-x86_64-with-glibc2.38
Python 3.12.1 (main, Dec 18 2023, 00:00:00) [GCC 13.2.1 20231205 (Red Hat 13.2.1-6)]
Executable /usr/bin/python
CPU (20 cores)
Memory 62.5 GB

Core
├☑ mne 1.6.1 (latest release)
├☑ numpy 1.24.4 (OpenBLAS 0.3.21 with 20 threads)
├☑ scipy 1.11.1
├☑ matplotlib 3.8.2 (backend=QtAgg)
├☑ pooch 1.5.2
└☑ jinja2 3.1.3

Numerical (optional)
├☑ sklearn 1.3.0
├☑ nibabel 5.1.0
├☑ nilearn 0.10.3
├☑ pandas 1.5.3
└☐ unavailable numba, dipy, openmeeg, cupy

Visualization (optional)
├☑ pyvista 0.42.3 (OpenGL 4.6 (Core Profile) Mesa 23.3.5 via Mesa Intel(R) Graphics (ADL GT2))
├☑ pyvistaqt 0.11.0
├☑ vtk 9.3.0
├☑ qtpy 2.4.0 (PyQt5=5.15.12)
├☑ pyqtgraph 0.13.3
└☐ unavailable ipympl, mne-qt-browser, ipywidgets, trame_client, trame_server, trame_vtk, trame_vuetify

Ecosystem (optional)
└☐ unavailable mne-bids, mne-nirs, mne-features, mne-connectivity, mne-icalabel, mne-bids-pipeline
None

@papadop papadop added the BUG label Feb 23, 2024
@papadop
Copy link
Contributor Author

papadop commented Feb 23, 2024

Similar problem with dist_limit (FIFF.FIFF_MNE_SOURCE_SPACE_DIST_LIMIT), which is a float and is saved as a matrix....

Saved as:

write_float_matrix(
fid,
FIFF.FIFF_MNE_SOURCE_SPACE_DIST_LIMIT,
np.array(this["dist_limit"], float),
)

But read as:
res["dist_limit"] = tag2.data.item() # USe of item() to remove the array...

@larsoner
Copy link
Member

We probably need to keep the array type to be compatible with MNE-C. It's a bit wasteful to use an array for something that is always a singleton but the harm is pretty minimal. It would be hard to change the spec and expectations of MNE-C.

At the Python end I don't mind exposing it to users as a float because it's more convenient for them that way. We do conversions like this elsewhere -- the info['meas_date'] on disk is a length-2 tuple of int but we convert to a datetime object for convenience of understanding (and to fit better in the Python ecosystem). So I think internally we're consistent, too, by making the Python interface people deal with as simple as possible so they don't have to think about how data are actually stored on disk.

So in short I think there is nothing to change with the file format (backward compatibility issues) or how MNE-Python interprets things (we try to make things easy for people), but anyone else dealing with FIF format should be made aware (somehow) that quirks like this exist.

@papadop
Copy link
Contributor Author

papadop commented Feb 27, 2024

Ah. I did not realize that there was a MNE-C issue. I guess that's why also covariance matrices can be read as sparse matrices but are only saved as dense (or diagonal ones). It does not help that MNE-C is closed source or at least that we do not have some MNE-C database of files which can be tested for compliance....

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants