default bandpass parameters #533

martinwimpff · 2023-12-08T10:02:13Z

Lines 291 to 311 in 4b297b9

    
           class MotorImagery(SinglePass): 
        
               """N-class motor imagery. 
        
               Metric is 'roc-auc' if 2 classes and 'accuracy' if more 
        
               Parameters 
        
               ----------- 
        
               events: List of str 
        
                   event labels used to filter datasets (e.g. if only motor imagery is 
        
                   desired). 
        
               n_classes: int, 
        
                   number of classes each dataset must have. If events is given, 
        
                   requires all imagery sorts to be within the events list. 
        
               fmin: float (default 8) 
        
                   cutoff frequency (Hz) for the high pass filter 
        
               fmax: float (default 32) 
        
                   cutoff frequency (Hz) for the low pass filter

Is there any reason why the default bandpass values are 8 and 32?
In my opinion these should be None i.e. the default is to not use a filter at all.

The text was updated successfully, but these errors were encountered:

bruAristimunha · 2023-12-08T14:26:37Z

Hey @martinwimpff!

Yes! There is a reason! The narrative revolves around Moabb's philosophy on reproducibility, a concept shaped by @vinay-jayaram and @alexandrebarachant.

About five years back, before Moabb, articles often played around with things like bandpass, baseline, etc for different datasets and methods. This made it impossible to compare them fairly. Authors usually had some extra tricks that made their method seem way better.

So, Moabb stepped in. They said, "Hold up! Let's treat all the datasets the same way, right from the start." They processed the raw data uniformly and checked the methods in the same way—whether within-session, cross-session, or cross-subject evaluation.

Just a heads-up, this preprocessing happens only when you're using the motor-imagery/ssvep/cvep/p300 paradigm object, and the band interval selection was based on studies by Fabian Lotte (BCI) or other similar literature for each paradigm.

If you prefer, you can easily remove the bandpass by tweaking the object's values. In Braindecode, you get the raw data without this preprocessing because we grab it using the dataset object.

bruAristimunha · 2023-12-08T14:26:50Z

Tagging @sylvchev if you want to complement

martinwimpff · 2023-12-08T20:09:11Z

Hi Bruno,

thanks for the fast response!
I get the reproducability point and I fully understand the comparison point.
However, the 8-32Hz Bandpass is far from perfect for most DL models (at least for the "most important" dataset, the BCIC IV 2a dataset). Therefore I fear that many models will not get the best results using the standard 8-32Hz bandpass. If people find this out, they will stop using moabbs standard configuration, which would go against the original intention of moabb.

Best,
Martin

sylvchev · 2024-01-10T21:08:06Z

I understand your concern, do you have any results or do you know any publication that investigates the influence of bandpass filters on DL models across several model (more than just BCIC IV 2a)?
Anyway, those values are only the default ones by default and could be change if needed.

vinay-jayaram · 2024-01-10T21:38:21Z

If there are results suggesting that the bandpass is not helpful in DL situations then that's a good argument to change the default, but otherwise the original definition was to deal with the fact that the 8-32 bandpass also mitigates movement and muscle artifacts. Especially for DL it's important -- if you want to make a claim about brain interfacing and not simply EEG -- to provide evidence or use methods to convince the reader that the models aren't taking advantage of non-brain information as well.

I also strongly agree with Sylvain -- a single dataset should not be considered more important than any other (especially the BCIC datasets! They've been overfitted to for decades) unless its large enough to offer population coverage or was recorded on the same hardware setup as a planned closed-loop study.

martinwimpff · 2024-01-12T07:14:07Z

Thanks for your responses @sylvchev @vinay-jayaram!
@sylvchev no official publication, just personal experience.
@vinay-jayaram I get your point and those artifacts might be present as discussed in this publication. However, they don't use the BCIC datasets and they use a 4-40Hz filter and discard the first second after the cue to "overcome" this issue. Their investigation is good but not complete.

DL often uses 4-40Hz BP whereas CSP & Co. tend to use smaller bandwidths like 8-32Hz. I personally like it when the default parameters don't change the original data at all such that every modification of the original data is an "active choice" which should then be mentioned in the paper.

Finally, I think this is more about personal preference so it is not necessary to change the default parameters. However, I still see the problem that people may be discouraged from using MOABB as it was intended.
The solution for future datasets would be to define the preprocessing (e.g. 8-32Hz BP) upfront.

sylvchev · 2024-01-12T11:58:12Z

Do you obtain a noticeable difference with 4-40 Hz instead of 8-32Hz and with what kind of DL models?

Regarding leaving the data as is and to make the filtering part more visible (with a preprocessing step), I understand your point but the MOABB community is very diverse. We provide the data "as is" with the dataset object, and you could do what you want with it (useful for Neuroscience folks), or if you are more in ML and don't know about EEG, the paradigms are there to ensure that the preprocessing is correct and you get a ndarray easy to handle.

Indeed, DL blurred the lines with the end to end approaches. With @PierreGtch we are adding the possibility to make batch preprocessing, defined for all your data and save the transformed dataset for further reuse without needing to apply the preprocessing steps.

Only few users are both knowledgeable in EEG and ML, and they know how the data is processed or know how to find the information.

bruAristimunha added the question label Dec 8, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

default bandpass parameters #533

default bandpass parameters #533

martinwimpff commented Dec 8, 2023

bruAristimunha commented Dec 8, 2023

bruAristimunha commented Dec 8, 2023

martinwimpff commented Dec 8, 2023

sylvchev commented Jan 10, 2024

vinay-jayaram commented Jan 10, 2024

martinwimpff commented Jan 12, 2024

sylvchev commented Jan 12, 2024

default bandpass parameters #533

default bandpass parameters #533

Comments

martinwimpff commented Dec 8, 2023

bruAristimunha commented Dec 8, 2023

bruAristimunha commented Dec 8, 2023

martinwimpff commented Dec 8, 2023

sylvchev commented Jan 10, 2024

vinay-jayaram commented Jan 10, 2024

martinwimpff commented Jan 12, 2024

sylvchev commented Jan 12, 2024