Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Adding a A large EEG database with users' profile information for motor imagery Brain-Computer Interface dataset #404

Open
wants to merge 27 commits into
base: develop
Choose a base branch
from

Conversation

Sara04
Copy link
Collaborator

@Sara04 Sara04 commented Jun 21, 2023

This PR aims to add a new MI dataset https://zenodo.org/record/7554429 proposed in #1 .

Copy link
Member

@sylvchev sylvchev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do not forget to add also the dataset information the documentation (docs/sources/datasets.rst). You need also to update the whats_new.rst in the same folder.


subj_id = self.db_id + str(subject + self.db_idx_off[self.db_id])

ch_names = [
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To avoid long list like that you could use #fmt: off/on like explained here

}


class Dreyer2023(BaseDataset):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You may rename this class as Dreyer2023Base to make clear that is the parent class and you could indicate in the docstring that it should not be used directly, instead use Dreyer2023A, Dreyer2023BorDreyer2023C`.

Comment on lines 114 to 133
.. admonition:: Dataset summary

========== ======= ======= ========== ================= ============ =============== ===========
Name #Subj #Chan #Classes #Trials / class Trials len Sampling rate #Sessions
========== ======= ======= ========== ================= ============ =============== ===========
Dreyer2023 87 27 2 20 5s 512 Hz 6
========== ======= ======= ========== ================= ============ =============== ===========

===================
Dataset description
===================
A large EEG database with users' profile information for motor imagery
Brain-Computer Interface research

Data collectors : Appriou Aurélien; Caselli Damien; Benaroch Camille;
Yamamoto Sayu Maria; Roc Aline; Lotte Fabien;
Dreyer Pauline; Pillette Léa
Data manager : Dreyer Pauline
Project leader : Lotte Fabien
Project members : Rimbert Sébastien; Monseigne Thibaut
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The dataset summary + description should removed here and put in each dataset (A, B and C) with the correct number of subjects. Keep only the information that this is the parent class and that it should not be instanciated directly.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @sylvchev ! I moved dataset summary and description to each A, B and C dataset. Since, recording procedure is the same, a large portion of the descriptions is the same for all the three datasets.

from sklearn.pipeline import make_pipeline

import moabb
from moabb.datasets import Dreyer2023A, Dreyer2023B, Dreyer2023C
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
from moabb.datasets import Dreyer2023A, Dreyer2023B, Dreyer2023C
from moabb.datasets import Dreyer2023A

B & C not used here

from moabb.paradigms import MotorImagery


dreyer2023 = Dreyer2023A()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
dreyer2023 = Dreyer2023A()
dreyer2023 = Dreyer2023A()
dreyer2023.subject_list = dataset.subject_list[:6]

Generating documentation uses lots of computational ressources, both for CPU and network bandwidth. We try to limit as much as possible the datasets downloaded for generating docs (they are shared between the examples). As the user's profile included in this dataset could give some interesting insight, we could consider to add this dataset in the documentation, but it should use only a selection of subjects to keep the bandwidth usage within reasonable limits. Could you adapt the above line to select only a representative individual for your analysis below. If you could keep the number of subjects between 4 and 8, it will really lower the doc building time.
Cherry picking subjects for the documentation is allowed ^^

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I've selected 4 subjects (to include all combinations of participants' and experimenters' genders). However, since data of all subjects are zipped they all need to be downloaded together, is there some solution for this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No unfortunately.
The only possibility I see is to make this example a static one, that is you change the name of the file into Dreyer_clf_scores_vs_subj_info.py (without the plot_ prefix) and Sphinx won't execute this example, only display the code. If you want, you could generate a figure on your computer and upload it in a new folder examples/images/, then you could include the resulting image from your example code (see for example this page or that page).

@bruAristimunha bruAristimunha mentioned this pull request Aug 1, 2023
4 tasks
@PierreGtch PierreGtch added this to the 0.6.0 milestone Aug 2, 2023
from moabb.paradigms import MotorImagery


dreyer2023 = Dreyer2023A()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No unfortunately.
The only possibility I see is to make this example a static one, that is you change the name of the file into Dreyer_clf_scores_vs_subj_info.py (without the plot_ prefix) and Sphinx won't execute this example, only display the code. If you want, you could generate a figure on your computer and upload it in a new folder examples/images/, then you could include the resulting image from your example code (see for example this page or that page).

@sylvchev sylvchev modified the milestones: 0.6.0, 1.0.1 Sep 8, 2023
@Sara04 Sara04 marked this pull request as ready for review November 18, 2023 14:20
@bruAristimunha
Copy link
Collaborator

Can you review @sylvchev?

@bruAristimunha bruAristimunha changed the title Adding a new dataset [WIP] Adding a A large EEG database with users' profile information for motor imagery Brain-Computer Interface dataset Apr 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants