You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Short description of dataset and use case(s): 3064 T1-weighted contrast-inhanced images from 233 patients with three kinds of brain tumor: meningioma (708 slices), glioma (1426 slices), and pituitary tumor (930 slices).
This brain tumor dataset containing 3064 T1-weighted contrast-inhanced images
from 233 patients with three kinds of brain tumor: meningioma (708 slices),
glioma (1426 slices), and pituitary tumor (930 slices). Due to the file size
limit of repository, we split the whole dataset into 4 subsets, and achive
them in 4 .zip files with each .zip file containing 766 slices.The 5-fold
cross-validation indices are also provided.
This data is organized in matlab data format (.mat file). Each file stores a struct
containing the following fields for an image:
cjdata.label: 1 for meningioma, 2 for glioma, 3 for pituitary tumor
cjdata.PID: patient ID
cjdata.image: image data
cjdata.tumorBorder: a vector storing the coordinates of discrete points on tumor border.
For example, [x1, y1, x2, y2,...] in which x1, y1 are planar coordinates on tumor border.
It was generated by manually delineating the tumor border. So we can use it to generate
binary image of tumor mask.
cjdata.tumorMask: a binary image with 1s indicating tumor region
This data was used in the following paper:
Cheng, Jun, et al. "Enhanced Performance of Brain Tumor Classification via Tumor Region Augmentation
and Partition." PloS one 10.10 (2015).
Cheng, Jun, et al. "Retrieval of Brain Tumors by Adaptive Spatial Pooling and Fisher Vector
Representation." PloS one 11.6 (2016). Matlab source codes are available on github https://github.com/chengjun583/brainTumorRetrieval
Jun Cheng
School of Biomedical Engineering
Southern Medical University, Guangzhou, China
Email: chengjun583@qq.com
Folks who would also like to see this dataset in tensorflow/datasets, please thumbs-up so the developers can know which requests to prioritize.
Hello @BirkhoffLee and thank you for raising this issue!
Are you planning to add this dataset to TFDS yourself? If yes, you can follow this guide to adding a dataset.
As an example, you can refer to this recent commit that introduced the Databricks Dolly dataset.
I'd love to, but I have a few questions:
Removal of some data. I currently use the dataset on an image classification research project. The original dataset was published with MATLAB format. I have extracted the images as .PNG files (i.e.: removing some data in the orig dataset). Can I keep it as-is in the TFDS repo? To be more specific, only retaining cjdata.label and cjdata.image.
Training split. The original dataset does not split the data for training and testing. How am I supposed to handle it in this repo?
Hosting. Does the TFDS / Tensorflow project offer any place to store the dataset files? I do not see other datasets hosted here.
I'm new to the sector and apologies for any naive questions that I may have above, however I do wish to contribute to this repo because it makes research a lot easier. Much thanks :-)
Folks who would also like to see this dataset in
tensorflow/datasets
, please thumbs-up so the developers can know which requests to prioritize.And if you'd like to contribute the dataset (thank you!), see our guide to adding a dataset.
The text was updated successfully, but these errors were encountered: