This repository contains various useful scripts for data science projects.
- A jupyter boilerplate for data science projects.
- Functionalities:
- Collapsible sections
- Auto reload of imported file if it is modified
- Ignore warnings
- Select max rows and columns to print when printing complete dataframe
- Change cells width
- Install jupyter dark theme
- Save/load pickle files
- Send email in python with attachment
- Allows mulitple receivers
- Allows multiple attachments
- Random forest code with randomsearch and gridsearch for finding optimal parameters.
- Fasttext code with with autotune for finding optimal parameters.
- Bagging for NLP data with fasttext, also used for EDA to check on which categories we are performing good.
- RF classification complete notebook (source: fastai ml1).
- Data Interpretation using RF complete notebook (source: fastai ml1).
- FastAi required functions to run notebooks without fastai installation.
- Notebook for finding optimal validation set. This technique is very useful for kaggle competitions where test scores are kaggle scores after submission. This will help us in finding a good validation set according to the kaggle's hidden test set.
- RF expolation issue demo notebook.