tracevec

Learning word embedding models (Word2Vec and Doc2Vec) based on the electrical consumption of various home appliances.

Requirements

Python 3.8.10+
Pip / Anaconda
Jupyter Notebook

Other necessary dependencies are:

numpy, scipy, torch, pandas, seaborn, matplotlib, scikit_learn, gensim and matplotlib_venn

Dependencies and their version details are listed in the requirements.txt file. They can be easily installed with the setup.py script:

$ git clone https://github.com/mpinta/tracevec
$ cd tracevec
$ python setup.py install

Usage

The project consists of five connecting parts, which are:

Training word embedding models (using Gensim topic modelling library)
Clustering (Doc2Vec vectors into clusters)
Classification (of the electrical device type using Doc2Vec vectors)
Prediction (of the next electricity consumption category using Word2Vec vectors)
RNN Forecasting (the next electricity consumption category using RNN with GRU)

First, prepare your Pip or Anaconda environment and make sure you have all of the above dependencies installed. Then open the tracevec.ipynb notebook file, which stores and describes all the results of our training and model analysis. You can also run and modify the code yourself, as it is fully equipped with the descriptive comments. You can find our Word2Vec and Doc2Vec models in the models directory (skip the model part training if you don’t want to create new ones).

Datasets

All data sets required to run the code are included in the repository. If you are running code without the included data sets, it is only necessary to clone the tracebase repository, which represents projects main data set, into the datasets directory. All the other modified data sets (consumptions, samples, forecast-train and forecast-test) are gradually created by the notebook code itself. The tracebase data set is not our property and is used only as a depencency (submodule) - we appreciate the work done by the authors. Make sure to initialize the submodule with:

$ git submodule init
$ git submodule update

Publications

The code was originally used in the following publications:

Pintarič Matic, (2022).
S strojnim učenjem podprta analiza vzorcev vektorizirane porabe električne energije.
Maribor: University of Maribor, Faculty of Electrical Engineering and Computer Science.

Acknowledgements

Contains information from the tracebase data set, which is made available at http://www.tracebase.org under the Open Database License (ODbL).

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
datasets		datasets
models		models
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
setup.py		setup.py
tracevec.ipynb		tracevec.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

datasets

datasets

models

models

.gitignore

.gitignore

.gitmodules

.gitmodules

LICENSE

LICENSE

README.md

README.md

requirements.txt

requirements.txt

setup.py

setup.py

tracevec.ipynb

tracevec.ipynb

Repository files navigation

tracevec

Requirements

Other necessary dependencies are:

Usage

Datasets

Publications

Acknowledgements

About

Languages

License

mpinta/tracevec

Folders and files

Latest commit

History

Repository files navigation

tracevec

Requirements

Other necessary dependencies are:

Usage

Datasets

Publications

Acknowledgements

About

Topics

Resources

License

Stars

Watchers

Forks

Languages