Benchmark of Smart Meter Data Analytics

In this benchmark, we select the following five representative technologies for the benchmarking, including

Matlab -- A traditional analytics tool;
MADlib -- In-database (PostgreSQL) analytics library;
System C -- In-memory column store (the name is omitted due to license issue);
Spark -- Main memory based distributed computing framework;
Hive -- A distributed data warehouse system on Hadoop;

Matlab, MADlib, and System C are benchmarked in a centralized environment, e.g., on a single server, while Spark and Hive are benchmarked in a distributed environment, e.g., multiple clustered servers.

We select the following four representative smart meter data analytics algorithms for the benchmarking, including:

3-Line -- which is the model of using three linear regression lines to fit the relationship between meter readings and weather temperatures (see [1]);
PAR -- which is a periodic auto-regression model for extracting daily consumption trends that occur regardless of the outdoor temperature (see [2]);
Histogram -- which is used to understand the consumption variability of an energy consumer;
Cosine similarity -- which is used to find groups of similar consumers, e.g., according to energy consumption;

Synthetic Data sets

To use this benchmark, users could use this data generator to generate smart meter time series data.

Installation and Usage

The implementation programming languages are MATLAB for Matlab, plpgsql for MADlib, Q for System C, and Java for Spark and Hive. Following are the installation guidline:

MADlib: MADlib library has first to be installed in PostgreSQL, then create the tables using the script for storing the time time series data, and finally install the functions of the benchmarking algorithms.
Spark and Hive: Use the maven command mvn package to compile the Java program, and pack into a jar library.

Go to the corresponding folder of each technology, and execute ./run.sh for running the algorithms. Note: the run.sh shell scripts needs to be customized according to user's sepcific settings.

Publication

X. Liu, L. Golab, W. Golab, I. F. Ilyas, and S. Jin. Smart Meter Data Analytics: Systems, Algorithms and Benchmarking. ACM Transaction of Database System (TODS), 42(1), 2016.
X. Liu, L. Golab, W. Golab, Ihab F. Ilyas. Benchmarking Smart Meter Data Analytics. In Proc. of the 18th International Conference on Extending Database Technology (EDBT), pp. 385-396, 2015

Reference

[1] B. J. Birt, G. R. Newsham, I. Beausoleil-Morrison, M. M. Armstrong, N. Saldanha, and I. H. Rowlands, Disaggregating Categories of Electrical Energy End-use from Whole-house Hourly Data, Energy and Buildings, 50:93-102, 2012.

[2] O. Ardakanian, N. Koochakzadeh, R. P. Singh, L. Golab, and S.Keshav, Computing Electricity Consumption Profiles from Household Smart Meter Data, in EnDM Workshop on Energy Data Management, pp.140-147, 2014.

[3] M. Arlitt, et al. IoTAbench: an Internet of Things Analytics benchmark. In Proc. of ICPE, 2015.

Name		Name	Last commit message	Last commit date
Latest commit History 56 Commits
Hive		Hive
MADlib		MADlib
Matlab		Matlab
Spark		Spark
SystemC		SystemC
src/main/java/ca/uwaterloo/iss4e		src/main/java/ca/uwaterloo/iss4e
target		target
README.md		README.md
pom.xml		pom.xml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Hive

Hive

MADlib

MADlib

Matlab

Matlab

Spark

Spark

SystemC

SystemC

src/main/java/ca/uwaterloo/iss4e

src/main/java/ca/uwaterloo/iss4e

target

target

README.md

README.md

pom.xml

pom.xml

Repository files navigation

Benchmark of Smart Meter Data Analytics

Synthetic Data sets

Installation and Usage

Publication

Reference

About

Releases

Packages

Languages

xiufengliu/Benchmark

Folders and files

Latest commit

History

Repository files navigation

Benchmark of Smart Meter Data Analytics

Synthetic Data sets

Installation and Usage

Publication

Reference

About

Resources

Stars

Watchers

Forks

Languages