Skip to content

Final project of the course "Data and Information Quality" held by Prof. Cappiello @ Politecnico di milano

Notifications You must be signed in to change notification settings

Tecnarca/data-quality-analyzer

Repository files navigation

Data Quality Analyzer

This program aim is to show how it is possible to analyze the quality of a provided datasets and make some preliminary analysis and comparisons on chunks of data, in order to prepare the data for data mining tasks.

We were provided with an almost perfect dataset (no missing values, no duplicated rows etc.) so we had to dirty it a little bit before feeding it to the web application for the analysis.

See:

Usage

To see how we dirtyied our datasets and how we created the quality attributes, run this notebook:

jupyter notebook Orginal-Data/File\ Conversion.ipynb 

To start the webapp that contains the Data Quality Analyzer, simply run:

python webapp.py

Note: This command will also start a flask server on port 5000. To access it, open on your browser the page:

http://localhost:5000/query

Built with


Students Giacomo Astolfi, Leonardo Febbo. Project for the course on 'Data and Information Quality' held at Politecnico di Milano by Prof. Cinzia Cappiello, A.Y. 2018/2019

About

Final project of the course "Data and Information Quality" held by Prof. Cappiello @ Politecnico di milano

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published