kb-Anonymity-Data-Protection-and-Privacy

Final period project of the course Data Protection & Privacy: an implementation of the KB-anonymization technique, a framework useful for anonymizing data for testing purpose.

Getting started

To run the project is sufficient to clone or download this repository, with the command:

git clone https://github.com/A-725-K/kb-Anonimity-Data-Protection-and-Privacy.git

Our project relies on Z3 solver, if you don't have it installed, please refer to their main page.

How to launch the program

You have only to run this simple command from your terminal:

python3 main.py [-h] -i INPUT_FILE -o OUTPUT_FILE -a ALGORITHM -k K -c CONFIG_FILE

where:

-i: choose a dataset in json format as input
-o: choose an output file, it will be in json format
-a: choose the technique to apply to enhance the anonymization of data
- P-F: same Path, no Field repeat
- P-T: same Path, no Tuple repeat
-k: the degree of anonymization you would apply on data
-c: a configuration file that contains the range constraints to apply over the fields of tuples in dataset

Otherwise you can simply launch the test_runner utility:

cd utilities
./test_runner

How the repo works

datasets: it contains all the data used in our experiments, and a bash script to gather them through an open API
kb_anonymity: the core of the program, it contains the library proposed by us
mappings: each file contains a map that represents some values transformed in integer
main.py: the entry point of the program, the users would like to modify it depending on their needs
p_test.py: the SUT, the user have to encode its program like this
stat: contains graphics of the results produced by the test runner
utilities
- configs.txt: an example of configuration file, it must follow a specific syntax
- json_reader.py: a utility to parse the dataset, the user should modify it depending on their data
- draw_graphics.py: a script that plot the results of the algorithms executed in batch
- test_runner.sh: a simple script to perform some experiments with different parameters to understand the behavior of the algorithm

1. p_test format
p_test must contains a function called P_Test which simulates the behaviour of the system we want to test. It takesa raw tuple and a list of constraints as input(initially empty). A constraint is a triple (field, operation symbol, value).

2. configs format
In this file the user specify the range constraints for each field of a tuple. The first row must contain all the fields present in the dataset as strings. Then each row must follow this syntax: if the constraints are related to a single field:

field:(([op_symbol value]+),?)+

otherwise, if the constraints involve two related fields:

#field1 op_symbol field2

The comma symbol separates the conditions to be put in OR, while the whitespaces are for conditions in AND.

Authors

Andrea Canepa - Computer Science, UNIGE - Data Protection and Privacy a.y. 2019/2020
Alessio Ravera - Computer Science, UNIGE - Data Protection and Privacy a.y. 2019/2020

Name		Name	Last commit message	Last commit date
Latest commit History 35 Commits
datasets		datasets
kb_anonymization		kb_anonymization
mappings		mappings
stat		stat
utilities		utilities
.gitignore		.gitignore
DataProtectionAndPrivacy_Presentation.pdf		DataProtectionAndPrivacy_Presentation.pdf
LICENSE		LICENSE
README.md		README.md
main.py		main.py
p_test.py		p_test.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

datasets

datasets

kb_anonymization

kb_anonymization

mappings

mappings

stat

stat

utilities

utilities

.gitignore

.gitignore

DataProtectionAndPrivacy_Presentation.pdf

DataProtectionAndPrivacy_Presentation.pdf

LICENSE

LICENSE

README.md

README.md

main.py

main.py

p_test.py

p_test.py

Repository files navigation

kb-Anonymity-Data-Protection-and-Privacy

Getting started

How to launch the program

How the repo works

Authors

About

Releases

Packages

Languages

License

A-725-K/kb-Anonymity-Data-Protection-and-Privacy

Folders and files

Latest commit

History

Repository files navigation

kb-Anonymity-Data-Protection-and-Privacy

Getting started

How to launch the program

How the repo works

Authors

About

Topics

Resources

License

Stars

Watchers

Forks

Languages