This is a simple package that lets you change the sensitive data from a .CSV file creating a new file with fake data.
This allows the new data to be used for training, testing or analytics, without compromising private information.
Run the following command to install the package:
pip install datamizer
1- Instanciate the Datamizer class, pass the path to the CSV file, and optionally the CSV delimiter.
from datamizer import Datamizer
csv_datamize = Datamizer('file.csv')
2- Use fake()
to anonymize the columns with sensitive data, passing the column
,provider
, and optionally consistent
args.
csv_datamize.fake('Username', 'user_name', consistent=True)
csv_datamize.fake('First name', 'first_name', consistent=True)
csv_datamize.fake('Last name', 'last_name', consistent=True)
csv_datamize.fake('email', 'email', consistent=True)
csv_datamize.fake('Money', 'pricetag')
3- Write a new CSV file with the fake data, passing the path to the new file and optionally index=True
to include the index.
csv_datamize.write_csv('users.csv')