Open source data anonymization and synthetic data orchestration for developers. Create high fidelity synthetic data and sync it across your environments.
-
Updated
May 28, 2024 - TypeScript
Open source data anonymization and synthetic data orchestration for developers. Create high fidelity synthetic data and sync it across your environments.
Generate relevant synthetic data quickly for your projects. The Databricks Labs synthetic data generator (aka `dbldatagen`) may be used to generate large simulated / synthetic data sets for test, POCs, and other uses in Databricks environments including in Delta Live Tables pipelines
Synthesizer - a code for creating synthetic astrophysical spectra
Synthetic Patient Population Simulator
Design, conduct and analyze results of AI-powered surveys and experiments. Simulate social science and market research with large numbers of AI agents and LLMs.
Metrics to evaluate quality and efficacy of synthetic datasets.
A library to model multivariate data using copulas.
⚗️ distilabel is a framework for synthetic data and AI feedback for AI engineers that require high-quality outputs, full data ownership, and overall efficiency.
Synthetic data generation for tabular data
UnrealCV: Connecting Computer Vision to Unreal Engine
awesome synthetic (text) datasets
Python package for the generation and evaluation of synthetic time-series data.
This project allows users to generate synthetic videos from CAD models, including .npy files with additional information. Models are loaded dynamically into a Blender scene, and the camera smoothly moves along spherical points to create the final video.
Plugin for metasyn that prevents data from leaking.
Synthetic data generators for tabular and time-series data
A library to generate synthetic tabular or RDF data using Conditional Generative Adversary Networks (GANs) combined with Differential Privacy techniques.
Open-source version of the TDspora synthetic data generation algorithm.
Synthetic tree point cloud dataset with ground truth skeletons.
BENERATOR is a leading software solution to generate, obfuscate, pseudonymize and migrate data for development, testing, and training purposes with a model-driven approach.
PostgreSQL database anonymization tool
Add a description, image, and links to the synthetic-data topic page so that developers can more easily learn about it.
To associate your repository with the synthetic-data topic, visit your repo's landing page and select "manage topics."