Skip to content

Collection of Synthetic Data for Use in Data Loss Prevention (DLP) or Data Security Posture Management (DSPM) Testing, Validation, and Assurance. This repository contains sample data and scripts to generate new data.

License

Notifications You must be signed in to change notification settings

samerfarida/SyntheticDataHub

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

68 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Logo Image

Synthetic Data Hub

CodeQL Advanced Lint Test on Multiple OS Test on Ubuntu Test on Windows Test on macOS

This repository provides synthetic data collections for testing, validating, and ensuring the effectiveness of Data Loss Prevention (DLP) and Data Security Posture Management (DSPM) solutions and other use cases wiki.

It includes sample datasets and scripts to generate new data, covering various categories such as Personal Identifiable Information (PII), Human Resources (HR), Payment Card Industry (PCI) compliant data, Protected Health Information (PHI), and others.

The provided python scripts also allow you to customize the generated data in different file types (e.g., 'json', 'csv', 'excel', 'word', 'pdf', or 'txt') by choosing different locales (e.g., en_US, fr_FR), record counts, and other parameters, enabling tailored testing scenarios for your specific use cases.

About

Collection of Synthetic Data for Use in Data Loss Prevention (DLP) or Data Security Posture Management (DSPM) Testing, Validation, and Assurance. This repository contains sample data and scripts to generate new data.

Resources

License

Stars

Watchers

Forks

Languages