The Statice software generates new synthetic data consisting of new data points while reflecting the real data in its structure and statistical properties. Synthetic data generated by Statice can be used just like real data, but in a legally compliant manner.
Read more about what synthetic data is and how it works here.
Statice could be used as a usual Python library. This is the most flexible way to integrate synthetic data into an existing data pipeline.
Alternatively, we ship Statice with the CLI wrapper and pre-packaged dependencies.
Install Statice as python library or standalone application. We provide a python package, executable, docker and major cloud providers images for you to use.
Statice will analyze your data and detect the datatypes, shapes and dependencies. Statice comes with a special plugin system which allows integration of various custom datatypes, specific to your business, such as IBAN and phone numbers.
In this step, Statice knows how to map your data to the internal datatypes needed to train a set of generative models we developed. There's a lot happening at this stage: data encoding, structural learning, hyperparameter optimization, model fitting and synthetic data sampling.
Now the synthetic data is ready to use. Additionally, Statice will evaluate the utility and privacy of the newly generated data. The software generates a report with different evaluation methods: from simple statistical checks to the machine learning evaluations and simulated privacy attacks.
Statice is based on state-of-the art machine learning and privacy research, which enables businesses to explore data while staying privacy-compliant according to modern privacy laws.
On-premise integration of Statice gives you full control of the data, as it never leaves your data center.
Statice generates privacy-preserving synthetic data, which can be freely shared across your organization, while staying compliant.
Statice is implemented as a library and command line application, which makes it easy to integrate into your existing data flow.
Statice can handle large amounts of data. We have already shown that Statice can easily process data sets with tens of millions entries and over 5000 dimensions.