Statice's synthetic data technology is now part of Anonos Data Embassy, the award-winning data security and privacy solution.
We are kicking off our series of Statice team interviews with Beatrice and Benjamin, both part of our commercial team. They are in charge of developing our activities and supporting our clients from the definition of the project to the deployment of the POC or the first project.
Beatrice is our business development representative. She has been with the team since 2019 and is responsible for researching and contacting relevant stakeholders to discuss their data challenges. She has hands-on technical knowledge and an upbeat commercial attitude that truly make her a crucial asset to our team. With her experience, she understands the problems companies and data teams face today.
Benjamin is our head of business development. He joined the Statice adventure early 2019. Since then, he has worked with hundreds of teams to help them overcome their data access and privacy challenges. He has spent the last eight years working in data-heavy B2B startups and scaleups. Today, his ability to understand the reality of enterprise challenges and his excitement about data-driven innovation make him an essential partner for our clients.
Beatrice I’d like to begin by saying that this is my first job ever and I couldn’t have wished for a better one. Being part of the commercial team at Statice is a wonderful experience. Our team is always dedicated to bridging the gap between the product we offer at Statice and the clients we support. It contributes to feeling that we are a partner in the true sense of the word.
Ben: Being on the commercial team of any early-stage company is great fun! Our day-to-day is quite varied, and can range from designing and implementing campaigns to educate companies on the value of proper data protection through anonymization, to working intensively with client teams to discover, scope, and plan implementations and use cases for the Statice software. We’re also the team who spends the most time listening to real-life challenges from our customers, so we play a facilitator role for the product team as well, bringing them new feature requests and client feedback.
Beatrice: The profile of companies that we work with includes large enterprises that process huge amounts of data from their customers and face problems related to data sharing across borders, training machine learning models on customer data, driving innovation through product development, but not limited to this. We’ve basically worked with most, if not all, profiles of companies in finance, insurance, and healthcare.
Ben: I’d add that while we generally work with large(r) companies, in the end, data privacy is a big issue for all companies that collect, store, and use sensitive data, and a company doesn’t necessarily need to have a big headcount or to be doing crazy revenues to have lots of customers or users. As Beatrice mentioned in her answer, we do see the most interest from finance, healthcare and insurance, as these industries by default have a lot of sensitive data on their customers (or patients). But for example, automotive, mobility, and consumer electronics are some other verticals that are gathering more and more data via their products as these products get “smart”, and are thus facing increasing challenges with regard to data protection.
Beatrice: In my experience, I’ve noticed that companies have become more privacy-conscious in 2020, but from what I could tell, it was mainly because of the legal regulations.
Ben: I’d also agree with Beatrice that we’ve seen organizations become more privacy conscious in 2020. Something else we noticed is the operationalization of data protection and privacy go deeper into organizations. Where in 2018/19 we spoke a lot with innovation teams, in 2020 it seems that synthetic data for privacy has become a “production” topic for large companies, and so we’re now working more with people “in the weeds” of the topic, from data science and engineering teams, to product teams, and data protection and security.
Ben: I feel like we could do a whole story just on this! I think the events of the last two decades or so, in terms of huge portions of the global population coming online, 9/11 and the free-for-all data collection that was allowed after it, Google and targeted advertising, Facebook et al. figuring out that the collective data exhaust of online humanity was worth more than gold , ML/AI, and the resulting data ethics/AI ethics/Right to be forgotten themes have all played huge parts in getting us where we are today.
Beatrice: On a less macro scale, the rise of machine learning definitely contributed to this for example. Why? Well, because machine learning cannot exist without data and nowadays machine learning is widely spread and in the case of large enterprises, to give an example, machine learning plays an important role in the decision-making processes. Given this, there were questions raised as to why would individuals agree for corporations to feed their data into such models and give up on their privacy. This has been a legitimate question and teams worldwide are deploying privacy-enhancing technologies (such as synthetic data) to unlock the full potential of the digital era and protect our privacy and right to individuality.
Ben: From a theoretical perspective, any company that collects sensitive data (which is almost every consumer-facing business, and plenty of others to boot) and wants to use this data safely needs to have tools to manage privacy. Practically, companies tend to actually pay attention to this requirement once they have a combination of scale (for example, that regulators would look at them) and significant business or financial risk (of not innovating, or of being non-compliant). What that boils down to in terms of a pattern is large companies in primarily B2C verticals, with data science teams who are driving value from sensitive data.
Beatrice: Industry dynamics have also been an indicator. From what we’ve seen, the competitive landscape is a great incentive for companies. For instance, if a bank uses their customer data to launch a new service that solves a common problem, the other banks would ideally quickly follow or lose their customers. The problem is when the other banks are impeded by data access and privacy issues. So, what I would consider a critical need for a solution is the case in which innovation is greatly weighed down by data privacy within a company.
Ben: I think the biggest challenge is the diversity of comprehension with regard to data science topics, which is partially caused by the large number of stakeholders present in decision making around data topics in large companies. A significant additional challenge is the presence of (and therefore, unfortunately, the interaction with) complex legacy IT systems.
The former challenge is relatively easy to tackle with educational material prepared specifically for stakeholders of various backgrounds. Lawyers, IT security specialists, and data scientists/engineers all have different contexts on data protection and data value, and we aim to cater to these differences with our material, as well as through 1:1 sessions & workshops which we run with each new partner. The latter challenge is sometimes harder to overcome, but we work very closely with developer and infrastructure teams to ensure that no matter what their stack, we can get Statice humming away safely on their premise.
Ben: I think one big thing that was really apparent this year is that even a lot of the biggest firms in finance, insurance, healthcare and automotive are not fully on top of privacy. We only very rarely meet teams where all stakeholders understand the relevant sections of GDPR, and understand for example where data protection technologies are suitable, and where they’re not. It amazes me that in 2020, we still hear “but we can just delete PII and it’s fine, right?” quite often (ed. note: see our post on the shortcomings of data masking for more details).
Beatrice: Our feeling is that, at least for machine learning applications, privacy-preserving synthetic data will increasingly become an alternative to real customer data. The main reason for that is the fact that more and more teams are using machine learning to automate and improve operations and processes and a privacy-enhancing technology like synthetic data significantly reduces the risks of data processing for organizations.
Ben: I agree with Beatrice. Of course, there are some use cases that will always need some level of access to real data, but I do believe that synthetic data is becoming a key technology in data science, and will remain one for a long time. At some point, we might see some emerging technologies (homomorphic encryption, SMPC, etc) better fitting specific use cases or offering other guarantees of privacy, but for now, I see privacy-preserving synthetic data generation as a capability that is and will remain pretty important for enterprises looking to drive value from data (which should be most of them!).
Beatrice: Self-endorsement notice: the Statice blog!
Ben: It’s not specific to PET, but I find a lot of interesting pieces on KDnuggets, more on the technical side of things, and on the legal and regulatory aspects, IAPP publishes a lot of good content. Otherwise, as Beatrice mentioned, we put a lot of effort into our blog, as well as collecting a lot of industry news into our monthly newsletter ;).
We hope you enjoyed this interview, stay tuned for more! You can connect with Ben and Beatrice on LinkedIn and get in touch with them to discuss your project.
Contact us and get feedback instantly.