SSupported by cloud hosting provider DigitalOcean – Try DigitalOcean now and receive a $200 when you create a new account!

An Interview With Dr. Khaled El Emam, Senior-Vice President And General Manager At Replica Analytics

Listen to this article

Below is our recent interview with Dr. Khaled El Emam, Senior-Vice President and General Manager at Replica Analytics:

Q: Could you provide our readers with a brief introduction to your company?

A: Certainly. Replica Analytics is the premier science-based synthetic data generation technology provider to the healthcare industry. I co-founded Replica Analytics three years ago, having worked in the anonymization and de-identification space, developing privacy enhancing technologies, for about 20 years.

Many business and technology analysts predict that synthetic data generation (SDG) is the future of data sharing because it addresses many of the challenges with sharing and reusing personal data, largely due to privacy concerns. Synthetic data generation involves a machine learning model which learns the statistical patterns and properties of real data to then create a synthetic dataset which maintains features of the original dataset, but with no one-to-one mapping back to a person. The synthetic data is considered non-identifiable but because of the high quality of the data, you can come to the same conclusions as you would with the real data. We help our clients share, reuse, amplify and augment data with these privacy and utility preserving methodologies.

Q: Any highlights on your recent announcement?

A: Yes, we appreciate your interest. We just recently announced an important partnership with Critical Path Institute (C-Path) that involves leveraging synthetic data to help accelerate research into rare diseases. Datasets for rare diseases can often be quite small, only heightening privacy and accessibility challenges, so this is an important use case for synthetic data. We’re seeing a growing opportunity to partner with organizations like C-Path for generating datasets that are fit-for-purpose.

I’d like to highlight another key announcement as well, which may be of interest to your readers. Earlier this year, we were acquired by Aetion, the leading regulatory-grade real-world evidence (RWE) provider. It’s an indicator of how adoption of synthetic data is growing rapidly. We operate as a subsidiary of Aetion and the idea behind the acquisition is that Replica complements Aetion’s technology portfolio by opening access to previously inaccessible real-world data (RWD).

Q: Can you give us more insights into your offering?

A: Our Replica Synthesis technology, as I mentioned earlier, uses artificial intelligence (AI) to generate synthetic data from real datasets. It’s a largely automated process, which saves our clients time and effort, and is one of the things that distinguishes it from traditional de-identification methods. It is scalable – so we can handle small and large datasets, as well as simpler and more complex ones. We provide a comprehensive report demonstrating the utility – the quality – of the synthetic data we produce.

Our Privacy Assurance Technology, meanwhile, provides the evidence and confidence our clients need to demonstrate compliance with various legal regimes for data protection. We are able to assess accurately disclosure and privacy risks against widely accepted thresholds for identifiability and the results are outlined in a detailed report.

Through our Simulator Exchange, the generative models we created during the synthesis process can be saved as simulators for data users to generate synthetic data on demand, whenever they need it and as much as they require, without the need to access the original data.

Q: What can we expect from your company in next 6 months? What are your plans?

A: You can expect us to grow as a company to respond to the demand for synthetic data. We are actively recruiting for several positions, including machine learning engineers, software engineers and data scientists. You will see us continue to refine and improve our products and methodologies, to ensure that as this sector grows, our solutions continue to be recognized and in demand in this developing sector. You can also expect that we will continue to offer thought leadership and education in this area through, for example, papers, webinars, tutorials and research. We also intend to establish frameworks for evaluating the privacy risks and utility of synthetic data. And we will be making more case studies and example applications available to continue strengthen the case and the supporting evidence that synthetic data is fit-for-purpose.

Q: What is the best thing about your company that people might not know about?

A: Here, I will highlight a couple of things, but they are connected. We are long-time and trusted leaders in developing privacy enhancing technologies and our work is very much science-based. We are deliberately transparent with our research and we educate and publish widely in this area. For this reason, we do not suggest synthetic data generation is a panacea, but it certainly has many advantages over traditional methods for creating non-identifiable data.

Activate Social Media: