Synthetic data could soon become the next “big thing” in the healthcare and life sciences industry. According to Gartner, “by 2024, 60% of the data used for the development of AI and analytics projects will be synthetically generated.”
In this article, you will learn about:
- Synthetic data
- The challenges of patient recruitment in clinical trials
- The value of synthetic data in clinical trials
- The “fairness” of synthetic data
- The potential of synthetic data in the future of clinical trials
Synthetic data
Synthetic data is annotated information generated by computer simulations or algorithms as an alternative to real-world data. Put another way, synthetic data, rather than being collected or measured in the real world, is created digitally albeit often based on real-world data. .
Some of the methods to generate synthetic data include:
- Statistically rigorous sampling from real-world data
- Generative modeling
- Simulation scenarios where models and processes interact to create entirely new datasets of events
The challenge of patient recruitment in clinical trials
In drug development, the safety and efficacy of treatment are assessed during clinical trials. Recruiting patients who meet eligibility criteria is one of the most significant challenges that contribute to extended timelines and high costs of clinical trials.
80% of clinical trials globally fail to enroll patients on time. With some sources suggesting that each day of delay of the drug launch costs sponsors between $600,000 and $8 million, failures in timely patient enrollment significantly contribute to the cost of clinical trials and the overall cost of bringing new treatments to the market.
The value of synthetic data in clinical trials
In randomized clinical trials (RCT), patients are randomly assigned to one of two groups. Patients who constitute an experimental arm receive the treatment, while those who receive a placebo or the standard of care constitute a control arm.
The use of synthetic data offers the potential to replace the randomized control arm – made up of patients receiving a placebo or the standard of care – with a synthetic control arm, also known as an external control arm, made up of synthetic data. While synthetic control arms may not be a suitable solution for all clinical trials, they hold great promise, particularly where patient populations are challenging to recruit or assess in randomized clinical trials.
Synthetic control arms can be derived from real-world data (RWD) such as electronic health records (EHR), claims and prescription databases, wearables data, and other de-identified patient health records sources.
The use of synthetic control arms helps to address the ethical challenge of directing patients to clinical trials, in which they may receive a placebo rather than an active agent. Since patients with severe diseases, such as cancer, may decide not to participate in a clinical trial knowing the possibility of being randomized into a control group, synthetic control arms improve the chances of recruiting the required number of qualifying patients. Additionally, synthetic control arms offer a unique value for clinical trials in rare diseases where the number of eligible patients is small. Therefore patient recruitment presents an even greater challenge.
By leveraging synthetic control arms, organizations can significantly improve their enrollment timelines and reduce the cost of patient recruitment, ultimately accelerating time-to-market for novel treatments.
The “fairness” of synthetic data
Health inequities among different patient populations are a persistent problem in public health. Inequitable representation of patient subpopulations in synthetic data could lead to inaccurate analysis where conclusions and predictive models do not represent the real world. Any application of synthetic data in clinical trials must accompany efforts to measure fairness to enable the development of machine learning models that create more equitable synthetic healthcare datasets.
The potential of synthetic data in the future of clinical trials
Could the use of synthetic data in clinical trials go beyond synthetic control arms? Yes, it could. In silico clinical trials use patient-specific models to create virtual cohorts for testing the safety and efficacy of new drugs or medical devices. Examples include:
- A virtual clinical trial, conducted by researchers at the University of Leeds in the UK, using parameters from a real patient population and in silico modeling to investigate the use of a flow diverter device in brain aneurysms.
- VICTRE (Virtual Imaging Clinical Trial for Regulatory Evaluation) study, an in silico clinical imaging trial evaluating digital breast tomosynthesis (DBT) as a replacement for digital mammography (DM).
While in silico clinical trials are unlikely to replace clinical trials with real patients any time soon, the methodology can be used to model clinical trials as a way to predict outcomes. With the probability of success (POS) of clinical trials often ranging in single digits, synthetic data offers an opportunity to optimize study design and increase the POS.
As organizations explore data-driven strategies to optimize the process of bringing new treatments to the market, reduce timelines and costs while bringing greater precision to drug development, and improve patient outcomes, synthetic data application should definitely be a strategy to consider.
Ready to optimize your clinical trials?
At Globant, we combine our healthcare and life sciences expertise with a strong data and artificial intelligence (AI) practice and our partners’ capabilities to help clients optimize their clinical trials. Building synthetics control arms is one of the elements of protocol optimization that can significantly increase the probability of success of clinical trials. Get in touch to discuss your challenges and needs.