As success in the current pharma and biotech landscape becomes increasingly determined by patient outcomes and personalised therapies, the integration and mastery of digital technologies will be essential to improving the efficiency of clinical trial operations.

To harness the full potential of emerging technologies, sponsors will have to recognise that these innovations will have a greater impact when used collectively, than any one technology alone. This includes understanding that new digital technologies will continue to generate exponentially larger and more complex datasets.

The volume of health data — collected by providers, insurers, researchers and others — is doubling every 12 to 14 months (1). According to a 2014 report by consulting firm IDC, 153 exabytes (one exabyte = one billion gigabytes) of data were produced in 2013, and an estimated 2,314 exabytes will be produced in 2020, representing an overall rate of increase of at least 48 percent annually (2).

Moreover, the ability to expand processing and analytic infrastructure to keep pace with this acceleration will be essential to maximising its value (3). Creating and maintaining this capability represents a significant challenge for sponsors. As such, the first step to taking advantage of the ensuing digital transformation, is learning how to best utilise Big Data.  

The diversity of Big Data sources, and the sheer volume of it, requires an enormous effort to evaluate, normalise and structure it for analysis. The nature, sources and quality of data also influence its value and use. Below is an exploration of some data sources and a brief summary of their potential for improving clinical R&D performance.


Structured clinical data

The quality and value of structured clinical data, which includes data from current and past clinical trials, peer-reviewed studies, and real-world evidence (RWE) from registries, vary depending on collection method and rigor. 

Clinical trial data can be used for a wide range of purposes. For example, data captured automatically from trials can identify unanticipated safety issues, and flag anomalies that might indicate protocol deviations early to prevent costly study delays or failures. Also, they can help make go/no-go adaptive study changes earlier, and can potentially close out studies faster, cutting timelines. 

In other instances, historical trial data may be used as an external control arm in an active trial, while registry data can help identify possible label extensions and can make the case for reimbursement — particularly when combined with other RWE. 


Traditional clinical data

Traditional clinical data, such as electronic health records (EHRs), can be valuable for guiding study design and exclusion criteria, as well as selecting promising study sites. In addition, they can be useful in identifying patients at high risk of developing chronic diseases, particularly when merged with genetic data. Moreover, they are helpful in producing RWE for value-based payment models. Other forms of traditional clinical data include lab, pharmacy and insurance information. 


Emerging RWD sources 

Real-world data (RWD) sources include mobile clinical monitors, patient-reported outcome apps, and internet of medical things, as well as imaging, genomic and molecular studies. While there are challenges to using devices in clinical studies — for example cybersecurity, usability and durability, in addition to costs for expertise in design and digital endpoint validation — the payoff can be significant. The volume and granularity of data from devices can establish efficacy in shorter time periods. Further, devices can improve trial efficiency by making studies more convenient, reducing clinic visits and improving patient retention. 

Similarly, genomic, proteomic and imaging studies can be used for diagnosis, monitoring and therapy development. The potential for these technologies for improving the efficiency of new molecule development could increase approval rates, multiplying R&D efficiency.


Emerging supplemental, environmental, economic and social data 

The types of Big Data include everything from environmental data to insurance status, education and income markers that may influence therapy response and study success. For example, pollution can affect asthma or COPD, producing a response that might otherwise be attributed to a trial medication. Supplemental information helps filter out noise in datasets, potentially reducing the time and size of trials. Similarly, general educational level and health literacy can affect therapy compliance, and can inform designing therapies that are more likely to be successful in the real world.


Deploying Big Data 

The increased granularity, specificity and volume of Big Data has the capacity to increase clinical R&D efficiency. Harnessing this potential requires a significant infrastructure, expertise and judgment to determine when and how to best deploy it.

Today, we are witnessing technology giants providing necessary infrastructure and support. Yet clinical trial expertise is also essential to transform clinical trial efficiency. For example, aggregating retrospective data on trial site performance can help select sites more likely to successfully recruit patients for future studies. Addressing recruitment challenges are critical to cutting costs of opening sites that never see a patient, and to keep studies on track. In another instance, data evaluating the performance of trial designs can help streamline current trial protocols involving similar conditions or test therapies. 

Identifying and addressing these study needs using Big Data greatly improves trial efficiency and builds competence and confidence in applying digital technologies needed to tackle more complicated issues as they arise.

As Big Data is the raw material of digital transformation, how do sponsors fully make the best use of it? The answer may lie in artificial intelligence (AI)-powered capabilities, including pattern recognition and evolutionary modelling, that are essential to gather, analyse and harness the growing masses of data. The potential of AI in clinical trials include automating routine data entry functions, analysing EHR data to find suitable study participants and sites, monitoring patient compliance, and modelling potential new molecules and therapies.

But what, exactly, is AI? And how can it be developed and used to transform clinical trials, while adhering to the rigorous standards required to demonstrate drug safety and efficacy? To discover the answers, read our latest white paper on how digital technologies have the potential to improve pharma ROI.

Digital Disruption in Biopharma

Read our latest whitepaper on how digital technologies have the potential to improve pharma ROI


(1) Feinleib D. Big Data Bootcamp. Springer; 2014. The Big Data Landscape; pp. 15–34.

(2) The Digital Universe of Opportunities: Rich Data and the Increasing Value of the Internet of Things. EMC Digital Universe with Research and Analysis by IDC, April 2014.

(3) Dinov ID. Volume and Value of Big Healthcare Data. J Med Stat Inform. 2016;4:3.