Scientific knowledge, as measured by numbers of papers published, has been estimated to double every 17.3 years. However, it takes an average of about 17 years for health and medical research – going from basic lab studies on cell cultures and animals to clinical trials in people – to result in actual changes patients see in the clinic.
The typical process of medical research is generally not well equipped to respond effectively to quickly evolving pandemics. This has been especially evident for the COVID-19 pandemic, in part because the virus the causes COVID-19 mutates frequently. Scientists and public health officials are often left continually scrambling to develop and test new treatments to match emerging variants.
Fortunately, scientists may be able to bypass the typical research timeline and study treatments and interventions as they are used in the clinic nearly in real time by leveraging a common source of existing data – electronic medical records, or EMRs.
We are a team composed of an epidemiologist, pharmacist and cardiologist at the University of Pittsburgh Medical Center. During the COVID-19 pandemic, we realized the need to quickly study and disseminate accurate information on the most effective treatment approaches, especially for patients at high risk of hospitalization and death. In our recently published research, we used EMR data to show that early treatment with one or more of five different monoclonal antibodies substantially reduced the risk of hospitalization or death compared with delayed or no treatment.
Using EMR data for research
In the U.S., health care systems typically use EMR systems for documenting patient care and for administrative purposes like billing. While data collection is not uniform, these systems typically contain detailed records that can include sociodemographic information, medical history, test results, surgical and other procedures, prescriptions and billing charges.
Unlike single-payer health care systems that integrate data into a single EMR system, such as in the U.K. and in Scandinavian countries, many large health care systems in the U.S. collect patient data using multiple EMR systems.
Having multiple EMR systems adds a layer of complexity to using such data to conduct scientific research. To address this, the University of Pittsburgh Medical Center developed and maintains a clinical data warehouse that compiles and harmonizes data across the seven different EMR systems its 40 hospitals and outpatient clinics use.
Emulating clinical trials
Using EMR data for research is not new. More recently, researchers have been looking into ways to use these large health data systems to emulate randomized controlled trials, which are considered the gold standard study design yet are often costly and take years to complete.
Using this emulation framework, our team used the EMR data infrastructure at our institution to evaluate five different monoclonal antibodies for which the Food and Drug Administration granted emergency use authorization to treat COVID-19. Monoclonal antibodies are human-made proteins designed to prevent a pathogen – in this case the virus that causes COVID-19 – from entering human cells, replicating and causing serious illness. Initially the authorizations were based on clinical trial data. But as the virus mutated, subsequent evaluations based on cell culture studies suggested a loss of effectiveness.
We wanted to confirm that the findings of cell-based studies applied to actual patients. So we evaluated anonymous clinical data from 2,571 patients treated with these monoclonal antibodies within two days of COVID-19 infection, matching them with data from 5,135 patients with COVID-19 who were eligible for but either did not receive these treatments or received them three or more days after infection.
We found that overall, people who received monoclonal antibodies within two days of a positive COVID-19 test reduced their risk of hospitalization or death by 39% compared with those who did not receive the treatment or received delayed treatment. In addition, patients with compromised immune systems reduced their risk of hospitalization or death by 55%, regardless of their age.
Our near-real-time analysis of COVID-19 patients treated with monoclonal antibodies during the pandemic confirmed the findings of the cell culture studies. Our findings suggest that by using data in this way, researchers may be able to evaluate treatments in times of urgency without having to perform clinical trials.
Appropriate EMR data use
Many health care institutions have EMR systems that researchers can harness to rapidly answer important research questions as they arise. However, because this clinical data is not specifically collected for research purposes, researchers need to carefully design their studies and use rigorous data validation and analysis. They also need to take great care to harmonize data from different EMR systems, select appropriate patient samples and minimize all sources of potential bias.
New pandemics and significant public health challenges are likely to emerge abruptly and in unpredictable ways. Given the treasure trove of data routinely collected across U.S. health care systems, we believe that careful use of these data can help answer urgent health questions in ways that are representative of who’s actually receiving care.
Erin McCreary has served on scientific advisory boards for Shionogi, Inc and Merck.
Kevin Kip and Oscar Marroquin do not work for, consult, own shares in or receive funding from any company or organization that would benefit from this article, and have disclosed no relevant affiliations beyond their academic appointment.