[In this two-part series, Dale Sanders examines the state of disease surveillance in healthcare, focusing on how data should be collected, stored and leveraged to help identify outbreaks.]
We could easily justify a 30-page paper on this topic, but I don’t have time for that, so while it’s hot, I’m going to offer a few thoughts and observations on disease surveillance based upon my experience in the trenches of healthcare data. I hope that others with similar experiences will share your thoughts by submitting a comment.
Here’s a summary of the current options available for monitoring data that could help identify disease outbreaks. Additional details about these options appear later in the blog, and as you’ll find, the options are not great.
- Monitoring chief complaint/reason for admission data in Admit, Discharge, and Transfer (ADT) data streams.
- Monitoring coded data collected in EHRs.
- Monitoring billing data.
Federal Meaningful Use regulations require that EHRs be able to submit syndromic data to a surveillance system, but I have a feeling that requirement is going to run into data quality problems, for the reasons described below.
Key Concepts: Data Quality and Data Profiles
One of the key concepts that underlies what I’m about to discuss is data quality. Poor data quality translates into poor outcomes for decision-making, imprecise decision making, and imprecise responses to a situation. The equation for data quality is:
Data Quality = Completeness x Validity
The higher your data quality, the more precise your understanding of the situation at hand, and the more precise your decisions and reaction can be.
“Completeness” is exactly as the word implies — how complete and granular is the data you have about a patient? The metaphor that I like to use is that of a low vs. high-resolution picture. The higher the resolution of the picture, the more detail you can see and understand about the subject, while a low-resolution picture leaves you guessing about the finer details. Highly “complete” data is equivalent to higher resolution.
“Validity” is a little more difficult to describe, but in short, it relates to the context of the situation in which the data is collected, as well as the accuracy of the data. If a nurse measures a patient’s temperature and enters it into an EMR, that satisfies the concept of completeness — the data has been captured. If that nurse enters the wrong temperature, we have a violation of the concept of data validity.
Timeliness of data is also a dimension of validity. In the case of charting a patient’s temperature, if a nurse enters the correct temperature in the chart, but enters it 4 hours after actually taking the patient’s temperature, that data lacks temporal validity. To be valid, the data must be timely, relative to the decision-making or action associated with the data.
In addition to Data Quality, the other key concept is the notion of a “Data Profile” for a patient and disease type. A simple data profile for a patient is pretty straightforward: name, gender, age, height, weight, address. To round out the data quality of that profile, we also like to collect current medications, known allergies, past surgeries, family history of disease, and chronic conditions.
The next pass through the data profile is the collection of data associated with vitals and labs: temperature, heart rate, respiratory rate, blood pressure, basic blood, and urine labs. Each additional pass adds new data points about a patient (i.e., a higher resolution picture) and contributes to that patient’s data profile, and hopefully, the data quality as well.
Diseases also have a data profile, based upon commonly acknowledged symptoms and, hopefully, very discrete lab results or other diagnostics, such as those from imaging. The first symptom of Ebola is a fever, followed days later by increasingly worsening fever, bleeding, and vomiting. Unfortunately, the initial data profile of a patient with Ebola (data profile = fever) overlaps with hundreds of other disease states. Without more complete and valid data — that is, a higher resolution picture — the initial clinical data picture of Ebola looks like a multitude of diseases.
To raise the threshold of concern for declaring patient as “Ebola possible,” the next data point beyond fever, absent a pathology assessment for the virus, is a data point about the patient’s socio-physical environment prior to the onset of fever. Does this patient have a fever AND has this patient been exposed to another patient or group of patients who were Ebola infectious? I’m oversimplifying a bit, but the precision and accuracy of the data profile is a series of the data profile AND these statements. Any time we see a pattern of Boolean statements as a tool for describing a situation, we should immediately see an opportunity for computer-assisted decision-making.
A Data Profile Alerting Engine Fed by an EDW
Every healthcare system in the U.S. should possess a generalized data-profile-alerting engine that is fed by an enterprise data warehouse (EDW) that could, in-turn, feed analytic output to the EHR at the point of care. Within that alerting engine, a healthcare system would be capable of creating any number of profiles for “Patients Like This” (I notice that Epic has trademarked that phrase, but I’ve been using it, too, since about 2001). Those of you who are familiar with Theradoc can see the conceptual overlap with what I’m calling a patient profile alerting engine. But, there are also significant differences in the concept and, especially, the implementation models.
The profiler would sit in the background, passively watching the stream of data into the EDW until reaching a tipping point in its predictive algorithm, at which time it would declare, “This patient has a [%] probability of [this disease or condition].” Declaring the likelihood of a patient with an infectious disease, however, is not good enough. We must also use the alerting engine and our data to recommend the action and intervention to take following the prediction.
In the case of Ebola, the alerting engine would notify the immediate clinical team and the Infectious Disease SWAT team to isolate and sterilize everyone within contact. At Intermountain, we called these altering engines embedded in the EHR and EDW, “Medical Logic Modules.” Trained teams of informaticists, clinicians, data engineers, and pharmacists would define the logic inside these modules, monitor them over time, and adjust as necessary depending upon the evolution of the data profile for the disease or condition (see the above diagram).
That’s the long-term goal — a patient profile alerting system attached to an EDW, configured with Medical Logic Modules, and passing that data to an EHR so that the analytics of the EDW facilitates better decisions at the point of care.
[Part 2 of this series will examine the options for disease surveillance in the current health IT environment.]