A Suggestion About Predictive Analytics

Dale Sanders, Senior Technology Advisor & CIO Mentor, Cayman Islands Health Services Authority

Suggestive Analytics. The power of suggestion meets the power of data. But, before we talk about that, let’s contrast it to the buzz phrase of the day — Predictive Analytics.

I’d like to have a dime for every white paper, blog, journal article and marketing brochure I’ve seen in the past six months that cheer a coming healthcare revolution as a result of “Predictive Analytics.” The only thing more prevalent is the term “Big Data.” The latter fascination and buzz is rooted in Freudian reference, I’m quite sure, so of course every male CIO in healthcare wants it.  J But what of the former — what about predictive analytics?  On what do we base this sudden love affair with predictive analytics?  Not much, I’d say.

Wikipedia has a fairly lengthy definition of “Predictive Analytics.”  As defined there, “Predictive analytics encompasses a variety of statistical techniques from modeling, machine learning, data mining and game theory that analyze current and historical facts to make predictions about future events.”

Seems to me that predictive analytics in healthcare tends to come in two flavors — too easy and too hard — with a neglected middle ground.  Do we really need predictive analytics and Big Data to know that a 32-year old sedentary patient who smokes and has a BMI of 30 is a high risk for multiple chronic diseases?  What about a patient living alone, over 65, post-CABG — do we need Watson to tell us that such a patient is high risk for readmission?  Hardly… and yet we are surrounded by such patients.  The problem isn’t that we don’t know.  The problem is that we don’t intervene.

At the other extreme are the patient outliers and rare cases.  As much as we would hope and try, no computer algorithm in the near future is going to predict the impending stroke of a 34-year old patient who is a competitive triathlete with no family history of cardiovascular disease, yet we pursue such predictive scenarios and celebrate enormously if we come within a country mile in a randomized trial of pulling it off.  We ignore the forest of high-risk patients that surrounds us, in pursuit of the perfect and isolated tree.

The predictive middle ground of data that we choose to ignore at the point of care is genetic and family history.  Depending on which report you believe, we know of at least 150 genetic and family history markers that have a profound impact on the outcome of care. Think about it — we can predict and manage care with genuine data-driven decisions in these cases, but we ignore this middle ground, in part, because the data is not readily available in EMRs at the point of care (a different problem altogether worth discussing), we ignore these opportunities for the same reason we ignore the forest of “too easy” scenarios.

Predicting clinical outcomes and risk is not the problem.  The problem lies in our inability and unwillingness to intervene — personally, culturally, and operationally — when we see the opportunity. What healthcare system is operationally capable or even culturally willing to assign a caregiver at home to ensure that the 65-year-old post-CABG patient will not be readmitted?  How many women are willing to be tested for BRCA1 and BRCA2 mutations that raise the risk of breast cancer by 60% … and take action if they are affected?  I won’t even talk about our inability to do something about the skyrocketing incidence of obesity and diabetes.  It’s not that we don’t know, it’s that we are generally incapable of intervening.

Does this cynicism mean I have no interest or hope in predictive analytics for healthcare?  No.  We need to continue inching along technically and culturally with the concepts until, someday, the two will intersect.  In the meantime, we should be realistic and look for other analytic opportunities that are within the grasp of healthcare and already surround us in other parts of our lives — which brings me back to this notion of Suggestive Analytics.

I first saw the power of suggestive analytics in a program called the Antibiotic Assistant at LDS Hospital, thanks to colleagues Dave Classen and Scott Evans.  The Assistant is a complex algorithm which predicts and ranks the best course of antibiotic therapy for inpatients, given the profile of their lab, micro and pathology results, and general demographics.  It’s a very impressive, and an early example of predictive analytics.  However, to me, the equally impressive story is its use of suggestive analytics.

When the program was first released, only the predictive efficacy of the rank ordered antibiotic protocols were presented to physicians.  Naturally, physicians always chose the highest ranked protocol, even if the predicted efficacy of the top choice only differed by a tenth of a percentage compared to the second best choice, but that second best choice might be 10 times less expensive than the top choice.  The breakthrough came when Scott and David revealed the cost of those antibiotic protocols to the physicians, thus suggesting to physicians that they also consider cost alongside predicted efficacy when making their decision.  The benefits to clinical outcomes stayed virtually the same, but costs dropped from an average of $123 per dose to $52.

As e-commerce consumers, we are surrounded by Suggestive Analytics.  Amazon was among the first to influence our behavior by using data in this capacity.  They surround our transaction — e.g., buying a book — with data-driven suggestions that affect our purchasing behavior — customer ratings, frequency of purchase by other consumers, commonly bundled and related purchases and products, availability of the product, arrival date of the shipment, new vs. used prices.

Richard Thaler and Cass Sunstein wrote a great book (Nudge:  Improving Decisions About Health, Wealth, and Happiness) that gives example after fascinating example of what amounts to suggestive analytics, even though they never specifically use the term.

The difference between predictive and suggestive analytics is summarized quite easily:  under predictive analytics, Amazon would fill your shopping cart for you, based upon using predictive data mining algorithms.  Under suggestive analytics, you get to fill your own shopping cart.  Would I prefer that Amazon predict my shopping habits for me and streamline the whole process without my intervention?  Absolutely.  Is it reasonably possible in the near future?  Absolutely not.

Predictive analytics is certainly appealing in concept but, right now, it is little more than a marketing term, another adjective of hype.  Our healthcare industry would be better served to borrow concepts from the world of e-commerce and social networking, and embrace a new concept of Suggestive Analytics at the point of care to nudge our behavior in desired directions.

Dale Sanders also serves as VP of Healthcare Quality Catalyst


Email Newsletter

Sign up to receive our latest updates delivered straight to your inbox.


    Thanks for sharing your insights. As a former CEO of a predictive analytics company, and currently leading a new ‘analytics-centric’ leading edge, personal health and wellness company, among other activities, I am pleased to also contribute my perspectives here.
    I like your idea of contrasting Suggestive vs. Predictive Analytics- there is obvious proven benefit in using analytics to improve quality at the point of patient care.
    With regard to predictive analytics, I am pleased to offer comments:
    –Predictive analytics is often muddled in with other statistical tools as you say, it is often difficult to appreciate and understand just how powerful these tools are and what is their specific contribution
    –Rather than saying there are two flavors of PA, “easy and hard”, I suggest a better approach is to say there are two PA target opportunity areas in health care (and also in other sectors):
    Using PA to analyze the “known unknowns” – all of the patient treatment enhancements you described fall into this category- addressing known issues and processes, using analytics to improve processes, quality of care, and doing this more efficiently and at lower cost.
    Using PA to analyze the “unknown unknowns” – this is the real and power of predictive analytics and I believe really offers high upside for all health care players, and patients as well
    –Look at the magnitude of today’s health care issues. As one example, increasing complexity of medication regimens used by patients, coupled with a fragmented health care system involving multiple prescribers, has made the occurrence of serious drug-drug interactions more likely today than ever before. For example, one study suggests Preventable Adverse Drug Events injure 1.5 million people a year, costs the U.S. healthcare system $3.5 billion and resulting in an estimated 44,000 to 98,000 deaths every year. Some studies show even higher numbers.
    –Our aging population exacerbates the above issues. Studies show 41 percent of seniors take
    5 or more prescription medications, and more than half has 2 or more prescribing physicians. And 24 percent- about 1 out of 4 – seniors having 3 or more chronic conditions have not shared information with their health care providers during the last 12 months. No wonder medication errors among seniors on Medicare are estimated at almost $900 million.

    –We can use the real power of PA to better understand the “unknown-unknown” drivers here that are impacting our health care system, and create powerful new tools, improved processes and do this more efficiently while improving patient care.

    –The “unknown-unknown” data I would like to see addresses questions such as why do we have adverse drug events; what are the rules we should be looking at and changing to reduce these events save lives, and reduce health costs; what are the underlying drivers and patterns for adverse drug events- do these vary by geography, treatment modalities, user demographics, specific types of medical facilities, maybe how and where medical practitioners are trained. “Unknown-unknowns” may, for example, identify certain treatment modalities and drug regimens used by select groups of medical professionals which drive adverse drug events. Predictive analytics, an inductive rather than deductive process, offers a powerful tool to help us identify these and many other critical underlying health care drivers.

    I agree there are many PA projects that today may seem academic, but I do see great possibilities to improve our health care system, using powerful new predictive analytics computing tools and platforms coupled with more traditional analytics (both suggestive and deductive ‘rule based’ analytics), to dramatically improve the quality of our health care system. These new analytics and tools will address clinical issues such as the growing problem of adverse drug events, as well as addressing Medicare and other health care claims fraud and errors.

    We are making progress, but I still believe we can be doing much more to achieve significant improvement in our nation’s health care system and very clear to me predictive analytics and other tools, with the proper vision and commitments, will play a substantive role.

    Paul Silverman, Managing Partner, Gemini Business Group, Adjunct Professor, R.H. Smith School of Business in the University of Maryland.

    • Dale Sanders says:

      Very thoughtful and appreciated comments, Paul. Thank you for contributing! The unknown unknowns are definitely the most interesting and have the highest potential. I would like to see more predictive analytic attention given to what I call the “small n, high impact” diseases that affect a small population, but have a significantly higher social impact on families and communities– like ALS, for example.

      In healthcare, I’ve noticed that we tend to grab “the next big thing” and all follow it without really knowing what we are following– we do it with technology, medications, and half-baked results from clinical trials. I’ve seen three RFPs in recent weeks that make predictive analytics a firm requirement, but when asked to clarify, the authors haven’t a clue– and that lack of understanding opens the doors to snake oil vendors and investments that don’t pan out.

      I worked for several years in the USAF, then TRW. At one time, TRW managed more data than any single organization in the world– they managed the CIA data center, including data content; played a significant role in managing the NSA’s databases; and also owned the largest consumer credit data reporting system. In those settings, with what amounted to unlimited budgets, I learned a lot about the overall design concepts, strengths, and limitations of predictive analytics, especially when those analytics try to predict the nature of human nature. The technology is much better today than then, but largely, the concepts remain largely the same. We could predict with all sorts of accuracy the future of hardware and weapons related events, but our batting average with human beings was pretty bad. The best we could do was correlate past events by analyzing backwards– all these human events led up to this event. Hindsight was definitely 20/20. But we were terrible at analyzing and predicting ahead, prospectively, where human behavior played a major role in the scenario.

      So, in closing, smart people like you should continue to push us down the path of predictive analytics because eventually, it will pay off. In the meantime, we should also focus on interventions where the outcome is easy to predict; and take advantage of the suggestive analytic concepts described so well in Nudge which address the tricky issues of influencing human behavior with data.

      Thanks again!

  2. corasharma says:

    Hi Dale,

    The topic of Predictive Analytics has been around for decades, and what I find interesting is whether, now that we have EHR data, will this influx of clinical data make predictions easier… harder?

    Take for example predicting something “too hard”. That competitive triathlete — can his pending hospitalization be better predicted from new, ‘better’ variables gleaned from clinical data? I can imagine his physician adding ‘patient seems depressed’ into a note, or jotting down something about family history of stroke… More nuanced variables that claims data don’t capture.

    However, with claims data we were pretty certain about the ICD/procedure codes, but for clinical data, there is so much that remains unstructured, not to mention the pervasive data quality issues (that is a subject of another post).

    Also, another point/question. When it comes to analytics vendors bundling in predictive capabilities into their products — I really wonder why HCOs would use this bundled predictive software rather than best-of-breed SPSS, SAS, R, Excel, etc? Can Stat 101 do just as good a job with a lot less data?

    Thanks for your article!

  3. Dale Sanders says:

    Thanks Cora; great questions and thoughts. Like NLP, healthcare loves to “chase the asymptote” of predictive analytics, expecting results that are not possible within the context of our existing data ecosystem.

    I liken our current environment to calculus and sampling theories for analog waveforms. The Nyquist theorem relates the frequency of an analog waveform to the sampling rate required to digitally reconstruct that analog waveform. The higher the frequency of the analog waveform, such as a high C on a trumpet, the higher the Nyquist rate required to digitally reproduce it. Human beings and healthcare are, in essence, a system of complex analog waveforms. We are not digital. If we are going to understand the human health experience, we need a very complex, high rate Nyquist digital sampling model. We need to digitize the human being and human experience much more than we do now, and it won’t come from a physician or nurse entering data into an EMR. At best, we collect 20 digital samples (e.g. CPT, labs, weight, BP, height) in most healthcare settings that are then interpreted by a human mind. Digital imaging is great, but it doesn’t count as a Nyquist sample because we still rely so much on the human mind to make sense of those digital images. That’s never going to result in much of anything accurate or impressive, in terms of predictive modeling.

    We must collect more like 100,000 or 1,000,000 digital samples that are interpreted by a computer and presented for assessment to a human mind. A physician’s interpretation in a note is not digital. It’s a subjective assessment; another analog waveform. A definitive lab test that diagnoses ALS or an ejection fraction that diagnoses heart failure or an MRI that concludes algorithmically “the ACL is torn”– those are digital samples. We need orders of magnitude more digital samples– mathematical models– of humanity. Physics is the most digital and predictable science. Healthcare isn’t even close, by comparison, as a field of science. Maybe someday, after we’ve wired and sampled and modeled the human being enough, we’ll be able to approach the physics of human health and predictive analytics will be more useful… and predictable itself.

    I totally agree… we should be leveraging existing tools like the ones you mention, not reinventing the wheel. Contrary to popular belief, breakthroughs in science come from incremental growth, building on existing ideas and tools and borrowing concepts from other areas. SAS, SPSS, R, etc are great tools and we should leverage their capabilities, incrementally, not try to build something new. These tools and Stat 101 can take us a long way on the curve of value, without chasing the asymptote.

    Thanks again Cora! Looking forward to further discussions with you.

Share Your Thoughts

To register, click here.