The Transformative Potential of Clinical AI Rests on a New “Implementation Science”

 

chair's corner ai healthcareWe are at the cusp of a new era in the digital transformation of healthcare.

While the last decade was spent transitioning from paper to electronic health record (EHR) systems, the next decade will be dedicated to using those systems to deliver higher-quality, more efficient, and equitable care. We are at this transition moment because of advances in artificial intelligence (most recently in generative AI), the availability of digitized EHR data that serves as the foundation for clinical AI models, and EHRs that can deliver the outputs of AI models into clinical workflows. 

It is easy to be skeptical about the potential for digital transformation to change healthcare for the better, particularly for clinicians. Rates of physician burnout are sky high, and much of this can be attributed to problems related to the EHR, including documentation burden and the relentless challenge of managing the EHR inbox. Part of this dissatisfaction stems from the fact that frontline clinicians spent the time and effort to learn how to use EHRs and much of their day is spent entering data into them; yet, until now, they have received relatively little benefit in return. In the new era, however, our clinicians will transition from being the enablers of future transformation to the beneficiaries of that transformation. 

While clinical AI models themselves are now plentiful, we are just beginning to understand what it will take to achieve robust clinical integration and impact. There is a non-trivial set of challenges but two stand out as most urgent – because they are hard problems and resources alone won’t solve them. Said another way, there is a missing science that is needed to inform our efforts to translate these breathtaking new tools into systems that actually make healthcare better. This science might be referred to as clinical AI implementation science. 

We see two major questions that need to be tackled in this new field: the first is how to know when, where, and how to deliver AI model output into clinical workflow. While AI is a new type of tool in some ways (particularly generative AI), it is also just another form of clinical decision support. We don’t often call it that because our experience to date has given most forms of computerized clinical decision support a bad name, invoking alert fatigue, poor targeting, and clunky user interfaces. While certain qualities of generative AI should allow us to do better, there remains real risk that AI tools will suffer these same shortcomings. This is because we still lack a robust process to identify when a clinician (or care team) needs help and then offering new information that isn’t perceived as wrong or distracting.  

While it is near impossible to imagine how an AI tool could know what the clinician is thinking, we could do a much better job of inferring it. We can use the EHR to observe what data the clinician has reviewed, what tests they have ordered, and even how tired they may be (or at least how long they have been at work). Armed with knowledge like this, we can then build AI tools that explicitly take the clinician’s “thinking trajectory” into account. The Center for Clinical Informatics & Improvement Research (CLIIR) within the new Division of Clinical Informatics & Digital Transformation (DoC-IT) is a national expert in leveraging these novel EHR data elements, and is eager to partner with others in our department and at UCSF to help take advantage of AI to improve clinical reasoning and help create decision support that truly makes a difference for patients and clinicians. 

Similarly, Amazon, Google, and others routinely perform so-called A/B testing to determine which digital strategies (including font size and color) achieve the best user experience and associated outcomes. In healthcare, we have massively underutilized this technique to inform our work. We now have the ability to A/B test different user interface designs – when and how in the workflow to present new information so that it is least disruptive. We have recently built tools within APeX to support this – and they are available through our APeX-enabled Research Program. We need to do more of this style of testing, thereby identifying generalizable lessons that help us integrate new technology into workflows in a way that offers the best chance of seamless use and clinician acceptance.   

A second major question is: how to address automation bias that ultimately results in de-skilling. In contrast to other contexts that developed and implemented AI systems that guide processes from start to finish (such as those in aviation and manufacturing), we are approaching AI in more targeted ways – deploying models that accomplish defined tasks (e.g., drafting a note) or predict specific clinical targets (e.g., changes in sepsis, side effects from medications). As AI picks off varied components of the clinical encounter, clinicians will likely experience AI evolve from a distinct “new” input to something more pervasive and subtle. This evolution is conceptually appealing as we want AI to fade into the background, acting as a subtle co-pilot supporting clinicians and bringing together the strengths of human and artificial intelligence. However, in practice, we have little understanding regarding how to combine human and artificial intelligence in ways that allow clinicians to focus on the domains in which they are uniquely competent (and even what these domains are – empathy? rare cases? patient preference-guided treatment selection?) while allowing the technology to offer the greatest benefit.  

Then there are related concerns about where we will need to maintain at least some level of clinician oversight over AI and allow the clinician take over if needed. In both these areas, there are substantive concerns about whether these are reasonable expectations for clinicians. In general, humans are terrible at tasks that require vigilance without substantive engagement. And they will become de-skilled if they do not routinely use particular knowledge or procedural techniques.

All of this is to say that, while AI creates massive potential to improve healthcare, there is important fundamental work we need to do to understand how to address several important challenges. Further, this work cuts across mission areas as we think about where our approach to education needs to change to accommodate how AI will integrate into clinical work, how our health system works to integrate AI into routine clinical care safely, effectively and equitably, and where our research can advance the new field of clinical AI implementation science. This combination will challenge academic medical centers to work in new ways. At UCSF, and in our department and new division of DoC-IT, we welcome these challenges and are optimistic that we can help lead the digital transformation of healthcare, for the betterment of patients and clinicians.  

Julia Adler-Milstein, PhD                                                                 
Professor and Chief, Division of Clinical Informatics and Digital Transformation (DoC-IT) 
Director of the Center for Clinical Informatics and Improvement Research (CLIIR)

Bob Wachter, MD
Professor and Chair, Department of Medicine