AI in Healthcare Series: Tracking and Trusting AI in Medicine, Dr. Shantanu Nundy
By Stanford Online
Key Concepts
- AI Adoption in Daily Life: People are increasingly using AI tools for various tasks, including writing, seeking information, and analysis.
- Healthcare AI Use Cases: A significant portion of ChatGPT conversations (5-10%) relate to health, indicating consumer interest.
- AI Sophistication and Prompt Engineering: Users are moving from basic copy-editing to more complex analytical tasks by developing sophisticated prompts.
- AI Slop and Content Quality: A concern exists about the proliferation of low-quality, generic AI-generated content, particularly on social media.
- Video and Image Generation: Advancements in AI video generation (e.g., Sora 2, Veo 3) are rapid, raising concerns about content curation and potential misuse (e.g., deepfakes of doctors).
- Democratization and Cost Reduction: AI capabilities are becoming cheaper and more accessible, accelerating adoption across industries.
- Healthcare Access and Equity: AI has the potential to address significant healthcare access issues, such as the large number of people without regular medical care and those living in medical deserts.
- Medical Errors and Alert Fatigue: AI-driven alerts in healthcare can be ignored due to alert fatigue, mirroring the "boy who cried wolf" phenomenon, and raising complex legal and workflow challenges.
- FDA's Role in AI Regulation: The FDA is actively engaging with AI, focusing on both promoting innovation and protecting patients, with a growing emphasis on real-world monitoring.
- Counterfactual Analysis in Healthcare: Evaluating AI's impact requires comparing its performance not just against ideal scenarios but also against existing harms caused by current healthcare limitations and errors.
- Real-World Evidence (RWE) and Post-Market Surveillance: The FDA is exploring how to leverage RWE for AI monitoring, recognizing that AI models evolve and may not perform as expected in diverse real-world patient populations.
- Data Infrastructure for AI Evaluation: A significant challenge is the lack of standardized data infrastructure within Electronic Health Records (EHRs) to track AI usage, inputs, outputs, and versions, hindering effective RWE analysis.
- Risk-Based Frameworks and Prioritization: The FDA is applying risk-based frameworks to prioritize AI oversight, distinguishing between regulated and non-regulated AI tools and focusing on areas with the greatest potential impact.
- Collaboration and Innovation: The AI and healthcare communities are actively collaborating with the FDA to develop frameworks for safe and effective AI deployment, emphasizing a shared goal of improving healthcare access and quality.
AI Adoption and User Behavior
The discussion begins by highlighting the widespread adoption of AI tools in daily life, citing OpenAI's data showing millions of conversations. A significant finding is that 5% to 10% of ChatGPT conversations relate to health, indicating a strong consumer interest in using AI for health-related queries and practical guidance. This trend is mirrored by other companies like Google, where health queries are a top consumer concern.
Key Points:
- Personal AI Journey: Dr. Shantanu Nundy describes his personal evolution from using AI as a copy editor and ghostwriter to an analyst, emphasizing the development of more complex prompts for higher-order tasks.
- Writing as a Dominant Use Case: Writing is identified as a primary use of AI, with an expectation that personalized emails will increasingly be AI-generated.
- Internet Behavior Transfer: Matt Lungren suggests that internet behaviors, such as seeking information, are transferring to AI models.
- "AI Slop" and Credibility: A critique is raised about the prevalence of generic, uninspired AI-generated content, particularly on platforms like LinkedIn and Twitter, which can lead to a loss of credibility for the user. This "AI slop" is a concern for readers seeking more authentic content.
- Video and Image Generation Advancements: The rapid progress in AI video generation (e.g., OpenAI's Sora 2, Google's Veo 3) is noted, with implications for content curation and potential misuse, such as creating deepfakes of real doctors.
Democratization and Cost Reduction in AI
The conversation shifts to the increasing affordability and accessibility of AI technologies, a crucial factor for broader adoption, especially in healthcare.
Key Points:
- "State of AI" Report: The "State of AI" report by Nathan is highlighted as a valuable resource for understanding the past year's developments, particularly the trend of decreasing costs and increasing democratization of AI capabilities.
- Cost Reduction and Access: The decreasing cost of AI models is seen as paramount, directly equating to increased access, particularly in countries with healthcare disparities.
- Open-Source Models: The race for fully open-source AI models is mentioned as a driver of competition and cost reduction.
- Ubiquitous Access: As AI becomes ubiquitous, the initial cost barriers are diminishing, with the expectation that competition will continue to drive down prices.
Challenges and Concerns in Healthcare AI
Despite the promising advancements, significant challenges and concerns are emerging regarding the integration of AI into healthcare.
Key Points:
- The Case of Ignored Sepsis Alert: A New York Times article detailing a college-aged patient who died after an ER visit for flu-like symptoms is discussed. A sepsis alert was present but ignored by the provider, highlighting issues of alert fatigue and the social-technical aspects of AI implementation.
- Legal and Workflow Implications: The case raises questions about legal standards, the general adherence to alerts by doctors, and even the counterintuitive idea of disabling alerts to avoid liability.
- Alert Fatigue Analogy: Alert fatigue is compared to the constant honking in some cities, where the noise fades into the background, or the "boy who cried wolf" scenario.
- Fear of AI-Caused Harm: There's a fundamental fear that AI-related errors causing harm will be scrutinized more heavily than physician-caused harm or traditional accidents, even if AI demonstrably saves lives overall.
- Simulated vs. Real-World Testing: A critical point is made about the limitations of testing AI in simulated environments versus real-world prompts and scenarios. The FDA's focus is on the "long tail" of potential issues that arise with widespread use.
- The "Long Tail" of AI Performance: The concept of the "long tail" is explained using the example of Google searches, where a significant percentage of searches are novel. This highlights the challenge of ensuring AI performance across a vast and unpredictable range of real-world applications.
- Understanding Outcomes: A key challenge is developing methods to understand the actual outcomes of AI use in real-world settings, which requires allowing these systems to operate and iterating based on observed performance.
The FDA's Perspective and Real-World Monitoring
Dr. Shantanu Nundy, an advisor to the FDA, shares insights into the agency's approach to AI in healthcare.
Key Points:
- FDA Mission and Urgency: The FDA's mission is to promote and protect public health. Dr. Nundy emphasizes the need to balance the "protect" aspect with actively "promoting" the adoption of beneficial AI.
- Five Key Statistics: Dr. Nundy uses five statistics to underscore the urgency and need for AI in healthcare:
- 100 million people in the US lack regular medical care.
- 75 million live in medical deserts.
- Medical error is the third leading cause of death.
- 95% of rare diseases lack FDA-approved treatments.
- US life expectancy is flat and lags behind other OECD countries.
- Counterfactual Analysis: The importance of considering the "counterfactual" – the existing harms and limitations in healthcare – when evaluating AI is stressed. The current system already experiences significant medical errors, akin to a daily jumbo jet crash.
- Real-World Monitoring (RWM): The FDA has released guidance on real-world monitoring for AI, recognizing that AI models change rapidly and may not perform as expected in diverse patient populations compared to clinical trial participants.
- Evolving Regulatory Science: The FDA acknowledges that the science of evaluating and monitoring AI is still evolving and requires collaboration across academia, industry, and regulatory bodies.
- Digital Trials and Accelerated Market Entry: The possibility of conducting trials digitally and accelerating market entry for AI-powered medical devices is discussed, with the potential to bypass traditional bottlenecks without sacrificing safety.
- Collaborative Approach: The AI and healthcare community is characterized by a collaborative spirit, with a shared goal of ensuring safe and effective AI access.
Data Infrastructure and System-Level Tracking
A significant hurdle in effectively monitoring AI in healthcare is the lack of robust data infrastructure.
Key Points:
- EHR Data Limitations: Even with all EHR data flowing to the FDA, evaluating and monitoring AI would be challenging because AI usage is often not captured within current EHR systems.
- Lack of Unique Identifiers: Unlike medications with unique identifiers (NDCs), AI tools, their versions, and their specific usage within patient encounters are not consistently tracked.
- The "Plumbing" Problem: Beyond the methodological aspects, there's a fundamental need for the "plumbing" – the data infrastructure – to support the evaluation and monitoring of AI.
- Health System AI Inventory: Many health systems are struggling to even identify all the AI tools they are using, let alone track their deployment and impact. This lack of visibility can lead to security risks and governance challenges.
- Time-Stamped Patient Encounters: For effective post-market surveillance, it's crucial to have time-stamped data at the individual patient encounter level, detailing the AI tool used, its version, the inputs, and the outputs. This allows for retrospective analysis of AI performance and patient benefit.
Prioritization and Future Directions
The conversation concludes with a discussion on how to prioritize the vast landscape of AI challenges and the need for collective reporting and collaboration.
Key Points:
- Risk-Based Frameworks: The FDA utilizes risk-based frameworks to categorize AI, helping to reduce the number of items requiring intensive oversight.
- Extending Existing Frameworks: The FDA is looking to extend existing frameworks, such as the unique device identifiers used for implants, to AI.
- Surgeon and Clinician Reporting: Similar to how surgeons report faulty devices, a system for clinicians to flag problematic AI outputs or insufficient alerts is proposed as a way to gather signals.
- Collaborative Problem-Solving: The emphasis is on bringing people together to solve these complex problems, recognizing that the innovation community can contribute to developing solutions for data infrastructure and monitoring.
- Call for Collaboration: The FDA is actively seeking collaboration and encourages those with progress in AI evaluation and monitoring to engage with their requests for information.
Synthesis and Conclusion
The discussion underscores the transformative potential of AI in healthcare, driven by rapid technological advancements and increasing accessibility. However, it also highlights critical challenges related to user adoption, content quality, and, most importantly, the safe and effective integration of AI into clinical practice. The FDA's proactive engagement, coupled with a growing understanding of the need for robust real-world monitoring and data infrastructure, signals a commitment to navigating these complexities. The conversation emphasizes that addressing the existing harms in healthcare and ensuring equitable access to high-quality AI requires a collaborative, iterative, and data-driven approach, moving beyond simulated environments to understand AI's true impact in the real world. The "plumbing" of data infrastructure and the development of sophisticated monitoring frameworks are identified as crucial, albeit less glamorous, components for realizing AI's full potential in improving patient outcomes.
Chat with this Video
AI-PoweredHi! I can answer questions about this video "AI in Healthcare Series: Tracking and Trusting AI in Medicine, Dr. Shantanu Nundy". What would you like to know?