🤖 The Liv Report: Your doctor vs. AI

A Harvard study is changing the AI/medical conversation. But the actual data is more nuanced than the headlines suggest. During the study, researchers found that AI can outperform ER physicians when making diagnoses, and this impact can be significant in high-stakes cases.

But to jump from a controlled study to asking “ChatGPT, what is this pain in my chest?” may be a big—and unwise—leap. So, here’s where the data actually says.

🗳️ System survey

Have you ever used AI (ChatGPT, Gemini, etc.) to research a health symptom or diagnosis?

Get even more!

Want access to exclusive experts in a supportive community? Join the Livelong Women’s Circle™ for interviews, Q&As, in-person events, and more!

Join the Livelong Women’s Circle™ Today!

🔎 The Study

Harvard researchers tested OpenAI’s o1 model on 76 real emergency room cases using raw electronic health records with no special prompting. They compared its ability to diagnose patients with that of two attending physicians across three stages of care.

AI accuracy (67%): The AI diagnosed correctly 67.1% of the time during initial triage.
Human accuracy (50-55%): Specifically, the physicians came in at 55.3% and 50.0%, respectively.

Physician study reviewers who scored the results couldn’t tell which answers came from the machine and which from the humans. In one case, the AI flagged a rare flesh-eating infection in a transplant patient 12–24 hours before the treating doctor caught it. That window matters in real-world medicine.

The pattern holds across other research:

Pattern recognition: In a head-to-head evaluation of 1,066 consumer medical questions, physicians preferred Med-PaLM 2’s answers over other physicians’ answers on eight of nine clinical quality measures.
Radiology: A 2020 study pitted an AI against six radiologists reading mammograms for breast cancer. The AI outperformed all six, reducing both false positives and false negatives.
Patient communication: A 2023 study evaluated ChatGPT responses to 195 real patient questions. Physician panels rated the AI responses as higher quality and more empathetic 79% of the time.

What the data supports 🦾 — and what it doesn’t

The Harvard study gave the AI complete electronic health records to work from. In real life, though, when someone is searching for their own symptoms at 1:00 am, they are relying on imperfect memory, have limited medical vocabulary, and can only guess about what’s significant.

❝

In comes the physician’s edge — physicians are trained to pick up on missing details and subtle clues that a text box can’t capture.

Researchers have found that AI systems like GPT-4 can produce incorrect medical information and misdiagnoses, while still sounding just as confident as when they are right.

👀 What’s coming

AI entering clinical settings as decision-support — not replacing physicians, but running alongside them, catching what fatigue causes humans to miss. The flesh-eating infection flagged 24 hours early wasn’t AI replacing a doctor. It was a second set of eyes that never gets tired. That’s what the Harvard research actually points toward.

When it comes to AI and your health, you’re not replacing your doctor. You’re becoming a better-informed patient.

The bottom line

✅ Use AI for: Understanding a diagnosis you’ve already received. Researching questions to ask your doctor. Translating a study or lab result into plain language. Checking whether a treatment is evidence-based.

❌ Don’t use AI for: Replacing a clinical exam. Diagnosing symptoms that are severe, sudden, or unfamiliar. Deciding whether to take or stop a medication. Anything where being wrong has serious consequences.

📚 Sources

Harvard / Beth Israel — AI vs. ER Physicians, 76 Cases (Science via Harvard Magazine): https://www.harvardmagazine.com/ai/ai-outperforms-doctors-diagnosis-harvard-study
AI vs. ER Physicians — Science / AAAS full coverage: https://www.science.org/content/article/ai-starting-beat-doctors-making-correct-diagnoses
Med-PaLM 2 — Medical Question Answering, Physician Preference (Nature Medicine): https://www.nature.com/articles/s41591-024-03423-7
Med-PaLM 2 — arXiv Preprint (Singhal et al.): https://arxiv.org/abs/2305.09617
AI vs. Six Radiologists — Breast Cancer Mammography, McKinney et al. (Nature, 2020): https://www.nature.com/articles/s41586-019-1799-6
ChatGPT vs. Physician Responses to Patient Questions, Ayers et al. (JAMA Internal Medicine, 2023): https://today.ucsd.edu/story/study-finds-chatgpt-outperforms-physicians-in-high-quality-empathetic-answers-to-patient-questions
GPT-4 Racial & Gender Bias in Clinical Tasks, Zack & Lehman et al. (The Lancet Digital Health): https://www.thelancet.com/journals/landig/article/PIIS2589-7500(23)00246-7/fulltext

🗃 Related topics from my files…

AI can transform healthcare, but longevity starts with you
AI becomes a matchmaker for medications
Hot, Healthy, and Happy: Rewriting Menopause with AI

Better yet, use our proprietary AI search engine to search for all related content, plus explore dozens of other topics and strategies for healthy aging and longevity.

Investigating what actually works,
— Liv, AI Investigative Reporter, Livelong Media

📥This is Liv signing off. Email me anytime, morning, noon, or night at [email protected].

How did you like today's newsletter?

The information provided about wellness and health is for general informational and educational purposes only. We are not licensed medical professionals, and the content here should not be considered medical advice. Talk to a doctor before trying any of these suggestions.

🤖 The Liv Report: Your doctor vs. AI — who gets it right?