Chat GPT and AI in Healthcare Thread

Dose of uncertainty: Experts wary of AI health gadgets at CES

Health tech gadgets displayed at the annual CES trade show make a lot of promises. A smart scale promoted a healthier lifestyle by scanning your feet to track your heart health, and an egg-shaped hormone tracker uses AI to help you figure out the best time to conceive.

Tech and health experts, however, question the accuracy of products like these and warn of data privacy issues — especially as the federal government eases up on regulation.

The Food and Drug Administration announced during the show in Las Vegas that it will relax regulations on “low-risk” general wellness products such as heart monitors and wheelchairs. It’s the latest step President Donald Trump’s administration has taken to remove barriers for AI innovation and use. The White House repealed former President Joe Biden’s executive order establishing guardrails around AI, and last month, the Department of Health and Human Services outlined its strategy to expand its use of AI.

You can turn off OpenAI training on your data. It’s not what’s uploaded that’s valuable, it’s the entire conversation. After all they are producing conversations in some sense, rather than a new generation of what was uploaded.

Since most health data is digital, you should assume in my opinion that it’s already public in some sense.

RAM prices are crazy now, but you really could use an old gaming pc and upgrade the ram then run gpt-oss with 120 billion parameters well. It’s really chatgpt at home.

2 Likes

And the trend continues… but as seems to be the trend, Anthropic actually does it with some safeguards, like HIPAA oriented infrastructure (though what that means exactly is a little unclear to me).

3 Likes

This feels like a big deal (surprisingly this video is 2 months old):

2 Likes

Livestream from 1 hour ago, life sciences and healthcare with Dario Amodei for ~15 min after 5 min mark.

Wow, his sister and co-founder Daniela had her second child a few months ago, she had an infection during pregnancy and many fancy doctors said it was a viral infection. She got a second opinion from Claude who suggested it was bacterial and that she needed antibiotics within 48 hrs or it would go systemic so she took them, and further testing showed Claude was right (11:40 mark).

3 Likes

Rapamycin is mentioned at the 48 min mark by David Fajgenbaum in the livestream, co-founder and president of everycure:

Relatively brief and readable article. (No need to ai summarize) Stanford has lots of sleep data to feed their AI.

"“The most information we got for predicting disease was by contrasting the different channels,” Mignot said. Body constituents that were out of sync — a brain that looks asleep but a heart that looks awake, for example — seemed to spell trouble."

1 Like

1 Like

NEW: ARISE, a Stanford–Harvard network of clinicians and researchers, just published the State of Clinical AI 2026.

Here are the major takeaways:

  1. “Superhuman” results are real but fragile.
    AI can match or beat clinicians on narrow, well-defined test cases. But small changes that introduce uncertainty or ambiguity cause sharp performance drops.

  2. Uncertainty remains the core weakness.
    When information is incomplete, evolving, or ambiguous, AI systems struggle. They often commit confidently to wrong answers rather than expressing uncertainty.

  3. AI shines where scale beats judgment.
    The strongest evidence is in prediction at scale: early warning systems, risk forecasting, disease trajectories, and population-level insights that humans cannot compute manually.

  4. Most clinical AI studies don’t resemble real care.
    Nearly half rely on exam-style questions. Very few use real patient data, measure uncertainty, or test fairness. This limits how much results translate to practice.

  5. Realistic evals reveal useful failure modes.
    Realistic tests like simulated EHRs and long patient interactions expose how AI actually fails, such as losing context, missing updates, or committing too early to wrong conclusions.

  6. AI works best as a teammate, not replacement.
    Across imaging, diagnostics, and treatment planning, clinician + AI outperforms either alone when integration is done well.

  7. Poor integration can make decisions worse.
    Over-reliance, automation bias, and reduced vigilance are real risks. In some cases, clinicians performed worse with AI than without it.

  8. Patient-facing AI scales fastest and carries unique risk.
    These tools expand access and engagement, but operate without real-time professional oversight. Confidence without context is especially dangerous.

  9. Outcomes matter more than engagement.
    Patient-facing AI is often evaluated on simulations or usage metrics, not whether it improves health, reduces errors, or speeds appropriate escalation.

  10. The field is shifting from capability to evidence.
    The next phase is not better demos but prospective, real-world trials that show when AI actually improves care and when it does not.

1 Like

This aligns with my experience after testing various AI platforms in areas where I have done a lot of research and have good competence in. AI is useful in some contexts, but generally tends to be highly unreliable in unpredictable ways. I think one should be very, very cautious in relying on AI for healthcare decisions.

1 Like

How do you structure your prompts?

Like this?

1 Like

No, not like that. Admittedly, I usually only do “attempt1”, and rarely do “attempt2” or more, unless it’s a followup query asking to incorporate some paper the first AI missed. My approach is also not conversational. However, I do very tight and very detailed constraints - asking to focus on specific MOA, incorporate specific papers, avoid some sources (articles without citations etc.), focus on high credibility, exclude certain presumptions etc. So my query is very detail oriented. I figure if the AI platform fails here, there’s little point in my repeating an ask that I’ve already specified in my first attempt. It failed - my confidence that it won’t fail again craters.

Research from last year support that multi-turn conversations aren’t actually good anyway for certain tasks, maybe it’s fixed now, though I admit I haven’t read the paper but simply believe it (at least then):

Our experiments confirm that all the top open- and closed-weight LLMs we test exhibit significantly lower performance in multi-turn conversations than single-turn, with an average drop of 39% across six generation tasks.

In simpler terms, we discover that when LLMs take a wrong turn in a conversation, they get lost and do not recover.

1 Like

LOL, that was my instinct too! I figure I gave the AI very tight instructions with careful constraints and source quality instructions. If it still fails I have no reason to suspect that somehow the reasoning capability improves on the second go around. Ask once, but ask well is my philosophy, failure in this scenario strikes me as inherent to the architecture. That’s why I avoid conversational approaches, because there’s room for misunderstanding and ambiguity. Instead I keep my questions depersonalized, objective, emotion-free and very specific. I don’t understand or play games with tone, battling flattery or other social and personality overlays - “just the facts, ma’am”. Meanwhile I see all the time “assume I’m a biohacker, blah, blah, blah” - this is just asking for trouble. Facts, and nothing else, please. 🕵🏻‍♂️

1 Like

Will AI replace doctors? At our STAT@JPM event, Robert Nelsen told the audience, "Yes.

I encourage you to Listen to his discussion here: https://x.com/statnews/status/2011573799620067564

Then, Robert responded on X to that post and his video:

Sorry folks. Most (not all) doctors (and most later stage venture capitalists) will be replaced by AI, and robots. It is all about getting the AI the right information, which will be new longitudinal data streams, behavior patterns, proteomics, regular sequencing, imaging, etc. most of that can be done by less skilled workers, and eventually robots. I come from a family of docs. Love them dearly. Most docs are trained that they are right, not to acknowledge mistakes because of liability, and it is super clear AI is already better than most docs, includimg academic docs, when provided accurate data. So in months and years it will be a cake walk. 5 years ago, I said my nanny and an AI will be better than most docs, and I stand by that. Maybe in 5-15 years, we need 1/50th the amount of docs, and more low skilled data gatherers, and robots. We will need nurses longer, but robots will enable nurses to be much more productive. For the developing world, it will all be about access and getting data into a phone, as the best doctor in the will be AI. It is hardest for doctors to imagine these changes because the have been taught that they are right, and avoid negative data as a class, and their social status, even their title (try calling them by their name) is so deeply tied together. So they become refractory (see comments, lol). In the interim, the smarter docs will embrace the change, and it will make them better.

Source: https://x.com/rtnarch/status/2012242455286984957?s=20

More on Robert Nelson:

For This Venture Capitalist, Research on Aging Is Personal; ‘Bob Has a Big Fear of Death’

Robert Nelsen has invested hundreds of millions in Altos Labs, a biotech company working on ways to rejuvenate cells and eliminate disease

The investment firm Robert Nelsen co-founded in 1986, Arch Venture Partners, has racked up billions in profits from early stakes in companies developing methods to detect and treat cancer and other diseases.

In his personal life, Nelsen, 60 years old, downs a daily cocktail of almost a dozen different drugs, including rapamycin, metformin, taurine and nicotinamide mononucleotide, all of which he says help prevent illness and promote longevity. Nelsen has a full-body MRI every six months, sees a dermatologist every three months and has annual blood tests to detect cancer. At his home in the Rocky Mountains, he works out in an “electric suit” that he says emits low-frequency impulses to build muscle and improve health.

“I know I will get cancer, I just want to catch it early,” says Nelsen, who says an MRI several years ago has already identified thyroid cancer at an early stage. He has seen family members die of the disease.

“Bob has a big fear of death,” says his wife, Ellyn Hennecke.

Read the full story: For This Venture Capitalist, Research on Aging Is Personal; ‘Bob Has a Big Fear of Death’ (WSJ)

image

1 Like

Well, they say they don’t train on your data. But they must be doing something with it. I have no basis to say it whatsoever, but I simply don’t trust that they don’t use the things you upload. IMO, if you upload PDF financial reports or letters from your lawyer, that’s too juicy to pass up.

Mac Studio is the best bang-for-buck way to run large models. I have one with 128GB unified memory and it runs GPT-OSS-120b really well.

Agreed 100%

And yes, once a conversation goes off the rails, it can never, ever be recovered. Just start a new chat with a fresh context window.

I simply can’t believe this. It sounds like hype of the highest magnitude. End of the day, the “AI” is a calculator for words. I think specialised models will be companions and tools for doctors but I just can’t see replacement. That said, I don’t have 9 figures invested in it haha

2 Likes

Sure, but 3 years ago if I told you that by Jan/2026 over 800 million people would be using AI every day and we’d be investing $4 Trillion + in compute infrastructure per year, you wouldn’t have believed that either :wink:

3 Likes

That’s true, but it is indeed an incredibly fast calculator, which mimics the calculating abilities of the brain. At a vertiginous speed, exploring a myriad of solutions and elaborating the most plausible ones.
Does it make it an infallible entity? NO
Does it make it a second powerful brain, subject to control? YES

1 Like