“Recipients of the Galleri test did not show a significant reduction in the cancers diagnosed at Stage 3, when the disease would have grown or spread near its original site, or Stage 4, when the cancer would have spread to other parts of the body, according to the company.”
“ did not show a significant reduction in the cancers”
To me, this says it’s still reducing late stage cancer but just not a significant enough number. Meaning, it’s not worth it for the population at large, just as a lot of screening isn’t, but it could still be useful at the individual level … right?
From the language I don’t think you can speculate whether there was any difference at all. The pre-specified endpoint would have been a statistically significant reduction in stage 3 and stage 4 diagnoses (i.e. catching cancers earlier). It’s quite standard to say that it “didn’t show a significant reduction”, even if there was no difference at all.
The NYT article also mentions that they did find some benefits in the secondary endpoints - however, we have less faith in those because they don’t have as much statistical “power”. Basically, the more measurements you take, the more likely you find “something” significant, but it is more likely to just be chance.
I definitely don’t understand all the nuances, but one big problem for screening tests is that they have such huge barriers to overcome to be “successful”, especially at a population level. They not only need to correctly diagnose you (i.e. true positives and true negatives) but they need minimal false positives, and very few false negatives. After that, the information needs to be actionable; so your Galleri scan flags breast cancer - then what? You go for other scans, which also have their own specificities and sensitivities, and maybe a biopsy. And then you get a diagnosis. And does that diagnosis affect the treatment plan and patient survival? It’s a lot of hoops to jump through. Sometimes early diagnosis extends survival on paper but not in reality.
Sometimes it’s actively harmful and indolent diseases end up being treated when they didn’t need to be. And sometimes you won’t even have an opportunity to act, like an aggressive lung cancer, glioma etc could establish and then wipe you out between the tests.
That said, my overall interpretation is that this looks like a really well designed trial. They recruited a bunch of people in the 50-77 age bracket, tested them yearly and referred them for further diagnosis if the result was positive. And they compared to participants who didn’t have the Galleri test performed. End result, no significant shift towards earlier diagnosis in the Galleri group. To me, that basically means this product doesn’t work for the intended purpose that we all hoped it would work for.
Interesting article, and I agree that the NYT article misses many important elements. The “dilution effect” is very true, where there are competing screening tests already being used (colonoscopies etc).
But I think I find quite a few inaccuracies. It says there is no screening test for gastric cancer, but I literally had a gastroscopy done today. Sure it’s invasive, but gastric cancer is very slow growing and you can catch it and cure it early, just like colorectal. She also says liver cancer has no screening test, but it does. Alpha fetoprotein and PIVKA-II blood tests, and abdominal ultrasound. It also sets up a very silly comparison of survival rates for stage 1 versus stage 4, when clearly this test really isn’t effective at finding stage 1 cancers anyway.
I do like the attempt to approximate the cost to benefit ratio though. She comes out with
“Combining the probability of developing an unscreenable cancer (~10% over 20 years), the probability of Galleri detecting it (~78% weighted average for deadly cancers), and the probability that earlier detection meaningfully changes the outcome (~40-50%), annual Galleri screening gives a given individual roughly a 3-4% chance of altering what would otherwise be a fatal diagnosis. Whether that probability justifies $19,000 over two decades is a personal calculation[…]”
I’m not sure about the 40-50% change in outcome. That sounds high to me, and she doesn’t seem to have a citation for that.