Now this is starting to get interesting…
An artificial-intelligence system trained on data from 400,000 people in the United Kingdom can estimate the likelihood that a person will develop cancer and a host of other diseases over the course of 20 years.
Scientists have developed a new artificial intelligence tool that can predict your personal risk of more than 1,000 diseases, and forecast changes in health a decade in advance.
The generative AI tool was custom-built by experts from the European Molecular Biology Laboratory (EMBL), the German Cancer Research Centre and the University of Copenhagen, using algorithmic concepts similar to those used in large language models (LLMs).
It is one of the most comprehensive demonstrations to date of how generative AI can model human disease progression at scale, and was trained on data from two entirely separate healthcare systems.
Details of the breakthrough were published in the journal Nature.
Full story here:
Nature News:
Which diseases will you have in 20 years? This AI makes accurate predictions
A modified large language model called Delphi-2M analyses a person’s medical records and lifestyle to provide risk estimates for more than 1,000 diseases.
A new artificial intelligence (AI) tool can forecast a person’s risk of developing more than 1,000 diseases, in some cases providing a prediction decades in advance1.
The model, called Delphi-2M, uses health records and lifestyle factors to estimate the likelihood a person will develop diseases such as cancer, skin diseases and immune conditions up to 20 years ahead of time. Although Delphi-2M was trained only on one data set from the United Kingdom, its multi-disease modelling could one day help clinicians to identify high-risk people, allowing for the early roll-out of preventive measures. The model is described in a study published today in Nature.
The tool’s ability to model multiple diseases in one go is “astonishing”, says Stefan Feuerriegel, a computer scientist at the Ludwig Maximilian University of Munich in Germany, who has developed AI models for medical applications. “It can generate entire future health trajectories,” he says.
Read the full article: Which diseases will you have in 20 years? This AI makes accurate predictions (Nature)
Eric Topol’s take on this:
Dawn of A New Era of Primary Prevention in Medicine
Recent groundbreaking reports highlight our newfound potential to prevent diseases
Primary Prevention means a disease or condition is averted. The term was coined and introduced by Leavell and Clark in the late 1940s. Now, about 75 years later, we’ve yet to achieve any substantive primary prevention with the notable exception of vaccinations that prevent infectious diseases. Of the 3 major age-related diseases that I focused on in SUPER AGERS—cardiovascular, cancer, and neurodegenerative—we have not prevented the latter two. Remember screening for cancer (such as mammography, colonoscopy, PSA, or total body MRI as some have advocated without adequate data) is a secondary prevention, with the objective of finding cancer at an early stage. Nothing meaningful has yet been shown to prevent neurodegenerative diseases. While there has been some preemption of cardiovascular disease with the use of lipid panels and cholesterol lowering drugs, heart attacks (and heart disease) and strokes remain the number 1 and 3 medical causes of death in the United States, respectively. And heart disease deaths are ticking up, adjusted for age.
Instead our healthcare now is centered on treating these diseases , which has limited success for many cancers and even less, thus far no disease-modifying impact, for neurodegenerative diseases (Alzheimer’s and Parkinson’s). No less, there’s the profound economic benefit of primary prevention for reducing the cost of such treatments, such as tailored oncology drugs or support of people with dementia in long-term care facilities. I hope this brief review will convince you that primary prevention is a great and largely unfulfilled need, that in light of recent advances it should be given the highest priority. The key problem is that we haven’t had the ability to do it. Until now.
This Ground Truths post is about the new and exciting opportunity for achieving primary prevention. We’ll start with a groundbreaking study just published in Nature by Moritz Gerstung and his colleagues.
Journal Article (open access):
Learning the natural history of human disease with generative transformers (Nature)
Decision-making in healthcare relies on understanding patients’ past and current health states to predict and, ultimately, change their future course1,2,3. Artificial intelligence (AI) methods promise to aid this task by learning patterns of disease progression from large corpora of health records4,5. However, their potential has not been fully investigated at scale. Here we modify the GPT6 (generative pretrained transformer) architecture to model the progression and competing nature of human diseases. We train this model, Delphi-2M, on data from 0.4 million UK Biobank participants and validate it using external data from 1.9 million Danish individuals with no change in parameters. Delphi-2M predicts the rates of more than 1,000 diseases, conditional on each individual’s past disease history, with accuracy comparable to that of existing single-disease models. Delphi-2M’s generative nature also enables sampling of synthetic future health trajectories, providing meaningful estimates of potential disease burden for up to 20 years, and enabling the training of AI models that have never seen actual data. Explainable AI methods7 provide insights into Delphi-2M’s predictions, revealing clusters of co-morbidities within and across disease chapters and their time-dependent consequences on future health, but also highlight biases learnt from training data. In summary, transformer-based models appear to be well suited for predictive and generative health-related tasks, are applicable to population-scale datasets and provide insights into temporal dependencies between disease events, potentially improving the understanding of personalized health risks and informing precision medicine approaches.
https://www.nature.com/articles/s41586-025-09529-3
Related:
Github repository for Delphi :
This repository contains the code for Delphi, the modified GPT-2 model used in the paper “Learning the natural history of human disease with generative transformers”, along with the training code and analysis notebooks.
The implementation is based on Andrej Karpathy’s nanoGPT.