Oculomics and AI: The eye as a biomarker for health span (paper Jan-Feb 26)

https://www.sciencedirect.com/science/article/pii/S2162098926000083

chatGPT:

What the paper is (one-paragraph summary)

This is a narrative review on “oculomics” (using the eye—especially retina/lens/ocular fluids—as a window into systemic aging) and how AI can turn ocular imaging + ocular “liquid biopsy” (tears/aqueous/vitreous proteomics/metabolomics) into biomarkers of health span and risk for age-related diseases (CVD, diabetes, dementia, neurodegeneration). It surveys ocular structures (anterior + posterior segment), imaging modalities (CFP, OCT, OCT-A, functional tests like pupillometry/eye-tracking), molecular layers (NAD⁺ metabolism, taurine, cytokines, proteomics), then summarizes current AI models (e.g., “retinal age gap,” LensAge, RetiPhenoAge, vessel caliber DLS) and closes with translational barriers: confounding eye disease, generalisability, dataset bias, explainability, device standardisation, privacy/security, and clinical deployment/regulation.


Key points / takeaways (structured)

1) The eye as a systemic + brain-aging readout

  • The authors frame the retina/optic nerve as CNS-extension, so retinal neurovascular changes can correlate with systemic vascular aging and neurodegenerative processes.

  • They summarize commonly cited imaging biomarkers:

    • CFP: vessel caliber, tortuosity, optic disc features.
    • OCT: RNFL/GCL thinning; choroidal thickness.
    • OCT-A: vessel density changes and FAZ enlargement as microvascular integrity/perfusion proxies.
    • Functional: pupillary light reflex and eye movement tracking for neurodegeneration.

2) “Ocular metabolomics/proteomics” as an aging layer

  • They highlight NAD⁺ decline/dyshomeostasis and enzymes like NAMPT / NMNAT1 as retinal vulnerability nodes; discuss NMN/NAD⁺ intermediates as potential interventions (mostly preclinical).
  • They discuss taurine dynamics (declines with age; hypotaurine accumulation) and inflammatory cytokine profiling (tears/aqueous/vitreous) as candidate biomarkers.
  • They emphasize proteomics across cornea/lens/retina/vitreous/tears and the idea of ocular proteomic aging clocks.

3) AI models and the “retinal age gap” paradigm

  • The review explains typical evaluation metrics (MAE, R², AUC, HRs) and the “retinal age gap” idea: predicted ocular age minus chronological age as a proxy for accelerated aging/systemic vulnerability.

  • They compile examples:

    • LensAge (lens photos; consumer/smartphone feasibility angle).
    • RetiPhenoAge (fundus images approximating a blood-based composite aging measure).
    • CNN age prediction from fundus photos linked to mortality.
    • Automated vessel caliber measurement (SIVA-DLS) for vascular risk stratification.
    • Liquid-biopsy proteomics + AI (TEMPO concept) as a way to map proteins to cell origins and build clocks.

4) Translation challenges they explicitly flag

  • Confounding from common eye diseases (e.g., pachychoroid; refractive error/axial length affecting structural measures) and training on “healthy eyes” limiting real-world generalisation.
  • “Black box” models and the push toward explainable AI.
  • Dataset ethnicity/representation bias, need for subgroup performance reporting and governance frameworks (they mention STANDING Together recommendations).
  • Device standardisation, calibration, regulatory clearance, workflow integration, privacy/security.

What’s novel here (relative to the typical oculomics reviews)

  1. Health span framing (not just disease detection).
    Many oculomics papers focus on predicting specific diseases (diabetes/CVD/AD). This review explicitly re-anchors ocular biomarkers in health span / longevity / biological age and discusses a “health span gap” motivation and terminology harmonisation (they even include “joy span”).

  2. Multimodal bridge: imaging + metabolomics/proteomics + AI in one synthesis.
    The paper is not only “retinal photos + deep learning”; it puts omics (NAD⁺, taurine, cytokines, proteomic clocks, liquid biopsy) alongside imaging and highlights multimodal AI as the next step (while acknowledging dataset scarcity).

  3. Explicit translational framework for AI evaluation.
    The review repeatedly emphasizes what a clinician/regulator cares about—ground-truth labels, external validation, generalisability, explainability, deployment—and summarizes representative models in a structured table format.

  4. Sex-specific / women’s wellness angle.
    A dedicated section argues for sex-specific risk prediction where eye biomarkers may be particularly informative (CVD risk underestimation in women; pregnancy hypertensive disorders/preeclampsia prediction signals; dementia burden). This is less common in generic oculomics overviews.


Critique (what’s strong, what’s weak, what I’d watch out for)

Strengths

  • Comprehensive modality coverage: CFP/OCT/OCT-A + functional tests + redox fluorescence + fluids/omics gives a broad “map” of the field.
  • Practical translational concerns are not an afterthought (confounding ocular disease, dataset bias, device standardisation, ethics/privacy).
  • The emphasis on external validation and multi-ethnic cohorts in examples is helpful, because this is where many AI biomarker claims fail.

Limitations / gaps

  1. It’s a narrative review, not a quantitative synthesis.
    There’s no meta-analysis, no formal risk-of-bias scoring, and no systematic comparison of which ocular biomarkers have the best incremental value over standard clinical predictors. The “Methods” section describes databases/terms, but it’s not a PRISMA-style systematic review with reproducible inclusion/exclusion and quantitative pooling.

  2. Biological-age ground truth is still slippery.
    Many models use chronological age or proxies like PhenoAge as labels, then interpret the residual (“age gap”) as “accelerated aging.” That interpretation can be confounded by:

    • camera/device differences, media opacity (cataract), refractive error, ocular disease,
    • lifestyle/SES correlates embedded in datasets,
    • comorbidity that is not “aging rate” per se.
      The paper acknowledges confounding, but doesn’t fully unpack how much “age gap” might be non-aging signal.
  3. Clinical actionability is under-specified.
    The review is optimistic about screening and prevention, but it doesn’t provide much on:

    • what interventions are triggered by an ocular age-gap finding,
    • net benefit vs harms (false positives, anxiety, over-testing),
    • health economics and workflow burden,
    • regulatory pathways and post-market monitoring in practice.
  4. Omics sections mix maturity levels.
    NAD⁺/taurine/cytokines/proteomics are discussed as “biomarkers,” but the evidentiary base varies widely (animal vs human; cross-sectional vs longitudinal; ocular-local vs systemic). The review could do more to separate:

    • correlational biomarkers vs validated predictors vs causal mediators,
    • and to state which are realistically measurable at scale (tears vs aqueous humor sampling, etc.).
  5. Device + protocol heterogeneity deserves even more weight.
    For imaging biomarkers, differences in scan protocols, segmentation algorithms, OCT vendor idiosyncrasies, and image quality pipelines can dominate signal. The paper notes standardisation/calibration needs, but doesn’t propose concrete harmonisation strategies or benchmark tasks/datasets.

“If I were reviewing this for a journal…”

  • I’d ask the authors to include a clear evidence hierarchy table (e.g., for each biomarker/model: label type, cohort design, external validation, prospective outcomes, incremental AUC/HR over standard risk scores, subgroup calibration).
  • I’d also want more emphasis on longitudinal change (does retinal age-gap track within-person aging trajectories, or is it mostly cross-sectional ranking?).

If you want, I can also extract (a) the paper’s Table(s) into a clean structured list (models, modalities, cohorts, outcomes) and (b) a “so what?” section: what would be required for oculomics to become a credible healthspan clock rather than a risk-correlated image score.

Here’s a clean extraction of the paper’s tables into structured lists, focusing on (i) ocular biomarkers and (ii) AI models (oculomics + longevity).


Table 2 — Ocular biomarkers used for aging / neurodegeneration assessment

Biological & imaging biomarkers

  • Retinal vessel tortuosity (CFP): with microvascular aging; confounded by BP/diabetes and image quality
  • RNFL thickness (OCT): in neurodegenerative disease (e.g., AD); confounded by glaucoma, axial length, signal strength
  • Choroidal thickness (OCT/EDI-OCT): with aging and AD/MCI; confounded by axial length/refractive error/pachychoroid
  • Central corneal thickness (Pachymetry/AS-OCT): with aging; confounded by corneal disease, IOP, prior surgery
  • Corneal stiffness/biomechanics (CorVis ST / ORA): with aging; confounded by IOP, corneal hydration
  • Corneal endothelial cell density (specular microscopy): with aging; confounded by prior surgery/endothelial pathology
  • Lens thickness/opacity (slit lamp/lens photo/A-scan): with aging and cataract changes; confounded by cataract subtype/surgery
  • Pupillary light reflex (PLR) (pupillometry): ↓ amplitude, ↑ latency in AD/PD; confounded by meds, ambient light, ocular disease
  • Eye movements (eye tracking): ↓ accuracy, ↓ velocity in AD/PD; confounded by fatigue, visual acuity, neuro comorbidity

Metabolomics / proteomics biomarkers

  • Retinal amyloid-β / tau deposition (retinal/hyperspectral imaging): in AD; confounded by artefacts/age/vascular disease
  • NAD⁺ and related metabolites (retina/optic nerve tissue assays): in vulnerable retinal ganglion cells / glaucoma contexts; confounded by systemic metabolism/diet
  • Taurine (plasma/ocular metabolomics): with aging; confounded by renal function/nutrition
  • Lens crystallin protein alterations (lens proteomics): structural modification with aging; confounded by oxidative stress/cataract type
  • Corneal wound healing capacity (clinical/experimental): with aging; confounded by diabetes/medications
  • Ocular proteomic aging clocks (aqueous humor proteomics + AI): ↑ biological age gap in accelerated aging/neurodegeneration; confounded by inflammation/ocular disease
  • Inflammatory cytokines (aqueous humor / plasma proteomics): in AD/PD; confounded by systemic inflammation/infection

Source: Table 2 in the uploaded PDF.


Table 3 — AI models in oculomics and longevity (structured)

Model Input modality External validation cohorts Demographics / ethnicity Key outcomes Clinical significance (as described) Key study
LensAge (Deep Learning–Based Lens Age Estimator) Lens imaging (slit-lamp / OCT) SEED, Beijing Eye Study, Handan Eye Study Predominantly Chinese; Southeast Asian Biological age estimation; age-related health risk R² > 0.80; MAE ~4.25–4.82y; positioned as scalable (incl. smartphone imaging) for screening/home monitoring Li et al., Nature Communications 2023
RetiPhenoAge Retinal color fundus photos (CFP) SEED, AREDS Multi-ethnic (UK, Korean cohorts) Biological aging; morbidity; mortality risk Non-invasive scalable population screening proxy for systemic health outcomes Nusinovici et al., Lancet Healthy Longevity 2024
Retinal Age Gap (Xception CNN) Fundus photos (UK Biobank) Internal (UKB) Predominantly UK European Retinal age gap (predicted age − chronological age), linked to all-cause mortality MAE 3.55y; R²=0.81; +1y gap → HR ~1.02 for mortality; peri-vascular regions highlighted Zhu et al., British Journal of Ophthalmology 2023
Extended Retinal Age Gap → Metabolic Health Fundus photos + glycemic data (UK Biobank) UK Biobank Predominantly European Retinal age gap correlated with prediabetes/diabetes; HbA1c; glucose Bridges oculomics-derived aging signal with systemic metabolic health/diabetes risk Chen et al., Diabetes Research and Clinical Practice 2023
SIVA-DLS (Singapore I Vessel Analyzer Deep Learning System) CFP (automated retinal vessel calibre) SEED, UK Biobank, Beijing Eye Study (multi-country) Multi-ethnic Cardiovascular risk; cognitive decline; dementia “Validated vascular biomarker” for systemic disease and neurodegeneration prediction Cheung et al., Nature Biomedical Engineering 2021; Brain Communications 2022
Retinal Aging Biomarker for Cognitive Decline & Dementia (CNN) CFP + retinal image features Population-based cohorts Predominantly Asian Brain aging; cognitive decline; dementia risk; women’s ocular aging models Positioned as dementia-/cognition-specific risk prediction; framed for wellness forecasting Sim et al., Alzheimers & Dementia 2025
Reti-CVD (Retinal DL for CVD risk) Retinal fundus images UK Biobank + external cohorts Multi-ethnic Cardiovascular risk stratification; systemic disease prediction Population-level prediction of CVD risk factors/events using retinal imaging Rim et al., Lancet Digital Health 2021; Tseng et al., BMC Medicine 2023
TEMPO (Tracing Expression of Multiple Protein Origins) Liquid biopsy proteomics + single-cell RNA mapping Independent ocular disease cohorts Mixed ocular disease populations Aging clocks; “disease acceleration”; protein-level aging shifts Claimed to detect subtle proteomic aging processes and link ocular proteomics to systemic longevity Wolf et al., Cell 2023

Source: Table 3 in the uploaded PDF.


(b) “So what?” — What would be required for oculomics to be a credible healthspan clock (not just a risk-correlated image score)

Below is a practical checklist, framed the way a biomarker program / regulator / clinician would think about it.


1) Define the target: what is the clock supposed to measure?

A credible healthspan clock must be explicit about its label:

  • Chronological age (easy, common, but not the goal)
  • Outcome risk (e.g., 10-yr CVD events, dementia incidence)
  • Physiologic state (frailty, multimorbidity, functional capacity)
  • Rate of change (within-person slope over time)

Most retinal “age gap” models are trained on chronological age, then interpreted as accelerated aging. That can work, but only if you prove the residual tracks healthspan and not just confounding.

Minimum requirement: pre-register the primary target (e.g., “predict 5-yr incident frailty independent of age/sex/SES”), and treat age-prediction performance as secondary.


2) Show that the signal is not dominated by confounders

Ocular measures are highly sensitive to things that aren’t “systemic aging rate”:

  • Ocular disease (glaucoma, AMD, diabetic retinopathy, cataract)
  • Axial length/refractive error (changes OCT thickness metrics)
  • Image quality + device/vendor (camera/OCT model, segmentation)
  • Medication effects (pupillometry; vascular tone)
  • Demographics/SES (embedded correlates in training data)

The paper flags confounding and generalisation concerns repeatedly.

Minimum requirement: demonstrate robustness with:

  • stratified performance across common ocular conditions,
  • adjustment for axial length / refractive error where relevant,
  • device harmonisation tests (train on vendor A, test on vendor B),
  • calibration curves by subgroup (sex/ethnicity/age bands).

3) Prove incremental value over standard clinical predictors

A biomarker isn’t useful if it just rediscoveres what BP, HbA1c, lipids, smoking, BMI already tell you.

Minimum requirement (for each key outcome):

  • Compare baseline model (traditional risk factors) vs baseline + oculomics

  • Report:

    • ΔAUC / C-index
    • Net reclassification improvement (NRI)
    • Calibration (esp. in external cohorts)
    • Decision-curve / net benefit (clinical utility)

This is where many “cool AI” biomarkers stall in translation.


4) External validation that is truly external

Not “random split UK Biobank.” You need:

  • different geography,
  • different capture devices,
  • different clinical workflow,
  • different prevalence (case mix).

The review emphasizes external validation and dataset bias issues.

Minimum requirement: at least one large prospective cohort and one “real-world clinic” deployment dataset.


5) Longitudinal tracking: does it move within person in the right direction?

A healthspan clock must show meaningful within-person dynamics.

Minimum requirement:

  • test–retest reliability (same day / short interval),
  • annual drift consistent with expected aging,
  • sensitivity to major health events (e.g., MI, diabetes onset),
  • and ideally responsiveness to interventions (see next).

6) Intervention responsiveness: does the clock change when healthspan changes?

For a clock to be actionable, it should respond to:

  • smoking cessation,
  • BP/diabetes control,
  • exercise/weight loss programs,
  • maybe specific therapeutics (depending on the claim).

The paper is optimistic about prevention, but doesn’t supply strong intervention-grade evidence (this is a field-wide gap, not just this paper).

Minimum requirement: at least one preregistered study where an intervention changes clinical endpoints and oculomics shifts concordantly.


7) Mechanistic plausibility and interpretability

Clinicians will ask “what is it seeing?” Even if you don’t fully know, you need credible explanation layers:

  • saliency maps are not enough;
  • link features to known biology (microvascular rarefaction, neurodegenerative thinning, inflammation signatures in tears/aqueous, etc.).
    The review discusses explainability needs and the retina-as-CNS rationale.

Minimum requirement: interpretable feature sets or hybrid models (explicit vessel metrics + DL embeddings), plus biological correlations.


8) Standardisation: devices, protocols, QC, harmonisation

This is often the hidden killer for imaging biomarkers.

Minimum requirement:

  • a defined acquisition protocol,
  • automated quality scoring and exclusion rules,
  • cross-vendor normalisation strategy,
  • locked model + versioning, with drift monitoring.

The paper notes standardisation and privacy/security as deployment barriers.


9) Clinical workflow and harm analysis

You need a clear “if X then Y” clinical pathway.

Minimum requirement:

  • specify who gets tested (population screening vs high-risk groups),
  • what threshold triggers action,
  • downstream testing burden,
  • false-positive/false-negative implications,
  • health economic model (cost per QALY, etc.).

10) Governance: bias, fairness, and regulatory readiness

The review explicitly calls out bias/representation and governance frameworks.

Minimum requirement:

  • subgroup performance reporting,
  • fairness constraints or post-hoc correction if needed,
  • privacy/security plan,
  • regulatory strategy (as a medical device / decision-support tool).

Practical “credibility bar” (a compact checklist)

If you want one line:

A credible oculomics healthspan clock must show external prospective validation, incremental predictive value, longitudinal within-person tracking, robustness to ocular/device confounding, and intervention responsiveness, with calibrated, bias-audited deployment.