https://www.zyphra.com/post/zuna
Can ZUNA from Zyphra classify seizure EEG from healthy EEG with zero training? That is the simple question I started with this weekend.
I have been obsessed with what the 380M parameter masked diffusion model can do since it dropped last week. And since they open sourced the weights, my brain just got too excited.
Here are the results…
Yes it can… With 0.89 AUC. ( can be compared with accuracy for imbalanced datasets)
To put that into perspective, that is a huge number for a dataset that is so skewed that healthy EEG epochs numbered 10k vs 15 seizure epochs. (I am using a small subset of the CHB-MIT EEG dataset: subject 1 and 2)
Training models on such an imbalanced dataset is extremely hard, so to be able to achieve high classification accuracy with zero training is extremely important.
The state of the art DL models achieve 0.98 AUC. These models are specifically trained on th CHB-MIT dataset
What did I actually do here?
Since ZUNA reconstructs input data using a masked diffusion autoencoder, it learns the underlying manifold of healthy brain activity captured by EEG. So if I feed it epileptic EEG, it should fail to reconstruct it accurately, resulting in a reconstruction error.
I asked the question: what if this reconstruction error contains sufficient information to classify seizures?
Next step:
What if I build a lightweight classification head that uses the reconstruction error to detect seizure vs healthy EEG?
Follow along for updates on where this experiments head to next
Open to more ideas, feedback, and access to good GPUs:smiling_face_with_tear: