[URGENT][Cybersecurity thread] ""soon-to-be-released AI models could enable a world-shaking cyberattack this year" [secure Your Healthcare Data]

AlexKChen · April 6, 2026, 1:58pm

There are lots of AI threads here, and there needs to be one devoted for cyberattacks b/c they may affect the safety of all your health data.

[they can also affect this forum and ALL discourse forums]. It’s important to be forward-seeking with ALL platforms you use and ask how secure they are from cyberattacks (Google/OpenAI/Claude/Microsoft/Apple/Wordpress/Substack/Twitter/etc)

[the moderator/sysadmin of your forum may get compromised at some point]

Also, this is urgent, and agents are susceptible

The internet is about to become a minefield for AI agents, and the success rate for attackers is 86%. Hidden prompt injections in HTML successfully commandeer agents in 86% of scenarios. Not in a lab. Not with custom exploits. Just instructions hidden in a webpage that the agent reads and the human never sees. And memory poisoning? It takes 0.1% contaminated data to permanently corrupt an agent’s knowledge base with 80%+ success rates. That means 1 bad document out of 1,000 rewrites everything the agent believes. DeepMind identifies six attack categories, each targeting a different layer of the agent stack: perception, reasoning, memory, action, multi-agent coordination, and the human supervisor. The co-author said every single category has documented proof-of-concept attacks. These aren’t theoretical. The scariest part is the systemic trap. DeepMind draws a direct line to the 2010 Flash Crash, where one automated sell order triggered a feedback loop that erased nearly $1 trillion in 45 minutes. Now imagine thousands of AI trading agents parsing the same fabricated financial report simultaneously. OpenAI admitted in December 2025 that prompt injection will probably never be completely solved. And yet every major lab is racing to ship agents with access to email, banking, and code execution. The entire agentic AI thesis assumes the information environment is neutral. This paper proves it can be weaponized at every layer, from the HTML the agent reads to the human who rubber-stamps its output. We’re building autonomous systems that trust the internet. The internet has never been trustworthy.

It’s been said by some that this year may mark the “death of the open internet”

“Improve your cogsec” [tirzepatide helps by reducing your input flows/noise…]

[also, trump, by invading iran, just increased the incentive for Iranian agents to launch cyberattacks]

AlexKChen · April 6, 2026, 2:02pm

Put your most important accounts behind passkeys or the strongest phishing-resistant MFA they support, starting with your email, password manager, Apple/Google/Microsoft account, and anything financial. Microsoft and Google describe passkeys as phishing-resistant, and CISA/FTC keep stressing MFA because stolen passwords alone should not be enough to get in.

Use a password manager and unique random passwords everywhere else. FTC explicitly recommends a password manager and notes that even strong passwords are vulnerable without a second factor.

Turn on automatic updates for your OS, browser, apps, phone, and router. Replace any end-of-life router instead of pretending it still has a future. NCSC says AI will shorten the time from vulnerability disclosure to exploitation, and the FBI has warned that obsolete routers are being compromised and used as criminal proxy infrastructure.

Lock down your home network. Use WPA3 Personal if available, otherwise WPA2 Personal, change the router admin password, and put sketchy IoT junk on a guest or separate network. FTC recommends WPA3/WPA2 for home Wi-Fi, and U.S. government home-network guidance recommends segmentation between primary, guest, and IoT networks.

Be stingy with AI-agent permissions. Do not let an agent freely read your inbox, browse random sites, click links, download files, or send data without confirmation unless you truly need that setup. OpenAI’s own guidance says dangerous actions and transmission of sensitive data should not happen silently, and NCSC says the right mindset is reducing impact even when manipulation succeeds.

Stop logging into important sites through search ads or surprise links. Bookmark payroll, bank, insurance, school, and government portals yourself. The FBI has specifically warned about criminals using search ads to impersonate legitimate employee self-service sites and steal credentials and money.

Assume urgent voice calls can be faked. Set a family codeword, hang up, and call back on a known number. FBI and FTC both warn that AI voice cloning makes emergency scams and impersonation scams much more believable.

Back up your files now, before the universe auditions you for a ransomware subplot. FTC advises regular backups, and that advice only gets more important as attacks get faster and more automated.

AlexKChen · April 6, 2026, 3:45pm

Altman isn’t wrong about the coming arms race in cybersecurity.

There’s always been a tit for tat dynamics between attackers, virus writers, zeroday researchers and the companies ability to respond.

Tit for tat game theory only works when one side doesn’t have asymmetric dominance capable of a speed and level of severity that makes the first strike fatal.

During the Cold War the concept of a 100% effective first strike weapon terrified war planners as the likely hood such a weapon would be deployed rose in relation to its projected effectiveness.

The same is happening in cyberwarfare and criminal attacks as one side is equipped with god like super attack tools and too many defences are based on (now/recently) outdated security infrastructure, easily defeated and hopeless outmatched.

Like the AI created icebreakers (hacking tools) from Neuromancer.

[the side that models strange loops better may have an advantage]

AlexKChen · April 7, 2026, 6:05pm

apparently websites are ‘hacking’ your agents by secretly adding an invisible “trap” within images words that are undetectable by humans:

website detects your ai is browsing then creates a visually identical page that contains poisoned prompts.

agents then perform illicit financial transactions or steal personal data.

hidden commands are buried within image pixels. agents “read it” and execute the malicious trap.

some attack vectors exceed 80% success rate (e.g. memory poisoning)

since every ai model is trained on 90% of the same data this means every model could be at risk.

this can be used to manipulate humans who created the agent. fake data → making human perform a dangerous action without even knowing

ai models are being used in government, military, science and other crucial sectors, we can’t afford the risk

https://papers.ssrn.com/sol3/papers.cfm?abstract_id=6372438

AlexKChen · April 7, 2026, 6:26pm

https://www.nytimes.com/2026/04/07/technology/anthropic-claims-its-new-ai-model-mythos-is-a-cybersecurity-reckoning.html

AlexKChen · April 7, 2026, 6:47pm

AlexKChen · April 7, 2026, 9:04pm

RELEVANT for these forums:

Discourse is open-source, it’s used by a lot of communities including some pretty important ones (Rust, Mozilla, many company support forums), and it has a real attack surface — it’s a Rails app with user auth, file uploads, admin panels, plugin system, email integration, websockets. All stuff that’s historically vuln-rich.

But the calculus is:

Discourse doesn’t clear the “critical infrastructure” bar Glasswing is drawing. The named partners are operating systems, browsers, cloud infrastructure, network hardware, financial systems. The unnamed 40+ are probably things like: major Linux distros, OpenSSL, Apache, nginx, PostgreSQL, maybe Kubernetes, Docker, systemd — the stuff where a single vuln can cascade across millions of systems simultaneously. The system card frames Glasswing around software where these vulnerabilities have survived decades of human review and millions of automated security tests.

Discourse is important community software but a Discourse zero-day compromises… individual forum instances. It’s not the same blast radius as a Linux kernel vuln or a browser RCE chain. You’re looking at data exfiltration from specific communities, maybe lateral movement if the forum server is poorly isolated, but not “complete control over millions of machines.”

The more interesting question your question implies:

Where’s the line? WordPress at 40% of the web is arguably critical infrastructure. Discourse at… maybe tens of thousands of installations isn’t. But what about:

phpBB? Older, huge install base, terrible security history

Nginx? Already probably in the 40+

cPanel/WHM? Runs a shocking percentage of web hosting

Let’s Encrypt / certbot? Compromise there would be catastrophic

npm / PyPI package registries? The xz-utils incident proved supply chain is the real attack surface

The Glasswing partner list reveals Anthropic’s threat model: they’re thinking about foundational infrastructure, not application-layer software. Which makes sense given finite resources and the IPO narrative. But the application layer is where most actual breaches happen in practice.

For Discourse specifically — Sam Saffron and the Discourse team do reasonably good security work, they have a bug bounty, they ship updates fast. If I were worried about forum software security in a post-Mythos world, I’d be much more worried about the phpBB and MyBB instances that haven’t been updated since 2019 and are sitting on shared hosting with root passwords of “admin123.” Mythos-class capability in attacker hands doesn’t need zero-days for those targets. It just needs to automate what pentesters already do manually, at scale.

The thing Discourse should probably do proactively: run their codebase through whatever AI-assisted security tooling becomes publicly available in the next 6-12 months as these capabilities trickle down from frontier models into commercial products. CrowdStrike and Palo Alto Networks are Glasswing partners — their products will presumably incorporate Mythos-derived techniques. That’s the realistic path for Discourse-tier projects: not direct Glasswing access, but benefiting from the downstream tooling that Glasswing partners build.

https://claude.ai/share/8ffef859-75e2-404d-a179-e0a73d939243

AlexKChen · April 8, 2026, 4:06am

AlexKChen:

compromised

feels like feb 2020 when we didn’t know if we should be wiping down our groceries (no) or if covid was airborne (yes). millions or billions of superhuman hacking-capable agents about to be launched at every piece of the internet at once. what’s good digital hygiene right now

Quote

Nick

@nickcammarata

·

3h

if every database is hacked this month and all my texts and dms come out i didn’t mean any of it. i was steering mythos. i was thinking far ahead. i knew exactly what words to say and they might seem weird but they were all necessary for making things go well

9:27 PM · Apr 7, 2026

·

36.4K
Views

Relevant

View quotes

Nick

@nickcammarata

·

2h

i think deleting old emails and messages and keeping them on some harddrive instead is maybe reasonable? having my ancient emails feels… mildly helpful, downside its hard for me to even estimate, possibly really large. idk whats in 25 thousand emails

Nick

@nickcammarata

·

2h

i’m not sure whether gmail’s databases or our personal gmails getting hacked is a bigger worry, for instance. multi factor auth and stuff only helps with the latter

===

Nick

@nickcammarata

if every database is hacked this month and all my texts and dms come out i didn’t mean any of it. i was steering mythos. i was thinking far ahead. i knew exactly what words to say and they might seem weird but they were all necessary for making things go well

8:52 PM · Apr 7, 2026

·

Relevant

·

erica my sixteen year old crush if you ever find my journal, those weren’t cringe teenage fantasies. that was prompt engineering 15 years early. i was saving the world

Nick

@nickcammarata

·

3h

i have absolutely nothing to hide and unrelatedly will be installing eight-factor authentication on every internet platform i have ever used tonight

@mrgunn

·

1h

I know you’re trying to pretend this is a joke, but, really, just superhuman levels of cope. How can a guy simultaneously be so enlightened and so compartmentalized at the same time?

Nick

@nickcammarata

·

1h

huh i’ve worked full time on ai safety for a ~decade now

AlexKChen · April 8, 2026, 4:10am

Depends on what scopes you granted when you authorized them, and most people have no idea what they approved because they clicked “Allow” without reading.

When you do the Google OAuth flow — “Sign in with Google” or “App X wants access to your Google account” — there’s a permissions screen that lists what the app is requesting. Those scopes range from:

Minimal: “See your basic profile info (name, email)” — basically harmless, just authentication

Moderate: “See and download your contacts,” “View your calendar events” — not great but not catastrophic

Extensive: “Read, compose, send, and permanently delete all your email” — yes, this is a real scope that apps can request, and yes, people approve it

The Gmail API has granular scopes including gmail.readonly (read all your email), gmail.modify (read, write, and modify but not delete), gmail.compose (send email as you), and the full mail.google.com scope which is basically complete access to everything.

What’s probably sitting in your authorized apps right now:

Old email clients you tried once in 2018 and forgot about — many request full mail access

Productivity tools (Notion, Trello, Asana integrations) — often request calendar and sometimes mail

“Sign in with Google” on random websites — usually just basic profile, but not always

CRM or newsletter tools — sometimes request contact and mail access

Browser extensions — some request terrifyingly broad permissions

AI tools and assistants — many of the newer ones request mail access for “help manage your inbox” features

Go look right now: https://myaccount.google.com/permissions

You will probably be surprised by how many apps are listed and how broad some of the permissions are. Anything that says “Has access to Gmail” or “Can read your email” or similar — if you’re not actively using it right now, revoke it immediately.

The scary scenario: a small app developer authorized years ago gets compromised, their OAuth client secret gets stolen, and the attacker uses the existing authorized tokens to read your email. You never get a notification because the access was already authorized. No password needed. No 2FA bypass needed. The token is already valid.

This is actually one of the more realistic Mythos-era attack vectors — not because Mythos breaks Google’s infrastructure, but because Mythos automates the process of: identify small app developers with Google OAuth integrations → compromise their infrastructure → use their client credentials to access all their users’ authorized Google data. It’s the Discord bot ecosystem problem applied to the entire OAuth ecosystem.

AlexKChen · April 8, 2026, 4:41am

Eliezer Yudkowsky

@allTheYud

In conclusion: This is perhaps a good time to try making an extra backup of all your online data (eg, via Google Takeout) onto an airgapped offline hard drive, just in case Project Glasswing fails to prevent the First Great AI Security Meltdown.

AlexKChen · April 8, 2026, 4:46am

critter

@BecomingCritter

·

5h

How should people protect themselves in a world with Mythos-tier agents? Where do you put your money and how do you protect your information?

@tenobrus

Tenobrus

@tenobrus

·

5h

i really really don’t know. having local access to critical info on hard drives you control is good in case cloud infra is taken down. probably crypto is a significantly more dangerous place to have money than banks, 0days in everything means wallet hacks and key leaks.

critter

@BecomingCritter

·

5h

diversifying my portfolio by hiding half of it in my mattress

critter

@BecomingCritter

having a password manager seems like a big vulnerability

7:55 PM · Apr 7, 2026

·

6,892
Views

Relevant

View quotes

Replying to @BecomingCritter and @tenobrus

scoopdiddyoop

@scoopdiddy1

·

4h

I think not having a password manager is a bigger vulnerability. Paper might be best (as it already has been), but pass has pretty stupid simple security guarantees reliant on GPG, and Mythos patches will likely float to GPG anyway (and if the crypto breaks, shiiiiiiit)

critter

@BecomingCritter

·

4h

in my mind the single biggest vulnerability is probably my phone, followed by my password manager

Ryan McWhorter

@rmcwhorter99

·

1h

ah shit

AlexKChen · April 8, 2026, 4:48am

Siméon

@Simeon_Cps

·

9h

Carlini, one of the world best AI security researchers: “I’ve found more bugs in the last few weeks with Mythos than in the rest of my entire life combined”

===

Matthew Berman

@MatthewBerman

Soon, every piece of software in the world will have their vulnerabilities exposed. And then shortly after, no software will have vulnerabilities.

Quote

Theo - t3.gg

@theo

·

7h

I would highly, highly recommend you make sure your phone, computer, browser, and important apps are all updated and on the latest versions.

7:10 PM · Apr 7, 2026

·

19.7K
Views

AlexKChen · April 8, 2026, 2:57pm

https://x.com/g_leech_/status/2041805684400849095

AlexKChen · April 8, 2026, 6:19pm

Imagine waking up tomorrow to learn that every photo you ever took was… gone. Forever. Every video, gone Every email, gone Every document, gone Every memory, gone Backup your data! WHY: Claude Mythos could be COVID, but for software Anthropic now has zero-days - powerful exploits - into all major OSes and browsers And if this model - or another upcoming model release - leaks and falls into the wrong hands, it could be catastrophic.