AI Wire · Thursday, May 14, 2026

Critical cyber vulnerabilities & active threat campaigns

Patch Tuesday landed alongside a brutal vuln cycle. Exim's "Dead.Letter" (CVE-2026-45185) lets a single plaintext byte after a TLS close_notify mid-BDAT corrupt GnuTLS heap metadata in versions 4.97–4.99.2 — XBOW calls it one of the highest-caliber bugs they've ever seen (@thehackersnews). An 18-year-old NGINX flaw dubbed "NGINX Rift" (CVE-2026-42945) hands unauthenticated RCE/DoS on both Plus and OSS builds (@thehackersnews), and "Fragnesia" (CVE-2026-46300) is the third Linux LPE in two weeks, with a public PoC corrupting kernel page cache via XFRM ESP-in-TCP (@thehackersnews).

Microsoft shipped 138 fixes, 30 Critical, spanning DNS, Netlogon, Azure, Dynamics 365 and Hyper-V (@thehackersnews). Notably, 16 of those Windows bugs — including 4 critical RCEs in the TCP/IP kernel and IKEv2 VPN — were surfaced by Microsoft's new MDASH AI swarm of 100+ debating agents (@thehackersnews). Active-threat side: China-linked FamousSparrow hit an Azerbaijani O&G firm via reused Exchange entry points to drop Deed RAT and TernDoor between Dec 2025 and Feb 2026 (@thehackersnews); GemStuffer abused 150+ RubyGems to exfiltrate UK council ModernGov scrapes (@thehackersnews); and ConsentFix v3 automates Microsoft account hijacks via ClickFix + OAuth consent phishing, bypassing MFA/passkeys (@thehackersnews). On the defender side, Google rolled out opt-in tamper-proof "Intrusion Logging" in Android 16 for journalists and activists (@thehackersnews), while Mandiant clocks mean-time-to-exploit at −7 days against Verizon's 32-day median edge remediation (@thehackersnews).

Frontier AI cyber capabilities — Mythos, AISI, METR

The UK AI Security Institute and XBOW independently validated Claude Mythos Preview as a step-change in autonomous cyber capability, with Mythos becoming the first model to solve both AISI cyber ranges end-to-end including the previously-unsolved "Cooling Tower" (@bcherny). AISI puts cyber-task doubling time at ~4.5 months, with token budgets — not capability ceilings — as the limiter for Mythos and GPT-5.5 (@emollick, @garymarcus); the open-source enrichment points to OpenAI's parallel "Daybreak" frontier-AI-for-defenders effort (last30days, openai.com). Ethan Mollick notes the trajectory aligns with METR's task-horizon results, though Gary Marcus pushes back that METR's chart should plot 50/80/90/100% accuracy thresholds directly rather than tabbed (@garymarcus).

The policy debate is heating in parallel. Roon (@tszzl) flagged OpenAI endorsing Illinois' SB315 and KOSA, casting it as the first frontier-AI audit requirement and a pivot toward state-level consistency. Boris Cherny says Anthropic is rushing Mythos to defenders via Glasswing (@bcherny), but Marcus argues Anthropic "has no plans to roll out Mythos broadly" amounts to IPO-flavored gatekeeping (@garymarcus), and Mollick wonders how Anthropic exits the government-approval lane once Google and OpenAI ship equivalents (@emollick).

AI coding tool war: Claude Code pricing shake-up vs Codex enterprise blitz

Anthropic bumped Claude Code weekly limits 50% through July 13 across Pro/Max/Team/Enterprise (@bcherny, @claudedevs), and from June 15 will split paid plans into interactive (sub limits, unchanged) and a new $20–$200 programmatic credit metered at API rates covering the Agent SDK, claude -p, GitHub Actions, and third-party apps (@jeremyphoward, @claudedevs). Jeremy Howard and Theo argue this quietly cuts effective usage 25× for anyone routing through T3 Code, Conductor, Zed, Jean or CI scripts — "disguising this as 'free credits'" (@jeremyphoward). Polymarket has Claude 5 release odds down 14.2% on the month and Mythos down 15.5% (last30days, polymarket.com), suggesting traders are skeptical of near-term shipping. Sam Altman countered with two free months of Codex for enterprises that switch in the next 30 days, pitching Codex as "the best AI coding product" (@sama, @openai).

Novel LLM training methods & new research papers

Nous Research dropped two pretraining recipes: Lighthouse Attention, a removable subquadratic wrapper around SDPA that compresses/decompresses Q/K/V hierarchically and reverts to vanilla attention near the end of training (@rasbt), and Token Superposition Training, claiming 2–3× wall-clock speedup at matched FLOPs by averaging contiguous token-bag embeddings for the first third of training (@jeremyphoward). Jonas Geiping makes the case that ChatGPT-shaped message exchanges bottleneck even smart agents, pitching multi-stream LLMs that can read while writing (@jeremyphoward). On arXiv: SenseNova-U1's NEO-unify multimodal architecture, RubricEM's rubric-guided meta-RL, NVIDIA AnyFlow (any-step text-to-video on Hugging Face), Apple's on-policy distillation analysis, CausalCine, and EgoMemReason (@_akhaliq). Marcus relays LeCun on world models being prerequisite for reliable agents and Noam Brown's "intelligence is a function of inference compute" — with the caveat that humans run on 20 watts (@garymarcus).

Open-source & local AI momentum

With Trump meeting Xi, Clement Delangue called on US AI to publicly back open international weights — DeepSeek, Qwen, Kimi, GLM (@clementdelangue). GLM 5.1 is leading the Artificial Analysis intelligence index over closed peers per Merve Noyan (@aidotengineer), Qwen3-35B-A3B now runs 24/7 on a laptop via llama.cpp + Unsloth 4-bit quants (@huggingface), and llama-eval is shaping up as a community eval standard (@huggingface). Resemble released DramaBox, a voice model with verifiable signatures (@huggingface); Poolside is running a London hackathon May 29–30 with NVIDIA/Prime Intellect/HF around Laguna XS.2 (@clementdelangue).

Consumer AI products, agentic OS & societal effects

Android is baking MCP into the OS via a new @AppFunction annotation for on-device cross-app agent chaining (@_philschmid). Meta launched Muse Spark Voice, live-camera AI, and Incognito Chat across WhatsApp and the Meta AI app, with Alexandr Wang pressing WhatsApp head Will Cathcart on how private it actually is (@alexandr_wang). A Figure-style humanoid team ran an autonomous 8-hour shift at human productivity on Helix-02 per Mollick, who argues we're past the waitbutwhy "you are here" elbow (@emollick). The counter-signal from Gary Marcus: AFP profiles AI-induced psychosis support groups, HBR documents "AI brain fry" in 1,500 workers, a Nature paper shows state-controlled media disproportionately shapes LLM outputs, and the FT reports Amazon staff manufacturing token usage to look AI-productive (@garymarcus).

The Bottom Line

It was a brutally heavy day for security — three Linux LPEs in two weeks, an 18-year NGINX RCE, a 138-bug Patch Tuesday partly discovered by AI agents — landing exactly as AISI/XBOW certify Mythos as a real cyber step-change and OpenAI pushes a defender-side counterpart. Underneath the headlines, the AI coding economy is repricing in public (Anthropic's interactive/programmatic split vs OpenAI's 2-month Codex giveaway) while open-weights and pretraining-efficiency research keep eroding the closed-model moat.

Dispatch № 22 · Filed Thursday at dawn from Pensive — a second-brain publication.
Set in Bevan, Old Standard TT, Cormorant Garamond & Courier Prime.

Critical cyber vulnerabilities & active threat campaigns

Frontier AI cyber capabilities — Mythos, AISI, METR

AI coding tool war: Claude Code pricing shake-up vs Codex enterprise blitz

Novel LLM training methods & new research papers

Open-source & local AI momentum

Consumer AI products, agentic OS & societal effects

The Bottom Line

Sources

Critical cyber vulnerabilities & active threat campaigns

Frontier AI cyber capabilities — Mythos, AISI, METR

AI coding tool war

Novel LLM training methods & new research papers

Open-source & local AI momentum

Consumer AI products, agentic OS & societal effects