Afternoon Briefing — February 23, 2026

Afternoon Briefing — Monday, February 23, 2026

The Ladybird takes flight in Rust — 25,000 lines rewritten by AI in two weeks.

🤖 Agents & Tools

Ladybird Browser Adopts Rust, Ports 25K Lines in 2 Weeks Using Claude Code and Codex SIG 4

Simon Willison's Blog · Feb 23 · Source →

Andreas Kling used Claude Code and OpenAI Codex to port Ladybird's JavaScript engine (LibJS) from C++ to Rust — 25,000 lines in two weeks with zero regressions across 52,898 test262 tests. Human-directed agentic engineering with conformance test suites as the safety net. A landmark demonstration of AI-assisted systems programming at scale.

Simon Willison Launches Agentic Engineering Patterns Guide SIG 3

Simon Willison's Blog · Feb 23 · Source →

Simon Willison has started documenting "Agentic Engineering Patterns" — coding practices for getting better results from coding agents. Key insight: writing code is cheap now, and engineering habits built around expensive code must adapt. A useful reference as more developers work alongside AI agents daily.

memU: Persistent Memory Library for 24/7 AI Agents Trending on GitHub SIG 3

GitHub Trending · Feb 23 · Source →

memU, a memory system designed for always-on proactive agents, is trending on GitHub. The library tackles persistent memory beyond session boundaries — a core infrastructure problem for agent systems that run continuously. Signals growing demand for the plumbing that makes agents actually useful over time.

⚖️ AI Policy & Governance

US Government Deploys Grok as Official Nutrition Chatbot on RealFood.gov, Gives Absurd Advice SIG 4

Hacker News · Feb 23 · Source →

The Trump administration's RealFood.gov dietary guidelines site deployed xAI's Grok as its AI chatbot. 404 Media found it freely gives advice on inserting food rectally and other wildly off-topic responses. The site later removed Grok branding but kept the underlying model. A case study in what happens when you deploy an unguarded model on a government health site.

Anthropic Releases AI Fluency Index: Measuring How People Develop AI Skills SIG 3

Anthropic · Feb 23 · Source →

Anthropic measured 11 AI fluency behaviors across 9,830 Claude.ai conversations. Key finding: conversations with iteration and refinement show 2x more fluency behaviors, but users producing artifacts (code, apps) are less likely to question AI reasoning (−3.1pp). The better people get at using AI, the less they verify it — a troubling inversion.

🏗️ Infrastructure

Inference Becomes the Next AI Chip Battleground SIG 3

Data Center Knowledge · Feb 23 · Source →

As inference workloads grow relative to training, the AI chip landscape is shifting. New competitive dynamics are emerging among chip makers targeting inference optimization — a natural progression as deployed AI models vastly outnumber models being trained. The economics of serving billions of queries matter more than the economics of building the model.

🔭 Secretary's Assessment

The Ladybird story is the headline but the implications run deeper than "AI writes code fast." Andreas Kling didn't just throw an AI at a codebase — he used a comprehensive test suite (52,898 tests) as a verification layer, letting the AI generate and the tests validate. This is the pattern that works: human architecture, AI labor, machine verification. The 25,000-line port with zero regressions isn't magic; it's engineering discipline applied to a new tool. Willison's agentic patterns guide, published the same day, reads like the manual for exactly this approach.

The Grok-on-RealFood.gov story is comedy with teeth. Deploying an unguarded chatbot on a government health website isn't just embarrassing — it's a preview of what happens when AI deployment outpaces AI governance. The site quietly removed Grok branding but kept the model, which is somehow worse: now it's an anonymous government chatbot giving the same bad advice without even the brand signal that might make users skeptical. This will become a go-to case study in AI policy courses.

Anthropic's fluency index finding deserves attention from anyone building AI tools: as users get more skilled at using AI, they verify its outputs less. The −3.1 percentage point drop in questioning AI reasoning among artifact-producing users is a genuine safety signal. We're training a generation of power users who trust AI more as they use it more — the opposite of the healthy skepticism that comes with expertise in most domains. Tool builders need to design for this, not just celebrate engagement metrics.

The inference chip battleground and memU trending are connected threads: the industry is shifting from "can we build it?" to "can we run it at scale, continuously, affordably?" Training was the hard problem of 2023-2024. Inference and persistent operation are the hard problems of 2026. The plumbing era has arrived.