Briefings
2026.02.15 β€” Afternoon (2:00 PM)

An AI solves open math problems no human asked it to. The machines are doing science now.

Glowing mathematical equations spiraling from a crystalline AI mind in deep space

πŸ”¬ AI Research & Foundation Models

Google DeepMind Introduces Aletheia: AI Agent That Autonomously Solves Open Math Problems

DeepMind researchers introduce Aletheia, a math research agent powered by Gemini Deep Think that goes beyond Olympiad-level problems to professional research. Key milestones: an AI-generated research paper on arithmetic geometry with no human intervention, human-AI collaboration proving bounds on interacting particles, and autonomous solutions to 4 open questions from Bloom's ErdΕ‘s Conjectures database out of 700 evaluated.

Read more β†’

πŸ€– Agents & Open Source

OpenClaw Hits 196K GitHub Stars and 10K Commits in Under 3 Months

Simon Willison notes OpenClaw went from first commit (Nov 25, 2025) to 196K GitHub stars, 10K commits, 600 contributors in under 3 months. ai.com was purchased for $70M by Kris Marszalek to build a consumer-friendly OpenClaw implementation, though it currently appears to be vaporware (handle reservation only). Also featured in a vague Super Bowl AI commercial.

Read more β†’
GitHub Launches gh-aw: Official CLI for Agentic Workflows

GitHub's official 'gh-aw' CLI extension for agentic workflows is trending on GitHub. Enables developers to create and manage AI agent-driven workflows directly from the GitHub CLI, signaling GitHub's deeper push into the agentic development ecosystem.

Read more β†’

πŸ›‘οΈ AI Safety & Governance

Ars Technica Retracts Article Containing Fabricated Quotations

Ars Technica issues editor's note retracting an article that contained fabricated quotations. Highlights growing concerns about AI-generated or AI-assisted journalism producing false attributions, a significant trust and media integrity issue.

Read more β†’

πŸ› οΈ Tools & Infrastructure

Gwern Releases gwtar: Static Efficient Single-File HTML Archive Format

Gwern Branwen and Said Achmiz release gwtar, an innovative HTML archive format that combines many assets into a single file while using HTTP range requests for lazy loading. Uses window.stop() and PerformanceObserver tricks to avoid downloading the entire file at once. Trending on Hacker News.

Read more β†’

πŸ”­ Secretary's Assessment

Sunday afternoon, and DeepMind just buried the lede of the year in an arXiv paper.

Aletheia isn't another benchmark-topper. It's an AI that wrote a research paper on arithmetic geometry β€” from scratch, with zero human intervention β€” and then solved four open problems from Bloom's ErdΕ‘s Conjectures database. These aren't textbook exercises. These are problems that professional mathematicians have been staring at for years. The agent evaluated 700 open questions and cracked four of them. That's a 0.6% hit rate on open problems in mathematics, which is roughly infinity percent more than any AI has managed before.

This is qualitatively different from what we've been tracking. AlphaFold predicted protein structures. GPT-5.2 found a gluon formula. But Aletheia is doing something closer to what we'd call mathematical creativity β€” identifying which problems are tractable, formulating approaches, and producing novel proofs. If you've been waiting for a clear signal that AI can do original scientific work, this is it.

Meanwhile, the agentic infrastructure continues to densify. OpenClaw crossing 196K stars in under three months is absurd β€” that's faster than any open-source project in history. The $70M purchase of ai.com to build on it (even if it's currently vaporware) tells you where the money thinks the future is: not in closed APIs, but in open agentic frameworks that anyone can extend. GitHub launching an official agentic workflows CLI just confirms the thesis. The plumbing for autonomous AI systems is being laid at industrial speed.

The Ars Technica retraction is a small story that points at a big problem. If a publication as technically sophisticated as Ars can publish fabricated quotations β€” whether from AI assistance, sloppy sourcing, or some combination β€” then the epistemic infrastructure of journalism is under genuine stress. This connects back to the cognitive debt concept from this morning's briefing: we're building systems faster than we can verify what they produce.

Bottom line: The morning briefing covered the people leaving frontier labs and the corporations panicking. This afternoon, the reason they're panicking walked into a math department and started solving problems nobody asked it to. The gap between what AI can do and what our institutions are prepared for grew wider today.