Writing

A changelog formatting change took down Claude Code. Lessons about parsing human docs as machine data.

January 5, 2026

Notes on the Eurostar chatbot "vulnerability" report

Looking at what makes something a vulnerability versus a hardening opportunity in LLM applications.

January 3, 2026

What I learned shipping 1,000+ PRs with Claude Code

Notes from using Claude Code in parallel git worktrees: Plan Mode, ultrathink, verification loops, and Chrome automation.

December 15, 2025

How AI Regulation Changed in 2025

Why "AI compliance questions" appeared in security questionnaires and RFPs, and how policy becomes contract requirements.

December 12, 2025

Why Attack Success Rate (ASR) Isn't Comparable Across Jailbreak Papers

ASR isn't portable across papers because measurement choices dominate the headline number. Includes math and a checklist for reading papers.

December 11, 2025

GPT-5.2 Initial Trust and Safety Assessment

Day-zero red team of GPT-5.2 focusing on jailbreak resilience and harmful content.

November 28, 2025

Real-Time Fact Checking for LLM Outputs

Introduces search-rubric, an assertion where a search-enabled judge verifies time-sensitive claims during evals and CI.

November 10, 2025

When AI becomes the attacker: The rise of AI-orchestrated cyberattacks

Connects malware querying LLMs at runtime with "vibe hacking" case studies. Defense needs continuous testing.

October 24, 2025

Reinforcement Learning with Verifiable Rewards Makes Models Faster, Not Smarter

RLVR gains are often "search compression" rather than new reasoning ability.

August 18, 2025

Prompt Injection vs Jailbreaking: What's the Difference?

Jailbreaking targets model safety training; prompt injection targets application trust boundaries.

August 17, 2025

AI Safety vs AI Security in LLM Applications: What Teams Must Know

Safety protects people from harmful outputs; security protects systems from adversarial manipulation.

July 24, 2025

Evaluating political bias in LLMs

Open methodology and dataset (2,500 political statements) to measure political leaning in models.

Guides

Testing Humanity's Last Exam with Promptfoo

Guide on using Promptfoo to test the HLE benchmark.

Notes on the Claude Code 2.1.0 outage