• You're Not Above the Loop. You're Building It.

    Human-above-the-loop was the right framing earlier 2026. It is already incomplete. The job was never to approve the agent's work or even to direct it. The job is to design the mechanism that does the directing.

  • The Human Was Always the Next Ceiling.

    Background agents removed the laptop ceiling. The human approval gate is the next one. Machines move at machine speed. Humans sleep. You cannot govern asynchronous systems with biological rhythms. Here is what to do about it.

  • The Session Was Always the Ceiling.

    Your laptop was never the bottleneck. Your session was. The next wave of agents does not run on your machine. It runs in the cloud, fires on events, and finishes the work while you sleep. Here is what the teams who built it first learned.

  • Route the Intelligence, Not Just the Context.

    You built the harness. Now you're calling a frontier model for everything. The same model that writes a novel handles a spell check. The problem isn't the model. It's that you never asked whether the task needed it.

  • Building the Control Layer: How Agent Harnesses Make AI Reliable

    The model is not the system. The harness that wraps it, manages its memory, and enforces its boundaries is the system. Here is what that looks like when you actually build it.

  • The Hard Part Isn't the AI.

    Three posts explored the pieces. Context-aware redaction. Hierarchical navigation. Recursive language models. This post shows what happens when you assemble them into a production pipeline that processes documents end to end.

  • Code Got Cheap. Judgment Did Not.

    Code used to be scarce. Now agents generate thousands of lines in seconds for the cost of a few API tokens. The cost of writing code collapsed. What remains scarce is the judgment to direct it.

  • Three People. Ten Agents. Zero Sprints.

    A twelve-person sprint team shipped one feature in two weeks. Three people with ten agents shipped the same feature by Wednesday. The difference is not productivity. It is physics.

  • The Agents Work. The Organization Does Not.

    80% of enterprise AI initiatives fail. Not because the models are weak. Because the organization was never redesigned to run them. Here is what the research shows about managing an agentic workforce, and why the window to get it right is shorter than you think.

  • The Bottleneck Moved. Most Teams Have Not.

    Adding more agents makes systems worse. Flat teams fail. The bottleneck has shifted from writing code to knowing what to build. Here is what the research actually shows, and what it means for how you build.

  • MIT Gave the Model a Python Interpreter. The Results Are Hard to Ignore.

    MIT's Recursive Language Models reframe long-context reasoning. Instead of forcing a model to read everything, the model writes code to interrogate the corpus. The benchmark results are strong, the architecture is sound, and deploying this safely requires controls the paper does not specify.

  • Beyond the Million-Token Window: Why Context Capacity Isn't Context Intelligence

    RAG defined system design in 2025. In 2026, million-token context windows are shifting the paradigm but scale doesn’t equal reasoning. It amplifies failure modes. Here’s a framework for using large contexts effectively

  • Procedure Over Intelligence: Building Reliable AI Systems

    Intelligence without systematic workflows is just noise. Learn how Agent Skills encode organizational expertise to make AI agents reliable, reproducible, and trustworthy at scale.

  • Context Matters When Redacting Health Records for AI Analysis

    Standard PII redaction tools destroy clinical utility. Learn how context-aware recognition preserves healthcare provider names while protecting patient privacy.

  • Welcome

    A place to build, experiment, and think through problems with data and AI systems.