Karim Bhalwani Blog

Jun 26, 2026 You're Not Above the Loop. You're Building It.
Human-above-the-loop was the right framing earlier 2026. It is already incomplete. The job was never to approve the agent's work or even to direct it. The job is to design the mechanism that does the directing.
Jun 14, 2026 The Human Was Always the Next Ceiling.
Background agents removed the laptop ceiling. The human approval gate is the next one. Machines move at machine speed. Humans sleep. You cannot govern asynchronous systems with biological rhythms. Here is what to do about it.
May 29, 2026 The Session Was Always the Ceiling.
Your laptop was never the bottleneck. Your session was. The next wave of agents does not run on your machine. It runs in the cloud, fires on events, and finishes the work while you sleep. Here is what the teams who built it first learned.
May 10, 2026 Route the Intelligence, Not Just the Context.
You built the harness. Now you're calling a frontier model for everything. The same model that writes a novel handles a spell check. The problem isn't the model. It's that you never asked whether the task needed it.
Apr 26, 2026 Building the Control Layer: How Agent Harnesses Make AI Reliable
The model is not the system. The harness that wraps it, manages its memory, and enforces its boundaries is the system. Here is what that looks like when you actually build it.
Apr 12, 2026 The Hard Part Isn't the AI.
Three posts explored the pieces. Context-aware redaction. Hierarchical navigation. Recursive language models. This post shows what happens when you assemble them into a production pipeline that processes documents end to end.
Apr 3, 2026 Code Got Cheap. Judgment Did Not.
Code used to be scarce. Now agents generate thousands of lines in seconds for the cost of a few API tokens. The cost of writing code collapsed. What remains scarce is the judgment to direct it.
Mar 19, 2026 Three People. Ten Agents. Zero Sprints.
A twelve-person sprint team shipped one feature in two weeks. Three people with ten agents shipped the same feature by Wednesday. The difference is not productivity. It is physics.
Mar 13, 2026 The Agents Work. The Organization Does Not.
80% of enterprise AI initiatives fail. Not because the models are weak. Because the organization was never redesigned to run them. Here is what the research shows about managing an agentic workforce, and why the window to get it right is shorter than you think.
Mar 1, 2026 The Bottleneck Moved. Most Teams Have Not.
Adding more agents makes systems worse. Flat teams fail. The bottleneck has shifted from writing code to knowing what to build. Here is what the research actually shows, and what it means for how you build.
Feb 22, 2026 MIT Gave the Model a Python Interpreter. The Results Are Hard to Ignore.
MIT's Recursive Language Models reframe long-context reasoning. Instead of forcing a model to read everything, the model writes code to interrogate the corpus. The benchmark results are strong, the architecture is sound, and deploying this safely requires controls the paper does not specify.
Feb 7, 2026 Beyond the Million-Token Window: Why Context Capacity Isn't Context Intelligence
RAG defined system design in 2025. In 2026, million-token context windows are shifting the paradigm but scale doesn’t equal reasoning. It amplifies failure modes. Here’s a framework for using large contexts effectively
Jan 26, 2026 Procedure Over Intelligence: Building Reliable AI Systems
Intelligence without systematic workflows is just noise. Learn how Agent Skills encode organizational expertise to make AI agents reliable, reproducible, and trustworthy at scale.
Jan 17, 2026 Context Matters When Redacting Health Records for AI Analysis
Standard PII redaction tools destroy clinical utility. Learn how context-aware recognition preserves healthcare provider names while protecting patient privacy.
Jan 8, 2026 Welcome
A place to build, experiment, and think through problems with data and AI systems.