The Harness Stops Being Generic: Model-Specific Profiles and Runtime-Authored Workflows
Two converging shifts — per-model harness profiles and agents that write their own orchestration at runtime — are breaking the assumption that an agent harness is a stable, model-agnostic abstraction.
Most Read
Requirements as Code: Git-Native Business Documents for Agentic Workflows
Exploring the idea of putting business requirements, architecture diagrams, and domain models in Git — and how this could enable agentic pipelines from requirement change to deployed code.
February 26, 2026What's Happening
Analysis
The Agent Becomes the Optimization Unit
Multi-agent systems are being instrumented and tuned at the agent level: credit assignment, failure-mode decomposition, and policy learning all treat the individual agent as the thing you measure, replace, or evolve.
May 30, 2026The Sandbox Becomes a Runtime Primitive
Isolated code execution environments are emerging as a distinct layer of the agent stack, separable from both the harness and the model — with implications for security, portability, and cost.
May 23, 2026Context as a Deployable Artifact: The Third Layer of the Agent Stack
Agent context files are being pulled out of repos and into versioned, governed runtime stores — creating a third deployment surface alongside harness code and model weights.
May 16, 2026Alignment Is Splitting Into Two Layers: Midtraining and Runtime
Recent work from Anthropic, OpenAI, and Mozilla suggests alignment is no longer a single fine-tuning step — it's becoming a layered system spanning training stages and execution infrastructure.
May 9, 2026The Trace Becomes the Primary Artifact of Agent Engineering
Across evals, debugging, failure attribution, and self-improvement, the execution trace is consolidating as the central object practitioners build around — with consequences for tooling, storage, and team workflow.
May 2, 2026Each Role Owns a Contract: A Team Operating Model for Agentic Delivery
AI is making code faster to write but not teams faster to deliver. The bottleneck moves to handoffs — sign-offs, requirement changes, validation. This article explores a model where each role owns a versioned contract, and asks honestly where it helps and where it just relocates the problem.
April 29, 2026From the Archive
The Harness Is Now a Managed Surface — and a Managed Liability
Claude Code's quality regression, Gemini's Enterprise Agent Platform, and Anthropic's memory stores all point to the same shift: the harness is moving from something you build to something you consume — with consequences for debugging, eval reporting, and vendor lock-in.
April 25, 2026Why Memory Ownership Is Becoming a Harness Decision
As harnesses absorb session management, context compaction, and persistent memory, the choice of harness is increasingly a choice about who owns your agent's memory.
April 18, 2026The Self-Improving Harness: When Agent Infrastructure Learns to Optimize Itself
Agent harnesses are evolving from static scaffolding into self-modifying systems that mine their own failures, generate evals, and hill-climb their own performance — reshaping what it means to build and maintain agents in production.
April 11, 2026