What's the simplest way to understand the difference between an AI agent's 'harness' and its 'scaffolding'?

The 'scaffolding' is the behavior-defining layer, including the system prompt, available tools, and instructions the model follows. The 'harness' is the active execution layer—the loop that calls the model, runs the tools based on the model's output, and decides when to stop. Essentially, scaffolding provides the rules and resources, while the harness is the engine that drives the agent through its task.

Harness, Scaffold, and the AI Agent Terms Worth Getting Right

Distinguishing the Engine from the Blueprint

AI practitioners Sergio Paniego and Aritra Roy Gosthipaty have published a detailed glossary aimed at standardizing the vocabulary used in AI agent development. The guide directly addresses the widespread confusion over terms like 'harness' and 'scaffold,' which are often used interchangeably, creating ambiguity for both new and experienced developers. As the complexity of agentic systems grows, this attempt to establish a clear, shared mental model is a practical step toward improving collaboration and accelerating engineering cycles across the industry.

The Core Technical Separation

The central argument of the glossary is the functional separation between the components that surround a core language model. The authors propose a clear distinction between the scaffolding, which defines an agent's behavior, and the harness, which executes its actions. This framework helps teams reason about and debug different layers of the agent stack independently. The authors note that products like Claude Code and Hermes Agent are often misunderstood as just models, when in fact their unique performance comes from the sophisticated harnesses and scaffolding they employ.

Model: The base LLM that generates text and expresses intent (e.g., Claude, GPT, Qwen).
Scaffolding: The behavior-defining layer, including system prompts, tool descriptions, context management, and output parsing rules.
Harness: The execution layer responsible for calling the model, handling tool execution, managing errors, and deciding when a task is complete.

Impact on the AI Ecosystem

By providing a practical lexicon, this guide directly impacts how developers build, evaluate, and discuss agentic systems. A consistent vocabulary allows for more precise comparisons between different agent frameworks and simplifies the process of integrating new models or tools. For the broader ecosystem, this clarity reduces the barrier to entry and enables more structured innovation by allowing engineers to focus on specific components—whether it's refining the model's policy, engineering a more robust harness, or designing more effective scaffolding—rather than treating the agent as an inscrutable black box.

The distinction between a model, its scaffolding, and its harness is not merely semantic; it provides a crucial engineering framework for isolating and improving distinct components of an agent's performance, from core reasoning to execution logic.

>> Verify Original Transmission at Hugging Face