Hermes does not invent a new loop
At a high level, the Hermes loop is still the standard agent pattern:
- build messages
- call the model
- inspect tool calls
- execute tools
- append tool results
- repeat until there is a final answer
What makes Hermes worth studying is not novelty. It is the way the runtime surrounds this pattern with infrastructure.
A turn has preflight and postflight
Before the first model call of a user turn, Hermes already does real work:
- binds interrupt state to the active thread
- updates iteration budget state
- notifies memory providers via
on_turn_start - fetches memory recall once for the turn
- prepares a stable system prompt
This is a signal that Hermes thinks of a turn as a runtime transaction, not just a prompt submission.
turn preflightPrompt assembly is runtime logic
System prompt construction is layered rather than monolithic. Hermes can combine:
- identity
- tool-aware behavior guidance
- built-in memory snapshot
- external memory provider prompt fragments
- skills prompt
- context files
- platform hints
That makes the system prompt a controlled runtime artifact, not a static string.
_build_system_promptTool calls are filtered before execution
Hermes does not trust model output blindly. After a model response, it validates:
- malformed JSON
- tool call truncation
- duplicate tool calls
- excessive delegation calls
If needed, Hermes can inject recovery messages so the model gets a chance to correct itself instead of just crashing the turn.
This is a production mindset: the model is not a reliable compiler.
Why the budget abstraction matters
The loop is governed by more than a raw iteration count. Hermes uses IterationBudget to represent bounded work:
- each loop pass consumes budget
- some cheap tool-only iterations can be refunded
- a grace call can exist when useful
That sounds like a small implementation detail, but it changes how the runtime is reasoned about: as a bounded state machine rather than an uncontrolled recursive process.
Interruptibility is part of quality
Hermes keeps the loop interruptible because once agents can call tools, turns can become expensive, slow, and occasionally wrong.
Important runtime qualities here include:
- interrupt-aware execution
- stale/hung call detection
- clear loop exit reasons
A mature agent runtime must support recovery, not just happy-path completion.
The hidden half of the turn
After the final answer is produced, Hermes may still:
- sync memory providers
- queue background memory prefetch
- trigger memory or skill review nudges
- persist session state
This is significant. In Hermes, “turn complete” means both response delivery and invisible maintenance.
The right architectural takeaway
The best summary of this chapter is:
Hermes turns a familiar LLM-tool loop into a managed runtime by adding resource control, validation, stable prompt layers, and post-turn maintenance.