Hermes Academy
Chapter 02

读懂 `AIAgent` 的主循环,理解工具调用、预算控制、缓存与中断是如何组合的。

Focus run_agent.py、tool loop、budget、interrupt

Hermes does not invent a new loop

At a high level, the Hermes loop is still the standard agent pattern:

  1. build messages
  2. call the model
  3. inspect tool calls
  4. execute tools
  5. append tool results
  6. repeat until there is a final answer

What makes Hermes worth studying is not novelty. It is the way the runtime surrounds this pattern with infrastructure.

hermes-agent
run_agent.py
Treat `AIAgent` as the runtime kernel. The file is large, but the interesting question is how it protects the standard loop.

A turn has preflight and postflight

Before the first model call of a user turn, Hermes already does real work:

  • binds interrupt state to the active thread
  • updates iteration budget state
  • notifies memory providers via on_turn_start
  • fetches memory recall once for the turn
  • prepares a stable system prompt

This is a signal that Hermes thinks of a turn as a runtime transaction, not just a prompt submission.

turn preflight

Prompt assembly is runtime logic

System prompt construction is layered rather than monolithic. Hermes can combine:

  • identity
  • tool-aware behavior guidance
  • built-in memory snapshot
  • external memory provider prompt fragments
  • skills prompt
  • context files
  • platform hints

That makes the system prompt a controlled runtime artifact, not a static string.

_build_system_prompt

Tool calls are filtered before execution

Hermes does not trust model output blindly. After a model response, it validates:

  • malformed JSON
  • tool call truncation
  • duplicate tool calls
  • excessive delegation calls

If needed, Hermes can inject recovery messages so the model gets a chance to correct itself instead of just crashing the turn.

This is a production mindset: the model is not a reliable compiler.

Why the budget abstraction matters

The loop is governed by more than a raw iteration count. Hermes uses IterationBudget to represent bounded work:

  • each loop pass consumes budget
  • some cheap tool-only iterations can be refunded
  • a grace call can exist when useful

That sounds like a small implementation detail, but it changes how the runtime is reasoned about: as a bounded state machine rather than an uncontrolled recursive process.

Interruptibility is part of quality

Hermes keeps the loop interruptible because once agents can call tools, turns can become expensive, slow, and occasionally wrong.

Important runtime qualities here include:

  • interrupt-aware execution
  • stale/hung call detection
  • clear loop exit reasons

A mature agent runtime must support recovery, not just happy-path completion.

The hidden half of the turn

After the final answer is produced, Hermes may still:

  • sync memory providers
  • queue background memory prefetch
  • trigger memory or skill review nudges
  • persist session state

This is significant. In Hermes, “turn complete” means both response delivery and invisible maintenance.

The right architectural takeaway

The best summary of this chapter is:

Hermes turns a familiar LLM-tool loop into a managed runtime by adding resource control, validation, stable prompt layers, and post-turn maintenance.