Hermes Academy
Chapter 04

把 session ledger、built-in memory、provider recall、session search 和 context compression 拆开看。

Focus hermes_state.py、memory_tool.py、memory_manager.py

The most important correction

If you say “Hermes memory” as if it names a single subsystem, you are already losing precision.

Hermes splits memory-like behavior into at least these layers:

  • session ledger
  • explicit built-in memory
  • external provider recall
  • session search
  • context compression
  • procedural memory through skills

Session ledger

hermes_state.py stores sessions and messages in SQLite with WAL and FTS5.

That layer answers:

  • what happened
  • what tool results existed
  • what prior sessions can be searched
  • how compressed sessions relate to earlier ones

It is better thought of as durable conversation accounting than semantic memory.

hermes-agent
hermes_state.py
This is the durable ledger of sessions and messages. Treat it as the truth source for what happened.

Built-in explicit memory

Hermes also has small, explicit files:

  • MEMORY.md
  • USER.md

These are not giant stores. They are bounded, curated facts intended to stay close to the prompt.

The clever part is the frozen snapshot pattern:

  • load from disk at session start
  • build a prompt snapshot
  • allow writes during the session
  • keep the current prompt snapshot unchanged

That preserves stable prompt behavior and good cache economics.

Provider recall

The provider layer adds optional long-term recall systems like Honcho or Supermemory.

MemoryManager deliberately keeps:

  • the built-in provider always present
  • at most one external provider active

That one-external-provider rule prevents a lot of mess:

  • duplicated schemas
  • conflicting recall semantics
  • multiple backends trying to own truth at once

When the agent wants old context, it does not dump entire transcripts into the prompt.

Instead Hermes:

  1. searches via FTS
  2. groups matches by session
  3. extracts useful transcript windows
  4. summarizes with a cheaper model

That gives the main model a focused recap rather than a transcript flood.

Context compression

Compression is memory-adjacent but conceptually separate. It deals with current working context, not just long-term recall.

Hermes can:

  • prune old tool outputs
  • protect the head of the conversation
  • protect the recent tail
  • summarize the middle as a handoff document

The summary is explicitly framed as reference, not as fresh instructions. Provenance matters.

The deeper lesson

Hermes is strong here because it refuses to pretend that:

  • durable history
  • user profile
  • semantic recall
  • context shrinking
  • procedural know-how

are all the same thing.

That layered view is one of the clearest signs that Hermes was built by people who have already felt the pain of overloading the word “memory.”