Tideline

Recall & ranking

Recall is hybrid: BM25 keyword search is the workhorse, and a model-free vector lane recovers the semantically-close phrasings BM25 misses. Trust and decay fold into a single ranking score, and an origin filter runs before anything is scored.

Two lanes#

  • Lexical (BM25) — the primary scorer (Okapi BM25, k1 = 1.2, b = 0.75, Unicode-aware tokens, English stopwords stripped). BM25 hits keep their order and are never reordered by the vector lane.
  • Vector recovery — facts the keyword pass missed but whose embedding is close enough (cosine ≥ a noise floor of 0.3) are appended below every lexical hit, each successive recovery damped by 10%. The vector lane surfaces misses; it never outranks a real keyword match.

The model-free embedder (HRR)#

The default embedder is an HRR (Holographic Reduced Representation) vectorizer — and it needs no model, no API, and no download. It builds a 256-dimension unit vector from token unigrams and character trigrams, each feature mapped to a deterministic phase via SHA-256 and bundled by circular superposition.

  • Deterministic, offline, zero-dependency — the same text always embeds the same way.
  • Captures morphology ("deploy" ≈ "deploys") but not learned synonymy ("car" ≉ "automobile").
  • Pluggable: set BRIGADE_MEMORY_EMBEDDER to openai-256 or a local model to upgrade the vector lane to true synonymy. The hybrid plumbing is unchanged — only vector quality improves.

Trust & decay in the score#

Both lanes are multiplied by a trust factor, so external content can't dominate recall. A perfect keyword match from a fetched document still surfaces, but a weaker owner-written fact can outrank it.

SourceTrust
owner / user instruction (and legacy)1.0
extraction0.9
dream / compaction0.85
channel message0.8
tool output0.65
retrieved document0.6

Decay enters as a gentle modulator — 0.5 + 0.5 × effectiveScore — so a fully decayed fact keeps at most half its relevance weight rather than vanishing. The full decay math is on Decay & lifecycle.

Origin filtering, fail-closed#

Before scoring, candidates are filtered to the current origin — owner sessions see owner facts; a channel peer sees only facts from the same channel, conversation, and session. Auto-recall (the facts injected before each turn) fails closed: for an unidentified peer with no channel route, it injects nothing, so operator memory can never leak into an untrusted chat. Auto-recall is also passive — it never reinforces decay.

The context block#

context(query, { maxChars }) assembles a budgeted recall block ready to drop into a prompt: top-ranked facts as bullets, up to maxChars (default 1200), with each fact threat-scanned and </> escaped. Auto-recall caps at 5 facts.

example context block
- [preference] I keep a strict vegetarian diet.- [project] Deploys go out on Tuesdays, never Fridays.

Graph-walk recall is v2

v1 recall is flat: BM25 plus the vector lane. The typed link graph is stored and the read primitives exist, but multi-hop traversal during recall (following edges to pull in related context) is deliberately a v2 step — the edges accrue now so the traversal has something to walk later.

In the code

Hybrid fusion is src/agents/memory/hybrid.ts, the embedder is embedder.ts, BM25 is scoring.ts, and origin/auto-recall live in records.ts + auto-recall.ts.