Lodestar — QMI Lab

Most agent stacks today are built from three layers: a runtime that picks tools and calls them, a memory store that holds recalled material, and an observability platform that captures traces. Each of these layers is excellent at its own job. But the stack still rarely records the question that matters most when something goes wrong:

What did the agent believe, why did it believe it, and what evidence supported that belief at the moment it acted?

Lodestar is built for that missing record. It sits alongside the runtime, memory, and observability stack, and its job is to capture and audit the agent's epistemic state: the structured record of what the agent took to be true, why, and how confidently.

The project is the first applied output from QMI Lab. It operationalizes one specific lever of the lab's broader thesis — that the central problem in machine intelligence is the conversion of information into knowledge — and brings that thesis into contact with a class of working systems, autonomous agents, where the conversion is currently too often invisible.

The problem

When an agent does something unexpected, the post-mortem question is usually some version of "what was it thinking?" Standard agent infrastructure can answer adjacent questions — what did it do, what was in its context, what did it retrieve — but not the central one.

The reason is structural. A tool-call trace records actions. A memory dump records retrievable material. A prompt log records what was sent to the model. The transition from the model said this to the agent now acts as if this is true often happens implicitly, somewhere in the overlap between context and weights, and is not written down as a structured artifact. It exists only as the consequence of subsequent actions.

This creates three concrete problems for anyone running agents in production:

Post-mortems are unreliable. When an agent makes a costly mistake, the trail of evidence often does not include the belief that drove the mistake. Reviewers reconstruct the belief by inference, often poorly.
Memory poisoning lacks a clear audit boundary. A class of attacks plants attacker-controlled text in external documents the agent reads. The text contains claims that the agent may later extract and treat as factual. Without a record distinguishing observations the agent witnessed from documents it read, the attack may leave no trace apart from its downstream effects.
Trust does not transfer cleanly. An agent that performs well on a benchmark may still be deciding for opaque reasons. The reviewer who would extend trust to its decisions cannot, because the basis for those decisions is not legible.

These are not simply gaps in observability dashboards. They are gaps in what the agent stack records at all.

What Lodestar does

Lodestar treats agent cognition as a chain of typed steps:

Observation → Claim → Evidence → Belief → Decision → Action → Outcome → Revision

Each step is a first-class data structure with its own schema. When an agent invokes a tool, the result is captured as an Observation. From Observations the agent extracts Claims, each of which is supported by an Evidence Set of varying quality: a deterministic tool result is stronger than a model judgment, which is stronger than a snippet quoted from a third-party document.

Claims become Beliefs only when the evidence is sufficient and the governance layer permits the promotion. Beliefs inform Decisions. Decisions yield Actions. Actions produce Outcomes. When Outcomes contradict prior Claims, Revisions flow back through the chain.

This is not just a logging schema. It is a runtime invariant: an agent running under Lodestar cannot promote a model-generated claim directly to a settled belief without passing through an explicit gate. The gate enforces, among other things, that evidence sourced from external documents — the agent's reading material — cannot silently become ground truth.

The framework exposes its capabilities through four developer entry points:

Guard — wraps an existing agent loop and routes every tool call through the Action Kernel, every observation through the Cognitive Core, and every belief transition through the Memory Firewall.
Trace — consumes the resulting event log and produces a trust report. The package is @qmilab/lodestar-trace; the user-facing CLI command is lodestar report. The output is a human-readable account of what the agent observed, claimed, believed, decided, and did, with every transition justified by its supporting evidence.
Memory Firewall — provides governance over four orthogonal lifecycle axes — truth, retrieval, security, and freshness — for any piece of remembered information, with adapter contracts for existing memory stores such as mem0, Letta, and Zep.
Harness — the test side: probes, sentinels, and calibrators. Probes exercise the governance invariants under adversarial conditions; sentinels watch the live event stream and can gate dependent actions; the calibrator measures how well the agent's prior claims about outcomes matched the outcomes that actually occurred.

The reference implementation is in TypeScript, runs on Bun, and persists the event log as append-only NDJSON with monotonic sequence numbers and payload hashes. Integration with non-TypeScript runtimes — Python agents, Claude Code, Cursor, Aider, and others — happens through a Model Context Protocol proxy mode that wraps the underlying tool surface.

Status

The project is at v0.2.0, published to npm as twenty-two packages, with the v0.2 architecture locked. The architectural spine went through five rounds of structured adversarial review before implementation began; the implementation has since caught up to it:

The full schema layer for the epistemic chain is present.
The append-only event log and two-phase action execution are present.
The Memory Firewall with its four lifecycle axes is present, over both in-memory and Postgres backends.
The Cognitive Core primitives are present: claim extraction, belief adoption, and world-model update.
The Policy Kernel enforces the trust ladder, a three-valued allow / deny / hold gate, and a signed-approval lifecycle — with sentinels that watch the live event stream and a calibrator that scores how well prior claims matched outcomes.
Forty-seven governance invariants are encoded as automated probes that all pass under strict TypeScript — memory-poisoning resistance, epistemic-chain integrity, external-document retrieval gating, quarantine isolation, sensitivity-ceiling enforcement, the auto-observation evidence-quality gate, and forty-one more spanning the Policy Kernel, the MCP proxy, and the native egress adapters.

Lodestar wraps an existing agent two ways: guard.wrap() for a TypeScript loop it can see inside, and a Model Context Protocol proxy for runtimes it cannot — Claude Code, Cursor, Aider, Python agents. The first end-to-end demonstration, a coding agent governed from tool call to trust report, ships in the repository as examples/telenotes-governed-dev/, including a hostile-document variant that the firewall holds without ever promoting the poison into trusted context.

The full roadmap, including the post-v1 harness infrastructure and the trust marketplace, lives in the project repository and the documentation site.

Research arc

Lodestar is both a shipping software project and an active research program. The project is the substrate for several lines of inquiry the lab is pursuing:

Epistemic governance as infrastructure. A position paper drafting the case that the agent stack is missing a fourth layer, and arguing for its structural separation from runtime, memory, and observability. Target: 2026.
A threat taxonomy for memory poisoning. A structured analysis of the attack class where adversarial text plants itself in agent-readable documents and propagates into agent beliefs. The taxonomy distinguishes the attack surface, the propagation mechanism, and the defense locus for each variant. Target: 2026.
Calibration in governed agents. Empirical work measuring how well an agent's prior claims about outcomes match the outcomes that actually occurred, across a corpus of governed sessions. Target: 2027 and beyond, once a sufficient corpus has accumulated.

The research outputs and the software outputs feed each other. The probes that ship in the harness are operationalizations of invariants the threat model identified; the events the calibrator measures are the data the empirical work depends on.

Engaging with the project

The repository is open: github.com/qmilab/lodestar. Apache 2.0. The framework can be cloned, installed with Bun, and run against the included examples — examples/telenotes-governed-dev/ demonstrates the full pipeline end-to-end, and examples/doc-insight/ focuses on the auto-observation gate in isolation.

The documentation covers getting started, the core concepts, and a reader's walkthrough of a full governed session from tool call to trust report.

A series of writing on the design and the adversarial review process behind it is being published at nandan.me/writing.

Issues, design questions, and threat-model contributions are welcome through the repository. Substantial architectural changes should come with concrete failure cases, threat-model evidence, or implementation experience.

For research collaborations, lab affiliations, or commercial inquiries related to hosted and enterprise versions of the framework, contact the lab through qmilab.com.