← Back to main site

Applied project

Lodestar

A trust layer for AI agents.

Status
Pre-v0.1 implementation, v0.2 architecture
License
Apache 2.0
Repository
github.com/qmilab/lodestar

Most agent stacks today are built from three layers: a runtime that picks tools and calls them, a memory store that holds recalled material, and an observability platform that captures traces. Each of these layers is excellent at its own job. But the stack still rarely records the question that matters most when something goes wrong:

What did the agent believe, why did it believe it, and what evidence supported that belief at the moment it acted?

Lodestar is built for that missing record. It sits alongside the runtime, memory, and observability stack, and its job is to capture and audit the agent's epistemic state: the structured record of what the agent took to be true, why, and how confidently.

The project is the first applied output from QMI Lab. It operationalizes one specific lever of the lab's broader thesis — that the central problem in machine intelligence is the conversion of information into knowledge — and brings that thesis into contact with a class of working systems, autonomous agents, where the conversion is currently too often invisible.


The problem

When an agent does something unexpected, the post-mortem question is usually some version of "what was it thinking?" Standard agent infrastructure can answer adjacent questions — what did it do, what was in its context, what did it retrieve — but not the central one.

The reason is structural. A tool-call trace records actions. A memory dump records retrievable material. A prompt log records what was sent to the model. The transition from the model said this to the agent now acts as if this is true often happens implicitly, somewhere in the overlap between context and weights, and is not written down as a structured artifact. It exists only as the consequence of subsequent actions.

This creates three concrete problems for anyone running agents in production:

  • Post-mortems are unreliable. When an agent makes a costly mistake, the trail of evidence often does not include the belief that drove the mistake. Reviewers reconstruct the belief by inference, often poorly.
  • Memory poisoning lacks a clear audit boundary. A class of attacks plants attacker-controlled text in external documents the agent reads. The text contains claims that the agent may later extract and treat as factual. Without a record distinguishing observations the agent witnessed from documents it read, the attack may leave no trace apart from its downstream effects.
  • Trust does not transfer cleanly. An agent that performs well on a benchmark may still be deciding for opaque reasons. The reviewer who would extend trust to its decisions cannot, because the basis for those decisions is not legible.

These are not simply gaps in observability dashboards. They are gaps in what the agent stack records at all.


What Lodestar does

Lodestar treats agent cognition as a chain of typed steps:

Observation → Claim → Evidence → Belief → Decision → Action → Outcome → Revision

Each step is a first-class data structure with its own schema. When an agent invokes a tool, the result is captured as an Observation. From Observations the agent extracts Claims, each of which is supported by an Evidence Set of varying quality: a deterministic tool result is stronger than a model judgment, which is stronger than a snippet quoted from a third-party document.

Claims become Beliefs only when the evidence is sufficient and the governance layer permits the promotion. Beliefs inform Decisions. Decisions yield Actions. Actions produce Outcomes. When Outcomes contradict prior Claims, Revisions flow back through the chain.

This is not just a logging schema. It is a runtime invariant: an agent running under Lodestar cannot promote a model-generated claim directly to a settled belief without passing through an explicit gate. The gate enforces, among other things, that evidence sourced from external documents — the agent's reading material — cannot silently become ground truth.

The framework exposes its capabilities through four developer entry points:

  • Guard — wraps an existing agent loop and routes every tool call through the Action Kernel, every observation through the Cognitive Core, and every belief transition through the Memory Firewall.
  • Trace — consumes the resulting event log and produces a trust report. The package is @qmilab/lodestar-trace; the user-facing CLI command is lodestar report. The output is a human-readable account of what the agent observed, claimed, believed, decided, and did, with every transition justified by its supporting evidence.
  • Memory Firewall — provides governance over four orthogonal lifecycle axes — truth, retrieval, security, and freshness — for any piece of remembered information, with adapter contracts for existing memory stores such as mem0, Letta, and Zep.
  • Harness — the test side: probes now, sentinels and calibrators next. Probes exercise the governance invariants under adversarial conditions; sentinels and calibrators will measure how well the agent's prior claims about outcomes matched the outcomes that actually occurred.

The reference implementation is in TypeScript, runs on Bun, and persists the event log as append-only NDJSON with monotonic sequence numbers and payload hashes. Integration with non-TypeScript runtimes — Python agents, Claude Code, Cursor, Aider, and others — happens through a Model Context Protocol proxy mode that wraps the underlying tool surface.


Status

The project is at pre-v0.1 implementation with v0.2 architecture. The architectural spine has been through five rounds of structured adversarial review and is stable enough to implement. Implementation is partial:

  • The full schema layer for the epistemic chain is present.
  • The append-only event log and two-phase action execution are present.
  • The Memory Firewall with its four lifecycle axes is present.
  • The Cognitive Core primitives are present: claim extraction, belief adoption, and world-model update.
  • Six governance invariants are encoded as automated probes that all pass under strict TypeScript: memory poisoning resistance, epistemic chain integrity, external-document retrieval gating, quarantine isolation, sensitivity ceiling enforcement, and the auto-observation evidence-quality gate.

The next implementation cycle assembles the developer entry points into shippable packages. The cycle after that wraps existing coding agents — Claude Code first — through an MCP proxy. The first end-to-end demonstration target is a coding agent governed from tool call to trust report, exposing its reasoning at every step.

The full roadmap to v1, including phasing for the harness infrastructure and the trust marketplace, lives in the project repository.


Research arc

Lodestar is both a shipping software project and an active research program. The project is the substrate for several lines of inquiry the lab is pursuing:

  • Epistemic governance as infrastructure. A position paper drafting the case that the agent stack is missing a fourth layer, and arguing for its structural separation from runtime, memory, and observability. Target: 2026.
  • A threat taxonomy for memory poisoning. A structured analysis of the attack class where adversarial text plants itself in agent-readable documents and propagates into agent beliefs. The taxonomy distinguishes the attack surface, the propagation mechanism, and the defense locus for each variant. Target: 2026.
  • Calibration in governed agents. Empirical work measuring how well an agent's prior claims about outcomes match the outcomes that actually occurred, across a corpus of governed sessions. Target: 2027 and beyond, once a sufficient corpus has accumulated.

The research outputs and the software outputs feed each other. The probes that ship in the harness are operationalizations of invariants the threat model identified; the events the calibrator will measure are the data the empirical work depends on.


Engaging with the project

The repository is open: github.com/qmilab/lodestar. Apache 2.0. The current scaffold can be cloned, installed with Bun, and run against the included examples — examples/telenotes-governed-dev/ demonstrates the full pipeline end-to-end, and examples/doc-insight/ focuses on the auto-observation gate in isolation.

A series of writing on the design and the adversarial review process behind it is being published at nandan.me/writing.

Issues, design questions, and threat-model contributions are welcome through the repository. Substantial architectural changes should come with concrete failure cases, threat-model evidence, or implementation experience.

For research collaborations, lab affiliations, or commercial inquiries related to hosted and enterprise versions of the framework, contact the lab through qmilab.com.