Why Kvendra
AI assistants are good at writing code and bad at remembering. They lose context between sessions, they rediscover the same architecture every conversation, and the moment you swap one assistant for another your governance evaporates. Worse: secrets exfiltrate through LLM context windows the moment a tool call includes an access token in plain text.
Kvendra is the layer that solves both problems at once. A typed knowledge graph holds the structured truth of your project — components, interfaces, requirements, releases, decisions — diff-able, versioned, queryable. A local broker mediates every sensitive primitive (git, AWS, shell, npm, pip, HTTP) so the LLM only ever sees declarative requests, not raw tokens. The same Kvendra is consumed by Claude Code, Cursor, Continue — your governance model survives the choice of assistant.
Architecture
Kvendra is three independent pieces joined by
an optional fourth. They can be used together or one at a
time. The KB engine ships in two modes —
Platform self-hosted (AGPL-3.0, single
tenant, you operate it) and Enterprise SaaS
(closed source, multi tenant, managed). A declarative tier
flag in your project's CLAUDE.md decides which
one your skills talk to. The wire contract is identical, so
you can swap Platform for Enterprise without touching code.
Three asymmetric edges describe how the pieces coordinate. First, Skills always go through the CLI broker for sensitive primitives — that edge is mandatory by design so tokens never enter LLM context. Second, Skills talk to exactly one KB engine, the one the tier flag picked. Third, the CLI talks directly to Enterprise endpoints (vault backup, profile sync, shared workspaces) for Pro and above — without going through Skills.
Components in depth
CLI — Apache-2.0
A Rust binary on crates.io as kvendra. Current
stable: 0.4.0, cross-platform (macOS x64/aarch64,
Linux x64, Windows x64). Owns a
zero-knowledge vault at
~/.kvendra/ (config + audit log + per-profile
secrets + HMAC-signed allowlists). Owns a
local stdio MCP broker that exposes 7
primitives (shell, git, github, aws, http, npm, pip). The
master password never leaves the CLI process and never enters
environment variables. Code-signing on Mac/Win/Linux is
deferred to v0.3.0+.
Platform — AGPL-3.0
Single-tenant KB engine. Postgres 16 plus pgvector for
storage and 1024-dim vector search. BYOK embeddings
via any OpenAI-compatible /v1/embeddings
endpoint — Ollama (mxbai-embed-large local), OpenAI,
Together, or Kvendra Cloud. Ships as 4 docker containers (db
+ platform + ollama + backup) in the reference stack.
Container images are signed with Sigstore/cosign keyless and
shipped with an SPDX SBOM so security audits can verify the
supply chain.
Enterprise — closed
Multi-tenant SaaS on AWS. Lambda-native compute. Aurora
Serverless v2 with pgvector,
schema-per-tenant isolation. RDS Proxy for
connection pooling. KMS CMK for at-rest encryption. Cognito
user pool plus dual-auth (Cognito JWT or
kvd_live_* API key). 14 KB tools live at
api.kvendra.cloud/v1/kb/*. Tiers: Pro, Team (in
flight), Enterprise (in flight).
Skills
Two families ship today. The first is
kvendra-skills v1 (18 skills, Apache-2.0),
the Claude Code plugin built by the Kvendra team. The second
is kvendra-skills-community (4 skills,
Apache-2.0), shipped via M3 for clients with open-weights
≤14B models. Both surface the same MCP tool surface; the
difference is tool-call robustness — see
PAT-KVD-819856 for the lessons we learned.
Security model
Kvendra inverts the usual AI tooling threat model. The LLM is not trusted — it is just text — so anything sensitive lives outside its context.
Zero-knowledge vault
Master password never enters environment variables, never
appears in process args, never sits in LLM context. The vault
layout at ~/.kvendra/ is: config.toml (with an
HMAC sidecar), audit.db (SQLite WAL), sentinel.blob,
recovery_codes.json (Argon2id-hashed),
secrets/<profile_id>.blob,
allowlists/<profile_id>.yaml. All files are mode 0600.
Broker as mediator
Every sensitive primitive — git push, aws s3 sync, npm
publish, curl with bearer token — goes through the broker.
The LLM sends a declarative request like
aws.s3.cp src=./dist dst=s3://my-bucket profile=aws.prod.
The broker looks up the profile's HMAC-signed allowlist,
validates the call against it, fetches the credential,
executes, and returns the result. The LLM never touches a
token.
Signed allowlists
Each profile has a YAML allowlist (e.g. "only s3 cp into
s3://my-bucket/, never another bucket"). The
YAML is signed with an HMAC keyed by the master password, so
any out-of-band edit invalidates the signature and the broker
refuses to use it until you re-sign with
kvendra secret set-allowlist.
Audit log
Every primitive call writes a row to audit.db.
Rows are chained with an HMAC chain
(kvendra/audit-hmac/v1) so deletion or rewriting
is detectable. Read-only inspection works via direct SQLite
read; full verification
(kvendra audit --verify) needs the master
password.
Capabilities & limitations
Today's surface, written down so it cannot be confused with marketing.
What Kvendra does today
- Typed graph entities (projects, components, interfaces, requirements, releases, decisions, runbooks, glossaries, more).
- Transactions with explicit activate or cancel.
- Semantic vector search over 1024-dim L2-normalised embeddings.
- BYOK embeddings via any OpenAI-compatible endpoint.
- 7 broker primitives — shell, git, github, aws, http, npm, pip.
- 18-skill Claude Code orchestrator plugin (Apache-2.0).
- Multi-tenant SaaS with schema-per-tenant + RDS Proxy + KMS.
- Chained audit log (HMAC chain).
What Kvendra is not
- Not a chat memory cache — it stores structure, not transcripts.
- Not a vector database product — vectors are a search index, not the data model.
- Not a CI/CD platform — Kvendra coordinates work; CI still runs in GitHub Actions / Jenkins / etc.
- Not an agent runtime — Claude Code, Cursor run the agent loop.
What is deferred
- Code-signed binaries on macOS / Windows / Linux — v0.3.0+.
- Production Helm chart — M3 deliverable.
Decisions
The canonical architectural decisions, each in two or three lines. Full text lives in the KB.
- Open-core split. CLI Apache-2.0, Platform AGPL-3.0, Enterprise closed. The wire contract is identical on both KB engines so users can self-host or move to hosted without code changes.
ADR-KVD-1823F7 - Domain split.
kvendra.comis the marketing site and docs portal.kvendra.cloudhosts the app, the dashboard, and billing. Clean separation between selling and serving.ADR-KVD-69C683 - BYOK embeddings. Platform consumes any OpenAI-compatible
/v1/embeddingsendpoint. No vendor lock-in.ADR-KVD-PLATFORM-562CE8 - Claude Code as upstream agent runtime. Skills ship as overlays that ride Claude Code's native MCP + plugin manager surface, instead of forking an agent loop.
ADR-KVD-SKILLS-552A8F - Postgres schema-per-tenant. Multi-tenant isolation via one Postgres schema per workspace, lazy-provisioned on first call.
ADR-KVD-ENTERPRISE-49A9EF - KMS CMK with encryption context. One customer-managed key per environment, with the tenant_id bound into the KMS encryption context for cross-tenant defence in depth.
ADR-KVD-ENTERPRISE-D8F840 - Token-vending broker mode. Pro+ CLI receives short-lived scoped tokens from Enterprise for backup and sync — never long-lived credentials.
ADR-KVD-ENTERPRISE-8FB944
Design principles
Seven principles guide how we ship the self-hosted community
track. Lifted verbatim from ROAD-KVD-716183.
- Strict OSS coherence. Apache, AGPL, MIT. Freeware (e.g. LM Studio) is excluded — license must be true OSS to be on the reference path.
- LLM pluggable. Ollama is the documented starting point. Any provider speaking the OpenAI-compatible contract works.
- Embeddings via OpenAI-compatible API. Configure via
EMBEDDINGS_BASE_URL,EMBEDDINGS_MODEL, andEMBEDDINGS_API_KEY. - Zero-knowledge threat model preserved. The CLI stays on the host — never inside a docker container with master password injected as env var.
- Trust model verifiable. Sigstore/cosign keyless signatures plus SPDX SBOM plus sha256 manifests on every released artefact.
- Build-from-source first-class. Every component ships full source and reproducible build instructions — audit-friendly for regulated industries.
Getting started
Two ways to run Kvendra. Pick the one that matches how you ship.
≤3 min to your first MCP call.
Sign up, copy the MCP URL into Claude Code, you're running. Managed KB, vault backup across machines, embeddings included. $15/mo · free tier, no card required.
- Zero local infra — skip docker, env vars, embeddings setup
- Profile sync across your devices
- Same MCP wire as self-host — no migration cost later
Your hardware. Your data.
Platform AGPL · CLI Apache · Skills Apache. Run the whole stack on docker compose with three embedding routes. The /platform install page is the source of truth — quickstart, decision matrix, env reference and gotchas.
- CLI vault —
cargo install kvendra - Platform —
docker run kvendra/kvendra-platform - Skills —
/plugin install kvendra-skills