Architecture Overview
What is keep?
keep is a reflective memory system. It gives agents a comprehensive tool for persistent indexing, tagging, entity relationship management, summarization, semantic and timeline analysis, and powerful contextual recall. It's designed as an agent skill for Claude Code, OpenClaw, LangChain/LangGraph, and other agentic environments, enabling agents to remember information across sessions over time.
Published by Hugh Pyle, "inguz ᛜ outcomes", under the MIT license. Contributions are welcome; code is conversation, "right speech" is encouraged.
Core Concept
Every stored note has:
- ID: URI or custom identifier
- Summary: Human-readable text (stored, searchable)
- Embedding: Vector representation (for semantic search)
- Tags: Key-value metadata (for filtering)
- Timestamps: Created/updated/accessed (auto-managed)
- Version History: Previous versions archived automatically on update
- Parts: Optional structural decomposition (from
analyze)
The original document content is not stored — only the summary and embedding.
Architecture
keep is layered. Surface clients (CLI, MCP, LangChain, Claude Desktop bundle)
are thin wrappers that talk to a long-running daemon over HTTP. The daemon
hosts a Keeper, which composes provider, store, action, and flow modules.
Background work runs out-of-band on the daemon's queues.
┌────────────────────────────────────────────────────────────────────────┐
│ Surface clients │
│ ┌──────────┐ ┌─────────┐ ┌──────────────┐ ┌────────────────────┐ │
│ │ cli_app │ │ mcp.py │ │ langchain/ │ │ mcpb.py (Claude │ │
│ │ (typer) │ │ (stdio) │ │ adapters │ │ Desktop bundle) │ │
│ └────┬─────┘ └────┬────┘ └──────┬───────┘ └──────┬─────────────┘ │
└───────┼─────────────┼──────────────┼─────────────────┼─────────────────┘
│ │ │ │
│ HTTP (loopback, token-auth, host-header guarded)
▼ ▼ ▼ ▼
┌────────────────────────────────────────────────────────────────────────┐
│ Daemon (daemon.py / daemon_server.py / daemon_client.py) │
│ Routes: /v1/notes, /v1/notes/{id}, /v1/notes/{id}/tags, │
│ /v1/notes/{id}/context, /v1/search, /v1/flow, │
│ /v1/analyze, /v1/ready, /v1/health, /v1/admin/* │
└──────────────────────────────┬─────────────────────────────────────────┘
│
▼
┌────────────────────────────────────────────────────────────────────────┐
│ Keeper (api.py) │
│ ┌──────────────────────────────────────────────────────────────────┐ │
│ │ api.py:Keeper = ProviderLifecycleMixin │ │
│ │ + BackgroundProcessingMixin │ │
│ │ + SearchAugmentationMixin │ │
│ │ + ContextResolutionMixin │ │
│ └──────────────────────────────────────────────────────────────────┘ │
│ Implements high-level put/find/get/tag/move/delete/revert/analyze. │
│ Many user-visible operations are dispatched through actions/. │
│ Stable execution boundary: run_flow() over named state docs. │
└────────────┬───────────────────┬────────────────────┬──────────────────┘
│ │ │
▼ ▼ ▼
┌──────────────────┐ ┌─────────────────┐ ┌───────────────────────────┐
│ Providers │ │ Storage │ │ Background work │
│ (providers/) │ │ backends │ │ │
│ embedding / │ │ DocumentStore │ │ pending_summaries.py │
│ summarization / │ │ (SQLite) │ │ work_queue.py / │
│ document / │ │ ChromaStore │ │ work_processor.py │
│ media / OCR / │ │ (vectors) │ │ task_client.py │
│ analyzer │ │ PendingQueue │ │ (hosted delegation) │
│ │ │ → backend.py │ │ planner_stats.py │
└──────────────────┘ └─────────────────┘ └───────────────────────────┘Layers
1. Surface clients
cli_app.py — Typer command app
- Most commands are HTTP calls to the daemon via
daemon_client.http_request - A small set of commands stay local: setup, daemon control, MCP server, data
- Auto-spawns the daemon on first use
import/export
The CLI is intentionally thin for ordinary note operations (put, get,
find, tag, flow execution): it resolves shell concerns, sends daemon HTTP
requests, and renders responses. Commands that need direct process control or
local filesystem traversal remain CLI-owned for now: setup/config discovery,
daemon lifecycle, MCP stdio startup, bulk directory ingestion, and data
import/export. Those commands may construct a local Keeper or use local graph
helpers, but that is an explicit exception to the daemon-backed command path.
mcp.py — MCP stdio server
KeepFastMCPsubclass ofFastMCP- Three tools:
keep_flow,keep_prompt,keep_help - Dynamic prompt exposure: prompt docs tagged with
mcp_promptbecome native - MCP resources:
keep://now(current note) andkeep://{id}(any note by ID) - Thin HTTP layer — every operation delegates to the daemon via
_post/ - Structured output:
keep_promptreturnsCallToolResultwith
MCP prompts (protocol-level list_prompts / get_prompt)
_get. No local Keeper, no models, no database.
structuredContent
langchain/ — Framework adapters
KeepStore(LangGraphBaseStoreadapter)KeepNotesToolkit(LangChain tools)KeepNotesRetriever(BaseRetrieverwith now-context)KeepNotesMiddleware(LCEL runnable for auto-injection)- See the LangChain section below.
mcpb.py — Claude Desktop bundle
- Builds the .mcpb archive consumed by Claude Desktop's MCP loader
2. Daemon layer
daemon.py — Daemon entry point
- Minimal
keepd --store PATHorpython -m keep.daemon --store PATHrunner - Constructs a
Keeperwithdefer_startup_maintenance=Trueand runs the
pending-work daemon loop
daemon_server.py — HTTP query server
DaemonServerexposes the daemon HTTP API with the routes shown above- Auth: bearer token in
Authorizationheader (random per daemon, persisted - Local mode defaults to loopback bind + strict loopback
Hostallowlist - Remote mode is explicit (
--bind/KEEP_DAEMON_BIND_HOST, optional - Non-loopback binds require explicit trusted-proxy acknowledgment
- Wildcard remote binds (
0.0.0.0/::) requireadvertised_urlso the GET /v1/readyandGET /v1/healthpublish capability and network- Request handlers refuse new work while shutdown is in progress
- OpenTelemetry trace context is propagated from CLI/MCP into daemon spans
- Daemon error payloads include a
request_id; CLI and remote clients include - Set
KEEP_TRACE=1on the daemon to emit timing-tree traces. Storage spans
in ~/.keep/.processor.token)
--advertised-url / KEEP_DAEMON_ADVERTISED_URL) and uses a mode-aware
Host allowlist derived from the bind host and advertised URL
(--trusted-proxy / KEEP_DAEMON_TRUSTED_PROXY=1); keep does not provide
in-process TLS for the daemon HTTP server
Host check remains active
descriptors so remote clients can negotiate support explicitly
it in surfaced errors so an operator can correlate a failure with daemon logs
cover SQLite document-store operations and ChromaDB embedding/metadata queries with low-cardinality attributes only; note content and raw SQL are intentionally omitted from trace attributes.
daemon_client.py — Daemon discovery and HTTP
get_port(): locate or auto-spawn the daemon for a storehttp_request(): shared HTTP plumbing used by both CLI and MCP
The CLI and MCP layers each have their own retry-on-disconnect logic so they can gracefully follow a daemon that has restarted on a new port.
3. Keeper (core API)
api.py — Main facade
Keeperis composed from four mixins for organizational reasons:- Coordinates providers and stores
- Implements query operations with recency decay
- Content-based embedding dedup (skips re-embedding when content matches an
- ProviderLifecycleMixin (_provider_lifecycle.py)
— lazy init with double-checked locking, GPU release helpers
- BackgroundProcessingMixin (_background_processing.py)
— task dispatch, processing pipeline, process spawning
- SearchAugmentationMixin (_search_augmentation.py)
— deep-follow, recency decay, RRF fusion
- ContextResolutionMixin (_context_resolution.py)
— display-context assembly, prompt rendering, meta-doc resolution,
similar-for-display
existing document)
actions/ — Action implementations
A package of focused modules implementing user-visible operations behind
Keeper/flows:
analyze find_supernodes ocr resolve_meta
auto_tag generate put resolve_stubs
delete get resolve_duplicates stats
describe list_parts resolve_edges summarize
extract_links list_versions traverse tag
find move ...Most are dispatched from state-doc flows; some are still called directly
from Keeper methods during the migration to flows.
protocol.py — Abstract interfaces
FlowHostProtocol: minimal backend-neutral boundary (run_flow+close)KeeperProtocol(extendsFlowHostProtocol) — richer object API usedVectorStoreProtocol,DocumentStoreProtocol,PendingQueueProtocol—- Enables pluggable backends (local SQLite/ChromaDB or remote
— this is the stable semantic boundary shared by local and hosted stores
during migration to the flow boundary
storage backend contracts
PostgreSQL/pgvector)
flow_client.py — Shared flow-backed wrappers
- Convenience helpers (
get,put,find,tag,delete,move,now) - Used by both
KeeperandRemoteKeeper— one semantic path for local and - Parameter normalization and response coercion only; no semantic behavior
over FlowHostProtocol.run_flow
hosted
remote.py — Remote client
- HTTP client implementing
FlowHostProtocol - Public methods delegate through
flow_clientwrappers, same as local - Connects to the hosted REST API (keepmem)
Keeper
4. Flow runtime (state docs)
The Keeper exposes run_flow(state, params, ...) as its stable execution
boundary. A "state" is a named YAML state-doc that declares rules,
predicates, and actions. The runtime evaluates them and dispatches actions
from actions/ against the Keeper.
state_doc.py — Loader, compiler, evaluator
- Loads
.state/*documents from the keep store - Compiles CEL predicates
- Defines
AsyncActionEncounteredso a foreground flow can hand off to the
background queue mid-evaluation
state_doc_runtime.py — Synchronous runtime
- Evaluates state docs with inline action execution
- Used for the read/query path: query resolution, context assembly, deep find
- Enforces a per-call tick budget
system_docs.py / builtin_state_docs.py — System doc inventory
- Bundled
.state/*,.tag/*,.prompt/*,.now, etc., installed into the - IDs are stable, e.g.
_system:now → .now,_tag:act → .tag/act
store on first use
flow_env.py — Local flow execution environment
LocalFlowEnvironmentglues the runtime, action runner, and Keeper
together for local execution
Flows that must complete before returning to the caller (find, get-context, deep-find) run synchronously in this runtime. Write-side flows can suspend and continue on the background work queue.
5. Background work
pending_summaries.py — Pending task queue
- SQLite-backed (
pending_summaries.db) - Deferred processing:
summarize,embed,reindex,ocr,analyze - Atomic dequeue with PID claims; stale-claim recovery for crashed processors
- Exponential backoff on failure (30s → 1h); dead-letter for exhausted retries
work_queue.py / work_processor.py — Direct work queue
- Backed by the
continuation.dbSQLite file - Enqueue/claim/complete/fail semantics for write-side flow continuations
- Reuses the legacy
continue_worktable schema (hence the file name)
that can't complete synchronously
processors.py — Content processing helpers
- Content hashing, text normalization, processing-pipeline glue used by both
the synchronous and pending paths
task_client.py / task_workflows.py — Hosted task delegation
- When
config.remoteis set, expensive processing can be delegated to the - Initialized from
Keeper.__init__when remote task delegation is configured
hosted backend rather than run locally
planner_stats.py — Flow discriminator priors
- Precomputed statistics for flow planning
- Bootstrap rebuild is enqueued from
Keeper.__init__when stats are missing
recovery.py — DB recovery
- Detects and handles malformed SQLite databases (used by
document_store.py)
6. Storage backends
document_store.py — Document persistence (local)
DocumentStorewraps SQLite (documents.db)- Stores summaries, tags, timestamps, content hashes
- Version history: archives previous versions on update
- Parts table: structural decomposition from
analyze - Schema versioning + migrations (current
SCHEMA_VERSION = 14) - FTS index for keyword fallback search
store.py — Vector persistence (local)
ChromaStorewraps ChromaDB- Handles vector storage, similarity search, metadata queries
- Versioned embeddings:
{id}@v{N}for history - Part embeddings:
{id}@p{N}for structural decomposition
backend.py — Pluggable storage factory
- Creates store backends based on configuration
- External backends register via the
keep.backendsentry point - Returns
StoreBundle(doc store, vector store, pending queue, work queue,
is_local flag)
paths.py / config.py — Paths and config
config.pydetects available providers (platform, API keys, Ollama),paths.pyresolves the store/config directories, honoringKEEP_CONFIG
persists choices in keep.toml, and auto-creates on first use
and KEEP_STORE_PATH
7. Providers
All providers register through providers/base.py:ProviderRegistry. The
registry is populated lazily on first use so optional dependencies don't
break startup.
Embedding Providers
Generate vector representations for semantic search.
- gemini: API-based, Google (
GEMINI_API_KEYorGOOGLE_CLOUD_PROJECT - voyage: API-based, Anthropic's recommended partner (
VOYAGE_API_KEY) - openai: API-based, high quality (
OPENAI_API_KEY) - openrouter: API-based routing layer over multiple model providers
- mistral: API-based (
MISTRAL_API_KEY) - ollama: Local server, auto-detected, any model (
OLLAMA_HOST) - sentence-transformers: Local, CPU/GPU, no API key
- mlx: Apple Silicon optimized, local, no API key
for Vertex AI)
(OPENROUTER_API_KEY)
Dimension is determined by the model and must be consistent across indexing
and queries. Embeddings are cached through providers/embedding_cache.py
(embedding_cache.db).
openai also supports base_url for local or self-hosted OpenAI-compatible
servers such as llama.cpp llama-server, vLLM, LM Studio, or LocalAI. That is
distinct from the openrouter provider, which has its own model naming and
headers even though both use the OpenAI SDK underneath.
Summarization Providers
Generate human-readable summaries from content.
- anthropic: LLM-based (
ANTHROPIC_API_KEYorCLAUDE_CODE_OAUTH_TOKEN) - openai: LLM-based, high quality (
OPENAI_API_KEY) - openrouter: LLM-based routing layer (
OPENROUTER_API_KEY) - gemini: LLM-based, Google (
GEMINI_API_KEYorGOOGLE_CLOUD_PROJECT) - mistral: LLM-based (
MISTRAL_API_KEY) - ollama: LLM-based, local server, auto-detected (
OLLAMA_HOST) - mlx: LLM-based, local, Apple Silicon
- truncate: Simple text truncation (fallback)
- first_paragraph: First-paragraph extraction (non-LLM)
- passthrough: Store content as-is (with length limit)
Contextual Summarization. When documents have user tags (domain, topic, project, etc.), the summarizer receives context from related items. This produces summaries that highlight relevance to the tagged context rather than generic descriptions.
How it works:
- When processing pending summaries, the system checks for user tags
- Finds similar items that share any of those tags (OR-union)
- Boosts scores for items sharing multiple tags (+20% per additional match)
- Top 5 related summaries are passed as context to the LLM
- The summary reflects what's relevant to that context
Example: indexing a medieval text with domain=practice produces a summary
highlighting its relevance to contemplative practice, not just "a 13th-century
guide for anchoresses."
Tag changes trigger re-summarization. When user tags are added, removed, or changed on an existing document, it's re-queued for contextual summarization even if content is unchanged. The existing summary is preserved until the new one is ready.
Non-LLM providers (truncate, first_paragraph, passthrough) ignore
context.
Document Providers
Fetch content from URIs with content regularization.
- composite: Handles
file://,https://(default) - Extensible for
s3://,gs://, etc.
Content regularization:
- PDF: text extracted via
pypdf; scanned pages (no extractable text) - HTML: text extracted via BeautifulSoup (scripts/styles removed)
- DOCX/PPTX: text + tables/slides extracted via
python-docx/ - Audio (MP3, FLAC, OGG, WAV, AIFF, M4A, WMA): metadata via
tinytag; - Images (JPEG, PNG, TIFF, WEBP): EXIF metadata via Pillow; auto-tags:
- Other formats: treated as plain text
flagged for background OCR
python-pptx; auto-tags: author, title
auto-tags: artist, album, genre, year
dimensions, camera, date; flagged for background OCR
Provider-extracted tags merge with user tags (user wins on collision). This ensures both embedding and summarization receive clean text.
Content Extractor / OCR Providers
Extract text from scanned PDFs and images via optical character recognition.
- mistral: Cloud OCR via
mistral-ocr-latest— high quality, images and - ollama: Uses
glm-ocrmodel (auto-pulled on first use) - mlx: Apple Silicon — uses
mlx-vlmvision models
PDFs (MISTRAL_API_KEY)
OCR runs in the background via the pending queue (keep daemon), not
during put(). The flow:
- During
put(), content regularization detects scanned PDF pages (no - A placeholder is stored immediately so the item is indexed right away
- The pages/image are enqueued for background OCR processing
keep daemonpicks up the OCR task, renders pages to images, runs OCR,- The full OCR text replaces the placeholder and the item is re-embedded
extractable text) or image files
cleans and scores the text
Design points:
- Auto-detected: Ollama (with
glm-ocr) > MLX > None. No configuration needed. - Security: Pillow decompression bomb guard (250MP limit), PDF page cap
- OCR text is cleaned (whitespace normalized) and confidence-scored
- Graceful degradation: no OCR provider = metadata-only indexing
(1000), temp directory cleanup
Media Description Providers (optional)
Generate text descriptions from media files, enriching metadata-only content.
- mlx: Apple Silicon — vision (
mlx-vlm) + audio transcription - ollama: Local server — vision models only (
llava,moondream,
(mlx-whisper)
bakllava)
Media description runs in Keeper.put() between fetch and upsert.
Descriptions are appended to the metadata content before embedding/
summarization, making media files semantically searchable by their visual or
audio content.
Design points:
- Only triggered for non-text content types (
image/*,audio/*) - Lazy sub-provider loading: MLX composite only loads VLM for the first
- GPU-locked via
LockedMediaDescriber(same file-lock pattern as - Graceful degradation: errors never block indexing
- Optional dependency:
pip install keep-skill[media]for MLX models
image, whisper for the first audio
summarization, see model_lock.py)
Analyzer Providers
Decompose content into structural parts with their own summaries, tags, and
embeddings (analyzers.py + providers/base.py:AnalyzerProvider).
- SlidingWindowAnalyzer (default): token-budgeted sliding windows with
- SinglePassAnalyzer: single-pass JSON decomposition for large-context
XML-style target marking, suited to small local models
models
Parts are produced by analyze() and stored as their own rows in
document_parts, with vectors at {id}@p{N} in the vector store.
Other provider modules
- hermes/ — Hermes provider package (alternative
- providers/embedding_cache.py —
- providers/url_validation.py —
inference backend)
CachingEmbeddingProvider wrapper used by ProviderLifecycleMixin
shared URL validation for HTTP-based providers
Storage Layout
store_path/ # default: ~/.keep
├── keep.toml # Provider and store configuration
├── documents.db # SQLite: summaries, tags, versions, parts, FTS
├── chroma/ # ChromaDB persistence (vectors + metadata)
├── pending_summaries.db # Pending queue (summarize/embed/ocr/reindex/analyze)
├── continuation.db # Direct work queue + flow continuations
├── embedding_cache.db # SQLite cache for embeddings
├── planner_stats.db # Flow planner priors
├── .processor.pid # Daemon PID file
├── .processor.token # Daemon HTTP auth token
├── .processor.port # Daemon HTTP port
├── .processor.version # Code version the daemon was started under
└── keep-ops.log[.N] # Persistent operations log (rotating)documents.db contains the documents, document_versions, and
document_parts tables (plus FTS shadow tables). The Chroma directory uses
ChromaDB's own on-disk format (sqlite + parquet segment files); keep does
not impose its own structure on it.
Data Flow
Indexing: put(uri=…) or put(content=…)
URI or content
│
▼
┌─────────────────┐
│ Fetch / use │ ← DocumentProvider (for URIs only)
│ input │
└────────┬────────┘
│ raw bytes
▼
┌─────────────────┐
│ Content │ ← Extract text from HTML/PDF/DOCX/PPTX
│ regularization │ (scripts/styles removed; scanned pages flagged)
└────────┬────────┘
│ clean text (+ OCR page list if scanned)
▼
┌─────────────────┐
│ Media │ ← Optional: vision description (images)
│ enrichment │ or transcription (audio) appended
└────────┬────────┘
│ enriched text
▼
┌──────────────────────────────────────────────┐
│ DocumentStore.upsert + placeholder summary │
│ - tags, timestamps, content hash │
│ - previous version archived if updated │
└─────────────┬────────────────────────────────┘
│
├─► PendingQueue.enqueue("summarize")
├─► PendingQueue.enqueue("embed")
└─► PendingQueue.enqueue("ocr") (if scanned PDF or image)
│
▼
┌──────────────────────┐
│ Background processor │
│ (pending_summaries / │
│ work_processor) │
└──────────┬───────────┘
│
┌──────────────────┼──────────────────┐
│ │ │
▼ ▼ ▼
summarize() embed() OCR
│ │ │
▼ ▼ ▼
DocumentStore. VectorStore. DocumentStore
update_summary upsert / re-summarize +
upsert_version re-embedVersioning on update
- DocumentStore archives the current version before updating
- VectorStore adds a versioned embedding (
{id}@v{N}) if content changed - Same content (hash match) skips duplicate embedding
Embedding dedup
- Before computing an embedding, the Keeper checks if another document has
- If a donor exists with a compatible embedding, it copies that vector
- Safety: dimension check prevents cross-model contamination
the same content hash
instead of re-embedding
Retrieval: find(query)
query text
│
▼
embed() ← EmbeddingProvider
│
│ query vector
▼
┌───────────────────┐
│ VectorStore │
│ query_embedding() │ ← cosine similarity search
└─────────┬─────────┘
│
▼ results with distance scores
┌──────────────┐
│ Apply decay │ ← Recency weighting (ACT-R style)
│ score × 0.5^(days/half_life)
└──────┬───────┘
│
▼
┌──────────────┐
│ Date filter │ ← Optional --since / --until
└──────┬───────┘
│
▼
┌────────────────────────────┐
│ Augmentation │ ← deep follow, RRF fusion,
│ (SearchAugmentationMixin) │ tag boosts (when applicable)
└──────┬─────────────────────┘
│
▼
list[Item] (sorted by effective score)find is also reachable via the flow runtime (find / find-deep state
docs), which is the path used by MCP and the LangChain retriever.
Analyze: analyze(id)
content
│
▼
┌──────────────────────┐
│ AnalyzerProvider │ ← SlidingWindowAnalyzer (default) or
│ analyze(chunks, …) │ SinglePassAnalyzer
└──────────┬───────────┘
│ list[{summary, tags}]
▼
┌──────────────────────┐
│ Keeper.analyze │ ← Wraps into PartInfo, persists, embeds
└──────────┬───────────┘
│
├─► DocumentStore.upsert_part (rows in document_parts)
└─► VectorStore.upsert_part ({id}@p{N})Delete / Revert
delete(id) is a flat removal:
delete(id)
│
▼
DocumentStore.delete + VectorStore.delete
(versions removed by default; pass delete_versions=False to keep history)revert(id) is a separate operation that restores the previous version, or
falls back to delete(id) when there is no history:
revert(id)
│
▼
max_version(id)
│
├── 0 versions → delete(id)
│
└── N versions → restore previous
│
├─ get archived embedding from VectorStore (id@vN)
├─ DocumentStore.restore_latest_version()
│ (promote latest version row to current, delete version row)
├─ VectorStore.upsert restored embedding as current
├─ VectorStore.delete versioned entry (id@vN)
└─ delete stale parts (parts of the discarded version)delete_version(id, offset) removes a specific archived version by public
selector (1=previous, -1=oldest archived, etc.).
Key Design Decisions
1. Schema as Data
- System configuration is stored as documents in the store (
.now, - Bundled system docs are installed on first use by
system_docs.py; they - Flow definitions (
.state/*) are loaded from this same store at runtime,
.tag/*, .prompt/*, .state/*, .meta/*)
are then editable like any other note
so behavior is data-driven
2. Daemon-mediated state
- All non-trivial state lives in the daemon process. Surface clients are
- This keeps model loading, GPU locks, embedding caches, and pending-work
- Exception: the Hermes integration (
keep/hermes/) constructs an
stateless and short-lived.
state in one place
in-process Keeper directly inside the Hermes runtime. Reads (search,
get, prompt rendering) and the synchronous part of writes go through
this in-process Keeper for latency reasons; background work (embeddings,
summaries, analysis) is still handled by an auto-started daemon. See
HERMES-INTEGRATION.md.
3. Lazy Provider Loading
- Providers are registered at first use, not import time
- Avoids crashes when optional dependencies are missing
ProviderLifecycleMixinhandles double-checked locking for thread safety
and supports GPU-memory release
4. Separation of Concerns
- Storage backends are provider-agnostic (only know about vectors / metadata)
- Providers are storage-agnostic (only know about text → vectors)
- Protocols define the boundary; implementations are pluggable
5. No Original Content Storage
- Reduces storage size
- Forces meaningful summarization
- URIs can be re-fetched if needed
6. Immutable Items
Itemis a frozen dataclass- Updates via
put()return a newItem - Prevents accidental mutation bugs
7. System Tag Protection
- Tags prefixed with
_are system-managed - Source tags are filtered before storage
- A separate
INTERNAL_TAGSset (intypes.py) hides tags that exist for
efficient queries but aren't user-facing
8. Document Versioning
- All documents retain history automatically on update
- Previous versions archived in the SQLite
document_versionstable - Content-addressed IDs for text updates enable versioning via tag changes
- Embeddings stored for all versions (enables temporal search)
- No auto-pruning: history is preserved indefinitely
9. Version-Based Addressing
- Versions addressed by offset from current: 0=current, 1=previous,
- CLI uses
@V{N}syntax for shell composition:keep get "doc:1@V{1}" - Display format (v0, v1, v2) matches retrieval offset (
-V 0,-V 1, - Offset computation assumes
list_versions()returns newest-first ordering - Security: literal ID lookup runs before
@V{N}parsing to prevent
2=two-ago
-V 2)
confusion attacks
10. Flow as the stable boundary
- The hosted/local boundary is
run_flow(state, params), not a fixed - Public helpers like
put/find/getinvoke named state docs, so the - Async actions inside a synchronous flow throw
AsyncActionEncountered,
object API
same flow definitions run locally and in the hosted backend
which the runtime catches to hand off to the background work queue
LangChain / LangGraph Integration
The keep.langchain module provides framework adapters on top of the API
layer:
┌─────────────────────────────────────────────────────────────┐
│ LangChain Layer (keep/langchain/) │
│ - KeepStore LangGraph BaseStore adapter │
│ - KeepNotesToolkit LangChain tools │
│ - KeepNotesRetriever BaseRetriever with now-context │
│ - KeepNotesMiddleware LCEL runnable for auto-injection │
└──────────────────┬──────────────────────────────────────────┘
│ uses Keeper API
▼
┌─────────────────────────────────────────────────────────────┐
│ Keeper (api.py) → daemon → store │
└─────────────────────────────────────────────────────────────┘KeepStore maps LangGraph's namespace/key model to keep's tag system via
configurable namespace_keys. Namespace components become regular keep
tags, visible to CLI and all query methods. Tag filtering is a **pre-filter
on the vector search**, making tags suitable for data isolation (per-user,
per-project). See LANGCHAIN-INTEGRATION.md.
Extension Points
New Embedding or Summarization Provider
- Implement the provider protocol (
EmbeddingProvideror - Register it in the provider registry (typically by importing your module
- Reference the provider by name in
keep.toml
SummarizationProvider) from providers/base.py
so its register_* calls run)
New Analyzer
- Implement
AnalyzerProvider.analyze()inproviders/base.py - Register through the provider registry
- Selected by name in the store config
New Store Backend
- Implement the protocols in protocol.py:
- Local: ChromaDB + SQLite (built-in)
- Remote: PostgreSQL + pgvector (the keepmem package, registered via
- Register new backends via the
keep.backendsentry point in
VectorStoreProtocol, DocumentStoreProtocol, PendingQueueProtocol
keep.backends entry point)
pyproject.toml
New Flow / State Doc
- Author a
.state/*document with rules and actions - Add any new actions to
actions/and wire them into the action runner - Invoke via
run_flow("your-state", params=…)from clients or other flows
Framework Integration
- Implement adapters on top of the daemon HTTP API or the local Keeper
- Current: LangChain/LangGraph (keep/langchain/)
- Pattern: map framework concepts to keep tags + search