keep put
Add or update a document in the store.
Usage
Four input modes, auto-detected:
keep put "my note" # Text mode (inline content)
keep put file:///path/to/doc.pdf # URI mode (fetch and index)
keep put https://example.com/page # URI mode (web content)
keep put /path/to/folder/ # Directory mode (index all files)
keep put - # Stdin mode (explicit)
echo "piped content" | keep put # Stdin mode (detected)Directory mode indexes all regular files in the folder (non-recursive by default).
Options
| Option | Description |
|---|---|
-t, --tag KEY=VALUE | Tag as key=value (repeatable) |
-i, --id ID | Custom document ID (auto-generated for text/stdin) |
--summary TEXT | User-provided summary (skips auto-summarization) |
-r, --recurse | Recurse into subdirectories (directory mode) |
-x, --exclude PATTERN | Glob pattern to exclude (repeatable, directory mode) |
--watch | Set up a daemon watch — re-index automatically on file changes |
--unwatch | Remove an existing watch |
--interval DURATION | Polling interval for watches (ISO 8601 duration, e.g. PT5M) |
-f, --force | Force re-index even if content is unchanged |
Directory mode
Index a folder of files. By default non-recursive; use -r to include subdirectories:
keep put ./docs/ # All files in docs/ (flat)
keep put ./docs/ -r # All files in docs/ (recursive)
keep put ./src/ -r -x "*.pyc" -x "__pycache__" # Recursive with excludesExcludes use glob patterns matched against the relative path from the directory root. Hidden files and symlinks are always skipped.
Watching for changes
The --watch flag sets up a daemon-driven watch that re-indexes files automatically when they change:
keep put ./notes/ -r --watch # Index + watch for changes
keep put ./notes/ -r --watch -x "*.log" # With exclude patterns
keep put https://example.com/doc --watch # Watch a URL for changesWatches persist across sessions — the daemon polls for changes in the background. Use keep pending to see active watches. Excludes are captured at watch-creation time.
Global ignore patterns
The .ignore system doc contains glob patterns that are automatically excluded from all directory walks and watches — in addition to .gitignore and per-watch --exclude patterns.
keep get .ignore # View current patterns
keep edit .ignore # Edit in $EDITORUpdating .ignore retroactively purges matching file:// items from the store and cancels their pending work. Ships with sensible defaults for build artifacts, lock files, bytecode, and binaries.
Text mode and content-addressed IDs
Text mode uses content-addressed IDs for automatic versioning:
keep put "my note" # Creates %a1b2c3d4e5f6
keep put "my note" -t done # Same ID, new version (tag change)
keep put "different note" # Different ID (new document)Same content = same ID = enables versioning through tag changes.
Smart summary behavior
- Short content (under
max_summary_length, default 1000 chars): stored verbatim as its own summary - Long content: truncated placeholder stored immediately, real summary generated in background by
keep pending --summaryprovided: used as-is, skips auto-summarization
The LLM prompt used for summarization is configurable. Create a .prompt/summarize/* document whose match rules target specific tags, and its ## Prompt section replaces the default summarization prompt for matching documents. See PROMPTS.md for details.
Update behavior
When updating an existing document (same ID):
- Summary: replaced with new summary
- Tags: merged — existing tags preserved, new tags override on key collision
- Version: previous version archived automatically
Contextual summarization
When you provide tags during indexing, the summarizer uses context from related items to produce more relevant summaries.
- System finds similar items sharing your tags
- Items with more matching tags rank higher (+20% score boost per tag)
- Top related summaries are passed as context to the LLM
- Summary highlights relevance to that context
Tag changes trigger re-summarization:
keep put doc.pdf # Generic summary
keep put doc.pdf -t topic=auth # Re-queued for contextual summaryGit changelog
When a directory is a git repository, put -r queues the commit history for background indexing:
keep put ./myproject/ -r
# 42 indexed, 0 errors from myproject/
# git: changelog ingest queuedEach commit becomes a searchable item (ID: git://repo#sha) with the commit message as its summary. Files get a git\_commit edge tag linking to their last commit. Git tags and releases are indexed as separate items (ID: git://repo@tag).
Incremental: On re-scan (or via a watch), only new commits since the last ingest are processed. A git_watermark tag on the directory tracks the last ingested SHA.
Querying git history:
keep find "why was the auth flow changed" # Finds commit messages by meaning
keep find "auth" --deep # File results + linked commit context
keep list 'git://myproject#*' # All indexed commits
keep list 'git://myproject@*' # All indexed tags/releases
keep get 'git://myproject@v1.0' # A specific releaseSupported formats
| Format | Extensions | Content extracted | Auto-tags |
|---|---|---|---|
| Text | .md, .txt, .py, .js, .json, .yaml, ... | Full text | — |
| Text from all pages; scanned pages OCR'd in background† | — | ||
| HTML | .html, .htm | Text (scripts/styles removed) | — |
| DOCX | .docx | Paragraphs + tables | author, title |
| PPTX | .pptx | Slides + notes | author, title |
| Audio | .mp3, .flac, .ogg, .wav, .aiff, .m4a, .wma | Structured metadata (+ transcription\*) | artist, album, genre, year, title |
| Images | .jpg, .png, .tiff, .webp | EXIF metadata + OCR text† (+ description\*) | dimensions, camera, date |
\* When a media description provider is configured ([media] in keep.toml), images get vision-model descriptions and audio files get speech-to-text transcription, appended to the extracted metadata. See QUICKSTART.md for setup.
† OCR (optical character recognition): Scanned PDF pages (pages with no extractable text) and all image files are automatically queued for background OCR when an OCR provider is available. Keep auto-detects Ollama (using glm-ocr, pulled automatically on first use) or MLX (mlx-vlm on Apple Silicon). A placeholder is stored immediately so the item is indexed right away; the full OCR text replaces it once background processing completes via keep pending. No configuration needed — if Ollama is running, OCR just works.
Auto-extracted tags merge with user-provided tags. User tags win on collision:
keep put file:///path/to/song.mp3 # Auto-tags: artist, album, genre, year
keep put file:///path/to/song.mp3 -t genre="Nu Jazz" # Overrides auto-extracted genre
keep put file:///path/to/photo.jpg -t topic=vacation # Adds topic alongside auto camera/dateIndexing documents
Index important documents encountered during work:
keep put "https://docs.example.com/auth" -t topic=auth -t project=myapp
keep put "file:///path/to/design.pdf" -t type=reference -t topic=architectureSee Also
- TAGGING.md — Tag system, merge order, speech acts
- VERSIONING.md — How versioning works
- KEEP-GET.md — Retrieve indexed documents
- META-TAGS.md — Contextual queries (
.meta/*) - PROMPTS.md — Prompts for summarization, analysis, and agent workflows
- REFERENCE.md — Quick reference index