keep analyze

Decompose a note or string into meaningful parts.

Usage

keep analyze ID                       # Analyze using configured provider
keep analyze ID -t topic -t type      # With guidance tags

What it does

analyze uses an LLM to decompose content into meaningful sections, each with its own summary, tags, and embedding. This enables targeted search: find matches specific sections, not just whole documents.

Two modes, auto-detected:

Parts are the structural counterpart to versions:

Options

OptionDescription
-t, --tag KEYGuidance tag keys (repeatable). Fetches .tag/KEY descriptions to guide decomposition
--foreground, --fgRun in foreground and wait for results (default: background)
--forceRe-analyze even if parts are already current

The global -s/--store option (available on every keep subcommand) overrides the store directory.

Background processing

By default, analyze runs in the background, serialized with other ML work (summarization, embedding). Use --fg to wait for results:

keep analyze doc:1                    # Returns immediately, runs in background
keep analyze doc:1 --fg               # Waits for completion

Background tasks are processed by the same queue as keep daemon summaries.

Part addressing

Append @P{N} to any ID to access a specific part:

keep get "doc:1@P{1}"           # Part 1
keep get "doc:1@P{3}"           # Part 3

Parts include prev/next navigation:

---
id: doc:1@P{2}
tags:
  topic: "analysis"
prev:
  - @P{1}
next:
  - @P{3}
---
Detailed analysis of the main argument...

Parts in get output

When a document has parts, keep get shows a parts manifest:

---
id: doc:1
similar:
  - doc:2 (0.85) 2026-01-14 Related document...
parts:
  - @P{1} Introduction and overview of the topic
  - @P{2} Detailed analysis of the main argument
  - @P{3} Conclusions and future directions
prev:
  - @V{1} 2026-01-13 Previous summary...
---
Document summary here...

Parts in search results

Parts have their own embeddings and appear naturally in find results:

keep find "main argument"
# doc:1@P{2}  2026-01-14 Detailed analysis of the main argument...

Smart skip

Analysis is expensive (LLM call per document). To avoid redundant work, analyze tracks a content hash at the time of analysis. If the document hasn't changed since the last analysis, the call is skipped:

keep analyze doc:1                    # Analyzes, stores _analyzed_hash
keep analyze doc:1                    # Skipped — parts are current
keep put doc:1 "updated content"      # Content changes
keep analyze doc:1                    # Re-analyzes (content changed)

Analysis is queued automatically by the after-write flow whenever a note is stored or updated, so a daily cron that re-runs keep put /path/to/docs/ will only re-analyze files whose content actually changed.

Use --force to override the skip:

keep analyze doc:1 --force            # Re-analyze regardless

Part tags

Parts do not inherit tags from their parent document. Each part carries only:

This keeps parts clean sub-notes rather than clones of the parent's tag graph, and prevents drift when the parent is re-tagged without re-analysis. Document-level relationships like references, cites, and informs stay on the parent where they semantically belong.

The tradeoff is recovered automatically by find: a tag-filtered query like find("X", tags={"project": "alpha"}) still returns parts of matching parents via a _base_id join, so nothing is lost from the caller's perspective.

If the analyzer itself decides a particular part warrants a topic or type tag (via classifier output or guidance tags), that tag lives on the part. Everything else comes from the parent at read time, not write time.

Part immutability

Parts are machine-generated analysis results, not human observations. They are treated as derived data — immutable except for tag corrections.

Allowed:

Blocked:

If a part's summary is wrong, re-analyze (with --force or better guidance tags). The right fix is a better prompt, not manual editing.

Re-analysis

Behavior depends on the source type and on whether prior analysis is recorded:

keep analyze doc:1                    # Creates parts (or appends if incremental)
keep analyze doc:1 -t topic --force   # Full re-analyze — replaces all parts

Sliding-window semantic for vstrings

For inline notes that accumulate many versions (e.g. now), analyze is a sliding window over the most-recent versions, not an exhaustive log processor. When the analyze cursor falls more than ~100 versions behind the latest (a "backlog", typically caused by a daemon outage), the intermediate versions cannot be reconstructed from the analyze window; analyze logs a WARNING, rebases to the newest window via a full pass, and the cursor advances to the latest version. The gap is intentionally dropped — older versions are historical noise for trajectory analysis.

Guidance tags

Tag keys passed with -t fetch the corresponding .tag/KEY system documents (e.g., .tag/topic, .tag/type). These descriptions tell the LLM what each tag means and what values are appropriate, producing better decomposition and more consistent tagging — even with smaller models.

keep analyze doc:1 -t topic -t type   # Guided by tag descriptions

Python API

kp = Keeper()

# Analyze (skips if parts are current)
parts = kp.analyze("doc:1")
parts = kp.analyze("doc:1", tags=["topic", "type"])
parts = kp.analyze("doc:1", force=True)  # Override skip

# Enqueue for background processing (returns False if skipped)
enqueued = kp.enqueue_analyze("doc:1")
enqueued = kp.enqueue_analyze("doc:1", force=True)

# Access parts
part = kp.get_part("doc:1", 1)        # Returns Item
parts = kp.list_parts("doc:1")        # Returns list[PartInfo]

# Edit tags on a part (the only allowed mutation)
kp.tag_part("doc:1", 1, tags={"topic": "oauth2"})  # Update tag
kp.tag_part("doc:1", 1, tags={"topic": ""})         # Remove tag

Custom analysis prompts

The LLM prompt used for analysis is configurable. Create a .prompt/analyze/* document whose match rules target specific tags, and its ## Prompt section replaces the default system prompt for matching documents. See PROMPTS.md for details.

See Also