keep data — Export, Import, and Sync

Backup, restore, and continuously mirror keep stores.

Export

keep data export backup.json                           # JSON to file (default)
keep data export backup.json --include-system          # Include system docs (.tag/*, .meta/*, .now, etc.)
keep data export -                                     # JSON to stdout (for piping)
keep data export ~/vault --format md                   # Markdown: one .md per note
keep data export ~/vault --format md --include-parts   # ...plus analysis parts as sidecars
keep data export ~/vault --format md --include-versions # ...plus archived versions as sidecars
keep data export ~/vault --sync                        # Markdown + register continuous mirror
keep data export ~/vault --sync --stop                 # Stop mirroring (keeps files)
keep data export --list                                # List active sync directories

Exports all user documents, versions, and parts as JSON. System documents (dot-prefix ids like .tag/*, .meta/*, .now) are excluded by default — pass --include-system to include them. Embeddings are excluded (they are model-dependent and regenerated on import).

Markdown mode (--format md)

Markdown mode writes a directory with one .md file per note. The directory is created if it doesn't exist; for a one-shot export it must be empty, but --sync allows writing into an existing directory.

One-shot markdown export uses the configured authoritative store. If remote_store is configured, keep data export ~/vault --format md exports from that remote store through the remote export and note-bundle APIs. Continuous --sync also works with a remote authoritative store, but the daemon still owns the local mirror root and writes files only on this machine.

Each file has flat YAML frontmatter followed by the note summary as the body. The frontmatter is one flat map — no nested tags: block — with three kinds of keys:

Filenames mirror the id's path structure for easy browsing, using the wget -m convention:

Note idPath in export dir
auth-notesauth-notes.md
notes/2024/jan-meetingnotes/2024/jan-meeting.md
.tag/act/commitment.tag/act/commitment.md
file:///Users/x/README.mdfile/Users/x/README.md.md
https://example.com/docs/guidehttps/example.com/docs/guide.md
thread:abc-123@host.com#fragthread/abc-123@host.com%23frag.md
mailto:foo@bar.commailto/foo@bar.com.md

Any RFC 3986 URI scheme becomes a top-level directory. Inside each component, filesystem-unsafe characters (:, #, ?, \, *, <, >, |, non-ASCII) are percent-encoded; @, +, =, ,, (, ), space stay literal. .md is always appended to the last component, even for ids that already end in .md, so the suffix is unambiguous. Components that exceed the filesystem's per-component limit are truncated with a short SHA256 suffix; the full id is always in _id.

Two notes whose paths would collide case-insensitively (e.g. state-actions.md and STATE-ACTIONS.md on macOS APFS) are auto-disambiguated: the second one gets an 8-hex-char hash suffix on its stem. The frontmatter _id is unchanged — disambiguation is purely an on-disk detail.

Example output

---
_id: auth-notes
_content_hash: abc123
_content_hash_full: def456
_prev_version: "[[auth-notes/@V{1}]]"
_next_part: "[[auth-notes/@P{1}]]"
topic: auth
project: security
_source: inline
_created: '2026-01-15T10:30:00'
_updated: '2026-02-01T14:22:00'
_accessed: '2026-02-19T09:00:00'
_analyzed_hash: abc123
said:
- "[[conv1|First conversation about auth]]"
- "[[conv2|Follow-up on OAuth design]]"
---

Authentication patterns for OAuth2...

Chain navigation

Parent notes link to their sidecars via _prev_version and _next_part frontmatter keys. Sidecars link back to their parent (or to sibling sidecars) via the same system:

All chain-navigation values are [[vault-local-ref]] wikilinks that resolve in tools like Obsidian.

Inverse edges

When a note is the target of edge-tag relationships (e.g. speaker: Deborah on a conversation note creates an edge to Deborah), the inverse predicates appear in the target's frontmatter as multi-value lists:

said:
- "[[conv1|First conversation]]"
- "[[conv2|Second conversation]]"
recipient_of:
- "[[thread:abc@mail.com|Re: Meeting notes]]"

Values use canonical [[target|label]] labeled-ref syntax when the source note has a resolvable display name; otherwise they're plain ids. Forward edge tags (like speaker: Deborah on the source note) pass through as ordinary tag values — no special treatment.

Edge-tag value rewriting

Edge-tag values in the frontmatter are rewritten from canonical keep ids to the exported vault-local path namespace, so [[wikilinks]] in the frontmatter resolve correctly when the vault is opened in Obsidian or similar tools.

Parts and versions sidecars

When --include-parts or --include-versions is passed, notes that have analysis parts or archived versions get a sidecar directory alongside the parent file:

rust-tutorial.md             ← the current note (parent file)
rust-tutorial/               ← sidecar dir (only created if parts/versions exist)
  @P{1}.md                   ← analysis part 1
  @P{2}.md                   ← analysis part 2
  @V{1}.md                   ← previous version (1 step back from current)
  @V{2}.md                   ← 2 steps back
  @V{3}.md                   ← 3 steps back

Notes with no parts or versions get no sidecar dir — plain-note.md stays a single flat file even when both flags are on.

Using with Obsidian

The exported directory can be opened directly as an Obsidian vault. All [[wikilink]] values in the frontmatter (chain navigation, inverse edges, forward edge tags) resolve to the exported files. The Obsidian graph view renders the full relationship structure.

Continuous sync (--sync)

keep data export ~/vault --sync                        # Export + register mirror
keep data export ~/vault --sync --include-parts        # ...with parts sidecars
keep data export ~/vault --sync --stop                 # Stop mirroring
keep data export --list                                # List active mirrors

--sync performs an immediate one-shot markdown export with progress, then registers the directory as a daemon-owned continuous mirror.

Check sync status:

keep daemon                    # Shows "Markdown mirrors active: N" when mirrors are registered

Markdown mode vs JSON mode

FeatureJSON (--format json)Markdown (--format md)
OutputSingle fileDirectory of .md files
Round-trip importYes (keep data import)Yes (keep data import PATH --format md)
Human browsableNoYes (grep, Obsidian, etc.)
Continuous syncNoYes (--sync)
Parts/versionsAlways includedOpt-in (--include-parts, --include-versions)
EmbeddingsExcludedExcluded

Import

keep data import backup.json                 # Merge: skip existing IDs
keep data import backup.json --mode replace  # Replace: clear store first (prompts for confirmation)
keep data import -                           # Read from stdin
keep data import ~/vault --format md         # Recursive markdown import
keep data import ~/vault                     # Auto-detect dir/.md as markdown

Markdown import walks .md files recursively, honors _id and _source_uri, imports top-level scalar tags and scalar lists, and skips exporter-owned metadata such as _content_hash, _version_offset, and chain navigation frontmatter. Keep-export sidecars (@P{N}.md, @V{N}.md) are restored as parts and archived versions when present.

Import Modes

After Import

Imported documents, versions, and parts are queued for re-embedding. Run:

keep daemon     # Process embeddings in background

Until embeddings are processed, imported documents are retrievable by ID (keep get) and visible in keep list, but won't appear in semantic search (keep find).

Export Format

{
  "format": "keep-export",
  "version": 3,
  "exported_at": "2026-02-19T12:00:00",
  "store_info": {
    "document_count": 42,
    "version_count": 120,
    "part_count": 15,
    "collection": "default"
  },
  "documents": [
    {
      "id": "auth-notes",
      "summary": "Authentication patterns for OAuth2...",
      "tags": {"topic": "auth", "_source": "inline"},
      "content_hash": "abc123",
      "created_at": "2026-01-15T10:30:00",
      "updated_at": "2026-02-01T14:22:00",
      "accessed_at": "2026-02-19T09:00:00",
      "versions": [
        {"version": 1, "summary": "...", "tags": {}, "content_hash": "...", "created_at": "..."}
      ],
      "parts": [
        {"part_num": 1, "summary": "...", "tags": {}, "created_at": "..."}
      ]
    }
  ]
}

What's included: document summaries, tags (including system tags like _source), timestamps, version history, structural parts.

What's excluded: embeddings (model-dependent, regenerated on import), store configuration (target uses its own).

Use Cases

Python API

from keep import Keeper

kp = Keeper()

# Streaming export — yields header, then one dict per document.
# Each document dict is self-contained: versions and parts are
# included inline (not yielded separately).
for i, chunk in enumerate(kp.export_iter()):
    if i == 0:
        header = chunk   # {"format", "version", "exported_at", "store_info"}
    else:
        doc = chunk      # {"id", "summary", "tags", ..., "versions": [...], "parts": [...]}

# Convenience: collect everything into a single dict (loads all into memory)
data = kp.export_data()                          # All documents
data = kp.export_data(include_system=False)       # Skip system docs

# Import
stats = kp.import_data(data, mode="merge")        # Skip existing
stats = kp.import_data(data, mode="replace")      # Clear first
stats = kp.import_markdown("~/vault", mode="merge")  # Recursive markdown import
# stats = {"imported": 10, "skipped": 2, "versions": 5, "parts": 3, "queued": 10}