Skip to main content

SochDB for Coding Agents

SochDB is an agent-native database. Instead of stitching together a vector store, a relational database, and a hand-rolled prompt packer, an agent gets one embedded-or-served engine that does vector search (HNSW), SQL, key-value, graph and temporal edges, semantic cache, and token-budgeted agent memory — all with ACID transactions.

This guide shows two ways to put SochDB in front of a coding agent or LLM:

  1. The sochdb-mcp server — a Model Context Protocol server that exposes SochDB's tools and resources to clients like Claude Desktop, Cursor, and Goose over stdio.
  2. The SochDB Agent Skill — a portable instruction bundle that teaches an agent how to install, connect, store, and retrieve with SochDB through the SDK.
Versions

This page targets core engine 2.0.3 (the sochdb-mcp binary). The language SDKs version independently: Python 0.5.9, Node.js 0.5.3, Go 0.4.5.

License

The sochdb-mcp server is part of the core engine and is AGPL-3.0-or-later (commercial licensing available). The language SDKs (Python, Node.js, Go) are Apache-2.0. The Agent Skill bundle shipped with these docs is Apache-2.0.


What the MCP server is

sochdb-mcp is a small, framework-free Model Context Protocol server. It speaks JSON-RPC with Content-Length framing over stdio and makes direct embedded calls into SochDB — there is no separate database process or network hop. When the server starts it opens an on-disk SochDB at the path you give it (default ./sochdb_data) and serves tools and resources from that store.

  • Protocol version: 2024-11-05.
  • Transport: stdio. All logs go to stderr; stdout is reserved for the protocol stream.
  • Capabilities advertised on initialize: tools and resources. There are no MCP prompts.
Tool names use underscores, not dots

The bundled mcp.json manifest and the server's README.md document an older, dot-named catalog (sochdb.query, memory.search_episodes). Those are stale. The tools the server actually exposes use underscores (sochdb_query, memory_search_episodes, ...). The lists below are the real, shipped names taken from the server source.


Build and run the server

Build the standalone binary from the SochDB workspace:

cargo build --release --package sochdb-mcp
# -> target/release/sochdb-mcp

Run it against a database directory (it is created if it does not exist):

sochdb-mcp --db ./agent_db

The only flag is --db <path> (default ./sochdb_data). For verbose logging to stderr, set RUST_LOG:

RUST_LOG=sochdb_mcp=debug sochdb-mcp --db ./agent_db

You can smoke-test the protocol by piping a single JSON-RPC request to stdin:

echo '{"jsonrpc":"2.0","id":1,"method":"initialize","params":{}}' | sochdb-mcp --db ./test_data
Embedded, single-writer

Because the server opens the database in-process, point each MCP client at its own --db directory unless you intentionally want them to share one. The store is a local file/directory, so the same single-writer rules apply as for any embedded SochDB connection.


Wire it into your client

Every config below points the client at the compiled sochdb-mcp binary and a data directory. Use absolute paths — MCP clients do not resolve relative paths against your shell's working directory.

Add a sochdb entry to claude_desktop_config.json (under ~/Library/Application Support/Claude/ on macOS):

{
"mcpServers": {
"sochdb": {
"command": "/absolute/path/to/target/release/sochdb-mcp",
"args": ["--db", "/absolute/path/to/agent_db"],
"env": { "RUST_LOG": "info" }
}
}
}

Restart Claude Desktop. The sochdb_* tools and sochdb:// resources appear in the client.

Sample configs hardcode one machine's paths

The sample files shipped in the sochdb-mcp/ directory (mcp-claude-desktop.json, mcp-cursor.json, mcp-goose.json) hardcode an absolute path from the author's machine. Copy the shape, then replace command and the --db argument with paths on your system.


Tools the server exposes

All tools default to compact TOON output and can return JSON when you pass format: "json".

Core database tools

ToolPurposeKey args
sochdb_context_queryFetch AI-optimized context with token budgeting; sections packed in priority ordersections[], token_budget (default 4096), format, truncation
sochdb_queryRun a SochQL query through a hardened, prefix-bounded parserquery, format, limit (default 100)
sochdb_getGet the value at a pathpath
sochdb_putSet the value at a path (commits and double-fsyncs)path, value
sochdb_deleteDelete the value at a pathpath
sochdb_list_tablesList tables with semantic metadatainclude_metadata (default true)
sochdb_describeDetailed schema for one tabletable
Query tool is multi-tenant hardened

sochdb_query parses queries through a hardened parser that enforces prefix-bounded scans and table-name validation. This keeps a tool call from walking the whole database or reaching another tenant's keys.

Memory tools

These operate over SochDB's episode / entity / event schema — the same data the Agent Memory SDK writes.

ToolPurposeKey args
memory_search_episodesSimilarity search over episodesquery, k (default 5), episode_type, entity_id
memory_get_episode_timelineEvent timeline for one episodeepisode_id, max_events (default 50), role, include_metrics
memory_search_entitiesSearch entitiesquery, k (default 10), kind
memory_get_entity_factsFacts about an entityentity_id, include_episodes (default true), max_episodes (default 5)
memory_build_contextOne-shot, token-budgeted context packing for a goalgoal, token_budget (default 4096), session_id, episode_id, entity_ids[], include_schema

Log tools

ToolPurposeKey args
logs_tailLast N rows of a log table, most recent firsttable, limit (default 20), where, columns[]
logs_timelineEvents in a time range for an entityentity_id, from_ts, to_ts (microseconds since epoch), limit (default 100), table (default events)

Agentic search tools

SochDB's agentic search exposes three line-oriented tools backed by an in-memory corpus with a trigram index, giving sublinear, line-anchored hits — handy when an agent is grepping over ingested documents.

ToolPurposeKey args
sochdb_grepIndexed grep with line-anchored hitspattern, scope (doc-id prefix filter), limit (default 50)
sochdb_peekRead a line range from a documentdoc_id, start_line, end_line
sochdb_expandExpand plus/minus N lines around a hitdoc_id, line, window (default 5)
Semantic vector search is gated and not yet wired

A semantic-search Cargo feature plus the env var SOCHDB_SEMANTIC_SEARCH=1 will load an embedding model (AllMiniLML6V2). Even when enabled, the actual vector search path is currently disabled in code (embeddings are generated but the vector backend is not wired in yet), so memory_search_* falls back to a keyword / text scan. Treat memory search as lexical for now.


Resources and views

The server also exposes SochDB content as MCP resources, so a client can read schema and pre-shaped views without issuing a tool call.

  • resources/list enumerates every table as sochdb://tables/<name>, annotated with semantic metadata: role, primaryKey, clusterKey, tsColumn, backedByVectorIndex, and embeddingDimension.
  • resources/read serves two URI shapes:
    • sochdb://tables/<name> — a table's rows and schema.
    • sochdb://views/<name> — a predefined view.

The predefined views are:

View URIWhat it shows
sochdb://views/conversation_viewConversation history
sochdb://views/tool_calls_viewTool-call records
sochdb://views/error_viewErrors
sochdb://views/episode_summary_viewEpisode summaries
sochdb://views/entity_directory_viewEntity directory

Resource output is text/x-toon by default, or application/json when the client requests format=json. Well-known memory tables (episodes, events, entities) carry richer metadata — for example, episodes and entities advertise backedByVectorIndex=true with embeddingDimension=1536.


A typical agent loop over MCP

Once the server is registered, an agent works with SochDB entirely through tool calls. A common pattern:

  1. Discover the schema — sochdb_list_tables (or read sochdb://tables/<name> resources).
  2. Store facts — sochdb_put for KV state, or write episodes via the Agent Memory SDK so memory_search_episodes can find them later.
  3. Retrieve under a budget — sochdb_context_query or memory_build_context to pack relevant context into the prompt within a token_budget.
  4. Search code/docssochdb_grep to find hits, then sochdb_peek / sochdb_expand to pull just the surrounding lines.

A minimal memory_build_context request looks like this:

{
"jsonrpc": "2.0",
"id": 7,
"method": "tools/call",
"params": {
"name": "memory_build_context",
"arguments": {
"goal": "What does the user prefer for code style?",
"token_budget": 2048,
"include_schema": false
}
}
}

The SochDB Agent Skill

Not every agent runs MCP. For agents that call the SDK directly (or for clients that load skills rather than tools), SochDB ships an Agent Skill — a single instruction file that teaches an agent the full install → connect → store → retrieve workflow with correct, current API names.

The bundle lives with these docs at /agent-skill/SKILL.md. It covers:

  • When to use SochDB — agent memory, vector/semantic search, RAG storage, or one local store for structured data plus embeddings plus chat history.
  • Installpip install sochdb, npm install @sochdb/sochdb, go get github.com/sochdb/sochdb-go, cargo add sochdb.
  • Connect — embedded (a filesystem path) vs. server (gRPC host:port, default 50051).
  • Core operations — KV plus transactions, the KV-backed SQL subset, vector search, and the headline agent memory API.
  • MCP wiring — the same sochdb-mcp setup and underscore tool names described above.
  • Guardrails — use exact API names; scan_prefix needs a >= 2-byte prefix; and an honest list of features that are not fully shipped.

To use it, drop SKILL.md into your agent's skills/instructions directory (for example, an Anthropic Skill bundle, a Cursor rules file, or a system-prompt include). The file is self-contained and version-stamped.

Keep the skill and these docs in sync

The skill encodes the same non-negotiable facts as this site — versions, the split license, underscore MCP tool names, and the "do not over-claim" guardrails. If you fork it, update both together so an agent never learns a stale API name.


Honest limits to teach your agent

So your agent does not promise features that are not shipped, keep these straight:

  • MCP tool names use underscores (sochdb_query, not sochdb.query). The dot-named catalog in mcp.json / README is stale.
  • MultiShardHnswIndex is a Python-only convenience (a threaded scatter-gather wrapper in the native package), not a core-engine or server feature.
  • Semantic vector search in the MCP server is not yet wired — memory search falls back to lexical scan (see the caution above).
  • At-rest encryption (AES-256-GCM-SIV) exists as a library API but is not wired to a server CLI flag yet.
  • The Postgres-wire endpoint has no auth (cleartext, simple-query, loopback only) — never expose it.

Next steps