Skip to main content

SochDB v0.5.0 Release Notes

Release Date: February 15, 2026 Focus: Lock-free concurrency, MVCC consolidation, HNSW hot-path allocation removal, real compression, and the foundations for the Knowledge Fabric layer.

Engine-focused release

v0.5.0 is a core engine release. The bulk of the work lands in the Rust workspace crates (sochdb-core, sochdb-storage, sochdb-index). Most changes are internal performance and correctness improvements that are transparent to SDK callers β€” your existing Python, Node.js, Go, and Rust code keeps working unchanged.

The current shipping versions are: Core engine 2.0.3 Β· Python SDK 0.5.9 Β· Node.js SDK 0.5.3 Β· Go SDK 0.4.5. The notes below summarize the engine changelog as published for the 0.5.0 engine line.


Overview​

This release is about removing locks from the read/write hot paths, consolidating duplicated MVCC machinery into a single reusable version-chain abstraction, eliminating per-search heap allocations in HNSW, and replacing placeholder compression with real LZ4 and Zstd. It also introduces two new building blocks β€” the content-addressed Knowledge Object data model and the sochdb-fusion crate β€” that lay groundwork for fused vector + graph + temporal query execution.

Highlights:

  • Lock-free epoch GC β€” the reader registry and version-chain store no longer take global write locks.
  • Consolidated MVCC version chain β€” one generic BinarySearchChain<E> backs both durable storage and concurrent MVCC.
  • Zero-DashMap HNSW hot path β€” searches resolve the entry point and neighbor nodes without DashMap lookups, with reused scratch buffers.
  • Lock-free fat-node version chain β€” 8 version pointers per cache line, CAS-serialized slot reservation.
  • SSI bloom filter β€” a 256-bit read-set bloom filter fast-rejects non-conflicting commits.
  • Real LZ4 / Zstd compression β€” placeholders replaced with lz4_flex and zstd.
  • Columnar zero-allocation row access β€” read a single value or a row view without materializing a per-row HashMap.
  • Knowledge Object data model β€” content-addressed objects with BLAKE3 OIDs.
  • sochdb-fusion crate β€” new workspace member for fused query execution.

Changed​

Lock-Free Epoch GC (sochdb-core)​

Epoch-based garbage collection no longer serializes readers behind a global lock.

  • Lock-free ReaderRegistry β€” the previous RwLock<HashMap<u64, u64>> is replaced by a fixed-size array of 256 cache-line-aligned AtomicU64 slots. register() uses a CAS, unregister() is a single atomic store, and min_active_epoch() is a relaxed scan. The registry is now fully lock-free with zero contention.
  • DashMap-backed EpochGC β€” the version-chain store moves from RwLock<HashMap<K, VersionChain<T>>> to a DashMap, so a GC cycle no longer takes a global write lock.
  • Strict less-than epoch visibility β€” version_at(epoch) now uses epoch < N (previously <=), matching MVCC snapshot semantics.
  • O(1) GC truncation β€” VersionChain::gc() uses truncate(kept) instead of repeated pop_back() calls.

Consolidated MVCC Version Chain (sochdb-core, sochdb-storage)​

The duplicated version-chain logic that previously lived in both durable_storage and mvcc_concurrent is unified behind one generic abstraction.

  • BinarySearchChain<E> β€” a new generic binary-search version chain in sochdb-core::version_chain captures the O(log V) partition_point lookup that was duplicated in durable_storage::VersionChain and mvcc_concurrent::VersionChain. Both modules now wrap BinarySearchChain<E> and delegate core operations to it.
  • ChainEntry trait β€” an abstraction over version entry types (commit_ts, txn_id, set_commit_ts), implemented by durable_storage::Version and mvcc_concurrent::VersionEntry.
  • MvccVersionChain / MvccVersionChainMut traits β€” a unified read/write interface for any version-chain implementation.
  • MvccStore trait β€” a unified store interface (mvcc_get, mvcc_put, mvcc_commit_key, mvcc_abort_key, mvcc_gc), implemented by MvccMemTable.
  • Compile-time concurrency markers β€” ExternalLock, InternalRwLock, and LockFreeAtomic marker types let callers select a version-chain strategy generically at compile time.

HNSW Search: Zero-DashMap Hot Path (sochdb-index)​

Vector searches no longer touch a DashMap on every step, and no longer allocate per query.

  • AtomicU32 entry point β€” a new entry_point_dense field removes the DashMap lookup that previously resolved the entry point on every search. It is set during insert and read with an Acquire load.
  • Local idβ†’dense cache β€” each query builds a per-query HashMap<u128, u32> cache populated from internal_nodes neighbor traversal; every subsequent curr node resolves from the local cache instead of the DashMap.
  • Scratch buffer reuse β€” fast_candidates and fast_results_heap move into a thread-local ScratchBuffers, eliminating the per-search heap allocations that backed the FastCandidate heaps.
HNSW defaults are unchanged

This release changes how the hot path executes, not the index parameters. Core HnswConfig defaults remain M=32, ef_construction=256, ef_search=500, metric Cosine, precision F32. The dimension-aware behavior in the engine is the brute-force flat-scan threshold (<=128D β†’ 10000 vectors, <=384D β†’ 4000, otherwise 1000) β€” there is no dimension-dependent ef_search split.

Lock-Free Fat-Node Version Chain (sochdb-storage)​

A new fat-node layout reduces pointer chasing during version traversal.

  • FatNode struct β€” groups 8 version pointers per node, packed into a 64-byte cache line, reducing pointer chases from O(v) to O(v/8). A CAS on an AtomicU8 count serializes slot reservation within a node.
  • LockFreeVersionChain β€” a fat-node linked list; try_push appends within the current node and only allocates a new node when the current one is full.

SSI Conflict Detection: Stack-Allocated Keys + Bloom Filter (sochdb-storage)​

Serializable Snapshot Isolation commit checks are faster and allocate less.

  • InlineKey (SmallVec<[u8; 32]>) β€” read and write sets use stack-allocated keys; keys of 32 bytes or fewer avoid heap allocation. record_read / record_write now accept &[u8] instead of Vec<u8>.
  • 256-bit Bloom filter β€” each SsiTransaction carries a 4Γ—u64 Bloom filter over its read set. On commit(), non-conflicting write sets are fast-rejected before the full read set is scanned.
  • DashMap-backed SsiManager β€” key_writers moves from RwLock<HashMap> to a DashMap for concurrent shard-level access.

Real LZ4 / Zstd Compression (sochdb-storage)​

Compression is no longer a placeholder.

  • LZ4 block compression β€” implemented with lz4_flex::compress_prepend_size() / decompress_size_prepended(). The wire format is [original_len: u32 LE][payload], with an uncompressed fallback sentinel (len=0).
  • Zstd compression β€” implemented with zstd::encode_all() / zstd::decode_all() at a configurable compression level.
  • Faster dedup hashing β€” switched from DefaultHasher (SipHash) to twox_hash::xxh3::hash64(), roughly 5Γ— faster in this non-adversarial context.

LSCS Temperature Tracker: Lock-Free Threshold (sochdb-storage)​

  • AtomicU64 hot threshold β€” hot_threshold is stored as an AtomicU64 bit pattern (f64::to_bits() / from_bits()) for lock-free reads, and set_hot_threshold() is now a real implementation (previously a no-op).
  • Selective hot-column merge β€” compact_selective() reads and merges hot column stripes into L1, while cold columns get zero-I/O ColumnStripeRef references.

Feature Flag Hygiene (workspace-wide)​

  • sochdb-vector β€” simd-kernels is consolidated into simd. A deprecated alias is kept for backward compatibility and will be removed in v0.6.
  • sochdb-index β€” async-trait is now optional behind the llm-embeddings feature; a new async feature gates the tokio opt-in.
  • sochdb-core β€” serde_json moves from optional (analytics-only) to an always-on dependency; blake3 is added as a workspace dependency.
  • deny.toml β€” sochdb-index is added to the license-check exclusion list.
  • Workspace β€” the new sochdb-fusion crate is added to the workspace members.

Added​

Columnar Zero-Allocation Row Access (sochdb-core, sochdb-storage)​

You can now read columnar query results one value or one row at a time without building a per-row HashMap.

  • TypedColumn::value_at(idx) β€” extracts a single SochValue from a columnar array without materializing a row map.
  • ColumnarQueryResult::row_view(idx) β€” returns a ColumnarRowView that resolves column values on demand from the underlying arrays: O(1) column lookup plus O(1) array read, with zero allocation per row.
  • ColumnarRowView::get(column) β€” named-column access without HashMap overhead.
  • ColumnarQueryResult::into_query_result() β€” a backward-compatible materialization back to the row-oriented QueryResult.

DurableStorage Fast-Path APIs (sochdb-storage)​

Lightweight read paths that skip full MVCC bookkeeping for read-only access.

  • begin_read_only_fast() / abort_read_only_fast() β€” read transactions that bypass the active_txns DashMap and full MVCC bookkeeping.
  • read_latest(key) β€” a single-key read at the current timestamp with no transaction overhead.
  • scan_latest(prefix) β€” a prefix scan at the current timestamp with no transaction overhead.

Knowledge Object Data Model (sochdb-core) β€” New Module​

  • knowledge_object.rs β€” a content-addressed KnowledgeObject with a BLAKE3 object ID (OID), embedded edges, multi-space embeddings, bitemporal coordinates, and provenance chains. This is the foundation for the planned Knowledge Fabric layer.
Library foundation, not a wired-up feature yet

The Knowledge Object module ships as a sochdb-core building block. It is not yet exposed as a first-class API in the language SDKs or the server in this release β€” treat it as a foundation for future work rather than an end-user feature.

sochdb-fusion Crate β€” New​

  • A new workspace member for fused query execution across vector, graph, and temporal predicates.
New crate, foundation stage

sochdb-fusion is added to the workspace in this release as the home for fused execution. It is early-stage scaffolding for upcoming cross-modal query work, not yet a stable public API.

sochdb-bench Crate β€” New​

  • A Criterion micro-benchmark suite covering HNSW, storage, and MVCC, with optimization results documented in the engine's OPTIMIZATIONS.md.

Fixed​

  • Compression divide-by-zero β€” CompressionEngine now guards with compressed.len() > 0 before computing the compression ratio.

Version Bump: 0.4.9 β†’ 0.5.0​

The engine line moved from 0.4.9 to 0.5.0 across all 13 workspace crates, plus sochdb-kernel, sochdb-plugin-logging, and sochdb-python (both its Cargo.toml and pyproject.toml). Docker tags, docs, and READMEs were updated to match.


Upgrading​

For SDK users, no code changes are required by this release β€” the changes are internal to the engine. Make sure you are on a current toolchain and current SDK builds:

pip install --upgrade sochdb
# General-purpose embedded usage (pure-Python SDK, v0.5.9).
# Database/Namespace/Collection come from this SDK package.
from sochdb import Database

db = Database.open("./my_db")
db.put(b"agent/session/1", b"hello")
print(db.get(b"agent/session/1"))
db.close()
Two Python packages, both imported as sochdb

There are two importable sochdb packages. The pure-Python ctypes SDK (v0.5.9) above is the broad embedded + server SDK (Database, Namespace, Collection, Queue, AgentMemory, temporal-graph, semantic-cache, StudioClient) β€” prefer it for general use. The separate PyO3 native engine (v2.0.3) exposes index primitives such as HnswIndex, BM25Index, RRFFusion, ThreeLaneHybridIndex, MultiShardHnswIndex, TableDatabase, the build_index* helpers, and recommended_hnsw_params. MultiShardHnswIndex is a threaded scatter-gather wrapper that exists only in the native Python package; it is not a core-engine or server type.

Deprecation in the next release

The sochdb-vector simd-kernels feature flag is deprecated in 0.5.0 in favor of simd. The alias still works in 0.5.x but will be removed in v0.6 β€” switch any explicit simd-kernels usage to simd now.


License​

SochDB is split-licensed by component:

  • The core engine β€” the Rust workspace, the published sochdb crate, the server, and the MCP server β€” is AGPL-3.0-or-later, with commercial licensing available.
  • The language SDKs (Python, Node.js, Go) are Apache-2.0.

Resources​