SochDB SQL Surface & Compatibility
This document defines SochDB's SQL dialect support and the canonical pipeline for SQL execution.
Architecture Overviewβ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β SQL Query Lifecycle β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β β
β βββββββββββββββ βββββββββββββββ βββββββββββββββ β
β β SQL Text β --> β Lexer β --> β Parser β β
β β (any dialect)β β (tokenize) β β (parse) β β
β βββββββββββββββ βββββββββββββββ βββββββββββββββ β
β β β
β v β
β βββββββββββββββββββββββββββββββββββββββββ β
β β Canonical AST β β
β β (dialect-normalized representation) β β
β βββββββββββββββββββββββββββββββββββββββββ β
β β β
β βββββββββββββββββββββββββββββββΌββββββββββββββββββββββ β
β β β β β
β v v v β
β βββββββββββββββββββββββββββ βββββββββββββββββββββββββββ βββββββββ β
β β Validator β β Planner β βExecutorβ β
β β (semantic checks) β β (optimize + plan) β β (run) β β
β βββββββββββββββββββββββββββ βββββββββββββββββββββββββββ βββββββββ β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Core SQL Support (Guaranteed)β
Data Manipulation Language (DML)β
| Statement | Support | Example |
|---|---|---|
SELECT | β Full | SELECT id, name FROM users WHERE age > 21 |
INSERT | β Full | INSERT INTO users (id, name) VALUES (1, 'Alice') |
UPDATE | β Full | UPDATE users SET name = 'Bob' WHERE id = 1 |
DELETE | β Full | DELETE FROM users WHERE id = 1 |
Data Definition Language (DDL)β
| Statement | Support | Example |
|---|---|---|
CREATE TABLE | β Full | CREATE TABLE users (id INT PRIMARY KEY, name TEXT) |
DROP TABLE | β Full | DROP TABLE users |
ALTER TABLE | π Partial | ALTER TABLE users ADD COLUMN email TEXT |
CREATE INDEX | β Full | CREATE INDEX idx_name ON users (name) |
DROP INDEX | β Full | DROP INDEX idx_name |
Idempotent DDLβ
| Statement | Support | Behavior |
|---|---|---|
CREATE TABLE IF NOT EXISTS | β Full | No-op if table exists |
DROP TABLE IF EXISTS | β Full | No-op if table doesn't exist |
CREATE INDEX IF NOT EXISTS | β Full | No-op if index exists |
DROP INDEX IF EXISTS | β Full | No-op if index doesn't exist |
Transactionsβ
| Statement | Support | Notes |
|---|---|---|
BEGIN | β Full | Start transaction |
COMMIT | β Full | Commit transaction |
ROLLBACK | β Full | Rollback transaction |
SAVEPOINT | π Partial | Named savepoints |
Dialect Compatibility (Conflict/Upsert Family)β
SochDB normalizes dialect-specific INSERT variants to a canonical AST representation.
PostgreSQL Styleβ
-- Do nothing on conflict
INSERT INTO users (id, name) VALUES (1, 'Alice')
ON CONFLICT DO NOTHING;
-- Update on conflict (specific columns)
INSERT INTO users (id, name) VALUES (1, 'Alice')
ON CONFLICT (id) DO UPDATE SET name = 'Bob';
MySQL Styleβ
-- Ignore on duplicate (equivalent to ON CONFLICT DO NOTHING)
INSERT IGNORE INTO users (id, name) VALUES (1, 'Alice');
-- Update on duplicate key
INSERT INTO users (id, name) VALUES (1, 'Alice')
ON DUPLICATE KEY UPDATE name = 'Bob';
SQLite Styleβ
-- Ignore on conflict
INSERT OR IGNORE INTO users (id, name) VALUES (1, 'Alice');
-- Replace on conflict (delete + insert)
INSERT OR REPLACE INTO users (id, name) VALUES (1, 'Alice');
-- Abort on conflict (default behavior)
INSERT OR ABORT INTO users (id, name) VALUES (1, 'Alice');
-- Fail on conflict (fail but continue batch)
INSERT OR FAIL INTO users (id, name) VALUES (1, 'Alice');
Internal Representationβ
All dialect forms normalize to:
InsertStmt {
on_conflict: Some(OnConflict {
target: Option<ConflictTarget>, // (id) or ON CONSTRAINT name
action: ConflictAction, // DoNothing, DoUpdate(...), DoReplace, etc.
})
}
Parameterized Queriesβ
SochDB supports two placeholder styles:
Positional Placeholders ($1, $2, ...)β
SELECT * FROM users WHERE id = $1 AND name = $2
Question Mark Placeholders (?)β
SELECT * FROM users WHERE id = ? AND name = ?
Question marks are automatically indexed (1, 2, 3...) during lexing.
Parameter Bindingβ
let result = executor.execute_with_params(
"SELECT * FROM users WHERE id = $1",
&[SochValue::Int(42)]
)?;
Query Featuresβ
SELECT Clausesβ
DISTINCT/ALLFROMwith table aliasesWHEREwith complex predicatesGROUP BYwithHAVINGORDER BYwithASC/DESC/NULLS FIRST/NULLS LASTLIMITandOFFSET- Set operations:
UNION,INTERSECT,EXCEPT
Expressionsβ
- Arithmetic:
+,-,*,/,% - Comparison:
=,!=,<>,<,<=,>,>= - Logical:
AND,OR,NOT IS NULL,IS NOT NULLIN (...),NOT IN (...)BETWEEN ... AND ...LIKEwith wildcardsCASE WHEN ... THEN ... ELSE ... ENDCAST(expr AS type)- Function calls:
COUNT(),SUM(),AVG(), etc.
SochDB Extensionsβ
VECTOR(dimensions)data typeEMBEDDING(dimensions)data typeVECTOR_SEARCH(column, query, k, metric)functionCONTEXT_WINDOWfor LLM context management
Explicit Limitationsβ
| Feature | Status | Notes |
|---|---|---|
| Multi-table JOINs | π Partial | Two-table INNER JOIN only |
| LEFT/RIGHT/FULL JOIN | π Planned | Not yet implemented |
| Correlated subqueries | β | Out of scope |
| Window functions | π Planned | Future enhancement |
| CTEs (WITH clause) | π Planned | Future enhancement |
| Stored procedures | β | Out of scope |
| Triggers | β | Out of scope |
Complexity Analysisβ
| Operation | Time Complexity | Notes |
|---|---|---|
| Lexing | O(n) | n = input length |
| Parsing | O(n) | n = token count |
| AST rewriting | O( | AST |
| SELECT (no index) | O(N) | N = table rows |
| SELECT (with index) | O(log N + K) | K = result rows |
| INSERT | O(log N) | B-Tree index update |
| INSERT (conflict check) | O(log N) | Uniqueness check |
| UPDATE (no index) | O(N) | Full scan |
| UPDATE (with index) | O(log N + K) | K = affected rows |
Client Usageβ
Using the AST-Based Query Executorβ
The recommended way to execute SQL is via the AST-based query executor:
use sochdb::connection::SochConnection;
use sochdb::ast_query::QueryResult;
let conn = SochConnection::open("./data")?;
// Execute SQL using AST-based parser (recommended)
match conn.query_ast("SELECT * FROM users WHERE active = true")? {
QueryResult::Select(rows) => {
for row in rows {
println!("{:?}", row);
}
}
_ => {}
}
// Execute with parameters
let result = conn.query_ast_params(
"INSERT INTO users (id, name) VALUES ($1, $2)",
&[SochValue::Int(1), SochValue::Text("Alice".to_string())]
)?;
// Execute non-query SQL (INSERT, UPDATE, DELETE)
let rows_affected = conn.execute_ast("DELETE FROM users WHERE id = 1")?;
Dialect Supportβ
The AST-based executor automatically normalizes dialect-specific syntax:
// All of these normalize to the same canonical AST:
// MySQL style
conn.execute_ast("INSERT IGNORE INTO users VALUES (1, 'Alice')")?;
// PostgreSQL style
conn.execute_ast("INSERT INTO users VALUES (1, 'Alice') ON CONFLICT DO NOTHING")?;
// SQLite style
conn.execute_ast("INSERT OR IGNORE INTO users VALUES (1, 'Alice')")?;
Filesβ
sochdb-query/src/sql/compatibility.rs- Feature matrix and dialect detectionsochdb-query/src/sql/token.rs- Token types including dialect keywordssochdb-query/src/sql/lexer.rs- Tokenizer with placeholder indexingsochdb-query/src/sql/ast.rs- Canonical AST definitionssochdb-query/src/sql/parser.rs- Recursive descent parsersochdb-query/src/sql/bridge.rs- Unified execution pipelinesochdb-query/src/sql/error.rs- Error typessochdb-query/src/sql/mod.rs- Module exportssochdb/src/ast_query.rs- AST-based client query executor