Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

DML

Data Manipulation Language: reads, writes, updates, deletes, merges. SQE adds Iceberg time-travel clauses on SELECT, an Iceberg-aware MERGE INTO, and a SET WRITE_BRANCH shortcut for routing writes to a named branch.

Source: sqe-sql/src/time_travel.rs, crates/sqe-coordinator/src/{query_handler, write_handler}.rs.

SELECT

FormOriginNotesTrinoSnowflakeSpark SQLDuckDB
SELECT cols FROM t [WHERE ...] [GROUP BY ...] [HAVING ...] [ORDER BY ...] [LIMIT N]datafusion-builtinStandard SQL.yesyesyesyes
WITH cte AS (...) SELECT ...datafusion-builtinCTE. Multiple CTEs allowed.yesyesyesyes
WITH RECURSIVE cte AS (...) SELECT ...datafusion-builtinRecursive CTE.yesyesnoyes
SELECT DISTINCT cols FROM tdatafusion-builtinDistinct rows.yesyesyesyes
SELECT * EXCLUDE (col, ...) FROM tdatafusion-builtinExclude columns from *.yes--yes
SELECT * REPLACE (expr AS col) FROM tdatafusion-builtinSubstitute one or more columns.yes--yes
SELECT cols FROM t1 [INNER|LEFT|RIGHT|FULL] JOIN t2 ON ...datafusion-builtinAll join types including SEMI / ANTI.yesyesyesyes
SELECT cols FROM t1 USING (col1, col2)datafusion-builtinEquality on shared column names.yesyesyesyes
SELECT cols FROM t1, LATERAL (SELECT ... FROM t2 WHERE ...)datafusion-builtinCorrelated subquery in FROM.yesyespartialyes
SELECT cols FROM t TABLESAMPLE BERNOULLI (5)datafusion-builtinRandom sampling.yesyesyesyes

Time travel (Iceberg-specific)

FormOriginNotes
SELECT ... FROM t FOR VERSION AS OF snapshot_idsqe-sql/time_travel.rsRead a specific snapshot (snapshot id, branch name, or tag name).
SELECT ... FROM t FOR SYSTEM_TIME AS OF timestampdatafusion-builtin (sqlparser native)Read snapshot active at the given timestamp.
SELECT ... FROM t FOR INCREMENTAL BETWEEN SNAPSHOT s1 AND SNAPSHOT s2sqe-sql/time_travel.rsSQE-specific. Returns rows added between two snapshots; useful for CDC-style processing.
-- By snapshot id
SELECT * FROM events FOR VERSION AS OF 8472810294;

-- By branch
SELECT * FROM events FOR VERSION AS OF 'dev_2026_05';

-- By tag
SELECT * FROM events FOR VERSION AS OF 'release_2026_q2';

-- By timestamp
SELECT * FROM events FOR SYSTEM_TIME AS OF TIMESTAMP '2026-04-01 00:00:00';

-- Incremental between two snapshots
SELECT * FROM events
FOR INCREMENTAL BETWEEN SNAPSHOT 1234 AND SNAPSHOT 5678;

INSERT

FormOriginNotes
INSERT INTO t (cols) VALUES (...), (...)sqlparser-rs + sqe-coordinatorMulti-row literal insert.
INSERT INTO t SELECT ... FROM ssqlparser-rs + sqe-coordinatorInsert from query.
INSERT INTO t (col1, col2) SELECT ... FROM ssqlparser-rs + sqe-coordinatorSubset of columns; others get DEFAULT or NULL.
INSERT OVERWRITE t SELECT ... FROM ssqlparser-rs + sqe-coordinatorReplace partition or table data. Targets the partitions implied by the SELECT.
INSERT INTO events VALUES
    (1, 'click', TIMESTAMP '2026-05-08 09:00:00'),
    (2, 'view',  TIMESTAMP '2026-05-08 09:01:00');

INSERT INTO events (id, event_type, occurred_at)
SELECT id, kind, ts FROM staging.raw_events;

UPDATE

FormOriginNotesTrinoSnowflakeSpark SQLDuckDB
UPDATE t SET col = expr [WHERE pred]sqlparser-rs + sqe-coordinatorCoW or MoR by table property.yesyesyesyes
UPDATE t SET col1 = e1, col2 = e2 [WHERE pred]sqlparser-rs + sqe-coordinatorMulti-column set.yesyesyesyes
UPDATE t SET col = expr FROM other o WHERE t.k = o.ksqlparser-rs + sqe-coordinatorUpdate from another table.partialyesyesyes
UPDATE orders
SET status = 'shipped', shipped_at = now()
WHERE tracking_id IS NOT NULL;

DELETE

FormOriginNotesTrinoSnowflakeSpark SQLDuckDB
DELETE FROM t [WHERE pred]sqlparser-rs + sqe-coordinatorCoW (default) or MoR (write.delete.mode = 'merge-on-read').yesyesyesyes
DELETE FROM t USING other o WHERE t.k = o.ksqlparser-rs + sqe-coordinatorDelete by join.partialyesyesyes
TRUNCATE TABLE tsqe-sql/classifier.rsRewrites to DELETE FROM t. Same MoR / CoW behaviour.yesyesyesyes
DELETE FROM events WHERE event_type = 'spam';
TRUNCATE TABLE staging.tmp;

MERGE

FormOriginNotes
MERGE INTO t USING s ON cond WHEN MATCHED THEN UPDATE SET ... WHEN NOT MATCHED THEN INSERT (...) VALUES (...)sqlparser-rs + sqe-coordinatorCoW or MoR. Multiple WHEN MATCHED branches with extra predicates allowed.
MERGE INTO t USING s ON cond WHEN MATCHED THEN DELETEsqlparser-rs + sqe-coordinatorDelete matched rows.
MERGE INTO t USING s ON cond WHEN MATCHED AND pred THEN UPDATE SET ... WHEN MATCHED THEN DELETEsqlparser-rs + sqe-coordinatorConditional MATCHED branches.
MERGE INTO orders t
USING staging.order_updates s
ON t.id = s.id
WHEN MATCHED AND s.status = 'cancelled' THEN DELETE
WHEN MATCHED THEN UPDATE SET status = s.status, updated_at = s.updated_at
WHEN NOT MATCHED THEN INSERT (id, status, created_at) VALUES (s.id, s.status, s.created_at);

Copy-on-Write vs Merge-on-Read

Three table properties control write semantics:

PropertyDefaultEffect
write.delete.mode'copy-on-write'Switch to 'merge-on-read' to write position / equality delete files instead of rewriting whole data files.
write.update.mode'copy-on-write'Same options. MoR UPDATE writes both deletes and inserts.
write.merge.mode'copy-on-write'Same options.

Set per-table via ALTER TABLE:

ALTER TABLE orders SET TBLPROPERTIES (
    'write.delete.mode' = 'merge-on-read',
    'write.update.mode' = 'merge-on-read',
    'write.merge.mode'  = 'merge-on-read'
);

When to choose:

  • CoW: small tables, infrequent writes, predictable read latency. Default.
  • MoR: large tables, frequent small deletes, willing to trade read amplification for write speed. Compact periodically with system.rewrite_data_files.

COPY TO

FormOriginNotesTrinoSnowflakeSpark SQLDuckDB
COPY (SELECT ...) TO 'path' (FORMAT csv|json|parquet)datafusion-builtinWrite query result to a file.-yes-yes
COPY t TO 'path' (FORMAT parquet)datafusion-builtinWrite whole table.-yes-yes
COPY (SELECT ...) TO 'path' (FORMAT parquet, PARTITION_BY 'col1, col2')datafusion-builtinHive-style partitioned output.-yes-yes
COPY (SELECT * FROM events WHERE occurred_at >= DATE '2026-05-01')
    TO '/tmp/may_events.parquet'
    (FORMAT parquet);

COPY events TO 's3://export/events.csv'
    (FORMAT csv, COMPRESSION gzip, HEADER true);

Branch routing (SQE-specific)

Iceberg branches let you isolate writes from the production snapshot. SQE exposes branch routing as a session variable.

StatementNotes
SET WRITE_BRANCH = 'name'Subsequent INSERT / UPDATE / DELETE / MERGE writes go to the named branch.
SET WRITE_BRANCH = DEFAULTReset to the default (main) branch.
SET WRITE_BRANCH = NULLSame as DEFAULT.
ALTER TABLE events CREATE BRANCH staging;
SET WRITE_BRANCH = 'staging';

INSERT INTO events SELECT * FROM new_data;
-- ^ writes to the 'staging' branch only

SELECT count(*) FROM events FOR VERSION AS OF 'staging';
-- ^ reads from staging branch

SET WRITE_BRANCH = DEFAULT;

Session control

StatementOriginNotes
USE catalog.schemasqe-sql/classifier.rsSwitch active catalog and schema. Subsequent unqualified table references use this scope.
USE schemasqe-sql/classifier.rsSwitch schema only.
SET <variable> = <value>sqlparser-rs (DataFusion)DataFusion session config. See EXPLAIN ANALYZE documentation for valid keys.
BEGIN / COMMIT / ROLLBACKsqe-sql/classifier.rsNo-op stubs for JDBC compatibility. SQE does not implement multi-statement transactions; each commit is single-statement.

Comparison summary

OperationSQETrino + IcebergSpark + IcebergDuckDB
SELECT time travelyes (FOR VERSION AS OF, branch / tag / id)yesyespartial
Incremental SELECTFOR INCREMENTAL BETWEEN (SQE-specific)partial via Iceberg incremental APIsyes-
INSERT INTOyesyesyesyes
UPDATE (CoW + MoR)yesyesyesyes
DELETE (CoW + MoR)yesyesyesyes
MERGE INTO (CoW + MoR)yesyesyes-
TRUNCATE TABLEyes (rewrites to DELETE)yesyesyes
COPY TOyes--yes
Branch-routed writesSET WRITE_BRANCH (SQE)partialyes-
Multi-statement transactionsno (no-op stubs)nonoyes