Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

SQL Reference

A function-by-function and statement-by-statement reference for everything SQE can parse and execute. Every entry lists where the implementation lives so you can jump from “what does this do” to “where do I read the code”.

The reference focuses on what ships in the running engine, not the SQL standard in the abstract. If a function is not listed here, it is not registered in our SessionContext.

How to read these tables

Every page uses the same column shape. The first three columns describe the function in SQE; the four right columns describe how the same idea looks in other engines.

ColumnMeaning
FunctionThe name SQE accepts in SQL. Case-insensitive on the surface, lower-case canonical name.
OriginWhere the implementation comes from. See origins below.
NotesOne-line summary, return type, gotchas, link to source line.
TrinoThe Trino-equivalent function name, or - if Trino has none.
SnowflakeThe Snowflake-equivalent function name, or - if Snowflake has none.
Spark SQLThe Spark-equivalent function name, or - if Spark has none.
DuckDBThe DuckDB-equivalent function name, or - if DuckDB has none.

Origins

Every function in SQE has exactly one origin. Eight values appear:

Origin tagWhat it meansWhere it lives
datafusion-builtinShipped automatically with SessionContext::new(). No SQE registration.Upstream datafusion-functions-* crates.
datafusion-functions-jsonDataFusion JSON helper crate, registered explicitly.datafusion_functions_json::register_all() in session_context.rs:361.
sqe-trino-functionsOur Trino-compatibility crate. Adds Trino names for things DataFusion calls differently.sqe-trino-functions/src/trino_functions.rs and trino_functions_ext.rs.
sqe-trino-functions (ext)Extended Trino aliases. Same crate, separate registration call.register_extended_trino_functions() in the same crate.
sqe-policySecurity crate. Currently exposes one UDF (sha256) used by column masks.sqe-policy/src/sha256_udf.rs.
sqe-catalogIceberg catalog and TVF crate. Provides read_* and table_* table functions.sqe-catalog/src/.
sqe-sqlParser extension. Statements pre-parsed before DataFusion sees them.sqe-sql/src/.
sqe-coordinatorStatement router. Handles statements that need catalog calls or auth before execution.crates/sqe-coordinator/src/query_handler.rs, catalog_ops.rs.

The two registration entry points are crates/sqe-coordinator/src/session_context.rs (cluster mode) and crates/sqe-cli/src/embedded.rs (single-binary mode). Both register the same UDFs / UDTFs in the same order, so a function works the same way in both personas.

Pages

Scalar functions

  • Conditional and null-handling: if, iff, case, coalesce, nullif, greatest, least, nvl, nvl2, typeof, try.
  • String: concat, substring, trim, lower, upper, regex, normalisation, split, format, hash digests.
  • Math: trig, rounding, logs, exponents, sign, modular, base conversion.
  • Date and time: timestamp construction, extraction, formatting, parsing, arithmetic, time-zone handling.
  • Array, map, struct: the 40+ functions from datafusion-functions-nested plus Trino aggregate constructors (map_agg, histogram).
  • JSON: two layered surfaces: Trino-named (json_extract, json_parse) and the datafusion-functions-json json_get_* family.
  • Encoding, hashing, URL: base64, hex, md5, sha224..512, url_extract_*, url_encode, url_decode.

Aggregate and window

  • Aggregate functions: count, sum, avg, statistical, regression, array_agg, string_agg / listagg, histogram, map_agg, approximation.
  • Window functions: row_number, rank, lag, lead, first_value, frame syntax (ROWS BETWEEN, RANGE BETWEEN, GROUPS BETWEEN).

Table-valued functions

  • Table-valued functions: file format (read_parquet, read_csv, read_json, read_delta), Iceberg metadata (table_snapshots, table_history, table_files, table_partitions, table_manifests, table_refs), generators (generate_series, unnest).

Statements

  • DDL: CREATE, ALTER, DROP for tables, schemas, views; partition evolution; branches and tags; column defaults.
  • DML: SELECT, INSERT, UPDATE, DELETE, MERGE, COPY TO, TRUNCATE, time travel (FOR VERSION AS OF, FOR SYSTEM_TIME AS OF, FOR INCREMENTAL BETWEEN), SET WRITE_BRANCH.
  • CALL procedures: system.rewrite_data_files, expire_snapshots, remove_orphan_files, rewrite_manifests, suggest_bloom_filter_columns.
  • GRANT and REVOKE: SQE-specific security extensions. GRANT MASKED WITH, GRANT ROWS WHERE, SHOW GRANTS, SHOW EFFECTIVE GRANTS, CHECK ACCESS.
  • SHOW and EXPLAIN: metadata queries and plan inspection. SHOW CATALOGS, SHOW STATS, EXPLAIN FULL.
  • Operators: arithmetic, string, comparison, null tests, casting (::), set membership.

Embedded CLI

  • Dot-commands: .help, .tables, .schema, .describe, .summarize, .timer, .read, .format. Embedded CLI only.

What is intentionally not in SQE

Some functions appear in the dialect comparison columns as missing. The reasoning:

  • PIVOT, UNPIVOT, QUALIFY, ASOF JOIN, FROM-first syntax: DataFusion’s parser does not accept them. Tracked upstream.
  • Lambda expressions, list comprehensions: DataFusion has no AST node for closures.
  • Oracle / Snowflake DECODE: name collides with DataFusion’s binary decode(input, encoding) helper. CASE WHEN covers the use case.
  • IIF (T-SQL): covered by if (Trino) and iff (Snowflake), both registered.
  • postgres_table_scanner, mysql_table_scanner, sqlite_scanner: out of scope. SQE is Iceberg-first; if you need a non-Iceberg engine, query it where it lives.
  • spatial, vss, fts, excel: niche. Use a tool built for the job (PostGIS, a vector DB, an FTS engine).

The full DuckDB-comparison audit lives at duckdb-comparision.md. The Trino-comparison audit lives at trino-compatibility.md.