DataFusion’s datafusion-functions-nested crate ships ~40 array and map helpers. SQE adds Trino-named aggregate constructors (map_agg, histogram, multimap_agg).
Function Origin Notes Trino Snowflake Spark SQL DuckDB
[1, 2, 3] (literal)datafusion-builtinElement type from common supertype. yes yes yes yes
make_array(a, b, ...)datafusion-builtinFunction form of literal. - yes yes yes
array(...)datafusion-builtinAlias for make_array. yes yes yes yes
range(start, stop)datafusion-builtinHalf-open integer array. - - - yes
range(start, stop, step)datafusion-builtinWith step. - - - yes
array_repeat(elem, n)datafusion-builtinArray of n copies. yes - - yes
Function Origin Notes Trino Snowflake Spark SQL DuckDB
array_length(a) / cardinality(a)datafusion-builtinNumber of elements. yes yes yes yes
array_dims(a)datafusion-builtinArray of per-dimension sizes (for nested arrays). - - - yes
array_ndims(a)datafusion-builtinNesting depth. - - - yes
array_position(a, elem)datafusion-builtin1-based offset of first match; 0 if missing. yes yes yes yes
array_positions(a, elem)datafusion-builtinArray of all matching offsets. - - - yes
array_contains(a, elem) / array_has(a, elem)datafusion-builtinBoolean membership. yes yes yes yes
array_has_all(a, sub)datafusion-builtinAll of sub are in a. - - - yes
array_has_any(a, sub)datafusion-builtinAny of sub is in a. yes - - yes
Function Origin Notes Trino Snowflake Spark SQL DuckDB
array_append(a, elem)datafusion-builtinAdd to end. yes yes yes yes
array_prepend(elem, a)datafusion-builtinAdd to start. yes - yes yes
array_concat(a1, a2, ...)datafusion-builtinVariadic concat. yes yes yes yes
array_remove(a, elem)datafusion-builtinRemove first occurrence. - - - yes
array_remove_all(a, elem)datafusion-builtinRemove all occurrences. yes - - yes
array_replace(a, from, to)datafusion-builtinReplace first match. yes - yes yes
array_replace_all(a, from, to)datafusion-builtinReplace all matches. - - - yes
array_reverse(a)datafusion-builtinReverse order. yes yes yes yes
array_sort(a)datafusion-builtinAscending sort. NULLs last. yes yes yes yes
array_distinct(a)datafusion-builtinDeduplicate. Preserves first occurrence. yes yes yes yes
array_slice(a, start, end)datafusion-builtin1-based, inclusive. Negative indexes count from end. yes yes yes yes
array_pop_front(a) / array_pop_back(a)datafusion-builtinRemove first / last. yes - - yes
array_resize(a, n [, fill])datafusion-builtinTruncate or pad to length n. - - - yes
array_flatten(a) / flatten(a)datafusion-builtinOne level of flattening. yes yes yes yes
Function Origin Notes Trino Snowflake Spark SQL DuckDB
array_intersect(a, b)datafusion-builtinCommon elements (set-style). yes yes yes yes
array_union(a, b)datafusion-builtinDistinct combination. yes yes yes yes
array_except(a, b)datafusion-builtinIn a but not in b. yes - yes yes
Function Origin Notes Trino Snowflake Spark SQL DuckDB
array_min(a)datafusion-builtinMinimum element. yes yes yes yes
array_max(a)datafusion-builtinMaximum. yes yes yes yes
array_sum(a)datafusion-builtinSum of numeric elements. yes - - yes
array_mean(a)datafusion-builtinAverage. - - - -
array_any_value(a)datafusion-builtinFirst non-NULL element. - - - -
Function Origin Notes Trino Snowflake Spark SQL DuckDB
unnest(a)datafusion-builtinOne row per element. Used in FROM. yes yes yes (explode) yes
unnest(a) WITH ORDINALITYdatafusion-builtinAdds 1-based offset column. yes - - -
-- One row per (order, item) pair
SELECT order_id, item
FROM orders, UNNEST(items) AS t(item);
-- Numbered
SELECT order_id, item, idx
FROM orders, UNNEST(items) WITH ORDINALITY AS t(item, idx);
Function Origin Notes Trino Snowflake Spark SQL DuckDB
map(keys_array, values_array)datafusion-builtinBuild a map from two parallel arrays. yes - yes (map_from_arrays) yes
map_keys(m)datafusion-builtinArray of keys. yes yes yes yes
map_values(m)datafusion-builtinArray of values. yes yes yes yes
map_extract(m, key)datafusion-builtinLookup. NULL if missing. Also accessible via m[key]. yes (element_at) yes (get) yes (element_at) yes (element_at)
cardinality(m)datafusion-builtinNumber of keys. yes yes yes yes
m['key']datafusion-builtinSubscript syntax for map lookup. yes yes yes yes
See Aggregate functions for array_agg, map_agg, histogram, multimap_agg, map_union. The names differ slightly across engines:
SQE Trino Snowflake Spark SQL DuckDB
array_agg(x)array_aggarray_aggcollect_listarray_agg / list
map_agg(k, v)map_aggobject_aggmap_from_arraysmap
histogram(x)histogram- - histogram
multimap_agg(k, v)multimap_agg- - -
Construct Origin Notes Trino Snowflake Spark SQL DuckDB
struct(a, b, ...)datafusion-builtinAnonymous record. - yes (object_construct) yes yes
named_struct('a', x, 'b', y)datafusion-builtinNamed-field record. yes (row(...)) yes (object_construct) yes yes
s.fielddatafusion-builtinField access. yes yes yes yes
(a, b, ...) (row literal)datafusion-builtinAnonymous tuple. yes - yes yes
SELECT named_struct('host', host, 'port', port) AS endpoint
FROM servers;
SELECT endpoint.host, endpoint.port FROM ...;
-- Find products with both 'sale' and 'new' tags
SELECT * FROM products
WHERE array_has_all(tags, ARRAY['sale', 'new']);
-- Find products with any of the listed tags
SELECT * FROM products
WHERE array_has_any(tags, ARRAY['sale', 'clearance']);
SELECT k, v
FROM events, UNNEST(map_keys(histogram(event_type)), map_values(histogram(event_type))) AS t(k, v)
ORDER BY v DESC
LIMIT 10;
SELECT
user_id,
map_agg(setting_key, setting_value) AS preferences
FROM user_settings
GROUP BY user_id;
map_agg errors on duplicate keys. For multimap-style behaviour use multimap_agg.
SELECT order_id, tag
FROM orders, UNNEST(tags) AS t(tag)
WHERE order_id > 100 AND tag LIKE 'priority_%';
DataFusion’s parser does not support lambda syntax (x -> x + 1). Trino, Spark, DuckDB do. The audit rows in features.md note this. Workarounds:
Pre-compute via a CTE plus unnest.
Use map_filter / transform from datafusion-functions-nested once parser support lands upstream.
zip(a, b) (parallel-iterate two arrays). Use unnest against an indexed pair instead.
reduce(a, init, lambda, finish) . Aggregate within a CTE instead.
Snowflake flatten table function (with PATH and OUTER options). Use UNNEST directly.
These are tracked but blocked on lambda support in DataFusion.