feat: add array_prepend support#4716
Merged
Merged
Conversation
array_prepend is a RuntimeReplaceable that the analyzer rewrites to array_insert(arr, 1, elem) before Comet serde runs, so it already executes natively through the existing ArrayInsert path. Add a SQL file test to lock in coverage, mark array_prepend as supported in the expressions reference, and record the cross-version audit.
peterxcli
reviewed
Jun 25, 2026
Member
There was a problem hiding this comment.
I notice this sql would fail on comet with error sth like: "INVALID_INDEX_OF_ZERO", and the root cause is because comet's array_insert execution didnt Spark’s short-circuit evaluation.
statement
CREATE TABLE test_array_prepend(
arr ARRAY<INT>,
idx INT
) USING parquet
statement
INSERT INTO test_array_prepend VALUES
(NULL, 0),
(array(1), 1)
query
SELECT array_prepend(arr, element_at(array(9), idx))
FROM test_array_prepend
```
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Which issue does this PR close?
No dedicated issue.
array_prependwas previously listed as planned (🔜) in the expression reference.Rationale for this change
array_prependis aRuntimeReplaceableexpression in Spark. In every version that defines it (3.5.8, 4.0.1, 4.1.1) the analyzer rewrites it toArrayInsert(arr, Literal(1), elem)before Comet's serde runs, and that replacement is identical across all three versions. Comet already implementsArrayInsertasCompatible, soarray_prependalready executes natively end to end. No new Scala serde, protobuf, or Rust code is required. This PR locks that in with a regression test and corrects the documented support status.What changes are included in this PR?
spark/src/test/resources/sql-tests/expressions/array/array_prepend.sql. It is guarded withMinSparkVersion: 3.5becausearray_prependwas added in Spark 3.5.0 and does not exist in 3.4. Coverage includes column/literal/mixed arguments, NULL array yielding NULL, NULL element prepended, empty array, the int/string/boolean/double (including NaN)/long/multibyte-UTF8 element types, and the three cases Spark's ownDataFrameFunctionsSuiteexercises: type coercion (array_prepend(array(1, 2), 1.23D)->[1.23, 1.0, 2.0]), nested-array elements, and binary elements.array_prependas supported (✅) indocs/source/user-guide/latest/expressions.md.docs/source/contributor-guide/expression-audits/array_funcs.md.The
implement-comet-expressionskill was used to scaffold this work, including theaudit-comet-expressioncross-version comparison against Spark 3.4.3, 3.5.8, 4.0.1, and 4.1.1.How are these changes tested?
New SQL file test
array_prepend.sql, run via./mvnw test -Dsuites="org.apache.comet.CometSqlFileTestSuite array_prepend" -Dtest=none. The framework runs each query under both Spark and Comet and asserts the results match and that the expression runs natively.