Skip to content

ORC-2177: Fix array conversion with empty first batch#2638

Open
cxzl25 wants to merge 1 commit into
apache:mainfrom
cxzl25:ORC-2177
Open

ORC-2177: Fix array conversion with empty first batch#2638
cxzl25 wants to merge 1 commit into
apache:mainfrom
cxzl25:ORC-2177

Conversation

@cxzl25

@cxzl25 cxzl25 commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

What changes were proposed in this pull request?

Add a batchSize > 0 guard in ConvertTreeReader.convertVector() before accessing index 0.

Why are the changes needed?

When reading an ORC file with schema evolution converting array<int> to array<string>, an ArrayIndexOutOfBoundsException is thrown if the first batch processed by the element reader has childCount=0 (i.e., all rows in the batch contain empty arrays).

java.lang.ArrayIndexOutOfBoundsException: Index 0 out of bounds for length 0
	at org.apache.orc.impl.ConvertTreeReaderFactory$StringGroupFromAnyIntegerTreeReader.setConvertVectorElement(ConvertTreeReaderFactory.java:1094)
	at org.apache.orc.impl.ConvertTreeReaderFactory$ConvertTreeReader.convertVector(ConvertTreeReaderFactory.java:305)
	at org.apache.orc.impl.ConvertTreeReaderFactory$StringGroupFromAnyIntegerTreeReader.nextVector(ConvertTreeReaderFactory.java:1115)
	at org.apache.orc.impl.TreeReaderFactory$ListTreeReader.nextVector(TreeReaderFactory.java:2892)
	at org.apache.orc.impl.reader.tree.StructBatchReader.readBatchColumn(StructBatchReader.java:66)
	at org.apache.orc.impl.reader.tree.StructBatchReader.nextBatchForLevel(StructBatchReader.java:101)
	at org.apache.orc.impl.reader.tree.StructBatchReader.nextBatch(StructBatchReader.java:78)
	at org.apache.orc.impl.RecordReaderImpl.nextBatch(RecordReaderImpl.java:1444)

How was this patch tested?

Added testIntArrayToStringArrayFirstBatchAllEmpty in TestConvertTreeReaderFactory

Was this patch authored or co-authored using generative AI tooling?

Generated-by: Claude

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants