Skip to content

Support out of order grouped aggregate accumulation#8379

Open
gatesn wants to merge 7 commits into
developfrom
ngates/grouped-aggregate
Open

Support out of order grouped aggregate accumulation#8379
gatesn wants to merge 7 commits into
developfrom
ngates/grouped-aggregate

Conversation

@gatesn

@gatesn gatesn commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

Our original API for grouped aggregate functions only supported groups that we pre-sorted into complete lists. That means large groups would need to be globally sorted before we could start accumulating their internal state. This is not how most engines define their aggregate APIs.

This PR allows us to support grouped accumulation in the more traditional way by using group IDs.

Signed-off-by: "Nicholas Gates" <nick@nickgates.com>
@gatesn gatesn requested a review from onursatici June 11, 2026 21:41
@gatesn gatesn added the changelog/break A breaking API change label Jun 11, 2026
@gatesn gatesn changed the title Support dense grouped aggregate accumulation Support out of order grouped aggregate accumulation Jun 11, 2026
Signed-off-by: "Nicholas Gates" <nick@nickgates.com>
@gatesn gatesn marked this pull request as ready for review June 11, 2026 21:43
@gatesn gatesn requested a review from a team June 11, 2026 21:43
@gatesn gatesn enabled auto-merge (squash) June 11, 2026 21:43
@codspeed-hq

codspeed-hq Bot commented Jun 11, 2026

Copy link
Copy Markdown

Merging this PR will degrade performance by 18.04%

⚠️ Unknown Walltime execution environment detected

Using the Walltime instrument on standard Hosted Runners will lead to inconsistent data.

For the most accurate results, we recommend using CodSpeed Macro Runners: bare-metal machines fine-tuned for performance measurement consistency.

⚠️ Different runtime environments detected

Some benchmarks with significant performance changes were compared across different runtime environments,
which may affect the accuracy of the results.

Open the report in CodSpeed to investigate

⚡ 6 improved benchmarks
❌ 93 regressed benchmarks
✅ 1438 untouched benchmarks
⏩ 10 skipped benchmarks1

Warning

Please fix the performance issues or acknowledge them on CodSpeed.

Performance Changes

Mode Benchmark BASE HEAD Efficiency
Simulation sum_i32_nullable_all_valid 73.1 µs 105 µs -30.39%
Simulation compare[48] 213 µs 300.6 µs -29.15%
Simulation compare[50] 227.7 µs 319.2 µs -28.66%
Simulation compare[49] 228.2 µs 317.8 µs -28.21%
Simulation compare[44] 207.5 µs 287.9 µs -27.94%
Simulation compare[46] 218.5 µs 302.6 µs -27.78%
Simulation compare[47] 223.5 µs 309.3 µs -27.74%
Simulation compare[40] 190.7 µs 263.7 µs -27.71%
Simulation compare[44] 212.2 µs 292.4 µs -27.43%
Simulation compare[43] 209.2 µs 287.8 µs -27.29%
Simulation compare[45] 218.9 µs 301.1 µs -27.29%
Simulation compare[42] 204.6 µs 281.3 µs -27.28%
Simulation compare[40] 195.6 µs 268.4 µs -27.11%
Simulation compare[41] 204.5 µs 279.5 µs -26.82%
Simulation compare[43] 214.2 µs 292.6 µs -26.8%
Simulation compare[42] 209.4 µs 286 µs -26.77%
Simulation compare[41] 209.3 µs 284.1 µs -26.31%
Simulation compare[39] 199.9 µs 271 µs -26.25%
Simulation compare[38] 195.5 µs 264.8 µs -26.16%
Simulation sum_f64_all_valid 38.1 µs 51.4 µs -25.91%
... ... ... ... ... ...

ℹ️ Only the first 20 benchmarks are displayed. Go to the app to view all benchmarks.

Tip

Investigate this regression by commenting @codspeedbot fix this regression on this PR, or directly use the CodSpeed MCP with your agent.


Comparing ngates/grouped-aggregate (2ef64b2) with develop (d0013ff)

Open in CodSpeed

Footnotes

  1. 10 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

gatesn added 2 commits June 11, 2026 18:04
Signed-off-by: "Nicholas Gates" <nick@nickgates.com>
I, Nicholas Gates <nick@nickgates.com>, hereby add my Signed-off-by to this commit: 6bdcd32

I, Nicholas Gates <nick@nickgates.com>, hereby add my Signed-off-by to this commit: 9bd157f

I, Nicholas Gates <nick@nickgates.com>, hereby add my Signed-off-by to this commit: 50701b2

Signed-off-by: Nicholas Gates <nick@nickgates.com>
@gatesn gatesn requested a review from joseph-isaacs June 12, 2026 02:49
@joseph-isaacs

Copy link
Copy Markdown
Contributor

We wait that vortex was a linear scan engine. This breaks it correct?

@onursatici onursatici left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer this API, not only that it supports out of order groups but also it can keep the memory footprint lower by not having us to materialise each group before calling aggregate.

I think you are conflicting with #8314
Sum and count kernels you added make sense to me but we went with having a encoding agnostic kernel support on the registry on the merged PR instead of having the kernel in the aggregate function's vtable

@gatesn

gatesn commented Jun 12, 2026

Copy link
Copy Markdown
Contributor Author

We wait that vortex was a linear scan engine. This breaks it correct?

No, the idea is that I can take an array, compute the group indices (i.e. using a scalar function), then update the aggregate state per group. This still doesn't require a shuffle / sort. But it does mean aggregate state may grow large. So in the future this is where we would want the ability to spill partials / reconstruct later.

gatesn added 2 commits June 12, 2026 14:23
Signed-off-by: Nicholas Gates <nick@nickgates.com>
Signed-off-by: Nicholas Gates <nick@nickgates.com>
num_groups: usize,
}

fn dense_grouped_input(values: ArrayRef, group_sizes: &[usize]) -> DenseGroupedInput {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit - this is basically the constructor for DenseGroupedInput

Signed-off-by: Nicholas Gates <nick@nickgates.com>
fn grouped_count(
elements: &ArrayRef,
group_ranges: &GroupRanges,
pub(super) fn try_accumulate_grouped(

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit - this seems short enough to inline into the vtable

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

changelog/break A breaking API change

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants