Decimal arithmetic dtypes and widening rescale cast#8343
Conversation
Binary::return_dtype derives Hive-style result types for decimal + - * / (mirroring arrow-arith's execution-time rules, with precision saturation at the physical width), instead of erroring at plan time. The decimal cast kernel gains widening rescale so expression coercion can align decimal operand types. Needed by vortex-engine for TPC-H Q1/Q6 decimal expressions. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> Signed-off-by: Nicholas Gates <nick@nickgates.com>
Merging this PR will degrade performance by 11.63%
|
| Mode | Benchmark | BASE |
HEAD |
Efficiency | |
|---|---|---|---|---|---|
| ❌ | Simulation | chunked_bool_canonical_into[(1000, 10)] |
20.7 µs | 35.4 µs | -41.66% |
| ❌ | Simulation | chunked_varbinview_canonical_into[(1000, 10)] |
161.9 µs | 198 µs | -18.24% |
| ❌ | Simulation | chunked_varbinview_into_canonical[(1000, 10)] |
177.1 µs | 213.2 µs | -16.89% |
| ❌ | Simulation | decompress_rd[f64, (100000, 0.01)] |
845.9 µs | 980.8 µs | -13.75% |
| ❌ | Simulation | decompress_rd[f64, (100000, 0.1)] |
845.9 µs | 980.7 µs | -13.75% |
| ⚡ | Simulation | decompress_rd[f64, (100000, 0.0)] |
1,024.6 µs | 845 µs | +21.25% |
| ⚡ | Simulation | decompress_rd[f32, (100000, 0.0)] |
586.8 µs | 498.6 µs | +17.68% |
Tip
Investigate this regression by commenting @codspeedbot fix this regression on this PR, or directly use the CodSpeed MCP with your agent.
Comparing ngates/decimal-arithmetic-dtypes (9257307) with develop (31167ca)
Footnotes
-
10 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports. ↩
joseph-isaacs
left a comment
There was a problem hiding this comment.
This looks very slow?
Signed-off-by: "Nicholas Gates" <nick@nickgates.com>
|
|
||
| let from_values_type = array.values_type(); | ||
| if from_values_type == DecimalType::I256 { | ||
| vortex_bail!("rescaling i256 decimals is not supported"); | ||
| } | ||
|
|
||
| let to_values_type = DecimalType::smallest_decimal_value_type(&to); | ||
| if to_values_type == DecimalType::I256 { | ||
| vortex_bail!("rescaling into i256 decimals is not supported"); | ||
| } |
There was a problem hiding this comment.
Why? given appropriate precision/scale it should be possible.
|
|
||
| // Downcasting precision is not yet supported | ||
| if to_decimal_dtype.precision() < from_decimal_dtype.precision() { | ||
| // The target must retain at least the source's integer digits. |
There was a problem hiding this comment.
Its just a TODO right? Worth opening a ticket or something
| /// at the physical width's maximum: vortex lowers precisions | ||
| /// `<= i128::MAX_PRECISION` to Arrow `Decimal128` and wider decimals to | ||
| /// `Decimal256`. | ||
| fn decimal_arithmetic_dtype( |
There was a problem hiding this comment.
Why not use the narrower types?
|
I think this seems fine, but its worth creating issues for all the remaining gaps. The only thing we have right now is this almost year old issue. |
|
|
||
| /// Output decimal type of an arithmetic `operator` over two decimal operands. | ||
| /// | ||
| /// Mirrors the Hive-style rules `arrow-arith` applies at execution time |
There was a problem hiding this comment.
If we have widening system, I think its worth spelling it out in the docs or including a permalink or something to a reference
|
I think its worth expanding the PR description to what's actually here and what isn't so we can map out the follow up work. |
Binary::return_dtype derives Hive-style result types for decimal + - * / (mirroring arrow-arith's execution-time rules, with precision saturation at the physical width), instead of erroring at plan time. The decimal cast kernel gains widening rescale so expression coercion can align decimal operand types.