Expand description
Two-phase cross-partition aggregation. Two-phase cross-partition aggregation.
§Architecture
Partition 0: COUNT(*) GROUP BY symbol → partial(AAPL: 500)
Partition 1: COUNT(*) GROUP BY symbol → partial(AAPL: 300) → Ring 2 merge → final(AAPL: 800)
Partition 2: COUNT(*) GROUP BY symbol → partial(AAPL: 200)Phase 1 (Partial): Each partition computes local aggregates and produces
PartialAggregate entries — one per group key per partition.
Phase 2 (Merge): A coordinator collects partials from all partitions,
merges them via MergeAggregator, and produces final results.
§Supported Functions
| Function | Partial State | Merge Logic |
|---|---|---|
| COUNT | count: i64 | sum of counts |
| SUM | sum: f64 | sum of sums |
| AVG | sum: f64, count: i64 | total_sum / total_count |
| MIN | min: Option<f64> | min of mins |
| MAX | max: Option<f64> | max of maxes |
| APPROX_DISTINCT | HLL sketch bytes | HLL union |
§Arrow IPC
Partial results can be shipped between nodes as Arrow IPC-encoded
RecordBatches via encode_batch_to_ipc / decode_batch_from_ipc.
Structs§
- HllSketch
- Minimal
HyperLogLogsketch for approximate distinct counting. - Merge
Aggregator - Combines partial aggregates from multiple partitions into final results.
- Partial
Aggregate - A partial aggregate entry for one group from one partition.
Enums§
- Partial
State - Intermediate partial state for a single aggregate function.
- TwoPhase
Error - Errors from two-phase aggregation.
- TwoPhase
Kind - Aggregate function kind that supports two-phase execution.
Functions§
- can_
use_ two_ phase - Check if all functions in a query support two-phase execution.
- collect_
partials - Collect and deserialize partials for a group key from a
CrossPartitionAggregateStore. - decode_
batch_ from_ ipc - Decode a
RecordBatchfrom Arrow IPC file format bytes. - deserialize_
partials - Deserialize partial aggregates from JSON bytes.
- encode_
batch_ to_ ipc - Encode a
RecordBatchto Arrow IPC file format bytes. - publish_
partials - Publish partial aggregates to a
CrossPartitionAggregateStore. - serialize_
partials - Serialize partial aggregates to JSON bytes.