Skip to main content

Module two_phase

Module two_phase 

Source
Expand description

Two-phase cross-partition aggregation. Two-phase cross-partition aggregation.

§Architecture

Partition 0: COUNT(*) GROUP BY symbol → partial(AAPL: 500)
Partition 1: COUNT(*) GROUP BY symbol → partial(AAPL: 300)  → Ring 2 merge → final(AAPL: 800)
Partition 2: COUNT(*) GROUP BY symbol → partial(AAPL: 200)

Phase 1 (Partial): Each partition computes local aggregates and produces PartialAggregate entries — one per group key per partition.

Phase 2 (Merge): A coordinator collects partials from all partitions, merges them via MergeAggregator, and produces final results.

§Supported Functions

FunctionPartial StateMerge Logic
COUNTcount: i64sum of counts
SUMsum: f64sum of sums
AVGsum: f64, count: i64total_sum / total_count
MINmin: Option<f64>min of mins
MAXmax: Option<f64>max of maxes
APPROX_DISTINCTHLL sketch bytesHLL union

§Arrow IPC

Partial results can be shipped between nodes as Arrow IPC-encoded RecordBatches via encode_batch_to_ipc / decode_batch_from_ipc.

Structs§

HllSketch
Minimal HyperLogLog sketch for approximate distinct counting.
MergeAggregator
Combines partial aggregates from multiple partitions into final results.
PartialAggregate
A partial aggregate entry for one group from one partition.

Enums§

PartialState
Intermediate partial state for a single aggregate function.
TwoPhaseError
Errors from two-phase aggregation.
TwoPhaseKind
Aggregate function kind that supports two-phase execution.

Functions§

can_use_two_phase
Check if all functions in a query support two-phase execution.
collect_partials
Collect and deserialize partials for a group key from a CrossPartitionAggregateStore.
decode_batch_from_ipc
Decode a RecordBatch from Arrow IPC file format bytes.
deserialize_partials
Deserialize partial aggregates from JSON bytes.
encode_batch_to_ipc
Encode a RecordBatch to Arrow IPC file format bytes.
publish_partials
Publish partial aggregates to a CrossPartitionAggregateStore.
serialize_partials
Serialize partial aggregates to JSON bytes.