Skip to main content

Module watermark_sort

Module watermark_sort 

Source
Expand description

§Watermark-Bounded Sort Operator

Buffers events between watermark boundaries, emits sorted batch when watermark advances. Useful for producing ordered output (e.g., sorted Parquet files) from out-of-order streams.

§Memory Bounds

Only holds events in the (last_watermark, current_watermark) range, which is bounded by max_out_of_orderness. A max_buffer_size safety limit prevents unbounded growth.

§How It Works

  1. Events arrive out of order (within bounded disorder)
  2. Events are buffered until watermark advances
  3. On watermark advance, all events with timestamp <= new watermark are sorted and emitted as a batch
  4. Late events (timestamp <= last emitted watermark) are dropped

Structs§

WatermarkBoundedSortOperator
Watermark-bounded sort operator.