Skip to main content

Module checkpoint_batcher

Module checkpoint_batcher 

Source
Expand description

Checkpoint batching for S3 cost optimization Checkpoint batching for S3 cost optimization.

Accumulates state blobs from multiple partitions/operators into a single compressed object before flushing to object storage. Targets 4-32MB object sizes to minimize PUT costs ($0.005/1000 PUTs on S3).

§Batch Format

[magic: 4 bytes "LCB1"]
[version: u32 LE = 1]
[entry_count: u32 LE]
[uncompressed_size: u32 LE]
[crc32c: u32 LE]           ← over compressed_body
[compressed_body: LZ4 block]
  → decoded body is a sequence of frames:
    [key_len: u32 LE][key: UTF-8][data_len: u32 LE][data: bytes]

Structs§

BatchMetrics
Metrics for checkpoint batching operations.
BatchMetricsSnapshot
Immutable snapshot of BatchMetrics.
CheckpointBatcher
Accumulates state blobs and flushes them as a single compressed object.

Functions§

decode_batch
Decode a batch payload into (key, data) pairs.