Skip to main content

CheckpointStore

Trait CheckpointStore 

Source
pub trait CheckpointStore: Send + Sync {
Show 13 methods // Required methods fn save( &self, manifest: &CheckpointManifest, ) -> Result<(), CheckpointStoreError>; fn load_latest( &self, ) -> Result<Option<CheckpointManifest>, CheckpointStoreError>; fn load_by_id( &self, id: u64, ) -> Result<Option<CheckpointManifest>, CheckpointStoreError>; fn list(&self) -> Result<Vec<(u64, u64)>, CheckpointStoreError>; fn prune(&self, keep_count: usize) -> Result<usize, CheckpointStoreError>; fn save_state_data( &self, id: u64, data: &[u8], ) -> Result<(), CheckpointStoreError>; fn load_state_data( &self, id: u64, ) -> Result<Option<Vec<u8>>, CheckpointStoreError>; // Provided methods fn list_ids(&self) -> Result<Vec<u64>, CheckpointStoreError> { ... } fn update_manifest( &self, manifest: &CheckpointManifest, ) -> Result<(), CheckpointStoreError> { ... } fn validate_checkpoint( &self, id: u64, ) -> Result<ValidationResult, CheckpointStoreError> { ... } fn recover_latest_validated( &self, ) -> Result<RecoveryReport, CheckpointStoreError> { ... } fn cleanup_orphans(&self) -> Result<usize, CheckpointStoreError> { ... } fn save_with_state( &self, manifest: &CheckpointManifest, state_data: Option<&[u8]>, ) -> Result<(), CheckpointStoreError> { ... }
}
Expand description

Trait for checkpoint persistence backends.

Implementations must guarantee atomic manifest writes (readers never see a partial manifest). The latest.txt pointer is updated only after the manifest is fully written and synced.

Required Methods§

Source

fn save( &self, manifest: &CheckpointManifest, ) -> Result<(), CheckpointStoreError>

Persists a checkpoint manifest atomically.

The implementation writes to a temporary file and renames on success to prevent partial writes from being visible.

§Errors

Returns CheckpointStoreError on I/O or serialization failure.

Source

fn load_latest( &self, ) -> Result<Option<CheckpointManifest>, CheckpointStoreError>

Loads the most recent checkpoint manifest.

Returns Ok(None) if no checkpoint exists yet.

§Errors

Returns CheckpointStoreError on I/O or deserialization failure.

Source

fn load_by_id( &self, id: u64, ) -> Result<Option<CheckpointManifest>, CheckpointStoreError>

Loads a specific checkpoint manifest by ID.

§Errors

Returns CheckpointStoreError::NotFound if the checkpoint does not exist.

Source

fn list(&self) -> Result<Vec<(u64, u64)>, CheckpointStoreError>

Lists all available checkpoints as (checkpoint_id, epoch) pairs.

Results are sorted by checkpoint ID ascending.

§Errors

Returns CheckpointStoreError on I/O failure.

Source

fn prune(&self, keep_count: usize) -> Result<usize, CheckpointStoreError>

Prunes old checkpoints, keeping at most keep_count recent ones.

Returns the number of checkpoints removed.

§Errors

Returns CheckpointStoreError on I/O failure.

Source

fn save_state_data( &self, id: u64, data: &[u8], ) -> Result<(), CheckpointStoreError>

Writes large operator state data to a sidecar file for a given checkpoint.

§Errors

Returns CheckpointStoreError on I/O failure.

Source

fn load_state_data( &self, id: u64, ) -> Result<Option<Vec<u8>>, CheckpointStoreError>

Loads large operator state data from a sidecar file.

Returns Ok(None) if no sidecar file exists.

§Errors

Returns CheckpointStoreError on I/O failure.

Provided Methods§

Source

fn list_ids(&self) -> Result<Vec<u64>, CheckpointStoreError>

Lists all checkpoint IDs without loading manifests.

This is used by crash recovery to enumerate candidates including those with corrupt manifests. Results are sorted ascending.

§Errors

Returns CheckpointStoreError on I/O failure.

Source

fn update_manifest( &self, manifest: &CheckpointManifest, ) -> Result<(), CheckpointStoreError>

Overwrites an existing checkpoint manifest.

Used by Step 6b of the checkpoint protocol to update sink commit statuses after the initial save(). Unlike save(), this does NOT use conditional PUT — it unconditionally overwrites the manifest.

The default implementation delegates to save(), which is correct for backends where save() is idempotent (e.g., filesystem with write-to-temp + rename).

§Errors

Returns CheckpointStoreError on I/O or serialization failure.

Source

fn validate_checkpoint( &self, id: u64, ) -> Result<ValidationResult, CheckpointStoreError>

Validate a specific checkpoint’s integrity.

Checks that the manifest is parseable and, if a state_checksum is present, verifies the sidecar data matches.

§Errors

Returns CheckpointStoreError on I/O failure.

Source

fn recover_latest_validated( &self, ) -> Result<RecoveryReport, CheckpointStoreError>

Walk backward from latest to find the first valid checkpoint.

Returns a RecoveryReport describing the walk. If no valid checkpoint is found, chosen_id is None (fresh start).

§Errors

Returns CheckpointStoreError on I/O failure.

Source

fn cleanup_orphans(&self) -> Result<usize, CheckpointStoreError>

Delete orphaned state files that have no matching manifest.

Returns the number of orphans cleaned up.

§Errors

Returns CheckpointStoreError on I/O failure.

Source

fn save_with_state( &self, manifest: &CheckpointManifest, state_data: Option<&[u8]>, ) -> Result<(), CheckpointStoreError>

Atomically saves a checkpoint manifest with optional sidecar state data.

When state_data is provided, the sidecar (state.bin) is written and fsynced before the manifest. This ensures that if the sidecar write fails, the manifest is never persisted and latest.txt still points to the previous valid checkpoint.

Orphaned state.bin files (written but no manifest) are harmless and cleaned up by prune().

§Errors

Returns CheckpointStoreError on I/O or serialization failure.

Implementors§