pub trait StreamSource:
Send
+ Sync
+ Debug {
// Required methods
fn schema(&self) -> SchemaRef;
fn stream(
&self,
projection: Option<Vec<usize>>,
filters: Vec<Expr>,
) -> Result<SendableRecordBatchStream, DataFusionError>;
// Provided methods
fn supports_filters(&self, filters: &[Expr]) -> Vec<bool> { ... }
fn output_ordering(&self) -> Option<Vec<SortColumn>> { ... }
}Expand description
A source of streaming data for DataFusion queries.
This trait enables integration between LaminarDB’s push-based event
processing and DataFusion’s pull-based query execution model.
Implementations must be thread-safe as DataFusion may access them
from multiple threads during query planning and execution.
§Filter Pushdown
Sources can optionally support filter pushdown by implementing
supports_filters. When filters are pushed down, the source should
apply them before yielding batches to reduce data transfer.
§Projection Pushdown
Sources should respect the projection parameter in stream() to
only read columns that are needed by the query, improving performance.
Required Methods§
Sourcefn schema(&self) -> SchemaRef
fn schema(&self) -> SchemaRef
Returns the schema of records produced by this source.
The schema must be consistent across all calls and must match
the schema of RecordBatch instances yielded by stream().
Sourcefn stream(
&self,
projection: Option<Vec<usize>>,
filters: Vec<Expr>,
) -> Result<SendableRecordBatchStream, DataFusionError>
fn stream( &self, projection: Option<Vec<usize>>, filters: Vec<Expr>, ) -> Result<SendableRecordBatchStream, DataFusionError>
Creates a stream of RecordBatch instances.
§Arguments
projection- Optional column indices to project. IfNone, all columns are returned. Indices refer to the source schema.filters- Filter expressions that can be applied at the source. The source may partially or fully apply these filters.
§Returns
A stream that yields RecordBatch instances asynchronously.
§Errors
Returns DataFusionError if the stream cannot be created.
Provided Methods§
Sourcefn supports_filters(&self, filters: &[Expr]) -> Vec<bool>
fn supports_filters(&self, filters: &[Expr]) -> Vec<bool>
Returns which filters this source can apply.
For each filter in filters, returns true if the source will
apply that filter, false otherwise. DataFusion uses this to
know which filters still need to be applied after the scan.
The default implementation returns all false, indicating no
filter pushdown support.
§Arguments
filters- The filters being considered for pushdown.
§Returns
A vector of booleans, one per filter, indicating support.
Sourcefn output_ordering(&self) -> Option<Vec<SortColumn>>
fn output_ordering(&self) -> Option<Vec<SortColumn>>
Returns the output ordering of this source, if any.
When a source is pre-sorted (e.g., by event time from an ordered
Kafka partition), declaring the ordering allows DataFusion to
elide SortExec from the physical plan for ORDER BY queries that
match the declared ordering.
The default implementation returns None, indicating the source
has no guaranteed output ordering.