You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Much as we allow the reading of a single parquet file to produce multiple EngineDatas in an iterator, we should also allow expression eval on a single EngineData to result in a lazy iterator over multiple EngineDatas.
That's almost the same type as returned by read_[json/parquet]_files, which return Box<dyn Iterator<Item = DeltaResult<Box<dyn EngineData>>> + Send> and is aliased to FileDataReadResultIterator. So we should also factor out the Send requirement and give the type a better name.
The reason is, there are cases where expressions can expand input data significantly. This could cause OOMs, mess up block sizing, etc. As a concrete example, consider a hypothetical future table feature that supports non-materialized generated columns and/or default values. Given enough such columns, a single block read from parquet could easily expand by several factors.
The text was updated successfully, but these errors were encountered:
Much as we allow the reading of a single parquet file to produce multiple
EngineData
s in an iterator, we should also allow expression eval on a singleEngineData
to result in a lazy iterator over multipleEngineData
s.This requires an API change from:
to:
That's almost the same type as returned by
read_[json/parquet]_files
, which returnBox<dyn Iterator<Item = DeltaResult<Box<dyn EngineData>>> + Send>
and is aliased toFileDataReadResultIterator
. So we should also factor out theSend
requirement and give the type a better name.The reason is, there are cases where expressions can expand input data significantly. This could cause OOMs, mess up block sizing, etc. As a concrete example, consider a hypothetical future table feature that supports non-materialized generated columns and/or default values. Given enough such columns, a single block read from parquet could easily expand by several factors.
The text was updated successfully, but these errors were encountered: