Table.to_arrow_batch_iterator
Table.to_arrow_batch_iterator(max_results=None, *, variables=None, progress=True) → pyarrow.RecordBatch iterator
Returns an iterator that can be used to consume a table in chunks of PyArrow RecordBatches. Allows for streaming workflows where only a small portion of the table is read into memory at a time.
Parameters:
max_results
: int, default None
The maximum number of rows to return. If not specified, all rows in the table will be read.
variables
: list<str>, default None
A list of variable names to read, improving performance when not all variables are needed. If unspecified, all variables will be represented in the returned rows. Variable names are case-insensitive, though the names in the results will reflect the variable's true casing. The order of the columns returned will correspond to the order of names in this list.
progress
: bool, default True
Whether to show a progress bar.
Yields:
Last updated