Query$to_arrow_batch_reader
Query$to_arrow_batch_reader(max_results=NULL, variables=NULL) → RecordBatchReader
Returns a reader that mimics the Arrow RecordBatchStreamReader, which can then be consumed to process batches of rows in a streaming fashion. By streaming data , you can avoid the need to load more than a small chunk of data into memory at a time. This approach is highly scalable, since it won't be limited by available memory or disk.
The batch reader exposes a $read_next_batch()
method, which should be repeatedly called to fetch the next Arrow RecordBatch. If next() returns NULL
, this signals that there is no more data left to read.
Parameters:
max_results
: int, default NULL
The max number of records to load into the dataframe. If not specified, the entire query results will be processed.
variables
: list(str) | character vector
The specific variables to return, e.g., variables = c("name", "date")
. If not specified, all variables in the query results will be returned.
Returns:
class RecordBatchReader()
Methods:
$read_next_batch()
→ Arrow RecordBatch | NULL
Read the next batch of records. If NULL
is returned, indicates that there is nothing left to read.
$close()
→ void
Close the underlying connection and releases all resources.
Notes:
The returned instance largely mirrors the behavior of the Arrow RecordBatchStreamReader, though only the $read_next_batch()
method is implemented.
Make sure to call on.exit(batch_reader$close())
to ensure resources are released if your application prematurely exits.
Examples
Last updated