Query$to_arrow_dataset
Query$to_arrow_dataset(max_results=NULL, variables=NULL, max_parallelization=parallely::availableCores()) → Arrow Dataset
Returns an Arrow Dataset representing the results of a query on Redivis. Arrow datasets are backed by files on disk, rather than in memory, allowing you to load a table without contributing to memory usage. The file used by the dataset is stored in your operating system's temp directory.
Since the underlying files for arrow datasets are stored on the filesystem, you should remove them once you're done to prevent excess disk utilization. The following command will remove the temp files associated with the arrow dataset:
Parameters:
max_results
: int, default NULL
The max number of records to load into the arrow dataset. If not specified, the entire query results will be loaded.
variables
: list(str) | character vector
The specific variables to return, e.g., variables = c("name", "date")
. If not specified, all variables in the query results will be returned.
max_parallelization
: int, default parallely::availableCores()
The maximum parallelization when loading the query. Uses the future::multicore
strategy when supported, falling back to future::multisession
if not.
Returns:
Last updated