Links

Reference

Overview

The redivis python modules provides an interface to construct representations of Redivis entities and to create, modify, read, and delete them.
Resources are generally constructed by chaining together multiple constructor methods, reflecting the hierarchical nature of entities in Redivis. For example, to list all variables on table (which belongs to a dataset in an organization), we would write:
import redivis
variables = (
redivis.organization("Demo")
.dataset("CMS 2014 Medicare Data"
.table("Home health agencies")
.list_variables()
)
This package requires python >= 3.6, and lists pandas >=1.0.0 and requests >= 2.0.0 as its dependencies.

Getting resource properties

Most instances have a properties attribute, containing a dict of the API resource representation of that instance. This attribute will be fully populated with the get representation after calls to get / create / update, and will be partially updated to the list representation after calls to list resources; e.g., dataset.list_tables(). Otherwise this attribute will be None.

Reading data

When reading data from a table, query, or upload, you have the option to return the results as a list of rows or a pandas DataFrame (via the .list_rows() and .to_dataframe() methods).
When calling list_rows, a python list of named tuples will be returned, allowing you to reference values in each row by either the variable name or offset. E.g.,:
rows = (
redivis.query("""
SELECT 1 + 1 AS some_number, 'foo' AS some_string
UNION ALL
SELECT 4, 'bar'
""").list_rows()
)
print(rows) # [Row(some_number='2', some_string='foo'), Row(some_number='4', some_string='bar')]
rows[0]["some_number"] # 2
rows[0][1] # "foo"
When calling to_dataframe, a dataframe will be returned. The dataframe types are automatically set according to the variable type in Redivis, based on the following mapping:
  • integer: int64
  • float: float64
  • date: dateTime (will always have a time component of 00:00:00)
  • dateTime: datetime
  • time: timedelta
  • boolean: boolean (nullable boolean datatype)
  • string: string
  • geography: geopandas.GeoSeries | string (see below)

Geographic variables

If your data contains a variable of type geography, by default the to_dataframe() method will return a GeoPandas dataframe, which extends the base pandas dataframe with powerful GIS functionality. A GeoPandas dataframe contains a single column that is specified as its geospatial index — by default, this will be the first geography variable encountered, though you can explicitly set it to another variable: table.to_dataframe(geography_variable="variable_name") .
If you have geography variables but would prefer to not use GeoPandas, specify geography_variable=None in the arguments. In this case, all geography variables will be stored as strings using the WKT encoding. This is also how geography variables will be encoded when calling the .list_rows() method.

Environment variables

The following environment variables may be set to modify the behavior of the redivis-python client.

REDIVIS_DEFAULT_PROJECT

If set, tables referenced via redivis.table() and unqualified table names in redivis.query() will be assumed to be within the default project.
Takes the form user_name.project_name. All notebooks on Redivis automatically set the default project to that notebook's project.

REDIVIS _DEFAULT_DATASET

If set, tables referenced via redivis.table() and unqualified table names in redivis.query() will be assumed to be within the default dataset.
Takes the form owner_name.project_name. If both a default dataset and project are set, the default project will supersede the dataset.

REDIVIS_API_TOKEN

If using this library in an external environment, you'll need set this env variable to your API token in order to authenticate.
Important: this token acts as a password, and should never be inlined in your code, committed to source control, or otherwise published.