redipy.bigquery

Overview

The redipy.bigquery package is a thin wrapper around the google-cloud-bigquery python client, allowing you to leverage its functionality to interface with tables stored on Redivis. All authentication is managed via your Redivis API credentials.

Please note that the only supported methods are those that involve querying tables. Interfaces involved in listing BigQuery resource, referencing BigQuery datasets, or any calls to create, modify, or delete BigQuery resources are not supported.

Usage

Installation

pipenv install -e "git+https://github.com/redivis/redipy.git#egg=redivis-bigquery&subdirectory=bigquery"
REDIVIS_API_TOKEN=<your-api-token> pipenv run python

Authentication

The REDIVIS_API_TOKEN environment variable must be set to your Redivis API token, and the token must have data.data scope (or, if referencing sample tables, data.sample).

REDIVIS_API_TOKEN=your_api_token python [your_script.py]
# or, within your script
os.environ["REDIVIS_API_TOKEN"] = "your_api_token"

Simple queries

from redivis import bigquery
client = bigquery.Client()
# Perform a query.
# Table at https://redivis.com/projects/1008/tables/9443
QUERY = ('SELECT * FROM `ianmathews91.medicare_public_example.high_cost_in_providers_in_CA_output` LIMIT 10')
query_job = client.query(QUERY) # API request
for row in query_job:
print(row)

Working with data frames

from redivis import bigquery
client = bigquery.Client()
# Perform a query.
# Table at https://redivis.com/StanfordPHS/datasets/1411/tables
QUERY = ('SELECT * FROM `stanfordphs.commuting_zone:v1_0.life_expectancy_trends` LIMIT 10')
df = client.query(QUERY).to_dataframe() # API request
print(df)

Referencing tables

All tables referenced in SQL query strings should follow Redivis entity reference rules.

Further reference

Consult the google-cloud-bigquery documentation for further information. Please note the following changes from the google-cloud-bigquery library:

  • redivis-bigquery is read only, meaning that various write methods are not supported

  • You do not need to provide a project_id to any calls; it will be ignored

  • As long as the REDIVIS_API_TOKEN environment variable has been set, you do not need to worry about any additional authentication