Redivis API
User documentationredivis.com
  • Introduction
  • Referencing resources
  • Client libraries
    • redivis-js
      • Getting started
      • Examples
    • redivis-python
      • Getting started
      • Reference
        • redivis
          • redivis.current_notebook
          • redivis.file
          • redivis.organization
          • redivis.query
          • redivis.table
          • redivis.user
        • Dataset
          • Dataset.add_labels
          • Dataset.create
          • Dataset.create_next_version
          • Dataset.delete
          • Dataset.exists
          • Dataset.get
          • Dataset.list_tables
          • Dataset.list_versions
          • Dataset.query
          • Dataset.release
          • Dataset.remove_labels
          • Dataset.table
          • Dataset.unrelease
          • Dataset.update
          • Dataset.version
        • File
          • File.download
          • File.get
          • File.read
          • File.stream
        • Member
          • Member.add_labels
          • Member.exists
          • Member.get
          • Member.remove_labels
          • Member.update
        • Notebook
          • Notebook.create_output_table
        • Organization
          • Organization.dataset
          • Organization.list_datasets
          • Organization.list_members
          • Organization.member
        • Query
          • Query.download_files
          • Query.get
          • Query.list_files
          • Query.list_rows
          • Query.to_arrow_batch_iterator
          • Query.to_arrow_dataset
          • Query.to_arrow_table
          • Query.to_dataframe
          • Query.to_geopandas_dataframe
          • Query.to_dask_dataframe
          • Query.to_pandas_dataframe
          • Query.to_polars_lazyframe
        • Table
          • Table.add_files
          • Table.create
          • Table.delete
          • Table.download
          • Table.download_files
          • Table.get
          • Table.exists
          • Table.list_files
          • Table.list_rows
          • Table.list_uploads
          • Table.list_variables
          • Table.to_arrow_batch_iterator
          • Table.to_arrow_dataset
          • Table.to_arrow_table
          • Table.to_dataframe
          • Table.to_geopandas_dataframe
          • Table.to_dask_dataframe
          • Table.to_pandas_dataframe
          • Table.to_polars_lazyframe
          • Table.update
          • Table.upload
          • Table.variable
        • Upload
          • Upload.create
          • Upload.delete
          • Upload.exists
          • Upload.get
          • Upload.insert_rows
          • Upload.list_variables
          • Upload.to_*
        • Version
          • Version.dataset
          • Version.delete
          • Version.exists
          • Version.get
          • Version.previous_version
          • Version.next_version
        • User
          • User.dataset
          • User.list_datasets
          • User.workflow
          • User.list_workflows
        • Variable
          • Variable.get
          • Variable.exists
          • Variable.update
        • Workflow
          • Workflow.get
          • Workflow.exists
          • Workflow.list_tables
          • Workflow.query
          • Workflow.table
      • Examples
        • Listing resources
        • Querying data
        • Reading tabular data
        • Uploading data
        • Working with non-tabular files
    • redivis-r
      • Getting started
      • Reference
        • redivis
          • redivis$current_notebook
          • redivis$file
          • redivis$organization
          • redivis$query
          • redivis$table
          • redivis$user
        • Dataset
          • Dataset$create
          • Dataset$create_next_version
          • Dataset$delete
          • Dataset$exists
          • Dataset$get
          • Dataset$list_tables
          • Dataset$query
          • Dataset$release
          • Dataset$table
          • Dataset$unrelease
          • Dataset$update
        • File
          • File$download
          • File$get
          • File$read
          • File$stream
        • Notebook
          • Notebook$create_output_table
        • Organization
          • Organization$dataset
          • Organization$list_datasets
        • Query
          • Query$download_files
          • Query$get
          • Query$list_files
          • Query$to_arrow_batch_reader
          • Query$to_arrow_dataset
          • Query$to_arrow_table
          • Query$to_data_frame
          • Query$to_data_table
          • Query$to_tibble
          • Query$to_sf_tibble
        • Table
          • Table$add_files
          • Table$create
          • Table$delete
          • Table$download
          • Table$download_files
          • Table$get
          • Table$exists
          • Table$list_files
          • Table$list_uploads
          • Table$list_variables
          • Table$to_arrow_batch_reader
          • Table$to_arrow_dataset
          • Table$to_arrow_table
          • Table$to_data_frame
          • Table$to_data_table
          • Table$to_tibble
          • Table$to_sf_tibble
          • Table$update
          • Table$upload
          • Table$variable
        • Upload
          • Upload$create
          • Upload$delete
          • Upload$exists
          • Upload$get
          • Upload$insert_rows
          • Upload$list_variables
          • Upload$to_*
        • User
          • User$dataset
          • User$list_datasets
          • User$workflow
          • User$list_workflows
        • Variable
          • Variable$get
          • Variable$exists
          • Variable$update
        • Workflow
          • Workflow$get
          • Workflow$exists
          • Workflow$list_tables
          • Workflow$query
          • Workflow$table
      • Examples
        • Listing resources
        • Querying data
        • Reading tabular data
        • Uploading data
        • Working with non-tabular data
  • REST API
    • General structure
    • Authorization
    • Access
      • get
      • list
    • Datasets
      • delete
      • get
      • list
      • patch
      • post
    • Exports
      • download
      • get
      • post
    • Files
      • createSignedUrl
      • get
      • head
      • post
    • Members
      • get
      • list
    • Queries
      • get
      • post
      • listRows
    • ReadSessions
      • post
      • getStream
    • Tables
      • createTempUploads
      • delete
      • get
      • list
      • listRows
      • patch
      • post
    • Uploads
      • delete
      • get
      • insertRows
      • list
      • listRows
      • post
    • Variables
      • get
      • list
      • patch
    • Versions
      • delete
      • get
      • list
      • post
      • release
      • unrelease
    • Workflows
      • get
      • list
  • Resource definitions
    • Access
    • Dataset
    • Export
    • Member
    • Organization
    • Query
    • Table
    • Upload
    • User
    • Variable
    • Version
    • Workflow
Powered by GitBook
On this page
  • Overview
  • General structure
  • Escaping names
  • Reference Ids
  • Locating the reference id
  • Examples

Was this helpful?

Referencing resources

PreviousIntroductionNextredivis-js

Last updated 4 months ago

Was this helpful?

Overview

In many parts of this API, we will be referencing on Redivis, as well as their related dataset or workflow. The API uses a consistent structure to uniquely identify tables, as specified below.

In the simplest case, tables are referenced by their name, alongside the name of the table's workflow or dataset, as well as the name the dataset or workflow owner.

Additionally, to handle whitespace and non-standard characters. You can also provide an to ensure that your references don't break when a table, dataset, or workflow gets renamed.

General structure

All tables on Redivis belong to either a dataset or a workflow. All datasets belong to either a user or organization, and all workflows belong to a user.

A table reference reflects this hierarchy, taking the following form:

ownerName.workflowIdentifier|datasetIdentifier.tableIdentifer

The ownerName will represent the user or organization that owns the dataset / workflow.

The workflowIdentifier consists of the workflow name, followed by an optional referenceId prefaced by a colon. The name of the workflow may be .

workflowIdentifier = workflowName[:workflowReferenceId]

The datasetIdentifier consists of the dataset name, followed by an optional referenceId prefaced by a colon. The name of the dataset may be . Additionally, the dataset identifier may contain a sample flag :sample as well as a version identifier. The version identifier will identify a particular version of the dataset (of the form, v1_0, the current version current, or the next (unreleased) version next . If no version is specified the current version will be used by default.

datasetIdentifier = datasetName[:datasetReferenceId][:sample][:versionIdentifier]

versionIdentifier = v1_0|current|next

Make sure to specify a versionIdentifier to avoid errors or inconsistent results when new versions get released, unless you explicitly want to always reference the latest version of the dataset.

tableIdentifier = tableName[:tableReferenceId]

Escaping names

The user_name will never need to be escaped, as these names can only contain word characters ([A-Za-z0-9_]). These names are case-insensitive.

Dataset, workflow, and table names can contain a wide array of characters. To facilitate programmatic references, these names can be escaped with the following rules:

  1. All non alpha-numeric and underscore characters in names and version tags are replaced by an underscore (_) character.

  2. Multiple underscore characters are collapsed into one.

  3. Leading and trailing underscores are removed.

  4. All names are case-insensitive.

For example:

  • Census dataset: 1940-1980 -> census_dataset_1940_1980

  • ~~Leading and trailing characters. -> leading_and_trailing_characters

Uniqueness is enforced for all escaped names within the relevant scope. For example, all tables in a workflow and all datasets in an organization will have a unique escaped name.

If a name contains colons (:), periods (.), or backticks (`), they must be escaped.

Reference Ids

While it is obviously convenient to reference tables, datasets, and workflows by their name, this presents a challenge if these resources get renamed over time. In order to avoid code breakage due to renames, each resource has a 4-character (lowercase, alphanumeric) referenceId associated with it. This identifier will always be unique to the relevant scope — that is, a table's referenceId is unique across all tables in its dataset / workflow, and the referenceId of a dataset or workflow is unique to all datasets and workflows for that user.

If you're writing code within Redivis (table queries, transforms, notebooks), the referenceId will generally be pre-populated for you. This section is generally only applicable if you're using the Redivis API from an external environment (e.g., a Jupyter notebook running on your computer).

For tables that belong to a dataset, the referenceId will be consistent across all versions of the dataset, as well as for sample tables. This allows for you to easily change the version / sample without needing to update any reference ids.

Locating the reference id

In the URL bar

In the table export modal

On the dataset page

You can also find dataset-specific information by clicking on API Information link on the dataset overview page:

Via the API

Finally, the referenceId is returned as a property on dataset and table resource via the API. Table's also have a qualifiedReference property, which represents the full reference to the table (owner.dataset|workflow.table), including all reference ids and version specifiers.

Examples

SELECT * FROM 
`demo.ghcn_daily_weather_data.daily_observations` 
LIMIT 100
import redivis

user = redivis.user("demo")
dataset = user.dataset("ghcn_daily_weather_data")
table = dataset.table("daily_observations")

df = table.to_dataframe(max_results=100)
library(redivis)

user <- redivis$user("demo")
dataset <- user$dataset("ghcn_daily_weather_data")
table <- dataset$table("daily_observations")

data <- table$to_tibble(max_results=100)
import * as redivis from 'redivis'

const table = redivis.user('demo')
    .dataset('ghcn_daily_weather_data')
    .table('daily_observations')

const rows = table.list_rows({ maxResults: 100} )
https://redivis.com/api/v1
    /tables/demo.ghcn_daily_weather_data.daily_observations

By default this uses the latest version of the dataset. If we want to work with version 1.0:

SELECT * FROM 
`demo.ghcn_daily_weather_data:v1_0.daily_observations` 
LIMIT 100
import redivis

user = redivis.user("demo")
dataset = user.dataset("ghcn_daily_weather_data", version="1.0")
# Alternatively, dataset = user.dataset("ghcn_daily_weather_data:v1_0")
table = dataset.table("daily_observations")

df = table.to_dataframe(max_results=100)
library(redivis)

user <- redivis$user("demo")
dataset <- user$dataset("ghcn_daily_weather_data", version="1.0")
# Alternatively, dataset <- user$dataset("ghcn_daily_weather_data:v1_0")
table <- dataset$table("daily_observations")

data <- table$to_tibble(max_results=100)
import * as redivis from 'redivis'

const table = redivis.user('demo')
    .dataset('ghcn_daily_weather_data', { version: '1.0' })
    // Alternatively, dataset("ghcn_daily_weather_data:v1_0")
    .table('daily_observations')

const rows = table.list_rows({ maxResults: 100} )
https://redivis.com/api/v1
    /tables/demo.ghcn_daily_weather_data:v1_0.daily_observations

If we want to work with the 1% sample:

SELECT * FROM 
`demo.ghcn_daily_weather_data:sample.daily_observations`
LIMIT 100
import redivis

user = redivis.user("demo")
dataset = user.dataset("ghcn_daily_weather_data", sample=True)
# Alternatively, dataset = user.dataset("ghcn_daily_weather_data:sample")
table = dataset.table("daily_observations")

df = table.to_dataframe(max_results=100)
library(redivis)

user <- redivis$user("demo")
dataset <- user$dataset("ghcn_daily_weather_data", sample=TRUE)
# Alternatively, dataset <- user$dataset("ghcn_daily_weather_data:sample")
table <- dataset$table("daily_observations")

data <- table$to_tibble(max_results=100)
import * as redivis from 'redivis'

const table = redivis.user('demo')
    .dataset('ghcn_daily_weather_data', { sample: true })
    // Alternatively, dataset("ghcn_daily_weather_data:sample")
    .table('daily_observations')

const rows = table.list_rows({ maxResults: 100} )
https://redivis.com/api/v1
    /tables/demo.ghcn_daily_weather_data:sample.daily_observations
SELECT * FROM 
`demo.ghcn_daily_weather_data:7br5:v1_1:.daily_observations:6fff` 
LIMIT 100
import redivis

user = redivis.user("demo")
dataset = user.dataset("ghcn_daily_weather_data:7br5:v1_1")
table = dataset.table("daily_observations:6fff")

df = table.to_dataframe(max_results=100)
library(redivis)

user <- redivis$user("demo")
dataset <- user$dataset("ghcn_daily_weather_data:7br5:v1_1")
table <- dataset$table("daily_observations:6fff")

data <- table$to_tibble(max_results=100)
import * as redivis from 'redivis'

const table = redivis.user('demo')
    .dataset('ghcn_daily_weather_data:7br5:v1_1')
    .table('daily_observations:6fff')

const rows = table.list_rows({ maxResults: 100} )
https://redivis.com/api/v1
    /tables/demo.ghcn_daily_weather_data:7br5:v1_1:.daily_observations:6fff
SELECT * FROM 
imathews.demo_workflow:t066.annual_precipitation:vdwn
LIMIT 100
import redivis

user = redivis.user("imathews")
workflow = user.workflow("demo_workflow:t066")
table = workflow.table("annual_precipitation:vdwn")

df = table.to_dataframe(max_results=100)
library(redivis)

user <- redivis$user("imathews")
workflow <- user$workflow("demo_workflow:t066")
table <- workflow$table("annual_precipitation:vdwn")

data <- table$to_tibble(max_results=100)
import * as redivis from 'redivis'

const table = redivis.user('imathews')
    .dataset('demo_workflow:t066')
    .table('annual_precipitation:vdwn')

const rows = table.list_rows({ maxResults: 100} )
https://redivis.com/api/v1
    /tables/imathews.demo_workflow:t066.annual_precipitation:vdwn

The tableIdentifier consists of the table name, followed by an optional referenceId prefaced by a colon. The name of the table may be .

One of the easiest ways to find the referenceId is to look at your URL bar. For example, if we navigate to the daily observations table in the GHCN dataset, our URL will be: . Here you can see that the dataset's full identifier is 7br5-41440fjzk. The referenceId is the first part of this identifier: 7br5. Similarly, the referenceId for the table is 6fff.

You can also find this information by for a table, and clicking on the tab for programmatic export information. Make sure the "exact references" box is checked, and you'll see the referenceId included in the template.

We can reference the "Daily observations" table from the as:

Finally, we can provide to prevent things from breaking if a dataset or table is renamed. Make sure also to specify a specific dataset version to avoid changes when a new version is released:

Referencing is quite similar, though workflows don’t have versions or samples:

https://redivis.com/datasets/7br5-41440fjzk/tables/6fff-2djw7v7mw
navigating to the export modal
GHCN Climatology dataset
tables in a workflow
tables
names may be escaped
optional reference id
escaped
escaped
escaped
referenceIds
Use the export interface to view referenceIds and example code