Redivis API
User documentationredivis.com
  • Introduction
  • Referencing resources
  • Client libraries
    • redivis-js
      • Getting started
      • Examples
    • redivis-python
      • Getting started
      • Reference
        • redivis
          • redivis.current_notebook
          • redivis.file
          • redivis.organization
          • redivis.query
          • redivis.table
          • redivis.user
        • Dataset
          • Dataset.add_labels
          • Dataset.create
          • Dataset.create_next_version
          • Dataset.delete
          • Dataset.exists
          • Dataset.get
          • Dataset.list_tables
          • Dataset.list_versions
          • Dataset.query
          • Dataset.release
          • Dataset.remove_labels
          • Dataset.table
          • Dataset.unrelease
          • Dataset.update
          • Dataset.version
        • File
          • File.download
          • File.get
          • File.read
          • File.stream
        • Member
          • Member.add_labels
          • Member.exists
          • Member.get
          • Member.remove_labels
          • Member.update
        • Notebook
          • Notebook.create_output_table
        • Organization
          • Organization.dataset
          • Organization.list_datasets
          • Organization.list_members
          • Organization.member
        • Query
          • Query.download_files
          • Query.get
          • Query.list_files
          • Query.list_rows
          • Query.to_arrow_batch_iterator
          • Query.to_arrow_dataset
          • Query.to_arrow_table
          • Query.to_dataframe
          • Query.to_geopandas_dataframe
          • Query.to_dask_dataframe
          • Query.to_pandas_dataframe
          • Query.to_polars_lazyframe
        • Table
          • Table.add_files
          • Table.create
          • Table.delete
          • Table.download
          • Table.download_files
          • Table.get
          • Table.exists
          • Table.list_files
          • Table.list_rows
          • Table.list_uploads
          • Table.list_variables
          • Table.to_arrow_batch_iterator
          • Table.to_arrow_dataset
          • Table.to_arrow_table
          • Table.to_dataframe
          • Table.to_geopandas_dataframe
          • Table.to_dask_dataframe
          • Table.to_pandas_dataframe
          • Table.to_polars_lazyframe
          • Table.update
          • Table.upload
          • Table.variable
        • Upload
          • Upload.create
          • Upload.delete
          • Upload.exists
          • Upload.get
          • Upload.insert_rows
          • Upload.list_variables
          • Upload.to_*
        • Version
          • Version.dataset
          • Version.delete
          • Version.exists
          • Version.get
          • Version.previous_version
          • Version.next_version
        • User
          • User.dataset
          • User.list_datasets
          • User.workflow
          • User.list_workflows
        • Variable
          • Variable.get
          • Variable.exists
          • Variable.update
        • Workflow
          • Workflow.get
          • Workflow.exists
          • Workflow.list_tables
          • Workflow.query
          • Workflow.table
      • Examples
        • Listing resources
        • Querying data
        • Reading tabular data
        • Uploading data
        • Working with non-tabular files
    • redivis-r
      • Getting started
      • Reference
        • redivis
          • redivis$current_notebook
          • redivis$file
          • redivis$organization
          • redivis$query
          • redivis$table
          • redivis$user
        • Dataset
          • Dataset$create
          • Dataset$create_next_version
          • Dataset$delete
          • Dataset$exists
          • Dataset$get
          • Dataset$list_tables
          • Dataset$query
          • Dataset$release
          • Dataset$table
          • Dataset$unrelease
          • Dataset$update
        • File
          • File$download
          • File$get
          • File$read
          • File$stream
        • Notebook
          • Notebook$create_output_table
        • Organization
          • Organization$dataset
          • Organization$list_datasets
        • Query
          • Query$download_files
          • Query$get
          • Query$list_files
          • Query$to_arrow_batch_reader
          • Query$to_arrow_dataset
          • Query$to_arrow_table
          • Query$to_data_frame
          • Query$to_data_table
          • Query$to_tibble
          • Query$to_sf_tibble
        • Table
          • Table$add_files
          • Table$create
          • Table$delete
          • Table$download
          • Table$download_files
          • Table$get
          • Table$exists
          • Table$list_files
          • Table$list_uploads
          • Table$list_variables
          • Table$to_arrow_batch_reader
          • Table$to_arrow_dataset
          • Table$to_arrow_table
          • Table$to_data_frame
          • Table$to_data_table
          • Table$to_tibble
          • Table$to_sf_tibble
          • Table$update
          • Table$upload
          • Table$variable
        • Upload
          • Upload$create
          • Upload$delete
          • Upload$exists
          • Upload$get
          • Upload$insert_rows
          • Upload$list_variables
          • Upload$to_*
        • User
          • User$dataset
          • User$list_datasets
          • User$workflow
          • User$list_workflows
        • Variable
          • Variable$get
          • Variable$exists
          • Variable$update
        • Workflow
          • Workflow$get
          • Workflow$exists
          • Workflow$list_tables
          • Workflow$query
          • Workflow$table
      • Examples
        • Listing resources
        • Querying data
        • Reading tabular data
        • Uploading data
        • Working with non-tabular data
  • REST API
    • General structure
    • Authorization
    • Access
      • get
      • list
    • Datasets
      • delete
      • get
      • list
      • patch
      • post
    • Exports
      • download
      • get
      • post
    • Files
      • createSignedUrl
      • get
      • head
      • post
    • Members
      • get
      • list
    • Queries
      • get
      • post
      • listRows
    • ReadSessions
      • post
      • getStream
    • Tables
      • createTempUploads
      • delete
      • get
      • list
      • listRows
      • patch
      • post
    • Uploads
      • delete
      • get
      • insertRows
      • list
      • listRows
      • post
    • Variables
      • get
      • list
      • patch
    • Versions
      • delete
      • get
      • list
      • post
      • release
      • unrelease
    • Workflows
      • get
      • list
  • Resource definitions
    • Access
    • Dataset
    • Export
    • Member
    • Organization
    • Query
    • Table
    • Upload
    • User
    • Variable
    • Version
    • Workflow
Powered by GitBook
On this page
  • Overview
  • Simple uploads
  • Multipart uploads
  • Resumable uploads
  • Streaming uploads
  • Transfer uploads
  • HTTP Request
  • Path parameters
  • Query parameters
  • Request body
  • Authorization
  • Response body

Was this helpful?

  1. REST API
  2. Uploads

post

PreviouslistRowsNextVariables

Last updated 1 month ago

Was this helpful?

Overview

The upload resource is used for adding data content to a table. You may only create uploads on tables that belong to unreleased versions. You may create one or more uploads for per table, and their records will be "stacked" based on common variable names across the uploads.

There are multiple different mechanisms to upload content through this endpoint. In general, if your file is larger than 100MB, or if you are on an unreliable internet connection, it is recommended to use resumable uploads. If you are utilizing a Redivis client library to perform your uploads, this will be taken care of for you.

There's a lot of complexity in uploading data! It's highly recommended to use one of the client libraries to transfer data to Redivis. These libraries take care of the complexities automatically and optimize the upload mechanism based on the size of your file.

Simple uploads

Simple uploads should be used for smaller files that follow standard conventions. Provide the upload metadata through query parameters, and the file's content in the request body. The total payload cannot exceed 100MB.

Multipart uploads

If a file is still small, but you need to provide additional metadata, you may send a multipart request body. The first part must be JSON-encoded and contain the upload's metadata, and the second part should contain the file's content.

If sending a multipart request, the appropriate Content-Type: multipart/form-data header must be set. The total payload of multipart uploads cannot exceed 100MB.

Resumable uploads

For larger files or less reliable network connection, it is recommended to use resumable uploads for better fault tolerance. To perform a resumable upload, first and upload your file's content to the provided URL. After the upload is completed, call this endpoint with the resumableUploadId provided in the request body, setting the header Content-Type: application/json .

Streaming uploads

If you have a small number of rows being inserted at relatively high frequency, you should first create an empty upload via this endpoint (providing type="stream" in the request body), and then utilize the endpoint to stream individual rows into the upload.

Transfer uploads

HTTP Request

POST /api/v1/tables/:tableReference/uploads

Path parameters

Parameter

tableReference

Query parameters

Parameter

name

Required only if no name provided in the request body. The name of the file being uploaded. Overrides the name in the request body if both are provided.

The file type will be auto-determined based on the ending of this name. If you'd like to manually set the type, provide it via the request body.

Request body

If you are performing a simple upload, provide the upload's data in the request body.

If you are performing a multipart upload, the first part must contain the JSON-encoded parameters specified below, and the second part must contain the upload's data.

If you are performing a resumable upload, provide the resumableUploadId and the upload's parameters as a JSON-encoded body as specified below.

If you are performing a streaming upload, provide the upload's parameters as a JSON-encoded body as specified below. The upload type must be set to stream.

Property name
Type
Description

name

string

Required. The name of the upload.

tempUploadId

string

transferSpecification

object

Required if using an external transfer. The configuration for the external source where the file resides.

transferSpecification

.sourceType

string

The source where the file currently resides. Must be one of: - url

- gcs - s3 - bigQuery - redivis

transferSpecification

.sourcePath

string

The path to the file in your source. For example, a url for url uploads, bucket-name/object-path for GCS and s3, and a qualified table specification for BigQuery and Redivis sources.

transferSpecification .identity

string

type

string

Optional. The type of the file. If not provided, will be auto determined based on the file ending provided via the upload name. An error will occur if no valid type can be determined based on the upload name. Allowed types:

  • delimited

  • avro

  • jsonl

  • parquet

  • orc

  • xls

  • xlsx

  • dta

  • sas7bdat

  • sav

  • geojson

  • geojsonl

  • json

  • shp

  • shp.zip

  • kml

metadata

object

Optional. Provide optional metadata on the variables in the file. This parameter is a dict of variable names mapping to the metadata for that variable, e.g. { <variable_name>: { label: str, description: str,

valueLabels: { value: str, ... } }, ... }

schema

object

skipBadRecords

bool

hasHeaderRow

bool

Optional. Whether or not the file has a header present as the first row. Only applicable for delimited, xls, and xlsx types. Defaults to true.

delimiter

string

Optional. Only applicable for delimited types. The character to use as a field separator in the delimited file. If unspecified, the delimiter will be auto-determined based on analysis of the first 10MB of the file.

quoteCharacter

string

Optional. Only applicable for delimited type. The character used to escape fields that contain the delimiter (most often " for compliant delimited files).

If unspecified, Redivis will attempt to auto-infer the quote character by scanning the first 10MB of the file.

escapeCharacter

string

Optional. Only applicable for delimited type. The character that precedes any occurrences of the quote character when it should be treated as its literal value, rather than the start or end of a quote sequence (typically, the escape character will match the quote character, but sometimes is represented as a backward slash \).

If unspecified, Redivis will attempt to auto-infer the quote character by scanning the first 10MB of the file.

hasQuotedNewlines

bool

Optional. Applicable for all types other than stream , parquet, avro, and orc.

If true, specifies that line breaks exist within a quoted field of the delimited file. If unset, Redivis will scan the first 10MB of the file and attempt to determine whether the file contains quoted newlines, though this may fail if quoted newlines are first encountered further into the file.

Setting this value to true won't cause issues with files that don't have newline characters in fields, though it will substantially increase import processing time and may lead to inaccurate error messages if set unnecessarily.

Authorization

Edit access to the dataset is required. Your access token must have the following scope:

  • data.edit

Response body

If the file currently resides in another location, such as an s3 bucket or url, or as a table or file on Redivis, you can specify a transferSpecification in the request body. Note that you must first before you can perform a transfer upload from that source.

This endpoint extends the

A qualified reference to the table. See for more information.

Required if using temp resumable uploads. The id returned by the endpoint.

Should be omitted if the sourceType is url or redivis. For other sources, the identity should be the email address associated the credentials for the particular sourceType. Note that you must first for this identity.

stream - for uploads that will have subsequent calls to

Optional. Only relevant for uploads of . Defines an initial schema that will be validated on subsequent calls to insertRows. Takes the form: [{ "name": "var_name", "type": "integer"}, ...]

Optional. Default false. If set to true, the upload will succeed even if some records are invalid / un-parseable. After the upload is completed, the number of skipped records will be available through the skippedBadRecords property on the .

If successful, returns a JSON representation of an

create a temporary resumable upload
upload.insertRows
enable the relevant data source in your workspace -> settings
general API structure
Learn more about authorization.
upload resource.
referencing resources
table.createTempUploads
enable the relevant data source in your workspace -> settings
upload.insertRows
upload resource
type stream