post

Overview

The upload resource is used for adding data content to a table. You may only create uploads on tables that belong to unreleased versions. You may create one or more uploads for per table, and their records will be "stacked" based on common variable names across the uploads.

There are multiple different mechanisms to upload content through this endpoint. In general, if your file is larger than 100MB, or if you are on an unreliable internet connection, it is recommended to use resumable uploads. If you are utilizing a Redivis client library to perform your uploads, this will be taken care of for you.

There's a lot of complexity in uploading data! It's highly recommended to use one of the client libraries to transfer data to Redivis. These libraries take care of the complexities automatically and optimize the upload mechanism based on the size of your file.

Simple uploads

Simple uploads should be used for smaller files that follow standard conventions. Provide the upload metadata through query parameters, and the file's content in the request body. The total payload cannot exceed 100MB.

Multipart uploads

If a file is still small, but you need to provide additional metadata, you may send a multipart request body. The first part must be JSON-encoded and contain the upload's metadata, and the second part should contain the file's content.

If sending a multipart request, the appropriate Content-Type: multipart/form-data header must be set. The total payload of multipart uploads cannot exceed 100MB.

Resumable uploads

For larger files or less reliable network connection, it is recommended to use resumable uploads for better fault tolerance. To perform a resumable upload, first create a temporary resumable upload and upload your file's content to the provided URL. After the upload is completed, call this endpoint with the resumableUploadId provided in the request body, setting the header Content-Type: application/json .

Streaming uploads

If you have a small number of rows being inserted at relatively high frequency, you should first create an empty upload via this endpoint (providing type="stream" in the request body), and then utilize the upload.insertRows endpoint to stream individual rows into the upload.

Transfer uploads

If the file currently resides in another location, such as an s3 bucket or url, or as a table or file on Redivis, you can specify a transferSpecification in the request body. Note that you must first enable the relevant data source in your workspace -> settings before you can perform a transfer upload from that source.

HTTP Request

POST /api/v1/tables/:tableReference/uploads

This endpoint extends the general API structure

Path parameters

Parameter

tableReference

A qualified reference to the table. See referencing resources for more information.

Query parameters

Parameter

name

Required only if no name provided in the request body. The name of the file being uploaded. Overrides the name in the request body if both are provided.

The file type will be auto-determined based on the ending of this name. If you'd like to manually set the type, provide it via the request body.

Request body

If you are performing a simple upload, provide the upload's data in the request body.

If you are performing a multipart upload, the first part must contain the JSON-encoded parameters specified below, and the second part must contain the upload's data.

If you are performing a resumable upload, provide the resumableUploadId and the upload's parameters as a JSON-encoded body as specified below.

If you are performing a streaming upload, provide the upload's parameters as a JSON-encoded body as specified below. The upload type must be set to stream.

Property name
Type
Description

name

string

Required. The name of the upload.

tempUploadId

string

Required if using temp resumable uploads. The id returned by the table.createTempUploads endpoint.

transferSpecification

object

Required if using an external transfer. The configuration for the external source where the file resides.

transferSpecification

.sourceType

string

The source where the file currently resides. Must be one of: - url

- gcs - s3 - bigQuery - redivis

transferSpecification

.sourcePath

string

The path to the file in your source. For example, a url for url uploads, bucket-name/object-path for GCS and s3, and a qualified table specification for BigQuery and Redivis sources.

transferSpecification .identity

string

Should be omitted if the sourceType is url or redivis. For other sources, the identity should be the email address associated the credentials for the particular sourceType. Note that you must first enable the relevant data source in your workspace -> settings for this identity.

type

string

Optional. The type of the file. If not provided, will be auto determined based on the file ending provided via the upload name. An error will occur if no valid type can be determined based on the upload name. Allowed types:

  • delimited

  • stream - for uploads that will have subsequent calls to upload.insertRows

  • avro

  • ndjson

  • parquet

  • orc

  • xls

  • xlsx

  • dta

  • sas7bdat

  • sav

  • geojson

  • geojsonl

  • json

  • shp

  • shp.zip

  • kml

metadata

object

Optional. Provide optional metadata on the variables in the file. This parameter is a dict of variable names mapping to the metadata for that variable, e.g. { <variable_name>: { label: str, description: str,

valueLabels: { value: str, ... } }, ... }

schema

object

Optional. Only relevant for uploads of type stream. Defines an initial schema that will be validated on subsequent calls to insertRows. Takes the form: [{ "name": "var_name", "type": "integer"}, ...]

skipBadRecords

bool

Optional. Default false. If set to true, the upload will succeed even if some records are invalid / un-parseable. After the upload is completed, the number of skipped records will be available through the skippedBadRecords property on the upload resource.

hasHeaderRow

bool

Optional. Whether or not the file has a header present as the first row. Only applicable for delimited, xls, and xlsx types. Defaults to true.

delimiter

string

Optional. Only applicable for delimited types. The character to use as a field separator in the delimited file. If unspecified, the delimiter will be auto-determined based on analysis of the first 10MB of the file.

quoteCharacter

string

Optional. Only applicable for delimited type. The character used to escape fields that contain the delimiter (most often " for compliant delimited files).

If unspecified, Redivis will attempt to auto-infer the quote character by scanning the first 10MB of the file.

escapeCharacter

string

Optional. Only applicable for delimited type. The character that precedes any occurrences of the quote character when it should be treated as its literal value, rather than the start or end of a quote sequence (typically, the escape character will match the quote character, but sometimes is represented as a backward slash \).

If unspecified, Redivis will attempt to auto-infer the quote character by scanning the first 10MB of the file.

hasQuotedNewlines

bool

Optional. Applicable for all types other than stream , parquet, avro, and orc.

If true, specifies that line breaks exist within a quoted field of the delimited file. If unset, Redivis will scan the first 10MB of the file and attempt to determine whether the file contains quoted newlines, though this may fail if quoted newlines are first encountered further into the file.

Setting this value to true won't cause issues with files that don't have newline characters in fields, though it will substantially increase import processing time and may lead to inaccurate error messages if set unnecessarily.

Authorization

Edit access to the dataset is required. Your access token must have the following scope:

  • data.edit

Learn more about authorization.

Response body

If successful, returns a JSON representation of an upload resource.

Last updated