Table concepts
Overview
Tables are the "container" for all data on Redivis. They are made of rows and columns, where a row represents an individual entry or observation, and the column represents a Variable.
Tables belong to either a Dataset or a Workflow. In datasets, tables are created by uploading data. In workflows, tables are created as the resulting output of a transform or notebook.
When exploring a table, you will be able to switch between the following tabs:
Tables can be used for analysis within a Workflow. Alternatively, they can be exported for analysis in other environments.
Table characteristics
Name
The table's name. If in a dataset, must be unique across all tables for that version of the dataset. If in a workflow, must be unique across all tables currently in the workflow.
Description
Optional. A free-form description of the table's contents. May not exceed 5000 characters.
Bibliography
This table's citation, and any recorded related identifiers.
Variable count
Total number of variables in the table.
Row count
Total number of rows, or records, in the table.
Size
Total size of the table, in bytes.
Entity
Optional. The concept that one record in this table represents. For example, the table's entity might represent a unique patient, or a specific hospitalization, or a prescription.
Temporal range
Optional. The range of time that this table covers. This can either be set manually, or calculated from the min/max of a particular variable.
If calculated from a variable, that variable must have type date
, dateTime
, or integer
. If the variable is an integer, its values will be assumed to represent a year and must be in the range [0, 9999]
.
Sample
If this table is sampled, you will see a marker for whether you are looking at the full dataset or the 1% sample. To interact with sampled tables, add the dataset to a workflow.
Table types
All data on Redivis is stored within a table, including geospatial data and unstructured files:
Tabular
Tabular data is, unsurprisingly, stored within a table on Redivis, with its rows and columns mapped to the table's rows and variables. Table contents can be viewed in the cells viewer, queried, downloaded, and read into various programming interfaces as a tabular data frame.
Geospatial
Geospatial data is also stored within tables on Redivis. Each row maps to a geospatial feature, with various feature metadata encoded as variables, alongside a geometry
variable that encodes the actual feature (which could be a point, line, polygon, multi-polygon, etc).
When viewing the cells, you can preview a given geographic feature by hovering (or clicking) on the value in the cells view. Geospatial tables can also be queried (taking advantage of various geography methods), exported, or read as geospatial data frame in various programming interfaces.
Files
When data files aren't inherently tabular or geospatial (e.g., a collection of 1 million images), they can still be uploaded to a dataset. In this case, these individual data files are also represented within a table, often referred to as a "file index table". Each record in the table represents a single file, with a globally unique file_id
variable, as well as other variables containing metadata about the file.
When viewing the cells, you can preview a given file by hovering (or clicking) on the file_id
value. The metadata in file index tables can also be queried, potentially allowing you to join on file names, or extract a subset of files based on certain characteristics. Finally, these files can be read and downloaded via various programming interfaces, or exported via the interface.
Access
Access to a particular table will always be governed by the owner(s) associated with the table's contents.
For tables within a dataset, your access to a table will be the same as access to that dataset. You must have metadata access in order to view variable names and summary statistics, and data access in order to view cells, run queries, and export data.
For tables within a workflow, you must both have view access to the workflow, as well as corresponding access to all datasets whose data is present in that table. For example, if a particular workflow combines content from two datasets into a new output table, you'll need access to both datasets to view the table.
Bibliography
All tables automatically encode information about their lineage on Redivis. For example, if a table is created within a dataset, then transformed in a workflow, which is then forked into another workflow and joined with a new dataset, all of the information about the source datasets and workflows that created the table will be present in its bibliography.
This allows you to authoritatively cite any table on Redivis, making sure to credit everyone whose work contributed to a final output!

Last updated
Was this helpful?