Updating a dataset

Once a dataset is released, you can return to it to make changes at any time. Changes to datasets are tracked in Redivis as versions. Anyone with access can view and work with any version of the dataset.

Making changes

How to work with versions when updating a dataset:

  • Edits to the data itself will need to be released as a new version before anyone can see them.

  • Edits to the dataset information, table information, or variable metadata can be made on the current version and will be live as soon as it's saved. Or if you would like to preserve the existing dataset as it is, you can create a new version and make updates to that instead. No one will be able to see these updates until the new version is released.

  • Edits to the dataset name and access configuration (and for administrators only - published status) will always affect all versions.

Creating the next version

All data within a dataset is encapsulated in discrete, immutable versions. Every part of the dataset except for the name, access setting, and published status is versioned. All tables in a dataset are versioned together.

The first version of this dataset was created automatically for you. Once that is released, you can choose to create a new version at any time by clicking the button in the top right "Create next version"

Once you've created a new version you can make changes to any part of the dataset.

Replace vs append

Subsequent versions will by default build on the previous version of the dataset.

However, when uploading data you may choose to completely overwrite the existing content by replacing your current version. When you open a table, you will see an "Upload data" button on the right side of the table. Here you will need to choose if any additional data you upload to this table should append to the current table, or erase it and replace it from scratch. Once you've made a decision you can begin uploading your new data.

There is a maximum of 4000 versions allowed per dataset. This limit may be lower if you regularly replace (rather than append) data, depending on how that the replaced data overlaps with existing data. In such cases, it is reasonable to expect that the maximum version count will be at least 1000.

Storage and archival

Redivis computes row-level diffs for each version, efficiently storing the complete version history in one master table. This allows you to regularly release new versions and maintain a robust version history without ballooning storage costs.

After a time, the tables associated with historic versions will automatically be archived. Archived tables can still be used and queried just like any other table, with the sole limitation being that the cell preview is not available on archived tables.