Dataset nodes

A dataset node is a copy of a dataset in a project. You can only have one copy of a dataset in a project at a time. Dataset nodes do not add toward your storage quota.

Add a dataset to the project

(click to add dataset, can add anything but wont be able to see things you don't have access to)

To work with additional datasets within a project, click + Dataset in the minimap.

In the modal, you can search datasets on the Redivis site and click on a dataset to add it to your project.

Samples

Some large datasets have 1% samples which are useful for quickly and inexpensively testing querying strategies before running transforms against the full dataset.

If a 1% sample is available for a dataset, it will automatically be added to your project by default instead of the full sample. Samples are indicated by the dark circle icon to the top left of a dataset in the minimap.

You can inspect which variable is sampled on in the menu bar of the dataset. If an administrator samples on the same variable across datasets, the same group of values will be used (so joining two datasets with 1% samples will still result in a 1% sample).

‚ÄčTo switch to the full sample, use the menu in the top right of the dataset view. Your downstream transforms and tables will become stale, since an upstream change has been made. You can run these nodes individually to update their contents, or use the run all functionality in the top bar or project menu.

Versions

When a new version of a dataset is released by an administrator, the corresponding dataset node on your project minimap will become purple. To upgrade the dataset's version, click the version menu in the top right of the dataset.

You can view the data of the new version (and any previous versions) in the modal, along with the edits used to create it, release notes, and release details.

If you want to use a different version in this project, click the Use this version button in the top right of any version you aren't currently using.

Your downstream transforms and tables will become stale, since an upstream change has been made. You can run these nodes individually to update their contents, or use the run all functionality in the top bar or project menu.