Releasing a version

Overview

The Release tab of the data editor allows you to lock the next version of the dataset and release it to everyone who has access. Once all required fields describing this release are filled you can click the Release button to see an overview and release it.

Note that releasing a version is final and the data can't be edited once it has been released. However, you can return to edit the description, temporal range, release notes, and variable metadata.

It is highly recommended to follow the procedures outlined in Auditing data before releasing your version.

Fill out the description, sample, and temporal range information to finalize a release

Version description

This required field is for a short description of the release. This will appear in the left bar when users are browsing the version history of this dataset. This information will be visible to anyone with "Overview" access to the dataset.

1% Sample

This required field specifies whether and how to generate a 1% for this version. These samples are used in projects to limit querying costs. This section allows for several options:

No sample

For smaller datasets, it likely won't make sense to create a 1% sample, and you can select No sample.

Random sample

Randomly chooses records to be in the sample table (each record has a 1% chance of being sampled). Note that this is non-deterministic; the 1% sample of two identical tables won't be the same. Your table must have at least 1,000 records in order to be sampled.

Sample on a variable

You can select any variable in the dataset with at least 158,000 unique values to sample on. Every value for this variable will have a 1% chance of being in the output set; importantly, this sampling is deterministic. This guarantees that the same values that fall in the 1% sample for one dataset will also occur in the 1% sample for another dataset.

For example, given two datasets that contain a variable patient_id, if this variable is sampled upon, it is guaranteed that the same patient_id's will be included in both datasets' samples, and if a patient_id is in the sample dataset, all records with that patient_id from the dataset will be in the sample. This ensures that joins on patient_id across 1% samples will reflect a consistent 1% sample of the data.

Note that the sample will be computed on the string representation of the variable. For example, if the value '1234' falls in the 1% sample, then we are guaranteed that the integer value 1234 will also fall within the sample. However, if this value is stored as a float (1234.0), it is unlikely to also fall in the sample, as the string representation of this float is '1234.0', which for the purposes of sampling is entirely different than the string '1234'.

Temporal range

This required field represents the time range that is contained by the records in this dataset. This will be displayed on the Overview tab of the dataset page. The temporal range can either be an integer (year), date, or dateTime.

If there is a variable in the dataset which represents the time range, select that variable from the dropdown menu and the range will be automatically calculated.

If there is no representative variable you can enter the time range manually by selecting Set range manually.

If this dataset does not cover any range of time, you can select No temporal range for this version.

Release notes

This optional field is for you to provide further information about this release. You can describe any changes you made or include more details on the state of the data in this version.

This release notes supports markdown and as well as embedded images and other files

Release

When the table is done and you've completed all required fields, you can click the Release button on the right side of this tab. This will bring up your final check before releasing your data.

Upon confirmation, the release process will begin. This may take some time for larger datasets — you can leave the page at any time; when the release process is completed, the new version will be available for authorized users to work with.