# Dataset

## *class* <mark style="color:purple;">Dataset</mark>

Dataset on Redivis are the entity where data is stored. Datasets are made up of tables, non-tabular files, and various metadata. Datasets can be owned by a user or organization, and are version controlled.

## Constructors

<table data-header-hidden><thead><tr><th width="424">Method</th><th>Description</th></tr></thead><tbody><tr><td><a href="organization/organizationusddataset"><strong><code>Organization$dataset</code></strong></a>(dataset_reference)</td><td>Construct a new dataset instance that references a dataset owned by an organization.</td></tr><tr><td><a href="organization/organizationusdlist_datasets"><strong><code>Organization$list_datasets</code></strong></a>([max_results])</td><td>Returns a list of Datasets owned by an organization.</td></tr><tr><td><a href="user/userusddataset"><strong><code>User$dataset</code></strong></a>(dataset_reference)</td><td>Construct a new dataset instance that references a dataset owned by a user.</td></tr><tr><td><a href="user/userusdlist_datasets"><strong><code>User$list_datasets</code></strong></a>([max_results])</td><td>Returns a list of Datasets owned by a user.</td></tr></tbody></table>

## Examples

{% tabs %}
{% tab title="Basics" %}

```r
dataset <- redivis$organization("Demo")$dataset("US Fires")

# Will raise an error if the dataset doesn't exists
# Can first call dataset$exists() to check for existence
dataset$get()

print(dataset$properties) # A named list of dataset properties
```

{% endtab %}

{% tab title="Create" %}

```r
dataset <- redivis$user("my_username")$dataset("My dataset")

dataset$create(public_access_level="overview")

print(dataset$properties)
```

{% endtab %}

{% tab title="New version" %}

```r
dataset <- redivis$user("my_username")$dataset("My dataset")

dataset <- dataset$create_next_version()

# Coming soon: uploading data via R, see Python library
    
dataset.release()

```

{% endtab %}

{% tab title="Query" %}

```r
dataset <- redivis$organization("Demo")$dataset("CMS 2014 Medicare Data")

# The home_health_agencies table is assumed to be within the dataset,
#   since it isn't otherwise qualified
query = dataset$query("""
    SELECT * FROM home_health_agencies
    WHERE state = 'CA'
""")

print(query$to_tibble())
```

{% endtab %}
{% endtabs %}

## Fields

| Name                      | Description                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               |
| ------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| **`organization`**        | A reference to the [Organization](https://docs.redivis.com/api/client-libraries/redivis-r/reference/organization) instance that constructed this dataset. Will be `None` if the dataset belongs to a user.                                                                                                                                                                                                                                                                                                                                                                                                                                |
| **`properties`**          | <p>A named list containing the <a href="../../../resource-definitions/dataset">API resource representation of the dataset</a>. This will be fully populated after calling <a href="dataset/datasetusdget">get()</a>, <a href="dataset/datasetusdcreate_next_version">create\_next\_version()</a>, and <a href="dataset/datasetusdrelease">release()</a>, otherwise will be <code>None</code>. <br><br>This will also be partially populated for datasets returned via the <a href="organization/organizationusdlist_datasets">Organization$list\_datasets</a> and <a href="user/userusdlist_datasets">User$list\_datasets</a> methods</p> |
| **`qualified_reference`** | The [fully qualified reference](https://docs.redivis.com/api/referencing-resources) for the dataset, which can be used in SQL queries or the REST API. E.g., `demo.ghcn_daily_weather_data:v1_1:7br5`                                                                                                                                                                                                                                                                                                                                                                                                                                     |
| **`scoped_reference`**    | The canonical reference for the dataset, without the username qualifier. E.g.,: `ghcn_daily_weather_data:v1_1:7br5`                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       |
| **`user`**                | A reference to the [User](https://docs.redivis.com/api/client-libraries/redivis-r/reference/user) instance that constructed this dataset. Will be `None` if the dataset belongs to an organization.                                                                                                                                                                                                                                                                                                                                                                                                                                       |

## Methods

<table data-header-hidden><thead><tr><th width="450"></th><th></th></tr></thead><tbody><tr><td><a href="dataset/datasetusdconnect_dbi"><strong><code>Dataset$connect_dbi</code></strong></a>()</td><td>Create a DBI connection scoped to the dataset.</td></tr><tr><td><a href="dataset/datasetusdcreate"><strong><code>Dataset$create</code></strong></a>([public_access_level, ...])</td><td>Create a new dataset.</td></tr><tr><td><a href="dataset/datasetusdcreate_next_version"><strong><code>Dataset$create_next_version</code></strong></a>([if_not_exists])</td><td>Create a "next" (unreleased) version on the dataset. Data can only be uploaded to unreleased versions.</td></tr><tr><td><a href="dataset/datasetusddelete"><strong><code>Dataset$delete</code></strong></a>()</td><td>Delete the dataset.</td></tr><tr><td><a href="dataset/datasetusdexists"><strong><code>Dataset$exists</code></strong></a>()</td><td>Check whether the dataset exists.</td></tr><tr><td><a href="dataset/datasetusdget"><strong><code>Dataset$get</code></strong></a>()</td><td>Get the dataset, populating the <code>properties</code> on the current instance.</td></tr><tr><td><a href="dataset/datasetusdlist_tables"><strong><code>Dataset$list_tables</code></strong></a>([max_results])</td><td>List all tables in the dataset.</td></tr><tr><td><a href="dataset/datasetusdlist_versions"><strong><code>Dataset$list_versions</code></strong></a>([max_results])</td><td>List all versions for the dataset.</td></tr><tr><td><a href="dataset/datasetusdnext_version"><strong><code>Dataset$next_version</code></strong></a>()</td><td>Return a reference to the dataset at the version subsequent to the currently referenced version.</td></tr><tr><td><a href="dataset/datasetusdprevious_version"><strong><code>Dataset$previous_version</code></strong></a>()</td><td>Return a reference to the dataset at the version prior to the currently referenced version.</td></tr><tr><td><a href="dataset/datasetusdquery"><strong><code>Dataset$query</code></strong></a>(query_string)</td><td>Create a query scoped to the dataset.</td></tr><tr><td><a href="dataset/datasetusdrelease"><strong><code>Dataset$release</code></strong></a>()</td><td>Release the <code>next</code> version of the dataset.</td></tr><tr><td><a href="dataset/datasetusdtable"><strong><code>Dataset$table</code></strong></a>(table_reference)</td><td>Create a reference to a specific table within the dataset.</td></tr><tr><td><a href="dataset/datasetusdunrelease"><strong><code>Dataset$unrelease</code></strong></a>()</td><td>Unrelease the <code>current</code> version of the dataset.</td></tr><tr><td><a href="dataset/datasetusdupdate"><strong><code>Dataset$update</code></strong></a>([name, public_access_level, ...])</td><td>Update certain attributes on the dataset.</td></tr><tr><td><a href="dataset/datasetusdupdate_variables"><strong><code>Dataset$update_variables</code></strong></a>(variables)</td><td>Bulk update variable metadata across tables in a workflow.</td></tr><tr><td><a href="dataset/datasetusdversion"><strong><code>Dataset$version</code></strong></a>([tag])</td><td>Create a reference to a <a href="broken-reference">version instance</a> at a particular tag.</td></tr></tbody></table>
