# Dataset

## *class* <mark style="color:purple;">Dataset</mark>

Dataset on Redivis are the entity where data is stored. Datasets are made up of tables, non-tabular files, and various metadata. Datasets can be owned by a user or organization, and are version controlled.

## Constructors

<table data-header-hidden><thead><tr><th width="424">Method</th><th>Description</th></tr></thead><tbody><tr><td><a href="/pages/TUqISflwiQlyxeznqIwj"><strong><code>Organization$dataset</code></strong></a>(dataset_reference)</td><td>Construct a new dataset instance that references a dataset owned by an organization.</td></tr><tr><td><a href="/pages/5KwBABJcRrGGcVBV22hC"><strong><code>Organization$list_datasets</code></strong></a>([max_results])</td><td>Returns a list of Datasets owned by an organization.</td></tr><tr><td><a href="/pages/T7SZUdyepC6YyUHvYFUR"><strong><code>User$dataset</code></strong></a>(dataset_reference)</td><td>Construct a new dataset instance that references a dataset owned by a user.</td></tr><tr><td><a href="/pages/ptdNcXlOskxzx5r0YXaO"><strong><code>User$list_datasets</code></strong></a>([max_results])</td><td>Returns a list of Datasets owned by a user.</td></tr></tbody></table>

## Examples

{% tabs %}
{% tab title="Basics" %}

```r
dataset <- redivis$organization("Demo")$dataset("US Fires")

# Will raise an error if the dataset doesn't exists
# Can first call dataset$exists() to check for existence
dataset$get()

print(dataset$properties) # A named list of dataset properties
```

{% endtab %}

{% tab title="Create" %}

```r
dataset <- redivis$user("my_username")$dataset("My dataset")

dataset$create(public_access_level="overview")

print(dataset$properties)
```

{% endtab %}

{% tab title="New version" %}

```r
dataset <- redivis$user("my_username")$dataset("My dataset")

dataset <- dataset$create_next_version()

# Coming soon: uploading data via R, see Python library
    
dataset.release()

```

{% endtab %}

{% tab title="Query" %}

```r
dataset <- redivis$organization("Demo")$dataset("CMS 2014 Medicare Data")

# The home_health_agencies table is assumed to be within the dataset,
#   since it isn't otherwise qualified
query = dataset$query("""
    SELECT * FROM home_health_agencies
    WHERE state = 'CA'
""")

print(query$to_tibble())
```

{% endtab %}
{% endtabs %}

## Fields

| Name                      | Description                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       |
| ------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| **`organization`**        | A reference to the [Organization](/api/client-libraries/redivis-r/reference/organization.md) instance that constructed this dataset. Will be `None` if the dataset belongs to a user.                                                                                                                                                                                                                                                                                                                                                                                                                             |
| **`properties`**          | <p>A named list containing the <a href="/pages/-LztdIp8_7QkyaSUOODw">API resource representation of the dataset</a>. This will be fully populated after calling <a href="/pages/T6ri1UEhsQjgE5eA4dzc">get()</a>, <a href="/pages/l4FKBaAfBKIwkK0g6MnE">create\_next\_version()</a>, and <a href="/pages/zgycI8uDb98e5Z6WJ3z0">release()</a>, otherwise will be <code>None</code>. <br><br>This will also be partially populated for datasets returned via the <a href="/pages/5KwBABJcRrGGcVBV22hC">Organization$list\_datasets</a> and <a href="/pages/ptdNcXlOskxzx5r0YXaO">User$list\_datasets</a> methods</p> |
| **`qualified_reference`** | The [fully qualified reference](/api/referencing-resources.md) for the dataset, which can be used in SQL queries or the REST API. E.g., `demo.ghcn_daily_weather_data:v1_1:7br5`                                                                                                                                                                                                                                                                                                                                                                                                                                  |
| **`scoped_reference`**    | The canonical reference for the dataset, without the username qualifier. E.g.,: `ghcn_daily_weather_data:v1_1:7br5`                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               |
| **`user`**                | A reference to the [User](/api/client-libraries/redivis-r/reference/user.md) instance that constructed this dataset. Will be `None` if the dataset belongs to an organization.                                                                                                                                                                                                                                                                                                                                                                                                                                    |

## Methods

<table data-header-hidden><thead><tr><th width="450"></th><th></th></tr></thead><tbody><tr><td><a href="/pages/xgmn8svdhKqxYMKyA2fp"><strong><code>Dataset$connect_dbi</code></strong></a>()</td><td>Create a DBI connection scoped to the dataset.</td></tr><tr><td><a href="/pages/qNZje0MAOKHhJ1b6SK69"><strong><code>Dataset$create</code></strong></a>([public_access_level, ...])</td><td>Create a new dataset.</td></tr><tr><td><a href="/pages/l4FKBaAfBKIwkK0g6MnE"><strong><code>Dataset$create_next_version</code></strong></a>([if_not_exists])</td><td>Create a "next" (unreleased) version on the dataset. Data can only be uploaded to unreleased versions.</td></tr><tr><td><a href="/pages/Zu4GeLNvZZPIPsr6j4KL"><strong><code>Dataset$delete</code></strong></a>()</td><td>Delete the dataset.</td></tr><tr><td><a href="/pages/sJ8nuLAQGlE9sN1qxvyh"><strong><code>Dataset$exists</code></strong></a>()</td><td>Check whether the dataset exists.</td></tr><tr><td><a href="/pages/T6ri1UEhsQjgE5eA4dzc"><strong><code>Dataset$get</code></strong></a>()</td><td>Get the dataset, populating the <code>properties</code> on the current instance.</td></tr><tr><td><a href="/pages/hu1J1yKNvLdfWsBsNq2x"><strong><code>Dataset$list_tables</code></strong></a>([max_results])</td><td>List all tables in the dataset.</td></tr><tr><td><a href="/pages/ljJjYTJE5IpGXNSp42yP"><strong><code>Dataset$list_versions</code></strong></a>([max_results])</td><td>List all versions for the dataset.</td></tr><tr><td><a href="/pages/4xMYxvPXd09YDK0ybEv6"><strong><code>Dataset$next_version</code></strong></a>()</td><td>Return a reference to the dataset at the version subsequent to the currently referenced version.</td></tr><tr><td><a href="/pages/NsXqXdGVFnDHCF4ph5yy"><strong><code>Dataset$previous_version</code></strong></a>()</td><td>Return a reference to the dataset at the version prior to the currently referenced version.</td></tr><tr><td><a href="/pages/M9V0iAi2XqdG42gC2hd5"><strong><code>Dataset$query</code></strong></a>(query_string)</td><td>Create a query scoped to the dataset.</td></tr><tr><td><a href="/pages/zgycI8uDb98e5Z6WJ3z0"><strong><code>Dataset$release</code></strong></a>()</td><td>Release the <code>next</code> version of the dataset.</td></tr><tr><td><a href="/pages/IKVZOb4qlHQhYdGNSnnJ"><strong><code>Dataset$table</code></strong></a>(table_reference)</td><td>Create a reference to a specific table within the dataset.</td></tr><tr><td><a href="/pages/GpA5SrxJGbas1xCpqeyf"><strong><code>Dataset$unrelease</code></strong></a>()</td><td>Unrelease the <code>current</code> version of the dataset.</td></tr><tr><td><a href="/pages/YmoW4HdNOuq18vLTH0ql"><strong><code>Dataset$update</code></strong></a>([name, public_access_level, ...])</td><td>Update certain attributes on the dataset.</td></tr><tr><td><a href="/pages/ebhRnl3XmN9o3zRmayYF"><strong><code>Dataset$update_variables</code></strong></a>(variables)</td><td>Bulk update variable metadata across tables in a workflow.</td></tr><tr><td><a href="/pages/u0zj8MS3VfZiDawAnxc6"><strong><code>Dataset$version</code></strong></a>([tag])</td><td>Create a reference to a <a href="/pages/caYLih4mldc0Fk30YHLN">version instance</a> at a particular tag.</td></tr></tbody></table>


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.redivis.com/api/client-libraries/redivis-r/reference/dataset.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
