# Dataset

## *class* <mark style="color:purple;">Dataset</mark>

Datasets on Redivis are the entity where data is stored. Datasets are made up of tables, non-tabular files, and various metadata. Datasets can be owned by a user or organization, and are version controlled.

## Constructors

<table data-header-hidden><thead><tr><th width="424">Method</th><th>Description</th></tr></thead><tbody><tr><td><a href="/pages/7azWEvDBRuGUjqHmKvHl"><strong><code>Organization.dataset</code></strong></a>(dataset_reference)</td><td>Construct a new dataset instance that references a dataset owned by an organization.</td></tr><tr><td><a href="/pages/Vljer0vchX1XkdgWFFaK"><strong><code>Organization.list_datasets</code></strong></a>([max_results])</td><td>Returns a list of Datasets owned by an organization.</td></tr><tr><td><a href="/pages/b2nlrr6n4SmM5pKC2wuj"><strong><code>User.dataset</code></strong></a>(dataset_reference)</td><td>Construct a new dataset instance that references a dataset owned by a user.</td></tr><tr><td><a href="/pages/Aw5yLtNCZIe1r7wisnUb"><strong><code>User.list_datasets</code></strong></a>([max_results])</td><td>Returns a list of Datasets owned by a user.</td></tr></tbody></table>

## Examples

{% tabs %}
{% tab title="Basics" %}

```python
dataset = redivis.organization("Demo").dataset("US Fires")

# Will throw an error if the dataset doesn't exists
# Can first call dataset.exists() to check for existence
dataset.get()

print(dataset.properties)
```

{% endtab %}

{% tab title="Create" %}

```python
dataset = redivis.user("my_username").dataset("My dataset")

dataset.create(public_access_level="overview")

print(dataset.properties)
```

{% endtab %}

{% tab title="New version" %}

```python
dataset = redivis.user("my_username").dataset("My dataset")

dataset = dataset.create_next_version()

# We can upload new data to existing tables once we have a "next" version
with open("data.csv", "rb") as file:
    dataset.table("My table").upload("data.csv").upload_file(file)
    
dataset.release()

```

{% endtab %}

{% tab title="Query" %}

```python
dataset = redivis.organization("Demo").dataset("CMS 2014 Medicare Data")

# The home_health_agencies table is assumed to be within the dataset,
#   since it isn't otherwise qualified
query = dataset.query("""
    SELECT * FROM home_health_agencies
    WHERE state = 'CA'
""")

print(query.to_dataframe())
```

{% endtab %}
{% endtabs %}

## Attributes

| **`organization`**        | A reference to the [Organization](/api/client-libraries/redivis-python/reference/organization.md) instance that constructed this dataset. Will be `None` if the dataset belongs to a user.                                                                                                                                                                                                                                                                                                                                                                                                                  |
| ------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| **`properties`**          | <p>A dict containing the <a href="/pages/-LztdIp8_7QkyaSUOODw">API resource representation of the dataset</a>. This will be fully populated after calling <a href="/pages/tcvIrAQzBE3q9jGql2Nv">get()</a>, <a href="/pages/u7jme5xustry05C8i2to">create\_next\_version()</a>, and <a href="/pages/5OprI1LA9O92QGoYi20j">release()</a>, otherwise will be <code>None</code>. <br><br>This will also be partially populated for datasets returned via the <a href="/pages/Vljer0vchX1XkdgWFFaK">Organization.list\_datasets</a> and <a href="/pages/Aw5yLtNCZIe1r7wisnUb">User.list\_datasets</a> methods</p> |
| **`qualified_reference`** | The [fully qualified reference](/api/referencing-resources.md) for the dataset, which can be used in SQL queries or the REST API. E.g., `demo.ghcn_daily_weather_data:v1_1:7br5`                                                                                                                                                                                                                                                                                                                                                                                                                            |
| **`scoped_reference`**    | The canonical reference for the dataset, without the username qualifier. E.g.,: `ghcn_daily_weather_data:v1_1:7br5`                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |
| **`user`**                | A reference to the [User](/api/client-libraries/redivis-python/reference/user.md) instance that constructed this dataset. Will be `None` if the dataset belongs to an organization.                                                                                                                                                                                                                                                                                                                                                                                                                         |

## Methods

<table data-header-hidden><thead><tr><th width="450"></th><th></th></tr></thead><tbody><tr><td><a href="/pages/lpWUuUiKM86vRyciDWGy"><strong><code>Dataset.add_labels</code></strong></a>(labels)</td><td>Add labels to a dataset.</td></tr><tr><td><a href="/pages/PriVDRMKz1jzvmxBbO5t"><strong><code>Dataset.create</code></strong></a>([*, public_access_level, ...])</td><td>Create a new dataset.</td></tr><tr><td><a href="#create_next_version-if_not_exists-false"><strong><code>Dataset.create_next_version</code></strong></a>([*, if_not_exists])</td><td>Create a "next" (unreleased) version on the dataset. Data can only be uploaded to unreleased versions.</td></tr><tr><td><a href="/pages/ZZO2BGB0PuHuwVbyHSKi"><strong><code>Dataset.delete</code></strong></a>()</td><td>Delete the dataset.</td></tr><tr><td><a href="/pages/WdKccb69eJ8A5pO2XGHQ"><strong><code>Dataset.exists</code></strong></a>()</td><td>Check whether the dataset exists.</td></tr><tr><td><a href="/pages/tcvIrAQzBE3q9jGql2Nv"><strong><code>Dataset.get</code></strong></a>()</td><td>Get the dataset, populating the <code>properties</code> attribute on the current instance.</td></tr><tr><td><a href="/pages/l1YDxfyABzZz6iTnBhfK"><strong><code>Dataset.list_tables</code></strong></a>([max_results])</td><td>List all tables in the dataset.</td></tr><tr><td><a href="/pages/sbo9deQXFzEyeAUTXa8a"><strong><code>Dataset.list_versions</code></strong></a>([max_results])</td><td>List all versions for the dataset.</td></tr><tr><td><a href="/pages/rT1Nu5E16xXLMv4HpRaz"><strong><code>Dataset.next_version</code></strong></a>()</td><td>Return a reference to the dataset at the version subsequent to the currently referenced version.</td></tr><tr><td><a href="/pages/rV42YZL1YQbFNScPawvG"><strong><code>Dataset.previous_version</code></strong></a>()</td><td>Return a reference to the dataset at the version prior to the currently referenced version.</td></tr><tr><td><a href="/pages/ezf6V3pNDLBUf513M2v2"><strong><code>Dataset.query</code></strong></a>(query_string)</td><td>Create a query scoped to the dataset.</td></tr><tr><td><a href="/pages/5OprI1LA9O92QGoYi20j"><strong><code>Dataset.release</code></strong></a>()</td><td>Release the <code>next</code> version of the dataset.</td></tr><tr><td><a href="/pages/urJkhH84oooJVq7DYW8J"><strong><code>Dataset.remove_labels</code></strong></a>()</td><td>Remove labels from a dataset.</td></tr><tr><td><a href="/pages/wsJ9Y23Momb16hbZ9daR"><strong><code>Dataset.table</code></strong></a>(table_reference)</td><td>Create a reference to a specific table within the dataset.</td></tr><tr><td><a href="/pages/OocjU87SDLlrn7KrBLkt"><strong><code>Dataset.unrelease</code></strong></a>()</td><td>Unrelease the current version of the dataset, moving it back to an unreleased, "next" version.</td></tr><tr><td><a href="/pages/wsJ9Y23Momb16hbZ9daR"><strong><code>Dataset.update</code></strong></a>([*, name, public_access_level, ...])</td><td>Update certain attributes on the dataset.</td></tr><tr><td><a href="/pages/kjhQdX984qLdcD4nfLt9"><strong><code>Dataset.update_variables</code></strong></a>(variables)</td><td>Batch update variable metadata across tables in a dataset.</td></tr><tr><td><a href="/pages/aNkMQOdLbQLCsCqTLj7X"><strong><code>Dataset.version</code></strong></a>([tag])</td><td>Create a reference to a <a href="/pages/3x00zsPZG4PEpnTNYbDj">version instance</a> at a particular tag.</td></tr></tbody></table>


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.redivis.com/api/client-libraries/redivis-python/reference/dataset.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
