# Reference

The `redivis` python modules provides an interface to construct representations of Redivis entities and to create, modify, read, and delete them.&#x20;

Resources are generally constructed by chaining together multiple constructor methods, reflecting the hierarchical nature of entities in Redivis. For example, to list all variables on table (which belongs to a dataset in an organization), we would write:

```python
import redivis

variables = (
    redivis.organization("Demo")      # Returns an instance of an Organization
    .dataset("CMS 2014 Medicare Data" # Returns an instance of a Dataset
    .table("Home health agencies")    # Returns an instance of a Table
    .list_variables()                 # Retuns a list of Variable instances
)
```

## Interfaces

| [**`redivis`**](https://docs.redivis.com/api/client-libraries/redivis-python/reference/redivis)           | The redivis namespace. Provides constructor methods for most of the other classes.                                                                                                     |
| --------------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| [**`Dataset`**](https://docs.redivis.com/api/client-libraries/redivis-python/reference/dataset)           | Class representing a Redivis dataset. Provides constructor methods for Tables and Queries scoped to a given dataset, as well as methods for creating, deleting, and updating datasets. |
| [**`File`**](https://docs.redivis.com/api/client-libraries/redivis-python/reference/file)                 | Class representing a non-tabular file on Redivis.                                                                                                                                      |
| [**`Organization`**](https://docs.redivis.com/api/client-libraries/redivis-python/reference/organization) | Class representing a Redivis organization. Provides constructor methods for Datasets and Workflows scoped to a given organization.                                                     |
| [**`Workflow`**](https://docs.redivis.com/api/client-libraries/redivis-python/reference/workflow)         | Class representing a Redivis workflow. Provides constructor methods for Tables and Queries scoped to a given workflow.                                                                 |
| [**`Query`**](https://docs.redivis.com/api/client-libraries/redivis-python/reference/query)               | Class representing a running SQL query that references tables on Redivis.                                                                                                              |
| [**`Upload`**](https://docs.redivis.com/api/client-libraries/redivis-python/reference/upload)             | Class representing a tabular data upload on a Table.                                                                                                                                   |
| [**`Table`**](https://docs.redivis.com/api/client-libraries/redivis-python/reference/table)               | Class representing a Redivis table. Numerous methods available for reading data from the table, as well as uploading data and metadata.                                                |
| [**`User`**](https://docs.redivis.com/api/client-libraries/redivis-python/reference/user)                 | Class representing a Redivis user. Provides constructor methods for Datasets and Workflows scoped to a given user.                                                                     |
| [**`Variable`**](https://docs.redivis.com/api/client-libraries/redivis-python/reference/variable)         | Class representing a specific variable with a Table.                                                                                                                                   |

## Environment variables

The following [environment variables](https://www.twilio.com/blog/environment-variables-python) may be set to modify the behavior of the `redivis-python` client.&#x20;

#### REDIVIS\_API\_TOKEN

If using this library in an external environment, you'll need set this env variable to your [API token](https://docs.redivis.com/api/rest-api/authorization) in order to authenticate. This is not relevant for code executed in Redivis notebooks.

{% hint style="warning" %}
**Important:** this token acts as a password, and should never be inlined in your code, committed to source control, or otherwise published.
{% endhint %}

#### REDIVIS\_DEFAULT\_WORKFLOW

If set, tables referenced via `redivis.table()` and unqualified table names in `redivis.query()` will be assumed to be within the default workflow. In Redivis notebooks, this environment variable is always set to the current workflow.

Takes the form `user_name.workflow_identifier`. All notebooks on Redivis automatically set the default workflow to that notebook's workflow. [Learn more about referencing resources >](https://docs.redivis.com/api/referencing-resources)

#### REDIVIS\_DEFAULT\_DATASET

If set, tables referenced via `redivis.table()` and unqualified table names in `redivis.query()` will be assumed to be within the default dataset.&#x20;

Takes the form `owner_name.dataset_identifier`. If both a default dataset and workflow are set, the default workflow will supersede the dataset. [Learn more about referencing resources >](https://docs.redivis.com/api/referencing-resources)

#### REDIVIS\_TMPDIR

If set, this directory will be used to temporarily store data for disk-backed data objects (e.g., see [Table.to\_arrow\_dataset](https://docs.redivis.com/api/client-libraries/redivis-python/reference/table/table.to_arrow_dataset)). Otherwise, the default OS temp directory will be used.

## Pandas datatype conversions

When reading data from a table, query, or upload, you have the option to return the results as a [pandas DataFrame](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.html). By default, this dataframe will use the [pyarrow dtype backend](https://pandas.pydata.org/docs/user_guide/pyarrow.html), whose types easily map to how the data is stored in Redivis tables. Pyarrow is recommended when possible, as it avoids many of the challenges with nullable data inherent with the numpy datatypes (and can easily be converted as needed).

However, to mimic historic pandas behavior, you can instead specify `dtype_backend={"numpy","numpy_nullable"}`. The latter utilizes the [experimental nullable datatypes](https://pandas.pydata.org/docs/user_guide/integer_na.html) in Pandas, which allows for nullable booleans and integers.

When loading a Redivis table into pandas using these dtypes, the following conversions will apply:

| Redivis type                | Pandas type                                                                                                     |
| --------------------------- | --------------------------------------------------------------------------------------------------------------- |
| `Float`                     | `float64`                                                                                                       |
| `DateTime`                  | `pd.Timestamp` (`np.datetime64[ns]`)                                                                            |
| `Date`                      | `pd.Timestamp` (`np.datetime64[ns]`)                                                                            |
| `Time`                      | `object` (with `datetime.time` objects)                                                                         |
| `Geography`                 | <p>If a pandas.DataFrame: <code>str</code><br>If a geopandas.GeoDataFrame: <code>geopandas.GeoSeries</code></p> |
| **dtype="numpy"**           |                                                                                                                 |
| `Boolean`                   | `bool`                                                                                                          |
| `Boolean` *with nulls*      | `object` (with values `True`, `False`, `None`)                                                                  |
| `Integer`                   | `int64`                                                                                                         |
| `Integer` *with nulls*      | `float64`                                                                                                       |
| `String`                    | `str`                                                                                                           |
| **dtype="numpy\_nullable"** |                                                                                                                 |
| `Boolean`                   | `pd.BooleanDtype()`                                                                                             |
| `Integer`                   | `pd.Int64Dtype()`                                                                                               |
| `String`                    | `pd.StringDtype()`                                                                                              |

{% hint style="info" %}
For `Date` variables, if you prefer to work with `datetime.date` objects, rather than NumPy's dateTime64 dtype, provide the argument `date_as_object=True`
{% endhint %}
