# Query

## *class* <mark style="color:purple;">Query</mark>

Used to execute a SQL query against table(s) in Redivis, using the [Redivis SQL query syntax](https://docs.redivis.com/reference/projects/transform-nodes/raw-sql), and read out the results.

## Constructors

<table data-header-hidden><thead><tr><th width="312">Method</th><th>Description</th></tr></thead><tbody><tr><td><a href="redivis/redivis.query"><strong><code>redivis.query</code></strong></a>(query_string)</td><td>Execute a SQL query within the <a href="..#environment-variables">current default scope</a> (either a dataset or workflow). <br><br>In a Redivis notebook, the default scope will always be the notebook's workflow, and the notebook's source table can be referenced via the <code>_source_</code> identifier. <br><br>If no default scope is specified, all tables in the query must be fully qualified. Consult the <a href="../../../referencing-resources">referencing resources</a> documentation to learn more. </td></tr><tr><td><a href="dataset/dataset.query"><strong><code>Dataset.query</code></strong></a>(query_string)</td><td>Execute a SQL query scoped to a specific <a href="dataset">dataset</a>. Tables referenced by the query do not need to be fully qualified, since the table lookup is already scoped to the dataset.<br><br>Consult the <a href="../../../referencing-resources">referencing resources</a> documentation to learn more. </td></tr><tr><td><a href="workflow/workflow.query"><strong><code>Workflow.query</code></strong></a>(query_string)</td><td>Execute a SQL query scoped to a specific workflow. Tables referenced by the query do not need to be fully qualified, since the table lookup is already scoped to the dataset.<br><br>Consult the <a href="../../../referencing-resources">referencing resources</a> documentation to learn more. </td></tr></tbody></table>

## Examples

{% tabs %}
{% tab title="Basics" %}

```python
# Execute any SQL query and read the results
query = redivis.query("SELECT 1 + 1 AS two, 'foo' AS bar")
query.to_pandas_dataframe()
# 	two	bar
# 0	2	foo

# The query can reference any table on Redivis 
query = redivis.query("""
    SELECT * 
    FROM demo.iris_species.iris 
    WHERE SepalLengthCm > 5
""")
query.to_pandas_dataframe()
# 	Id	SepalLengthCm	SepalWidthCm	PetalLengthCm	PetalWidthCm	Species
# 0	33	5.2	        4.1	        1.5	        0.1	        Iris-setosa
# ...

# Other methods to read data:
# query.to_arrow_batch_iterator()
# query.to_arrow_dataset()
# query.to_arrow_dataset()
# query.to_geopandas_dataframe()
# query.to_dask_dataframe()
# query.to_polars_lazyframe()
```

{% endtab %}

{% tab title="Scoped queries" %}

```python
# To simplify table references, execute a query scoped to a dataset or workflow
dataset = redivis.organization("Demo").dataset("CMS 2014 Medicare Data")
query = dataset.query("""
    SELECT 
        hospice_providers.name, 
        inpatient_charges.drg_definition
    -- The tables inpatient_chargers, hospice_providers are assumed to be 
    -- within the scoped dataset
    FROM inpatient_charges
    INNER JOIN hospice_providers 
        ON hospice_providers.provider_id = inpatient_charges.provider_id
""")

# In a notebook, all queries are scoped to the current workflow.
# Additionally, the notebooks source table can simply be referenced as _source_
query = redivis.query("SELECT * FROM _source_ LIMIT 10")
```

{% endtab %}
{% endtabs %}

## Attributes

| **`properties`** | A dict containing the [API resource representation of the query](https://docs.redivis.com/api/resource-definitions/query). This will always be populated after the query has been created, and can be refreshed by calling `query.get()` |
| ---------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |

## Methods

<table data-header-hidden><thead><tr><th width="441"></th><th></th></tr></thead><tbody><tr><td><a href="query/query.file"><strong><code>Query.file</code></strong></a>(name, *, [...])</td><td>Reference a <a href="file">File</a> within query results that contain a file_id variable.</td></tr><tr><td><a href="query/query.get"><strong><code>Query.get</code></strong></a>()</td><td>Fetch query metadata. Once called, the <code>properties</code> attribute on the query will be fully populated.</td></tr><tr><td><a href="query/query.list_files"><strong><code>Query.list_files</code></strong></a>([max_results, *, ...])</td><td>Return a list of <a href="file">File</a> instances for query results containing a file_id variable.</td></tr><tr><td><a href="query/query.list_variables"><strong><code>Query.list_variables</code></strong></a>([max_results, *, ...])</td><td>Return a list of <a href="variable">Variable</a> instances associated with this query's output.</td></tr><tr><td><a href="query/query.to_"><strong><code>Query.to_*</code></strong></a>([max_results, *, ...])</td><td>Various methods to read query results. Mirrors the various <a href="table">Table.to_*</a> methods (e.g., <code>table.to_pandas_dataframe()</code></td></tr><tr><td><a href="query/query.variable"><strong><code>Query.variable</code></strong></a>(name)</td><td>Reference a <a href="variable">Variable</a> within the query's output.</td></tr></tbody></table>
