Links

Step: SQL query

Overview

Redivis supports direct SQL queries for advanced functionality and for users who prefer to work in a programmatic interface. You can add one or more SQL query steps to any transform.
This step accepts SQL code and will execute in the order it has been placed within the rest of the steps.

Query syntax

Referencing the source table

Redivis supports the BigQuery Standard SQL Syntax and features, with certain limitations. All queries must reference the SQL step's source, represented as _source_ in the query. If the SQL step is the first step in the transform, the source represents the transform's source table. Otherwise, the source represents the output of the previous step in the transform.
For example:
SELECT *
FROM _source_
ORDER BY mean_drg_cost
LIMIT 1000

Joining tables

You can also reference any other table in the current project from your SQL step, following the same reference rules as those in the Redivis API. After typing the backtick character (`), available tables will be presented as autocomplete options.
For example:
SELECT t0.id, t0.name, t1.latitude, t1.longitude
FROM _source_ AS t0
INNER JOIN `demo.ghcn_daily_weather_data:7br5:v1_1.stations:g2q3` AS t1
ON t0.id = t1.id
The query editor will automatically qualify referenced tables and value lists as you reference them, which ensures your query will continue to work if tables get renamed. In most cases, you can just reference the table by name, and the fully qualified reference will be inserted for you.

Referencing upstream variables

You can mix and match SQL steps with other step types within your transform. If you choose to do so, this will introduce some additional language semantics, since transforms allow variables with the same name to co-exist at any given step, delineated in the UI with the t1, t2, ... and v prefixes. The v prefix denotes any newly created variable, while the tN prefix denotes variables that have been joined from other tables.
In order to reference these variables, you must include the prefix in your SQL code before the variable name. The only exception to this rule is for variables prefixed by t0 — these should be referenced normally, allowing you to ignore these rules if you aren't mixing the SQL interface with other step types.
For example:
SELECT
-- These variables are from the transform's source table, denoted by `t0` in the UI
id, name,
-- These variables were joined upstream, denoted by `t1` in the UI
t1_latitude, t1_longitude,
-- This variable was created upstream, denoted by `v` in the UI
v_extracted_year
FROM _source_

Referencing value lists

You may also use value lists in your query — they can be referenced as a SQL parameter, prefixed with the @ symbol. Note that all value lists are explicitly typed, and you will need to cast the value list if there is a type mismatch.
For example:
SELECT provider_id
FROM _source_
WHERE provider_state IN UNNEST(@`states:8a2h`)
-- If a value list is in the incorrect type, use the following code to convert it
OR provider_fips IN UNNEST(
(
SELECT ARRAY_AGG(CAST(val AS STRING))
FROM UNNEST(@`fips_codes:134a`)val
)
)
ORDER BY mean_drg_cost
LIMIT 1000
-- @states = ['CA', 'TX', etc...]
-- @fips_codes = [6071, 2999, 3242]

Limitations

  • DDL / DML queries are not supported
  • All queries must return fields of a valid Redivis type. The following BigQuery types are not supported in the result set, though they may be used internally within the query:
    • TIMESTAMP
    • NUMERIC
    • BIGNUMERIC
    • BYTES
    • INTERVAL
    • JSON
    • ARRAY
    • STRUCT
  • BigQuery ML syntax is not supported
  • BigQuery scripting language is not supported
  • Statements may only reference tables that exist in your project
  • Unnamed variables in the result are not supported
  • Duplicate variable names in the result are not supported

Errors

If your query is invalid, you will see an invalid icon and the error message displayed above your code. Transforms can't be run while any step is invalid.
Last modified 6d ago