Redivis Documentation
API DocumentationRedivis Home
  • Introduction
  • Redivis for open science
    • FAIR data practices
    • Open access
    • Data repository characteristics
    • Data retention policy
    • Citations
  • Guides
    • Getting started
    • Discover & access data
      • Discover datasets
      • Apply to access restricted data
      • Create a study
    • Analyze data in a workflow
      • Reshape data in transforms
      • Work with data in notebooks
      • Running ML workloads
      • Example workflows
        • Analyzing large tabular data
        • Create an image classification model
        • Fine tuning a Large Language Model (LLM)
        • No-code visualization
        • Continuous enrollment
        • Select first/last encounter
    • Export & publish your work
      • Export to other environments
      • Build your own site with Observable
    • Create & manage datasets
      • Create and populate a dataset
      • Upload tabular data as tables
      • Upload unstructured data as files
      • Cleaning tabular data
    • Administer an organization
      • Configure access systems
      • Grant access to data
      • Generate a report
      • Example tasks
        • Emailing subsets of members
    • Video guides
  • Reference
    • Your account
      • Creating an account
      • Managing logins
      • Single Sign-On (SSO)
      • Workspace
      • Studies
      • Compute credits and billing
    • Datasets
      • Documentation
      • Tables
      • Variables
      • Files
      • Creating & editing datasets
      • Uploading data
        • Tabular data
        • Geospatial data
        • Unstructured data
        • Metadata
        • Data sources
        • Programmatic uploads
      • Version control
      • Sampling
      • Exporting data
        • Download
        • Programmatic
        • Google Data Studio
        • Google Cloud Storage
        • Google BigQuery
        • Embedding tables
    • Workflows
      • Workflow concepts
      • Documentation
      • Data sources
      • Tables
      • Transforms
        • Transform concepts
        • Step: Aggregate
        • Step: Create variables
        • Step: Filter
        • Step: Join
        • Step: Limit
        • Step: Stack
        • Step: Order
        • Step: Pivot
        • Step: Rename
        • Step: Retype
        • Step: SQL query
        • Variable selection
        • Value lists
        • Optimization and errors
        • Variable creation methods
          • Common elements
          • Aggregate
          • Case (if/else)
          • Date
          • DateTime
          • Geography
          • JSON
          • Math
          • Navigation
          • Numbering
          • Other
          • Statistical
          • String
          • Time
      • Notebooks
        • Notebook concepts
        • Compute resources
        • Python notebooks
        • R notebooks
        • Stata notebooks
        • SAS notebooks
        • Using the Jupyter interface
      • Access and privacy
    • Data access
      • Access levels
      • Configuring access
      • Requesting access
      • Approving access
      • Usage rules
      • Data access in workflows
    • Organizations
      • Administrator panel
      • Members
      • Studies
      • Workflows
      • Datasets
      • Permission groups
      • Requirements
      • Reports
      • Logs
      • Billing
      • Settings and branding
        • Account
        • Public profile
        • Membership
        • Export environments
        • Advanced: DOI configuration
        • Advanced: Stata & SAS setup
        • Advanced: Data storage locations
        • Advanced: Data egress configuration
    • Institutions
      • Administrator panel
      • Organizations
      • Members
      • Datasets
      • Reports
      • Settings and branding
    • Quotas and limits
    • Glossary
  • Additional Resources
    • Events and press
    • API documentation
    • Redivis Labs
    • Office hours
    • Contact us
    • More information
      • Product updates
      • Roadmap
      • System status
      • Security
      • Feature requests
      • Report a bug
Powered by GitBook
On this page
  • Overview
  • Query syntax
  • Referencing the source table
  • Joining tables
  • Referencing upstream variables
  • Referencing value lists
  • Limitations
  • Errors

Was this helpful?

Export as PDF
  1. Reference
  2. Workflows
  3. Transforms

Step: SQL query

Last updated 6 months ago

Was this helpful?

Overview

Redivis supports direct SQL queries for advanced functionality and for users who prefer to work in a programmatic interface. You can add one or more SQL query steps to any transform.

This step accepts SQL code and will execute in the order it has been placed within the rest of the steps.

Query syntax

Referencing the source table

For example:

SELECT * 
FROM _source_
ORDER BY mean_drg_cost
LIMIT 1000

Joining tables

For example:

SELECT t0.id, t0.name, t1.latitude, t1.longitude
FROM _source_ AS t0
INNER JOIN `demo.ghcn_daily_weather_data:7br5:v1_1.stations:g2q3` AS t1
    ON t0.id = t1.id

The query editor will automatically qualify referenced tables and value lists as you reference them, which ensures your query will continue to work if tables get renamed. In most cases, you can just reference the table by name, and the fully qualified reference will be inserted for you.

Referencing upstream variables

You can mix and match SQL steps with other step types within your transform. If you choose to do so, this will introduce some additional language semantics, since transforms allow variables with the same name to co-exist at any given step, delineated in the UI with the t1, t2, ... and v prefixes. The v prefix denotes any newly created variable, while the tN prefix denotes variables that have been joined from other tables.

In order to reference these variables, you must include the prefix in your SQL code before the variable name. The only exception to this rule is for variables prefixed by t0 — these should be referenced normally, allowing you to ignore these rules if you aren't mixing the SQL interface with other step types.

For example:

SELECT 
-- These variables are from the transform's source table, denoted by `t0` in the UI
    id, name, 
-- These variables were joined upstream, denoted by `t1` in the UI
    t1_latitude, t1_longitude, 
-- This variable was created upstream, denoted by `v` in the UI
    v_extracted_year
FROM _source_

Referencing value lists

For example:

SELECT provider_id 
FROM _source_
WHERE provider_state IN UNNEST(@`states:8a2h`) 
    -- If a value list is in the incorrect type, use the following code to convert it
    OR provider_fips IN UNNEST(
        (
            SELECT ARRAY_AGG(CAST(val AS STRING))
            FROM UNNEST(@`fips_codes:134a`)val
        )
    )
ORDER BY mean_drg_cost
LIMIT 1000

-- @states = ['CA', 'TX', etc...]
-- @fips_codes = [6071, 2999, 3242]

Limitations

  • DDL / DML queries are not supported

  • All queries must return fields of a valid Redivis type. The following BigQuery types are not supported in the result set, though they may be used internally within the query:

    • TIMESTAMP

    • NUMERIC

    • BIGNUMERIC

    • BYTES

    • INTERVAL

    • JSON

    • ARRAY

    • STRUCT

  • BigQuery ML syntax is not supported

  • BigQuery scripting language is not supported

  • Statements may only reference tables that exist in your workflow

  • Unnamed variables in the result are not supported

  • Duplicate variable names in the result are not supported

Errors

If your query is invalid, you will see an invalid icon and the error message displayed above your code. Transforms can't be run while any step is invalid.

Redivis supports the and features, with . All queries must reference the SQL step's source, represented as _source_ in the query. If the SQL step is the first step in the transform, the source represents the transform's source table. Otherwise, the source represents the output of the previous step in the transform.

You can also reference any other table in the current workflow from your SQL step, following the same reference rules as . After typing the backtick character (`), available tables will be presented as autocomplete options.

You may also use in your query — they can be referenced as a SQL parameter, prefixed with the @ symbol. Note that all value lists are explicitly typed, and you will need to cast the value list if there is a type mismatch.

those in the Redivis API
value lists
BigQuery Standard SQL Syntax
certain limitations