Redivis Documentation
API DocumentationRedivis Home
  • Introduction
  • Redivis for open science
    • FAIR data practices
    • Open access
    • Data repository characteristics
    • Data retention policy
    • Citations
  • Guides
    • Getting started
    • Discover & access data
      • Discover datasets
      • Apply to access restricted data
      • Create a study
    • Analyze data in a workflow
      • Reshape data in transforms
      • Work with data in notebooks
      • Running ML workloads
      • Example workflows
        • Analyzing large tabular data
        • Create an image classification model
        • Fine tuning a Large Language Model (LLM)
        • No-code visualization
        • Continuous enrollment
        • Select first/last encounter
    • Export & publish your work
      • Export to other environments
      • Build your own site with Observable
    • Create & manage datasets
      • Create and populate a dataset
      • Upload tabular data as tables
      • Upload unstructured data as files
      • Cleaning tabular data
    • Administer an organization
      • Configure access systems
      • Grant access to data
      • Generate a report
      • Example tasks
        • Emailing subsets of members
    • Video guides
  • Reference
    • Your account
      • Creating an account
      • Managing logins
      • Single Sign-On (SSO)
      • Workspace
      • Studies
      • Compute credits and billing
    • Datasets
      • Documentation
      • Tables
      • Variables
      • Files
      • Creating & editing datasets
      • Uploading data
        • Tabular data
        • Geospatial data
        • Unstructured data
        • Metadata
        • Data sources
        • Programmatic uploads
      • Version control
      • Sampling
      • Exporting data
        • Download
        • Programmatic
        • Google Data Studio
        • Google Cloud Storage
        • Google BigQuery
        • Embedding tables
    • Workflows
      • Workflow concepts
      • Documentation
      • Data sources
      • Tables
      • Transforms
        • Transform concepts
        • Step: Aggregate
        • Step: Create variables
        • Step: Filter
        • Step: Join
        • Step: Limit
        • Step: Stack
        • Step: Order
        • Step: Pivot
        • Step: Rename
        • Step: Retype
        • Step: SQL query
        • Variable selection
        • Value lists
        • Optimization and errors
        • Variable creation methods
          • Common elements
          • Aggregate
          • Case (if/else)
          • Date
          • DateTime
          • Geography
          • JSON
          • Math
          • Navigation
          • Numbering
          • Other
          • Statistical
          • String
          • Time
      • Notebooks
        • Notebook concepts
        • Compute resources
        • Python notebooks
        • R notebooks
        • Stata notebooks
        • SAS notebooks
        • Using the Jupyter interface
      • Access and privacy
    • Data access
      • Access levels
      • Configuring access
      • Requesting access
      • Approving access
      • Usage rules
      • Data access in workflows
    • Organizations
      • Administrator panel
      • Members
      • Studies
      • Workflows
      • Datasets
      • Permission groups
      • Requirements
      • Reports
      • Logs
      • Billing
      • Settings and branding
        • Account
        • Public profile
        • Membership
        • Export environments
        • Advanced: DOI configuration
        • Advanced: Stata & SAS setup
        • Advanced: Data storage locations
        • Advanced: Data egress configuration
    • Institutions
      • Administrator panel
      • Organizations
      • Members
      • Datasets
      • Workflows
      • Sub-institutions
      • Reports
      • Logs
      • Settings and branding
    • Quotas and limits
    • Glossary
  • Additional Resources
    • Events and press
    • API documentation
    • Redivis Labs
    • Office hours
    • Contact us
    • More information
      • Product updates
      • Roadmap
      • System status
      • Security
      • Feature requests
      • Report a bug
Powered by GitBook
On this page
  • Overview
  • Step structure
  • Input field definitions
  • Examples
  • Example 1: Basic order
  • Example 2: Ordering on multiple variables

Was this helpful?

Export as PDF
  1. Reference
  2. Workflows
  3. Transforms

Step: Order

Overview

The Order step sorts the table based on one or more variables.

Example starting data:

/*---------+--------*
 | student | score  |
 +---------+--------+
 | jane    | 83     |
 | neal    | 35     |
 | sam     | 74     |
 | pat     | 62     |
 *---------+--------*/

Example output data:

Order by score descending

/*---------+--------*
 | student | score  |
 +---------+--------+
 | jane    | 83     |
 | sam     | 74     |
 | pat     | 62     |
 | neal    | 35     |
 *---------+--------*/

Ordering large tables (>1GB) across variables with a large number of unique values requires substantial memory and isn't parallelizable. Order clauses in such cases may significantly slow your transform or cause it to fail.

Step structure

  • There will be at least one order block where you will define a variable and a sort order.

  • When multiple blocks exist, the variables will be ordered and then sub-ordered in sequence.

Input field definitions

Field
Definition

Order by

The variable containing the values that will be sorted.

Sort

A choice of how all values in the Order by variable will be sorted: ASC (nulls first) ASC (nulls last) DESC (nulls first) DESC (nulls last) Note that variables of the geography data type aren't sortable.

Examples

Example 1: Basic order

We can sort a table to quickly see the highest scores.

Starting data:

/*---------+-------+---------+------------*
 | test    | score | student | date       |
 +---------+-------+---------+------------+
 | quiz    | 83    | jane    | 2020-04-01 |
 | quiz    | 35    | pat     | 2020-04-01 |
 | quiz    | 89    | sam     | 2020-04-01 |
 | midterm | 74    | jane    | 2020-05-01 |
 | midterm | 62    | pat     | 2020-05-01 |
 | midterm | 100   | sam     | 2020-05-01 |
 | final   | 77    | jane    | 2020-06-01 |
 | final   | 59    | pat     | 2020-06-01 |
 | final   | 100   | sam     | 2020-06-01 |
 *---------+-------+---------+------------*/

Input fields

  • Order by: The score variable has the data we want to sort on, so we select it here

  • Sort: We want the data to go from smallest to largest values, so we choose ASC. There are no null values in this table, so we can choose either nulls first or nulls last and get the same result.

Output data:

/*---------+-------+---------+------------*
 | test    | score | student | date       |
 +---------+-------+---------+------------+
 | midterm | 100   | sam     | 2020-05-01 |
 | final   | 100   | sam     | 2020-06-01 |
 | quiz    | 89    | sam     | 2020-04-01 |
 | quiz    | 83    | jane    | 2020-04-01 |
 | final   | 77    | jane    | 2020-06-01 |
 | midterm | 74    | jane    | 2020-05-01 |
 | midterm | 62    | pat     | 2020-05-01 |
 | final   | 59    | pat     | 2020-06-01 |
 | quiz    | 35    | pat     | 2020-04-01 |
 *---------+-------+---------+------------*/

Example 2: Ordering on multiple variables

Lets say instead we first wanted to sort first by year, then by the the lowest sales number.

Starting data:

/*---------+-------+---------+------------*
 | test    | score | student | date       |
 +---------+-------+---------+------------+
 | quiz    | 83    | jane    | 2020-04-01 |
 | quiz    | 35    | pat     | 2020-04-01 |
 | quiz    | 89    | sam     | 2020-04-01 |
 | midterm | 74    | jane    | 2020-05-01 |
 | midterm | 62    | pat     | 2020-05-01 |
 | midterm | 100   | sam     | 2020-05-01 |
 | final   | 77    | jane    | 2020-06-01 |
 | final   | 59    | pat     | 2020-06-01 |
 | final   | 100   | sam     | 2020-06-01 |
 *---------+-------+---------+------------*/

Input fields:

First block

  • Order by: The first variable we want the data sorted on is score so we choose it in the first block.

  • Sort: We want the earliest information first, so we know we want the information to be ascending. This variable has a null value so it matters whether we want nulls to appear first or last in the order. Since we want it last, we choose ASC (nulls last).

Second block

  • Order by: The second variable we want to sort on is date so we put it here.

  • Sort: Since we want the most recent (highest) values first, we want it to be descending. There are no null values in this variable so where we put the nulls does not matter. We choose DESC (nulls first).

Output data:

/*---------+-------+---------+------------*
 | test    | score | student | date       |
 +---------+-------+---------+------------+
 | final   | 100   | sam     | 2020-06-01 |
 | midterm | 100   | sam     | 2020-05-01 |
 | quiz    | 89    | sam     | 2020-04-01 |
 | quiz    | 83    | jane    | 2020-04-01 |
 | final   | 77    | jane    | 2020-06-01 |
 | midterm | 74    | jane    | 2020-05-01 |
 | midterm | 62    | pat     | 2020-05-01 |
 | final   | 59    | pat     | 2020-06-01 |
 | quiz    | 35    | pat     | 2020-04-01 |
 *---------+-------+---------+------------*/

Last updated 6 months ago

Was this helpful?