Step: Order
Overview
The Order step sorts the table based on one or more variables.
Example starting data:
Example output data:
Order by score
descending
Ordering large tables (>1GB) across variables with a large number of unique values requires substantial memory and isn't parallelizable. Order clauses in such cases may significantly slow your transform or cause it to fail.
Step structure
There will be at least one order block where you will define a variable and a sort order.
When multiple blocks exist, the variables will be ordered and then sub-ordered in sequence.
Input field definitions
Field | Definition |
---|---|
Order by | The variable containing the values that will be sorted. |
Sort | A choice of how all values in the |
Examples
Example 1: Basic order
We can sort a table to quickly see the highest scores.
Starting data:
Input fields
Order by: The
score
variable has the data we want to sort on, so we select it hereSort: We want the data to go from smallest to largest values, so we choose ASC. There are no null values in this table, so we can choose either
nulls first
ornulls last
and get the same result.
Output data:
Example 2: Ordering on multiple variables
Lets say instead we first wanted to sort first by year, then by the the lowest sales number.
Starting data:
Input fields:
First block
Order by: The first variable we want the data sorted on is
score
so we choose it in the first block.Sort: We want the earliest information first, so we know we want the information to be ascending. This variable has a null value so it matters whether we want nulls to appear first or last in the order. Since we want it last, we choose
ASC (nulls last)
.
Second block
Order by: The second variable we want to sort on is
date
so we put it here.Sort: Since we want the most recent (highest) values first, we want it to be descending. There are no null values in this variable so where we put the nulls does not matter. We choose
DESC (nulls first)
.
Output data:
Last updated