Step: Stack
Overview
The Stack step will add or remove rows from another table that align with your existing variable structure.
Example starting data:
Example output data:
Union rows from Table B.
Step structure
Field descriptions
Stacked table
The table containing the rows you wish to add.
Stack type
How the rows will be added. (More information below.)
Variable alignment
How you will define the variable structure between the source and the Stacked table
.
All matching variables: Will automatically match variables and only include variables with a matching name, the rest will be dropped.
All variables: Will automatically match variables and keep all variables in the output. Values will be marked null
where there is no match for an existing variable
Manually match variables: Will require you to align variables between two tables.
Retype variables
Variables will be retyped in order to be aligned.
Only keep distinct rows
If any added rows are an exact duplicate of an existing row it will be dropped from the output.
Create variable for table name (Union only)
In order to record in the data where new rows came from you can opt to create a new variable with table information.
Stack types
The most common stack type is a Union, which can be particularly useful when combining data which has been broken up into multiple tables with the same structure.
Union
Appends all of the stacked table’s rows to the source table.
Except
Keep only those distinct rows in the source that match rows in the stacked table.
Intersect
Keep only those distinct rows that are in the source and stacked table.
Examples
Example 1: Basic union
Let's say I have data broken out into two tables with the same structure. I want to add all rows together as they are.
Starting data:
Input fields:
Stacked table: My data is in
table1
so we select that hereStack type: We want to add all rows so we select
Union
here.Variable alignment: Somehow our variable names shifted between tables so we can't automatically align tables. We choose
Manually match variables
here and then fill in the variable names underSource variable
andStacked table variable
.Where the variable names are different, the output will keep the name of the
Source variable
when it exists.
Output data:
Example 2: Bulk union and variable misalignment
To continue the previous example, let's say we have a third table of data, and it has additional information that was not present in the first two about state. We want to keep this information and also keep track of which rows came from which table.
Starting data:
Input fields:
We create one block for the table1 union, and a second block for the table2 union
Variable alignment: Since we want to keep all variables no matter if they have matches, we select
All variables
.Create variable for table name: We check this box since we want to keep track of which table each row came from.
Note that
Source table name
is automatically populated with the name of this transform's source table.
Output data:
Example 3: Except
Let's say we have a table with information about students and their test scores, and we have identified some students that have dropped the course and that we no longer need in the original table. We have gathered those students in table1
.
Starting data:
Input fields:
Output data:
The rows that were in both tables are now removed.
Example 4: Intersect
Let's say we have two tables full of students, one from who took our fall semester class, and one from the spring semester class. We want to only keep student rows for people who took both classes, so those who are present in both tables.
Input fields
Output data:
Only rows that were in both tables now remain.
Last updated