Step: Filter
Overview
The Filter step will select for rows that meet a certain set of conditions.
Example starting data:
Example output data:
Filter out rows with scores less than 70.
Step structure:
Basic state
A filter step will be made up of one or more filter blocks with a completed condition.
When there are multiple filter blocks in a step, conditions in all blocks must be met for rows to be kept.
If you have a more complex filter statement that is dependent on multiple nested conditions you can press the
+
button to expand the filter block.
Expanded state
When multiple conditions are needed in a block, you must specify how they relate to each other (
AND
vsOR
)Any nested conditions are resolved before higher level conditions.
Field descriptions
Examples
Example 1: Basic filter
Lets say we only want to reduce our table to only contain information about results from the final test.
Starting data:
Input fields:
Variable(s): The variable
test
is where thefinal
value we want to evaluate on is located.[Operator]: We want fields that exactly match, so we choose
=
.Value(s) or variable(s): We want to only keep rows where
final
is present, so we put that here.
Output data:
Example 2: Multiple conditions
Let's say we don't just want values from the final but only those from the final with a score above 60.
Starting data:
Input fields:
We input the data as in the above example, but since we now have two conditions, we have to decided how they relate to each other. In this case we want data that meets all the conditions so select
All conditions must be satisfied (AND)
.
Output data:
Example 3: Nested conditions
Let's say we want to keep all data from the final greater than 60, or any any scores above 85.
Starting data:
Input fields:
We want to keep all rows where scores are over 85 OR are from the final and over 60. So we set up the final and over 60 conditions under an
AND
umbrella, and nest that under theOR
umbrella alongside our condition about scores over 85.When executing, the nested conditions (scores from the final over 60) will be evaluated first to be true or false. Then the higher level condition (scores over 85 OR (scores about the final and over 60) will be evaluated. Any rows that meet this higher level condition will be kept in the output, and any that do not will be discarded.
Output data:
Example 4: Comparing variables and value lists
Let's say we only want to keep rows where scores are greater than the average score, and that are for our selected students. Right now our selected students are Jane and Pat but we know that might change in the future.
Starting data:
Input fields:
First step, we create a value list called
selected_students
with the valuesjane
andpat
on it. Then we can select it as compared tostudent
.With multiple inputs on either side of a condition, the condition is evaluated as an OR. So this condition will evaluate to true if the value in
student
equals any value on the list (jane
orpat
).While we could have input
jane
andpat
into the right side of this condition and gotten the same result, using a list makes it easy in the future to change our selected students centrally and update it everywhere the list is used.
For our second condition, we want to evaluate how two values in the same row compare to each other.
When executed, this condition will look at each row to check if the value in
score
is greater than the value inscore_average
. If so, then the condition is met.
Output data:
Reference: Comparison statements
A comparison always evaluates to either TRUE or FALSE.
Comparisons are made up of one or more comparison rows. A comparison row always evaluates to TRUE or FALSE; multiple rows can be nested together with a logical AND/OR.
A comparison row is made up of three components:
Left expression
The left expression can contain variable parameters from the source table, joined tables, as well as any newly created variables. All referenced new variables must be "upstream" from the current comparison, with the exception of joins, which may reference new variables that are constructed by any variables upstream of that join.
Depending on the operator selected, the left expression may contain multiple variables. In this case, each left expression will be evaluated against the right expression(s), and logically joined via an OR. If you want to take the logical AND of multiple variables, create a separate comparison row for each variable.
Multiple left hand values are only supported for =
and like
operators.
Where multiple variables can be entered in a comparison, you can use *
in the interface to select all matching results.
For example, typing DIAG* and pressing enter will add all variables beginning with "DIAG" to this field.
Operator
Redivis supports the following operators:
=, !=
Checks if any the value(s) in the left expression are (not) equal to any values in the right expression. NULL
s are treated as equivalent (NULL == NULL -> TRUE
and NULL != NULL -> FALSE
)
>, >=, <, <=,
Checks if the value in the left expression is less than, greater than, etc. the right expression. String
comparisons are lexicographically ordered, other data types are based on the numeric / temporal order of that value. Comparisons between NULL
values will always be false.
like / ! like
Checks if the string(s)
in the left hand expression matches specified pattern(s) in the right hand expression. The pattern may contain the following characters:
A percent sign "%" matches any number of characters
An underscore "_" matches a single character
You can escape "\", "_", or "%" using one backslash. For example, "\%"
Right expression
The right expression can contain any variables allowed in the left expression, as well as literal values and lists. The comparison row will evaluate to TRUE when the left hand expression matches any of the right hand expressions, except for the !=
and !like
comparators, where the comparison will evaluate to true if all values are not equal to / not like the left expression.
To match against a null datum (empty cell), you must specify the special literal value NULL
here.
Last updated