Step: Retype
Overview
The Retype step converts a variable of a given type to another type.
Example starting data
Example output data:
Retype score
from integer to float.
Step structure
There will be at least one retype block where you will define a variable and a new type.
Field definitions
Source variable
The variable that you want to retype. Note that you can see its current type by locating it in the variable selector at the bottom of the page, or hovering over the variable in this selection menu to see a tooltip with more information.
New type
The type that the Source variable
will be converted to. Can be any of the supported variable types on Redivis. More on conversion rules.
If invalid for type, set to null
Whether failing type conversions should be converted to a null
value. By default, failed type conversions will throw an error.
Note that this might significantly change the content of your data and we suggest using this option with full understanding of how it might affect your outcome and verifying the output results.
Specify custom format
Informs how the data will be read by the method. Only relevant for conversions of strings to date, time, and dateTime. More on format elements.
Examples
Example 1: Basic usage
We can convert sales data currently stored as an integer type into float to use in another formula elsewhere that only accept the float type.
Starting data:
Input fields:
Source variable: The variable we want to convert is
score
so we select it here.New type: We want this to be a float, and since the score variable is currently an integer it is compatible with conversion to the float type, so we can select
Float
here.If invalid for type, set to null: Since there are no incompatible values in this variable it doesn't matter what we choose. We leave it unchecked to validate that we understand our data and to confirm that this transform will execute without failing.
Output data:
Our output table looks as it should when we look at the cells in the output table, and we can confirm the new type by clicking on the variable in the output table to check the type.
Example 2: Handling invalid conversions
Lets say instead that in our initial data, the sales
variable was stored as a string. Converting this to a float would be a bit trickier since the data entry wasn't as clean.
Starting data:
Input fields:
Source variable: Same as above example.
New type: Same as above example.
If invalid for type, set to null: Since this data has
35%
as a value, this can't be converted to a float. If we leave this box unchecked our transform will fail. Checking it will set that value tonull
.
Output data:
Note that while this retype was successful, our data might not be what we want, and in this case removes information.
Example 3: Parsing dates
Let's say we want to convert our year variable which is currently a string to a Date variable type, but that the starting format does not cleanly translate.
Starting data:
Input fields:
Source variable: The variable we want to convert is
date
so we select it here.New type: We want this to be a date so we can select
Date
here.If invalid for type, set to null: Since there are no incompatible values in this variable it doesn't matter what we choose. We leave it unchecked to validate that we understand our data and to confirm that this transform will execute without failing.
Specify custom format: Since our data does not fit the standard format (%Y-%m-%d, e.g.
2020-10-01
) we need to specify what format it is in. We can use the reference table at the bottom of this page to specify our format: MM/DD/YYYY
Output data:
Reference: Type conversion
Note you can see more information on Redivis variable types here.
String
Integer Float Boolean Date DateTime Time Geography
To integer: A hex string can be cast to an integer. For example, 0x123
to 291
or -0x123
to -291
.
To float: Returns x
as a floating point value, interpreting it as having the same form as a valid floating point literal.
To boolean: Returns TRUE
if x
is "true"
and FALSE
if x
is "false"
. All other values of x are invalid and throw an error instead of casting to a boolean. A string is case-insensitive when converting to a boolean.
To date, dateTime, or time: Uses the canonical format by default (see information below)
Integer
String
Float Boolean
To float: Returns a close but potentially not exact floating point value.
To boolean: Returns FALSE
if x
is 0
, TRUE
otherwise.
Float
String Integer
To integer: Returns the closest integer value. Halfway cases such as 1.5 or -0.5 round away from zero.
Boolean
String Integer
To string: Returns true
if x is true, false
otherwise.
To integer: Returns 1
if x is true, 0
otherwise.
Date
String
DateTime
String Date Time
To date, dateTime, or time: Uses the canonical format by default (see information below)
Time
String Date DateTime
To date, dateTime, or time: Uses the canonical format by default (see information below)
Geography
String
Reference: Cannonical representation
When retyping between a String type variable and a Date, DateTime, or Time type variable it is presumed that the data will be in the format below.
Date
(Four digit year)- (1 or 2 digit month)- (1 or 2 digit date)
2023-01-01 2023-1-1
Time
(1 or 2 digit hour): (1 or 2 digit minute): (1 or 2 digit second). (Up to 6 fractional seconds)
01:01:01.123456 6:2:9 22:19:3
DateTime
(Date specification)
(space
or T
or t
)
(Time specification)
2023-01-01 01:01:01.123456 2023-1-1T6:2:9
If it is not in this canonical format you can click the Specify custom format field and use Format elements (below) to indicate otherwise.
Reference: Format elements
Since Date, DateTime, and Time variable types contain structured information you can use format strings to indicate how you want different pieces of date and time information translated to and from string format when retyping.
For example when converting a date to a string you can choose whether it will become JAN 1 2023 or 2023-01-01. When converting from a string to a DateTime you'll need to outline how the information is structured in your data so it can be read in correctly.
You can use these elements in the Specify custom format field.
YYYY
Four (or more) digit year
Input: 2023-01-01 Output: 2023
Input: 23-01-01 Output: 0023
Input: 20000-01-01 Output: 20000
YYY
Last three digit year
Input: 2023-01-01 Output: 023
Input: 23-01-01 Output: 023
YY
Two digit year
Input: 2023-01-01 Output: 23
Input: 2-01-30 Output: 02
Y
Last one digit of year
Input: 2023-01-01 Output: 3
MM
Two digit month
Input: 2023-01-01 Output: 23
MON
Three character month: JAN, FEB, MAR, APR, MAY, JUN, JUL, AUG, SEP, OCT, NOV, DEC
Input: 2023-01-01 Output: JAN
MONTH
Month name
Input: 2023-01-01 Output: JANUARY
DDD
Three digit day of the year
Input: 2023-01-01 Output: 001
DD
Two digit day of the month
Input: 2023-01-01 Output: 01
D
Day of the week (1-7) with Sunday being 1
Input: 2023-01-01 Output: 1
DAY
Day of the week. Spaces are padded on the right side to make the output size exactly 9.
Input: 2023-01-01 Output: SUNDAY
DY
Three character day: MON, TUE, WED, THU, FRI, SAT, SUN
Input: 2023-01-01 Output: SUN
HH
Two digit hour of the day (valid values from 00 to 12)
Input: 20:10:15 Output: 10
HH12
Hour of the day (valid values from 00 to 12)
Input: 20:10:15 Output: 10
HH24
Two digit hour (valid values from 00 to 24)
Input: 20:10:15 Output: 20
MI
Two digit minute
Input: 20:10:15 Output: 10
SS
Two digit second
Input: 20:10:15 Output: 15
SSSSS
Five digit second
Input: 20:10:15 Output: 15234
FFn
(Replace n
with a value from 1 to 9. For example, FF5
.)
Fractional part of the second, n
digits long. The fractional part of the second is rounded to fit the size of the output.
FF1
Input: 20:10:15 Output: 1
FF2
Input: 20:10:15
Output: 15
FF3
Input: 20:10:15
Output: 015
A.M.
or AM
P.M.
or PM
A.M. (or AM) if the time is less than 12, otherwise P.M. (or PM). The letter case of the output is determined by the first letter case of the format element.
AM
Input: 09:10:15 Output: AM
A.M.
Input: 20:10:15
Output: P.M.
PM
Input: 09:10:15
Output: AM
PM
Input: 20:10:15
Output: PM
TZH
Hour offset for a time zone. This includes the +/-
sign and 2-digit hour.
Input: 2008-12-25 05:30:00+00 Output: −08
TZM
Minute offset for a time zone. This includes only the 2-digit minute.
Input: 2008-12-25 05:30:00+00 Output: 00
A space
Input: Output:
-./,'l;:
Same character in the output
Input: -./,'l;: Output: -./,'l;:
"Text"
Output is the value within the double quotes. To preserve a double quote or backslash character, use the \"
or \\
escape sequence. Other escape sequences are not supported.
Input: "abc" Output: abc
Input: "a\"b\\c" Output: a"b\c
These format elements will only work in the Retype step (or CAST
method in SQL). Format elements for using other date parsing or formatting methods are detailed elsewhere and might be useful if your data is not coercible using the format elements described here.
Last updated