Step: Retype

Overview

The Retype step converts a variable of a given type to another type.

Example starting data

/*---------+--------*
 | student | score  |
 +---------+--------+
 | jane    | 83     |
 | neal    | 35     |
 | sam     | 74     |
 | pat     | 62     |
 *---------+--------*/

Example output data:

Retype score from integer to float.

/*---------+--------*
 | student | score  |
 +---------+--------+
 | jane    | 83.0   |
 | neal    | 35.0   |
 | sam     | 74.0   |
 | pat     | 62.0   |
 *---------+--------*/

Step structure

  • There will be at least one retype block where you will define a variable and a new type.

Field definitions

FieldDefinition

Source variable

The variable that you want to retype. Note that you can see its current type by locating it in the variable selector at the bottom of the page, or hovering over the variable in this selection menu to see a tooltip with more information.

New type

The type that the Source variable will be converted to. Can be any of the supported variable types on Redivis. More on conversion rules.

If invalid for type, set to null

Whether failing type conversions should be converted to a null value. By default, failed type conversions will throw an error. Note that this might significantly change the content of your data and we suggest using this option with full understanding of how it might affect your outcome and verifying the output results.

Specify custom format

Informs how the data will be read by the method. Only relevant for conversions of strings to date, time, and dateTime. More on format elements.

Examples

Example 1: Basic usage

We can convert sales data currently stored as an integer type into float to use in another formula elsewhere that only accept the float type.

Starting data:

/*---------+-------+---------+------------*
 | test    | score | student | date       |
 +---------+-------+---------+------------+
 | quiz    | 83    | jane    | 2020-04-01 |
 | quiz    | 35    | pat     | 2020-04-01 |
 | quiz    | 89    | sam     | 2020-04-01 |
 | midterm | 74    | jane    | 2020-05-01 |
 | midterm | 62    | pat     | 2020-05-01 |
 | midterm | 93    | sam     | 2020-05-01 |
 | final   | 77    | jane    | 2020-06-01 |
 | final   | 59    | pat     | 2020-06-01 |
 | final   | 92    | sam     | 2020-06-01 |
 *---------+-------+---------+------------*/

Input fields:

  • Source variable: The variable we want to convert is score so we select it here.

  • New type: We want this to be a float, and since the score variable is currently an integer it is compatible with conversion to the float type, so we can select Float here.

  • If invalid for type, set to null: Since there are no incompatible values in this variable it doesn't matter what we choose. We leave it unchecked to validate that we understand our data and to confirm that this transform will execute without failing.

Output data:

/*---------+-------+---------+------------*
 | test    | score | student | date       |
 +---------+-------+---------+------------+
 | quiz    | 83.0  | jane    | 2020-04-01 |
 | quiz    | 35.0  | pat     | 2020-04-01 |
 | quiz    | 89.0  | sam     | 2020-04-01 |
 | midterm | 74.0  | jane    | 2020-05-01 |
 | midterm | 62.0  | pat     | 2020-05-01 |
 | midterm | 93.0  | sam     | 2020-05-01 |
 | final   | 77.0  | jane    | 2020-06-01 |
 | final   | 59.0  | pat     | 2020-06-01 |
 | final   | 92.0  | sam     | 2020-06-01 |
 *---------+-------+---------+------------*/

Our output table looks as it should when we look at the cells in the output table, and we can confirm the new type by clicking on the variable in the output table to check the type.

Example 2: Handling invalid conversions

Lets say instead that in our initial data, the sales variable was stored as a string. Converting this to a float would be a bit trickier since the data entry wasn't as clean.

Starting data:

/*---------+-------+---------+------------*
 | test    | score | student | date       |
 +---------+-------+---------+------------+
 | quiz    | 83    | jane    | 2020-04-01 |
 | quiz    | 35%   | pat     | 2020-04-01 |
 | quiz    | 89    | sam     | 2020-04-01 |
 | midterm | 74    | jane    | 2020-05-01 |
 | midterm | 62    | pat     | 2020-05-01 |
 | midterm | 93    | sam     | 2020-05-01 |
 | final   | 77    | jane    | 2020-06-01 |
 | final   | 59    | pat     | 2020-06-01 |
 | final   | 92    | sam     | 2020-06-01 |
 *---------+-------+---------+------------*/

Input fields:

  • Source variable: Same as above example.

  • New type: Same as above example.

  • If invalid for type, set to null: Since this data has 35% as a value, this can't be converted to a float. If we leave this box unchecked our transform will fail. Checking it will set that value to null.

Output data:

/*---------+-------+---------+------------*
 | test    | score | student | date       |
 +---------+-------+---------+------------+
 | quiz    | 83.0  | jane    | 2020-04-01 |
 | quiz    | null  | pat     | 2020-04-01 |
 | quiz    | 89.0  | sam     | 2020-04-01 |
 | midterm | 74.0  | jane    | 2020-05-01 |
 | midterm | 62.0  | pat     | 2020-05-01 |
 | midterm | 93.0  | sam     | 2020-05-01 |
 | final   | 77.0  | jane    | 2020-06-01 |
 | final   | 59.0  | pat     | 2020-06-01 |
 | final   | 92.0  | sam     | 2020-06-01 |
 *---------+-------+---------+------------*/

Note that while this retype was successful, our data might not be what we want, and in this case removes information.

Example 3: Parsing dates

Let's say we want to convert our year variable which is currently a string to a Date variable type, but that the starting format does not cleanly translate.

Starting data:

/*---------+-------+---------+------------*
 | test    | score | student | date       |
 +---------+-------+---------+------------+
 | quiz    | 83    | jane    | 04/01/2020 |
 | quiz    | 35    | pat     | 04/01/2020 |
 | quiz    | 89    | sam     | 04/01/2020 |
 | midterm | 74    | jane    | 05/01/2020 |
 | midterm | 62    | pat     | 05/01/2020 |
 | midterm | 93    | sam     | 05/01/2020 |
 | final   | 77    | jane    | 06/01/2020 |
 | final   | 59    | pat     | 06/01/2020 |
 | final   | 92    | sam     | 06/01/2020 |
 *---------+-------+---------+------------*/

Input fields:

  • Source variable: The variable we want to convert is date so we select it here.

  • New type: We want this to be a date so we can select Date here.

  • If invalid for type, set to null: Since there are no incompatible values in this variable it doesn't matter what we choose. We leave it unchecked to validate that we understand our data and to confirm that this transform will execute without failing.

  • Specify custom format: Since our data does not fit the standard format (%Y-%m-%d, e.g. 2020-10-01) we need to specify what format it is in. We can use the reference table at the bottom of this page to specify our format: MM/DD/YYYY

Output data:

/*---------+-------+---------+------------*
 | test    | score | student | date       |
 +---------+-------+---------+------------+
 | quiz    | 83    | jane    | 2020-04-01 |
 | quiz    | 35    | pat     | 2020-04-01 |
 | quiz    | 89    | sam     | 2020-04-01 |
 | midterm | 74    | jane    | 2020-05-01 |
 | midterm | 62    | pat     | 2020-05-01 |
 | midterm | 93    | sam     | 2020-05-01 |
 | final   | 77    | jane    | 2020-06-01 |
 | final   | 59    | pat     | 2020-06-01 |
 | final   | 92    | sam     | 2020-06-01 |
 *---------+-------+---------+------------*/

Reference: Type conversion

Note you can see more information on Redivis variable types here.

Starting typePossible destinationsNotes

String

Integer Float Boolean Date DateTime Time Geography

To integer: A hex string can be cast to an integer. For example, 0x123 to 291 or -0x123 to -291. To float: Returns x as a floating point value, interpreting it as having the same form as a valid floating point literal. To boolean: Returns TRUE if x is "true" and FALSE if x is "false". All other values of x are invalid and throw an error instead of casting to a boolean. A string is case-insensitive when converting to a boolean. To date, dateTime, or time: Uses the canonical format by default (see information below)

Integer

String

Float Boolean

To float: Returns a close but potentially not exact floating point value.

To boolean: Returns FALSE if x is 0, TRUE otherwise.

Float

String Integer

To integer: Returns the closest integer value. Halfway cases such as 1.5 or -0.5 round away from zero.

Boolean

String Integer

To string: Returns true if x is true, false otherwise. To integer: Returns 1 if x is true, 0 otherwise.

Date

String

DateTime

String Date Time

To date, dateTime, or time: Uses the canonical format by default (see information below)

Time

String Date DateTime

To date, dateTime, or time: Uses the canonical format by default (see information below)

Geography

String

Reference: Cannonical representation

When retyping between a String type variable and a Date, DateTime, or Time type variable it is presumed that the data will be in the format below.

LayoutExample

Date

(Four digit year)- (1 or 2 digit month)- (1 or 2 digit date)

2023-01-01 2023-1-1

Time

(1 or 2 digit hour): (1 or 2 digit minute): (1 or 2 digit second). (Up to 6 fractional seconds)

01:01:01.123456 6:2:9 22:19:3

DateTime

(Date specification) (space or T or t) (Time specification)

2023-01-01 01:01:01.123456 2023-1-1T6:2:9

If it is not in this canonical format you can click the Specify custom format field and use Format elements (below) to indicate otherwise.

Reference: Format elements

Since Date, DateTime, and Time variable types contain structured information you can use format strings to indicate how you want different pieces of date and time information translated to and from string format when retyping.

For example when converting a date to a string you can choose whether it will become JAN 1 2023 or 2023-01-01. When converting from a string to a DateTime you'll need to outline how the information is structured in your data so it can be read in correctly.

You can use these elements in the Specify custom format field.

ElementReturn

YYYY

Four (or more) digit year

Input: 2023-01-01 Output: 2023

Input: 23-01-01 Output: 0023

Input: 20000-01-01 Output: 20000

YYY

Last three digit year

Input: 2023-01-01 Output: 023

Input: 23-01-01 Output: 023

YY

Two digit year

Input: 2023-01-01 Output: 23

Input: 2-01-30 Output: 02

Y

Last one digit of year

Input: 2023-01-01 Output: 3

MM

Two digit month

Input: 2023-01-01 Output: 23

MON

Three character month: JAN, FEB, MAR, APR, MAY, JUN, JUL, AUG, SEP, OCT, NOV, DEC

Input: 2023-01-01 Output: JAN

MONTH

Month name

Input: 2023-01-01 Output: JANUARY

DDD

Three digit day of the year

Input: 2023-01-01 Output: 001

DD

Two digit day of the month

Input: 2023-01-01 Output: 01

D

Day of the week (1-7) with Sunday being 1

Input: 2023-01-01 Output: 1

DAY

Day of the week. Spaces are padded on the right side to make the output size exactly 9.

Input: 2023-01-01 Output: SUNDAY

DY

Three character day: MON, TUE, WED, THU, FRI, SAT, SUN

Input: 2023-01-01 Output: SUN

HH

Two digit hour of the day (valid values from 00 to 12)

Input: 20:10:15 Output: 10

HH12

Hour of the day (valid values from 00 to 12)

Input: 20:10:15 Output: 10

HH24

Two digit hour (valid values from 00 to 24)

Input: 20:10:15 Output: 20

MI

Two digit minute

Input: 20:10:15 Output: 10

SS

Two digit second

Input: 20:10:15 Output: 15

SSSSS

Five digit second

Input: 20:10:15 Output: 15234

FFn

(Replace n with a value from 1 to 9. For example, FF5.)

Fractional part of the second, n digits long. The fractional part of the second is rounded to fit the size of the output.

FF1

Input: 20:10:15 Output: 1

FF2 Input: 20:10:15 Output: 15

FF3 Input: 20:10:15 Output: 015

A.M. or AM P.M. or PM

A.M. (or AM) if the time is less than 12, otherwise P.M. (or PM). The letter case of the output is determined by the first letter case of the format element.

AM

Input: 09:10:15 Output: AM

A.M. Input: 20:10:15 Output: P.M.

PM Input: 09:10:15 Output: AM PM Input: 20:10:15 Output: PM

TZH

Hour offset for a time zone. This includes the +/- sign and 2-digit hour.

Input: 2008-12-25 05:30:00+00 Output: −08

TZM

Minute offset for a time zone. This includes only the 2-digit minute.

Input: 2008-12-25 05:30:00+00 Output: 00

A space

Input: Output:

-./,'l;:

Same character in the output

Input: -./,'l;: Output: -./,'l;:

"Text"

Output is the value within the double quotes. To preserve a double quote or backslash character, use the \" or \\ escape sequence. Other escape sequences are not supported.

Input: "abc" Output: abc

Input: "a\"b\\c" Output: a"b\c

These format elements will only work in the Retype step (or CAST method in SQL). Format elements for using other date parsing or formatting methods are detailed elsewhere and might be useful if your data is not coercible using the format elements described here.

Last updated