Standard Transformer Expressions
Standard Transformer provides several built-in functions that are useful for common ML use-cases. These built-in functions are accessible from within expression context.
Geospatial
Geospatial
Geospatial
Geospatial
Geospatial
Geospatial
Geospatial
Geospatial
JSON
Statistics
Time
Time
Time
Time
Time
Time
Series
Series
Series
Series
Series
Series
Series
Series
Series
Series
Series
Series
Series
Geospatial
Geohash
Geohash calculates geohash of latitude
and longitude
with the given precision
.
Input
Latitude
Latitude of the object, in form of JSONPath, array, or variable.
Longitude
Longitude of the object, in form of JSONPath, array, or variable.
Precision
Character precision in integer.
Output
Geohash of location with the given precision.
Example
Input:
{
"latitude": 1.0,
"longitude": 2.0
}
Standard Transformer Config:
variables:
- name: geohash
expression: Geohash("$.latitude", "$.longitude", 12)
Output: `"s01mtw037ms0"`
S2ID
S2ID calculates S2ID cell of latitude
and longitude
with the given level
.
Input
Latitude
Latitude of the object, in form of JSONPath, array, or variable.
Longitude
Longitude of the object, in form of JSONPath, array, or variable.
Level
S2ID level in integer.
Output
S2ID cell of the location in certain level.
Example
Input:
{
"latitude": 1.0,
"longitude": 2.0
}
Standard Transformer Config:
variables:
- name: s2id
expression: S2ID("$.latitude", "$.longitude", 12)
Output: `"1154732743855177728"`
HaversineDistance
HaversineDistance calculates Haversine distance of two points (given by their latitude and longitude).
Input
Latitude 1
Latitude of the first point, in form of JSONPath, array, or variable.
Longitude 1
Longitude of the first point, in form of JSONPath, array, or variable.
Latitude 2
Latitude of the second point, in form of JSONPath, array, or variable.
Longitude 2
Longitude of the second point, in form of JSONPath, array, or variable.
Output
The haversine distance between 2 points in kilometer.
Example
Input:
{
"pickup": {
"latitude": 1.0,
"longitude": 2.0
},
"dropoff": {
"latitude": 1.2,
"longitude": 2.2
}
}
Standard Transformer Config:
variables:
- name: haversine_distance
expression: HaversineDistance("$.pickup.latitude", "$.pickup.longitude", "$.dropoff.latitude", "$.dropoff.longitude")
HaversineDistanceWithUnit
HaversineDistanceWithUnit calculates Haversine distance of two points (given by their latitude and longitude) and given the distance unit
Input
Latitude 1
Latitude of the first point, in form of JSONPath, array, or variable.
Longitude 1
Longitude of the first point, in form of JSONPath, array, or variable.
Latitude 2
Latitude of the second point, in form of JSONPath, array, or variable.
Longitude 2
Longitude of the second point, in form of JSONPath, array, or variable.
Distance Unit
Unit of distance measurement, supported unit km
and m
Output
The haversine distance between 2 points.
Example
Input:
{
"pickup": {
"latitude": 1.0,
"longitude": 2.0
},
"dropoff": {
"latitude": 1.2,
"longitude": 2.2
}
}
Standard Transformer Config:
variables:
- name: haversine_distance
expression: HaversineDistanceWithUnit("$.pickup.latitude", "$.pickup.longitude", "$.dropoff.latitude", "$.dropoff.longitude", "m")
PolarAngle
PolarAngle calculates polar angles between two points (given by their latitude and longitude) in radian.
Input
Latitude 1
Latitude of the first point, in form of JSONPath, array, or variable.
Longitude 1
Longitude of the first point, in form of JSONPath, array, or variable.
Latitude 2
Latitude of the second point, in form of JSONPath, array, or variable.
Longitude 2
Longitude of the second point, in form of JSONPath, array, or variable.
Output
The polar angles between 2 points in radian.
Example
Input:
{
"pickup": {
"latitude": 1.0,
"longitude": 2.0
},
"dropoff": {
"latitude": 1.2,
"longitude": 2.2
}
}
Standard Transformer Config:
variables:
- name: polar_angle
expression: PolarAngle("$.pickup.latitude", "$.pickup.longitude", "$.dropoff.latitude", "$.dropoff.longitude")
GeohashDistance
GeohashDistance will calculate haversine distance between two geohash. It will convert a geohash into the center point (latitude, longitude) of that geohash and calculate haversine distance based on that point.
Input
Geohash 1
First geohash, in form of JSONPath, array
Geohash 2
Second geohash, in form of JSONPath, array
Distance Unit
Unit measurement of distance, supported unit km
and m
Output
Haversine Distance between two geohash calculated from the center point of that geohash
Example
Input:
{
"pickup_geohash": "qqgggnwxx",
"dropoff_geohash": "qqgggnweb"
}
Standard Transformer Config:
variables:
- name: geohash_distance
expression: GeohashDistance("$.pickup_geohash", "$.dropoff_geohash", "m")
GeohashAllNeighbors
GeohashAllNeighbors will find all neighbors of geohash from all directions
Input
Geohash 1
Geohash , in form of JSONPath, array
Output
List of neighbors of given geohash
Example
Input:
{
"pickup_geohash": "qqgggnwxx",
"dropoff_geohash": "qqgggnweb"
}
Standard Transformer Config:
variables:
- name: geohash_distance
expression: GeohashAllNeighbors("$.pickup_geohash")
GeohashNeighborForDirection
GeohashNeighborForDirection will find a neighbor of geohash given the direction
Input
Geohash 1
Geohash , in form of JSONPath, array
Direction
Direction of that neighbor relatively from geohash. List of accepted direction north
, northeast
, northwest
, south
, southeast
, southwest
, west
, east
Output
Neighbor of given geohash
Example
Input:
{
"pickup_geohash": "qqgggnwxx",
"dropoff_geohash": "qqgggnweb"
}
Standard Transformer Config:
variables:
- name: geohash_distance
expression: GeohashNeighborForDirection("$.pickup_geohash", "north")
JSON
JsonExtract
Given a JSON string as value, you can use JsonExtract to extract JSON value from that JSON string.
Input
Parent's JSONPath
Path to JSON key that its value is a JSON string to be extracted.
Nested's JSONPath
Path to JSON key inside of JSON string which extracted from Parent's JSONPath above.
Output
JSON value within a JSON string pointed by the first JSONPath argument.
Example
Input:
{
"details": "{\"merchant_id\": 9001}"
}
Standard Transformer Config:
variables:
- name: merchant_id
valueType: STRING
expression: JsonExtract("$.details", "$.merchant_id")
Output: `"9001"`
Statistics
CumulativeValue
CumulativeValue is a function that accumulates values based on the index and its predecessors. E.g., [1, 2, 3] => [1, 1+2, 1+2+3] => [1, 3, 6]
.
Input
Values
Array of numbers.
Output
Array of cumulative values.
Example
Input:
{
"fares": [10000, 20000, 50000]
}
Standard Transformer Config:
variables:
- name: cumulative_fares
expression: CumulativeValue($.fares)
Output: `[10000, 30000, 80000]`
Time
Now
Return current local timestamp.
Input
None
Output
Current local timestamp.
Example
Standard Transformer Config:
variables:
- name: currentTime
expression: Now()
DayOfWeek
Return number representations of the day in a week, given the timestamp and timezone.
SUNDAY(0), MONDAY(1), TUESDAY(2), WEDNESDAY(3), THURSDAY(4), FRIDAY(5), SATURDAY(6).
Input
Timestamp
Unix timestamp value in integer or string format. It accepts JSONPath, arrays, or variable.
Timezone
Timezone value in string. For example, Asia/Jakarta
. It accepts JSONPath, arrays, or variable.
Output
Day number.
Example
Input:
{
"timestamp": "1637605459"
}
Standard Transformer Config:
variables:
- name: day_of_week
expression: DayOfWeek("$.timestamp", "Asia/Jakarta")
Output: `2`
IsWeekend
Return 1 if given timestamp is weekend (Saturday or Sunday), otherwise 0.
Input
Timestamp
Unix timestamp value in integer or string format. It accepts JSONPath, arrays, or variable.
Timezone
Timezone value in string. For example, Asia/Jakarta
. It accepts JSONPath, arrays, or variable.
Output
1 if weekend, 0 if not
Example
Input:
{
"timestamp": "1637445044",
"timezone": "Asia/Jakarta"
}
Standard Transformer Config:
variables:
- name: is_weekend
expression: IsWeekend("$.timestamp", "$.timezone")
Output: `1`
FormatTimestamp
FormatTimestamp converts timestamp in given location into formatted date time string.
Input
Timestamp
Unix timestamp value in integer or string format. It accepts JSONPath, arrays, or variable.
Timezone
Timezone value in string. For example, Asia/Jakarta
. It accepts JSONPath, arrays, or variable.
Format
Targetted date time format. It follows Golang date time format (https://pkg.go.dev/time#pkg-constants).
Output
Date time.
Examples
Input:
{
"timestamp": "1637691859"
}
Standard Transformer Config:
variables:
- name: datetime
expression: FormatTimestamp("$.timestamp", "Asia/Jakarta", "2006-01-02")
Output: `"2021-11-24"`
ParseTimestamp
ParseTimestamp converts timestamp in integer or string format to time.
Input
Timestamp
Unix timestamp value in integer or string format. It accepts JSONPath, arrays, or variable.
Output
Parsed timestamp.
Examples
Input:
{
"timestamp": "1619541221"
}
Standard Transformer Config:
variables:
- name: parsed_timestamp
expression: ParseTimestamp("$.timestamp")
Output: `"2021-04-27 16:33:41 +0000 UTC"`
ParseDateTime
ParseDateTime converts datetime given with specified format layout (e.g. RFC3339) into time.
Input
Date time
Date time value in string format. It accepts JSONPath, arrays, or variable.
Timezone
Timezone value in string. For example, Asia/Jakarta
. It accepts JSONPath, arrays, or variable.
Format
Date time input format. It follows Golang date time format (https://pkg.go.dev/time#pkg-constants).
Output
Parsed date time.
Examples
Input:
{
"datetime": "2021-11-30 15:00:00",
"location": "Asia/Jayapura"
}
Standard Transformer Config:
variables:
- name: parsed_datetime
expression: ParseDateTime("$.datetime", "$.location", "2006-01-02 15:04:05")
Output: `"2021-11-30 15:00:00 +0900 WIT"`
Series Expression
Series expression is function that can be invoked by series (column) values in a table
Get
Get
will retrieve a row in series based on the given index
Input
Index
Position of rows starts with 0
Output
Single series row
Examples
Suppose users have table yourTableName
1
2000
0.02
2
3000
0.005
3
4000
0.006
Users try to retrieve index 2 for series avg_order_1_day
Standard Transformer Config:
variables:
- name: total_order_1_day
expression: yourTableName.Col("avg_order_1_day").Get(2)
Output: 4000
IsIn
IsIn
checks whether value in a row is part of the given array, the result will be a new series that has boolean type
Input
Comparator
Array of value
Output
New Series that has boolean type and same dimension with original series
Examples
Suppose users have table yourTableName
1
2000
0.02
2
3000
0.005
3
4000
0.006
Standard Transformer Config:
variables:
- name: bool_series
expression: yourTableName.Col("avg_order_1_day").IsIn([2000, 3000])
Output:
true
true
false
StdDev
StdDev
is a function to calculate standard deviation from series values. The output will be single value
Input
No Input
Output
Single value with float type
Examples
Suppose users have table yourTableName
1
2000
0.02
2
3000
0.005
3
4000
0.006
Standard Transformer Config:
variables:
- name: std_dev
expression: yourTableName.Col("avg_cancellation_rate_30_day").StdDev()
Output: 0.0068475461947247
Mean
Mean
is a function to calculate mean value from series values. The output will be single value
Input
No Input
Output
Single value with float type
Examples
Suppose users have table yourTableName
1
2000
0.02
2
3000
0.005
3
4000
0.006
Standard Transformer Config:
variables:
- name: mean
expression: yourTableName.Col("avg_order_1_day").Mean()
Output: 3000
Median
Median
is a function to calculate median value from series values. The output will be single value
Input
No Input
Output
Single value with float type
Examples
Suppose users have table yourTableName
1
2000
0.02
2
3000
0.005
3
4000
0.006
Standard Transformer Config:
variables:
- name: median
expression: yourTableName.Col("avg_order_1_day").Median()
Output: 3000
Max
Max
is a function to find max value from series values. The output will be single value
Input
No Input
Output
Single value with float type
Examples
Suppose users have table yourTableName
1
2000
0.02
2
3000
0.005
3
4000
0.006
Standard Transformer Config:
variables:
- name: max
expression: yourTableName.Col("avg_order_1_day").Max()
Output: 4000
MaxStr
MaxStr
is a function to find max value from series values. The output will be single value in string type
Input
No Input
Output
Single value with string type
Examples
Suppose users have table yourTableName
1
2000
0.02
2
3000
0.005
3
4000
0.006
Standard Transformer Config:
variables:
- name: max_str
expression: yourTableName.Col("avg_order_1_day").MaxStr()
Output: "4000"
Min
Min
is a function to find minimum value from series values. The output will be single value in float type
Input
No Input
Output
Single value with float type
Examples
Suppose users have table yourTableName
1
2000
0.02
2
3000
0.005
3
4000
0.006
Standard Transformer Config:
variables:
- name: min
expression: yourTableName.Col("avg_order_1_day").Min()
Output: 2000
MinStr
MinStr
is a function to find minimum value from series values. The output will be single value in string type
Input
No Input
Output
Single value with string type
Examples
Suppose users have table yourTableName
1
2000
0.02
2
3000
0.005
3
4000
0.006
Standard Transformer Config:
variables:
- name: min_str
expression: yourTableName.Col("avg_order_1_day").MinStr()
Output: "2000"
Quantile
Quantile
is a function to returns the sample of x such that x is greater than or equal to the fraction p of samples
Input
Fraction in float type
Output
Single value with float type
Examples
Suppose users have table yourTableName
1
2
3
4
5
6
7
8
9
10
Standard Transformer Config:
variables:
- name: quantile_0.9
expression: yourTableName.Col("rank").Quantile(0.9)
Output: 9
Sum
Sum
is a function to sum all the values in the seriess. The output will be single value in float type
Input
No Input
Output
Single value with float type
Examples
Suppose users have table yourTableName
1
2000
0.02
2
3000
0.005
3
4000
0.006
Standard Transformer Config:
variables:
- name: sum
expression: yourTableName.Col("avg_order_1_day").Sum()
Output: 9000
Flatten
Flatten
is a function to flatten all values in a series, this is suitable for series that has list type, for non list the result will be the same with the original seriess
Input
No Input
Output
New Series that the value already flatten
Examples
Suppose users have table yourTableName
1
[2, 3, 4]
2
[4, 5, 6]
3
[7, 8, 9]
Standard Transformer Config:
variables:
- name: restaurant_ids
expression: yourTableName.Col("nearby_restaurant_ids").Flatten()
Output:
2
3
4
4
5
6
7
8
9
Unique
Unique
is a function to return all values without duplication.
Input
No Input
Output
New Series that has unique value for each row
Examples
Suppose users have table yourTableName
1
[2, 2, 4]
2
[4, 5, 4]
1
[2, 2, 4]
Standard Transformer Config:
variables:
- name: unique_restaurant_id
expression: yourTableName.Col("restaurant_id").Unique()
Output:
1
2
variables:
- name: rating
expression: yourTableName.Col("rating").Unique()
Output:
[2, 2, 4]
[4, 5, 4]
Last updated