FeatureAttributes#
Table of Contents
Overview#
The mapping of attributes for a single feature
- type:
object
Properties#
auto_derive_on_train#
bounds#
- ref:
code_features#
A list of features needed to derive code
- type:
list
- values:
string
cycle_length#
Cyclic features are set by specifying a cycle_length value in the feature attributes. cycle_length requires a single value, which is the upper bound of the difference for the cycle range. For example, if cycle_length is 360, then a value of 1 and 359 will have a difference of 2. Cyclic features have no restrictions in the input dataset, however, cyclic features will be output on a scale from 0 to cycle_length. To constrain the output to a different range, modify the min and max bounds feature attribute.
Examples:
degrees: values should be 0-359, cycle_length = 360
days: values should be 0-6, cycle_length = 7
hours: values should be 0-23, cycle_length = 24
- type:
number
data_type#
- ref:
date_time_format#
If specified, feature values should match the date format specified by this string. Only applicable to continuous features.
- type:
string
decimal_places#
Decimal places to round to, default is no rounding. If significant_digits is also specified, the number will be rounded to the specified number of significant digits first, then rounded to the number of decimal points as specified by this parameter.
- type:
number
default_time_zone#
The default time zone for datetimes. If unspecified, uses ‘UTC’.
- type:
string
dependent_features#
A list of other feature names that this feature either depends on or features that depend on this feature. Should be used when there are multi-type value features that tightly depend on values based on other multi-type value features.
- type:
list
- values:
string
derived_feature_code#
Code defining how the value for this feature could be derived if this feature is specified as a derived_context_feature or a derived_action_feature during react flows. For react_series, the data referenced is the accumulated series data (as a list of rows), and for non-series reacts, the data is the one single row. Each row is comprised of all the combined context and action features. Referencing data in these rows uses 0-based indexing, where the current row index is 0, the previous row’s is 1, etc. The specified code may do simple logic and numeric operations on feature values referenced via feature name and row offset
Examples:
"(call value {feature "x" lag 1}): Use the value for feature ‘x’ from the previously processed row (offset of 1, one lag value)."(- (call value {feature "y" lag 0}) (call value {feature "x" lag 1}))": Feature ‘y’ value from current (offset 0) row minus feature ‘x’ value from previous (offset 1) row.
- type:
string
fanout_on#
A list of feature names whose values can be used to determine other cases in the model that have the same duplicated values for this feature.
- type:
list
- values:
string
id_feature#
Set to true for nominal features containing nominal IDs, specifying that his feature should be used to compute case weights for id based privacy. For time series, this feature will be used as the id for each time series generation.
- type:
boolean
locale#
The date time format locale. If unspecified, uses platform default locale.
- type:
string
max_row_lag#
The number of time steps traced back by the maximum lag feature created for this feature.
- type:
number
nominal_numbers#
Defaults to false, applicable to code features (when ‘data_type’ is one of json/yaml/amalgam). If true, will assume that all numbers will match only if identical; if false, it will compare similarity of values.
- type:
boolean
nominal_strings#
Defaults to true, applicable to code features (when ‘data_type’ is one of json/yaml/amalgam). If true, will assume that all strings will match only if identical; if false uses string edit distance to compare similarity.
- type:
boolean
non_sensitive#
Flag a categorical nominal feature as non-sensitive. It is recommended that all nominal features be represented with either an int-id subtype or another available nominal subtype using the subtype attribute. However, if the nominal feature is non-sensitive, setting this parameter to true will bypass the subtype requirement. Only applicable to nominal features.
- type:
boolean
null_is_dependent#
Modify how dependent features with nulls are treated during a react, specifically when they use null as a context value. Only applicable to dependent features.
When false (default), the feature will be treated as a non-dependent context feature. When true for nominal types, treats null as an individual dependent class value, only cases that also have nulls as this feature’s value will be considered. When true for continuous types, only the cases with the same dependent feature values as the cases that also have nulls as this feature’s value will be considered.
- type:
boolean
observational_error#
Specifies the observational mean absolute error for this feature. Use when the error value is already known.
- type:
number
original_format#
Original data formats used by clients. Automatically populated by clients to store client language specific context about features.
- type:
any
original_type#
Original data type details. Used by clients to determine how to serialize and deserialize feature data.
- ref:
parent#
The feature whose values this time-series feature’s values are derived from.
- type:
string
parent_type#
The type of time-series processing used by the parent feature.
- type:
string
- enum:
delta | rate | covariate
post_process#
Custom Amalgam code that is called on resulting values of this feature during react operations.
- type:
string
recursive_matching#
Only applicable to code features (when ‘data_type’ is one of json/yaml/amalgam). If unspecified, defaults to false for ‘json’ and ‘yaml’ features and true for ‘amalgam’ features. When true, operatations will work recursively on feature values. When false, will operate on positional matches without considering recursion, which will yield better and faster results if the schema of the semistructured data is not recursive.
- type:
boolean
sample#
A sample of a value for the feature.
- type:
any
significant_digits#
Round to the specified significant digits, default is no rounding.
- type:
number
subtype#
The type used in novel nominal substitution.
- type:
string
time_series#
- ref:
ts_order#
The order of rate/delta being described by this time-series feature.
- type:
number
- min:
0
ts_type#
The type of value being captured by this time-series feature.
- type:
string
- enum:
lag | delta | rate
type#
- ref:
- required:
True
types_must_match#
Defaults to true, applicable to code features (when ‘data_type’ is one of json/yaml/amalgam). If true, only considers nodes common if their types match.
- type:
boolean
unique#
Flag feature as only having unique values. Only applicable to nominals features.
- type:
boolean