FeatureAttributes#

Overview#

The mapping of attributes for a single feature

type:: object

Properties#

auto_derive_on_train#

ref:: FeatureAutoDeriveOnTrain

bounds#

ref:: FeatureBounds

code_features#

A list of features needed to derive code

type:: list
values:: string

cycle_length#

Cyclic features are set by specifying a cycle_length value in the feature attributes. cycle_length requires a single value, which is the upper bound of the difference for the cycle range. For example, if cycle_length is 360, then a value of 1 and 359 will have a difference of 2. Cyclic features have no restrictions in the input dataset, however, cyclic features will be output on a scale from 0 to cycle_length. To constrain the output to a different range, modify the min and max bounds feature attribute.

Examples:

degrees: values should be 0-359, cycle_length = 360
days: values should be 0-6, cycle_length = 7
hours: values should be 0-23, cycle_length = 24

type:: number

data_type#

ref:: FeatureDataType

date_time_format#

If specified, feature values should match the date format specified by this string. Only applicable to continuous features.

type:: string

decimal_places#

Decimal places to round to, default is no rounding. If significant_digits is also specified, the number will be rounded to the specified number of significant digits first, then rounded to the number of decimal points as specified by this parameter.

type:: number

default_time_zone#

The default time zone for datetimes. If unspecified, uses ‘UTC’.

type:: string

dependent_features#

A list of other feature names that this feature either depends on or features that depend on this feature. Should be used when there are multi-type value features that tightly depend on values based on other multi-type value features.

type:: list
values:: string

derived_feature_code#

Code defining how the value for this feature could be derived if this feature is specified as a derived_context_feature or a derived_action_feature during react flows. For react_series, the data referenced is the accumulated series data (as a list of rows), and for non-series reacts, the data is the one single row. Each row is comprised of all the combined context and action features. Referencing data in these rows uses 0-based indexing, where the current row index is 0, the previous row’s is 1, etc. The specified code may do simple logic and numeric operations on feature values referenced via feature name and row offset

Examples:

"(call value {feature "x" lag 1}): Use the value for feature ‘x’ from the previously processed row (offset of 1, one lag value).
"(- (call value {feature "y" lag 0}) (call value {feature "x" lag 1}))": Feature ‘y’ value from current (offset 0) row minus feature ‘x’ value from previous (offset 1) row.

type:: string

fanout_on#

A list of feature names whose values can be used to determine other cases in the model that have the same duplicated values for this feature.

type:: list
values:: string

id_feature#

Set to true for nominal features containing nominal IDs, specifying that his feature should be used to compute case weights for id based privacy. For time series, this feature will be used as the id for each time series generation.

type:: boolean

locale#

The date time format locale. If unspecified, uses platform default locale.

type:: string

max_row_lag#

The number of time steps traced back by the maximum lag feature created for this feature.

type:: number

nominal_numbers#

Defaults to false, applicable to code features (when ‘data_type’ is one of json/yaml/amalgam). If true, will assume that all numbers will match only if identical; if false, it will compare similarity of values.

type:: boolean

nominal_strings#

Defaults to true, applicable to code features (when ‘data_type’ is one of json/yaml/amalgam). If true, will assume that all strings will match only if identical; if false uses string edit distance to compare similarity.

type:: boolean

non_sensitive#

Flag a categorical nominal feature as non-sensitive. It is recommended that all nominal features be represented with either an int-id subtype or another available nominal subtype using the subtype attribute. However, if the nominal feature is non-sensitive, setting this parameter to true will bypass the subtype requirement. Only applicable to nominal features.

type:: boolean

null_is_dependent#

Modify how dependent features with nulls are treated during a react, specifically when they use null as a context value. Only applicable to dependent features.

When false (default), the feature will be treated as a non-dependent context feature. When true for nominal types, treats null as an individual dependent class value, only cases that also have nulls as this feature’s value will be considered. When true for continuous types, only the cases with the same dependent feature values as the cases that also have nulls as this feature’s value will be considered.

type:: boolean

observational_error#

Specifies the observational mean absolute error for this feature. Use when the error value is already known.

type:: number

original_format#

Original data formats used by clients. Automatically populated by clients to store client language specific context about features.

type:: any

original_type#

Original data type details. Used by clients to determine how to serialize and deserialize feature data.

ref:: FeatureOriginalType

parent#

The feature whose values this time-series feature’s values are derived from.

type:: string

parent_type#

The type of time-series processing used by the parent feature.

type:: string
enum:: delta | rate | covariate

post_process#

Custom Amalgam code that is called on resulting values of this feature during react operations.

type:: string

preserve_rare_values#

type:: object

Nested Properties:#

protected_values#

A collection of feature values that the Engine should preserve (the case-weight multipliers left unspecified).

type:: list
values:: any

protected_values_multipliers#

A collection of objects describing feature values to be preserved and the case-weight multipliers to use to preserve them.

type:: list
values:: {‘type’: ‘assoc’, ‘indices’: {‘value’: {‘type’: ‘any’, ‘required’: True, ‘description’: “The feature value having it’s signal preserved.”}, ‘multiplier’: {‘type’: ‘number’, ‘required’: True, ‘description’: ‘ The value by which to scale up case weights of cases with this feature value.’}}, ‘additional_indices’: False}

unprotected_multiplier#

The weight multiplier to use on cases without a protected value.

type:: number

recursive_matching#

Only applicable to code features (when ‘data_type’ is one of json/yaml/amalgam). If unspecified, defaults to false for ‘json’ and ‘yaml’ features and true for ‘amalgam’ features. When true, operatations will work recursively on feature values. When false, will operate on positional matches without considering recursion, which will yield better and faster results if the schema of the semistructured data is not recursive.

type:: boolean

sample#

A sample of a value for the feature.

type:: any

shared_deviations#

A list of feature names that will share deviations with this feature. In analysis, the predictions computed for this feature and the features specified are combined to create deviations that are used for all of the involved features. If a time series feature, then child lag features will automatically share deviations. If ‘shared_deviations’ is specified as false, then automatically created lag features will not automatically share deviations.

type:: list | boolean
values:: string

significant_digits#

Round to the specified significant digits, default is no rounding.

type:: number

subtype#

The type used in novel nominal substitution.

type:: string

time_series#

ref:: FeatureTimeSeries

ts_order#

The order of rate/delta being described by this time-series feature.

type:: number
min:: 0

ts_type#

The type of value being captured by this time-series feature.

type:: string
enum:: lag | delta | rate

type#

ref:: FeatureType
required:: True

types_must_match#

Defaults to true, applicable to code features (when ‘data_type’ is one of json/yaml/amalgam). If true, only considers nodes common if their types match.

type:: boolean

unique#

Flag feature as only having unique values. Only applicable to nominals features.

type:: boolean

value_evaluation_code#

Custom Amalgam code that is executed within queries to return the value used for the feature when measuring distances. This code should be a singular (call_entity) or (call_on_entity) with the entity specified as .null.

type:: string