|
DayF core
1.2.1.2
DayF (Decision at your Fingertips) is an AutoML freeware development framework that let developers works with Machine Learning models without any idea of AI, simply taking a csv dataset and the objective column
|
Class oriented to manage normalizations on dataframes for improvements on accuracy. More...


Public Member Functions | |
| def | __init__ (self, e_c) |
| Constructor. More... | |
| def | define_normalizations (self, dataframe_metadata, an_objective, objective_column) |
| Method oriented to specificate data_normalizations. More... | |
| def | define_ignored_columns (self, dataframe_metadata, objective_column) |
| Method oriented to specificate ignored_columns. More... | |
| def | define_special_spark_naive_norm (self, dataframe_metadata) |
| Method oriented to specificate minimal data_normalizations. More... | |
| def | define_minimal_norm (self, dataframe_metadata, an_objective, objective_column) |
| Method oriented to specificate special data_normalizations non negative. More... | |
| def | filter_standardize (self, normalizemd, model_id) |
| Method oriented to filter stdmean operations on non standardize algorithms. More... | |
| def | filter_drop_missing (self, normalizemd) |
| Method oriented to filter drop_missing operations on non standardize algorithms. More... | |
| def | filter_objective_base (self, normalizemd) |
| Method oriented to filter filling_missing operations dependent of objective_column. More... | |
| def | normalizeDataFrame (self, df, normalizemd) |
| Main method oriented to define and manage normalizations sets applying normalizations. More... | |
| def | ignored_columns (self, normalizemd) |
| Method oriented to generate ignored_column_list on issues where missed > exclusion_missing_threshold. More... | |
| def | normalizeBase (self, dataframe) |
| Internal method oriented to manage drop NaN values from dataset. More... | |
| def | normalizeDropMissing (self, dataframe, col) |
| Internal method oriented to manage base normalizations. More... | |
| def | normalizeWorkingRange (self, dataframe, minval=-1.0, maxval=1.0, minrange=-1.0, maxrange=1.0) |
| Internal method oriented to manage Working range normalizations on a [closed, closed] interval. More... | |
| def | normalizeOffset (self, dataframe, offset=0) |
| Internal method oriented to manage Working range normalizations on a [closed, closed] interval. More... | |
| def | normalizeAgregation (self, dataframe, br=0.25) |
| Internal method oriented to manage bucket ratio normalizations head - tail. More... | |
| def | normalizeBinaryEncoding (self, dataframe) |
| Internal method oriented to manage Binary encodings. More... | |
| def | normalizeStdMean (self, dataframe, mean, std) |
| Internal method oriented to manage mean and std normalizations. More... | |
| def | normalizeDiscretize (self, dataframe, buckets_number, fixed_size) |
| Internal method oriented to manage bucketing for discretize. More... | |
| def | fixedMissingValues (self, dataframe, value=0.0) |
| Internal method oriented to manage imputation for missing values to fixed value. More... | |
| def | meanMissingValues (self, dataframe, col, objective_col, full=False) |
| Internal method oriented to manage imputation for missing values to mean value. More... | |
| def | progressiveMissingValues (self, dataframe, col, objective_col) |
| Internal method oriented to manage progressive imputations for missing values. More... | |
| def | normalizeDateTime (self, dataframe, date_column=None) |
| Internal method oriented to manage date_time conversions to pattern. More... | |
Class oriented to manage normalizations on dataframes for improvements on accuracy.
Definition at line 26 of file normalizer.py.
| def gdayf.normalizer.normalizer.Normalizer.__init__ | ( | self, | |
| e_c | |||
| ) |
| def gdayf.normalizer.normalizer.Normalizer.define_ignored_columns | ( | self, | |
| dataframe_metadata, | |||
| objective_column | |||
| ) |
Method oriented to specificate ignored_columns.
| dataframe_metadata | DFMetadata() |
| objective_column | string indicating objective column |
Definition at line 89 of file normalizer.py.
| def gdayf.normalizer.normalizer.Normalizer.define_minimal_norm | ( | self, | |
| dataframe_metadata, | |||
| an_objective, | |||
| objective_column | |||
| ) |
Method oriented to specificate special data_normalizations non negative.
| dataframe_metadata | DFMetadata() |
| an_objective | ATypesMetadata |
| objective_column | string indicating objective column |
Definition at line 147 of file normalizer.py.
| def gdayf.normalizer.normalizer.Normalizer.define_normalizations | ( | self, | |
| dataframe_metadata, | |||
| an_objective, | |||
| objective_column | |||
| ) |
Method oriented to specificate data_normalizations.
| dataframe_metadata | DFMetadata() |
| an_objective | ATypesMetadata |
| objective_column | string indicating objective column |
Definition at line 41 of file normalizer.py.
| def gdayf.normalizer.normalizer.Normalizer.define_special_spark_naive_norm | ( | self, | |
| dataframe_metadata | |||
| ) |
Method oriented to specificate minimal data_normalizations.
| dataframe_metadata | DFMetadata() |
| an_objective | ATypesMetadata |
| objective_column | string indicating objective column |
Definition at line 124 of file normalizer.py.
| def gdayf.normalizer.normalizer.Normalizer.filter_drop_missing | ( | self, | |
| normalizemd | |||
| ) |
Method oriented to filter drop_missing operations on non standardize algorithms.
| normalizemd | OrderedDict() compatible structure |
Definition at line 212 of file normalizer.py.
| def gdayf.normalizer.normalizer.Normalizer.filter_objective_base | ( | self, | |
| normalizemd | |||
| ) |
Method oriented to filter filling_missing operations dependent of objective_column.
| normalizemd | OrderedDict() compatible structure |
Definition at line 229 of file normalizer.py.
| def gdayf.normalizer.normalizer.Normalizer.filter_standardize | ( | self, | |
| normalizemd, | |||
| model_id | |||
| ) |
Method oriented to filter stdmean operations on non standardize algorithms.
| normalizemd | OrderedDict() compatible structure |
| model_id | Model_identification |
Definition at line 196 of file normalizer.py.
| def gdayf.normalizer.normalizer.Normalizer.fixedMissingValues | ( | self, | |
| dataframe, | |||
value = 0.0 |
|||
| ) |
Internal method oriented to manage imputation for missing values to fixed value.
| self | object pointer |
| dataframe | single column dataframe |
| value | int |
Definition at line 470 of file normalizer.py.

| def gdayf.normalizer.normalizer.Normalizer.ignored_columns | ( | self, | |
| normalizemd | |||
| ) |
Method oriented to generate ignored_column_list on issues where missed > exclusion_missing_threshold.
| normalizemd | mormalizations_set_metadata |
Definition at line 353 of file normalizer.py.
| def gdayf.normalizer.normalizer.Normalizer.meanMissingValues | ( | self, | |
| dataframe, | |||
| col, | |||
| objective_col, | |||
full = False |
|||
| ) |
Internal method oriented to manage imputation for missing values to mean value.
| self | object pointer |
| dataframe | full column dataframe |
| col | column name for imputation |
| objective_col | objective_column |
| full | True means fll_dataframe.mean(), False means objective_col.value.mean() |
Definition at line 480 of file normalizer.py.

| def gdayf.normalizer.normalizer.Normalizer.normalizeAgregation | ( | self, | |
| dataframe, | |||
br = 0.25 |
|||
| ) |
Internal method oriented to manage bucket ratio normalizations head - tail.
| self | object pointer |
| dataframe | single column dataframe |
| br | bucket ratio |
Definition at line 416 of file normalizer.py.

| def gdayf.normalizer.normalizer.Normalizer.normalizeBase | ( | self, | |
| dataframe | |||
| ) |
Internal method oriented to manage drop NaN values from dataset.
| self | object pointer |
| dataframe | single column dataframe |
Definition at line 367 of file normalizer.py.

| def gdayf.normalizer.normalizer.Normalizer.normalizeBinaryEncoding | ( | self, | |
| dataframe | |||
| ) |
Internal method oriented to manage Binary encodings.
| self | object pointer |
| dataframe | single column dataframe |
Definition at line 432 of file normalizer.py.
| def gdayf.normalizer.normalizer.Normalizer.normalizeDataFrame | ( | self, | |
| df, | |||
| normalizemd | |||
| ) |
Main method oriented to define and manage normalizations sets applying normalizations.
| self | object pointer |
| df | dataframe |
| normalizemd | OrderedDict() compatible structure |
Definition at line 249 of file normalizer.py.

| def gdayf.normalizer.normalizer.Normalizer.normalizeDateTime | ( | self, | |
| dataframe, | |||
date_column = None |
|||
| ) |
Internal method oriented to manage date_time conversions to pattern.
| self | object pointer |
| dataframe | full column dataframe to be expanded |
| date_column | Date_Column name to be transformed |
Definition at line 539 of file normalizer.py.

| def gdayf.normalizer.normalizer.Normalizer.normalizeDiscretize | ( | self, | |
| dataframe, | |||
| buckets_number, | |||
| fixed_size | |||
| ) |
Internal method oriented to manage bucketing for discretize.
| self | object pointer |
| dataframe | single column dataframe |
| buckets_number | Int |
| fixed_size | Boolean (True=Fixed Size, False Fixed Frecuency |
Definition at line 458 of file normalizer.py.

| def gdayf.normalizer.normalizer.Normalizer.normalizeDropMissing | ( | self, | |
| dataframe, | |||
| col | |||
| ) |
Internal method oriented to manage base normalizations.
| self | object pointer |
| dataframe | single column dataframe |
| col | column base to reference drop NaN |
Definition at line 382 of file normalizer.py.

| def gdayf.normalizer.normalizer.Normalizer.normalizeOffset | ( | self, | |
| dataframe, | |||
offset = 0 |
|||
| ) |
Internal method oriented to manage Working range normalizations on a [closed, closed] interval.
| self | object pointer |
| dataframe | single column dataframe |
| minval | |
| maxval |
Definition at line 406 of file normalizer.py.

| def gdayf.normalizer.normalizer.Normalizer.normalizeStdMean | ( | self, | |
| dataframe, | |||
| mean, | |||
| std | |||
| ) |
Internal method oriented to manage mean and std normalizations.
Default mean=0 std=1
| self | object pointer |
| dataframe | single column dataframe |
| mean | mean value to center |
| std | standard deviation value to be normalized |
Definition at line 441 of file normalizer.py.

| def gdayf.normalizer.normalizer.Normalizer.normalizeWorkingRange | ( | self, | |
| dataframe, | |||
minval = -1.0, |
|||
maxval = 1.0, |
|||
minrange = -1.0, |
|||
maxrange = 1.0 |
|||
| ) |
Internal method oriented to manage Working range normalizations on a [closed, closed] interval.
| self | object pointer |
| dataframe | single column dataframe |
| minval | |
| maxval |
Definition at line 391 of file normalizer.py.

| def gdayf.normalizer.normalizer.Normalizer.progressiveMissingValues | ( | self, | |
| dataframe, | |||
| col, | |||
| objective_col | |||
| ) |
Internal method oriented to manage progressive imputations for missing values.
([right_not_nan] - [left_not_nan])/Cardinality(is_nan)
| self | object pointer |
| dataframe | full column dataframe |
| col | column name for imputation |
| objective_col | objective_column |
Definition at line 502 of file normalizer.py.

1.8.13