DayF core  1.2.1.2
DayF (Decision at your Fingertips) is an AutoML freeware development framework that let developers works with Machine Learning models without any idea of AI, simply taking a csv dataset and the objective column
Public Member Functions | List of all members
gdayf.normalizer.normalizer.Normalizer Class Reference

Class oriented to manage normalizations on dataframes for improvements on accuracy. More...

Inheritance diagram for gdayf.normalizer.normalizer.Normalizer:
Inheritance graph
[legend]
Collaboration diagram for gdayf.normalizer.normalizer.Normalizer:
Collaboration graph
[legend]

Public Member Functions

def __init__ (self, e_c)
 Constructor. More...
 
def define_normalizations (self, dataframe_metadata, an_objective, objective_column)
 Method oriented to specificate data_normalizations. More...
 
def define_ignored_columns (self, dataframe_metadata, objective_column)
 Method oriented to specificate ignored_columns. More...
 
def define_special_spark_naive_norm (self, dataframe_metadata)
 Method oriented to specificate minimal data_normalizations. More...
 
def define_minimal_norm (self, dataframe_metadata, an_objective, objective_column)
 Method oriented to specificate special data_normalizations non negative. More...
 
def filter_standardize (self, normalizemd, model_id)
 Method oriented to filter stdmean operations on non standardize algorithms. More...
 
def filter_drop_missing (self, normalizemd)
 Method oriented to filter drop_missing operations on non standardize algorithms. More...
 
def filter_objective_base (self, normalizemd)
 Method oriented to filter filling_missing operations dependent of objective_column. More...
 
def normalizeDataFrame (self, df, normalizemd)
 Main method oriented to define and manage normalizations sets applying normalizations. More...
 
def ignored_columns (self, normalizemd)
 Method oriented to generate ignored_column_list on issues where missed > exclusion_missing_threshold. More...
 
def normalizeBase (self, dataframe)
 Internal method oriented to manage drop NaN values from dataset. More...
 
def normalizeDropMissing (self, dataframe, col)
 Internal method oriented to manage base normalizations. More...
 
def normalizeWorkingRange (self, dataframe, minval=-1.0, maxval=1.0, minrange=-1.0, maxrange=1.0)
 Internal method oriented to manage Working range normalizations on a [closed, closed] interval. More...
 
def normalizeOffset (self, dataframe, offset=0)
 Internal method oriented to manage Working range normalizations on a [closed, closed] interval. More...
 
def normalizeAgregation (self, dataframe, br=0.25)
 Internal method oriented to manage bucket ratio normalizations head - tail. More...
 
def normalizeBinaryEncoding (self, dataframe)
 Internal method oriented to manage Binary encodings. More...
 
def normalizeStdMean (self, dataframe, mean, std)
 Internal method oriented to manage mean and std normalizations. More...
 
def normalizeDiscretize (self, dataframe, buckets_number, fixed_size)
 Internal method oriented to manage bucketing for discretize. More...
 
def fixedMissingValues (self, dataframe, value=0.0)
 Internal method oriented to manage imputation for missing values to fixed value. More...
 
def meanMissingValues (self, dataframe, col, objective_col, full=False)
 Internal method oriented to manage imputation for missing values to mean value. More...
 
def progressiveMissingValues (self, dataframe, col, objective_col)
 Internal method oriented to manage progressive imputations for missing values. More...
 
def normalizeDateTime (self, dataframe, date_column=None)
 Internal method oriented to manage date_time conversions to pattern. More...
 

Detailed Description

Class oriented to manage normalizations on dataframes for improvements on accuracy.

Definition at line 26 of file normalizer.py.

Constructor & Destructor Documentation

◆ __init__()

def gdayf.normalizer.normalizer.Normalizer.__init__ (   self,
  e_c 
)

Constructor.

Parameters
e_ccontext pointer

Definition at line 30 of file normalizer.py.

Member Function Documentation

◆ define_ignored_columns()

def gdayf.normalizer.normalizer.Normalizer.define_ignored_columns (   self,
  dataframe_metadata,
  objective_column 
)

Method oriented to specificate ignored_columns.

Parameters
dataframe_metadataDFMetadata()
objective_columnstring indicating objective column
Returns
None if nothing to DO or Normalization_sets orderdict() on other way

Definition at line 89 of file normalizer.py.

◆ define_minimal_norm()

def gdayf.normalizer.normalizer.Normalizer.define_minimal_norm (   self,
  dataframe_metadata,
  an_objective,
  objective_column 
)

Method oriented to specificate special data_normalizations non negative.

Parameters
dataframe_metadataDFMetadata()
an_objectiveATypesMetadata
objective_columnstring indicating objective column
Returns
None if nothing to DO or Normalization_sets orderdict() on other way

Definition at line 147 of file normalizer.py.

◆ define_normalizations()

def gdayf.normalizer.normalizer.Normalizer.define_normalizations (   self,
  dataframe_metadata,
  an_objective,
  objective_column 
)

Method oriented to specificate data_normalizations.

Parameters
dataframe_metadataDFMetadata()
an_objectiveATypesMetadata
objective_columnstring indicating objective column
Returns
None if nothing to DO or Normalization_sets OrderedDict() on other way

Definition at line 41 of file normalizer.py.

◆ define_special_spark_naive_norm()

def gdayf.normalizer.normalizer.Normalizer.define_special_spark_naive_norm (   self,
  dataframe_metadata 
)

Method oriented to specificate minimal data_normalizations.

Parameters
dataframe_metadataDFMetadata()
an_objectiveATypesMetadata
objective_columnstring indicating objective column
Returns
[None] if nothing to DO or Normalization_sets orderdict() on other way

Definition at line 124 of file normalizer.py.

◆ filter_drop_missing()

def gdayf.normalizer.normalizer.Normalizer.filter_drop_missing (   self,
  normalizemd 
)

Method oriented to filter drop_missing operations on non standardize algorithms.

Parameters
normalizemdOrderedDict() compatible structure
Returns
normalizemd OrderedDict() compatible structure

Definition at line 212 of file normalizer.py.

◆ filter_objective_base()

def gdayf.normalizer.normalizer.Normalizer.filter_objective_base (   self,
  normalizemd 
)

Method oriented to filter filling_missing operations dependent of objective_column.

Parameters
normalizemdOrderedDict() compatible structure
Returns
normalizemd OrderedDict() compatible structure

Definition at line 229 of file normalizer.py.

◆ filter_standardize()

def gdayf.normalizer.normalizer.Normalizer.filter_standardize (   self,
  normalizemd,
  model_id 
)

Method oriented to filter stdmean operations on non standardize algorithms.

Parameters
normalizemdOrderedDict() compatible structure
model_idModel_identification
Returns
normalizemd OrderedDict() compatible structure

Definition at line 196 of file normalizer.py.

◆ fixedMissingValues()

def gdayf.normalizer.normalizer.Normalizer.fixedMissingValues (   self,
  dataframe,
  value = 0.0 
)

Internal method oriented to manage imputation for missing values to fixed value.

Parameters
selfobject pointer
dataframesingle column dataframe
valueint
Returns
dataframe

Definition at line 470 of file normalizer.py.

Here is the caller graph for this function:

◆ ignored_columns()

def gdayf.normalizer.normalizer.Normalizer.ignored_columns (   self,
  normalizemd 
)

Method oriented to generate ignored_column_list on issues where missed > exclusion_missing_threshold.

Parameters
normalizemdmormalizations_set_metadata
Returns
ignored_list updated

Definition at line 353 of file normalizer.py.

◆ meanMissingValues()

def gdayf.normalizer.normalizer.Normalizer.meanMissingValues (   self,
  dataframe,
  col,
  objective_col,
  full = False 
)

Internal method oriented to manage imputation for missing values to mean value.

Parameters
selfobject pointer
dataframefull column dataframe
colcolumn name for imputation
objective_colobjective_column
fullTrue means fll_dataframe.mean(), False means objective_col.value.mean()
Returns
dataframe

Definition at line 480 of file normalizer.py.

Here is the caller graph for this function:

◆ normalizeAgregation()

def gdayf.normalizer.normalizer.Normalizer.normalizeAgregation (   self,
  dataframe,
  br = 0.25 
)

Internal method oriented to manage bucket ratio normalizations head - tail.

Parameters
selfobject pointer
dataframesingle column dataframe
brbucket ratio
Returns
dataframe

Definition at line 416 of file normalizer.py.

Here is the caller graph for this function:

◆ normalizeBase()

def gdayf.normalizer.normalizer.Normalizer.normalizeBase (   self,
  dataframe 
)

Internal method oriented to manage drop NaN values from dataset.

Parameters
selfobject pointer
dataframesingle column dataframe
Returns
dataframe

Definition at line 367 of file normalizer.py.

Here is the caller graph for this function:

◆ normalizeBinaryEncoding()

def gdayf.normalizer.normalizer.Normalizer.normalizeBinaryEncoding (   self,
  dataframe 
)

Internal method oriented to manage Binary encodings.

Parameters
selfobject pointer
dataframesingle column dataframe
Returns
dataframe

Definition at line 432 of file normalizer.py.

◆ normalizeDataFrame()

def gdayf.normalizer.normalizer.Normalizer.normalizeDataFrame (   self,
  df,
  normalizemd 
)

Main method oriented to define and manage normalizations sets applying normalizations.

Parameters
selfobject pointer
dfdataframe
normalizemdOrderedDict() compatible structure
Returns
dataframe

Definition at line 249 of file normalizer.py.

Here is the call graph for this function:

◆ normalizeDateTime()

def gdayf.normalizer.normalizer.Normalizer.normalizeDateTime (   self,
  dataframe,
  date_column = None 
)

Internal method oriented to manage date_time conversions to pattern.

Parameters
selfobject pointer
dataframefull column dataframe to be expanded
date_columnDate_Column name to be transformed
Returns
dataframe

Definition at line 539 of file normalizer.py.

Here is the caller graph for this function:

◆ normalizeDiscretize()

def gdayf.normalizer.normalizer.Normalizer.normalizeDiscretize (   self,
  dataframe,
  buckets_number,
  fixed_size 
)

Internal method oriented to manage bucketing for discretize.

Parameters
selfobject pointer
dataframesingle column dataframe
buckets_numberInt
fixed_sizeBoolean (True=Fixed Size, False Fixed Frecuency
Returns
dataframe

Definition at line 458 of file normalizer.py.

Here is the caller graph for this function:

◆ normalizeDropMissing()

def gdayf.normalizer.normalizer.Normalizer.normalizeDropMissing (   self,
  dataframe,
  col 
)

Internal method oriented to manage base normalizations.

Parameters
selfobject pointer
dataframesingle column dataframe
colcolumn base to reference drop NaN
Returns
dataframe

Definition at line 382 of file normalizer.py.

Here is the caller graph for this function:

◆ normalizeOffset()

def gdayf.normalizer.normalizer.Normalizer.normalizeOffset (   self,
  dataframe,
  offset = 0 
)

Internal method oriented to manage Working range normalizations on a [closed, closed] interval.

Parameters
selfobject pointer
dataframesingle column dataframe
minval
maxval
Returns
dataframe

Definition at line 406 of file normalizer.py.

Here is the caller graph for this function:

◆ normalizeStdMean()

def gdayf.normalizer.normalizer.Normalizer.normalizeStdMean (   self,
  dataframe,
  mean,
  std 
)

Internal method oriented to manage mean and std normalizations.

Default mean=0 std=1

Parameters
selfobject pointer
dataframesingle column dataframe
meanmean value to center
stdstandard deviation value to be normalized
Returns
dataframe

Definition at line 441 of file normalizer.py.

Here is the caller graph for this function:

◆ normalizeWorkingRange()

def gdayf.normalizer.normalizer.Normalizer.normalizeWorkingRange (   self,
  dataframe,
  minval = -1.0,
  maxval = 1.0,
  minrange = -1.0,
  maxrange = 1.0 
)

Internal method oriented to manage Working range normalizations on a [closed, closed] interval.

Parameters
selfobject pointer
dataframesingle column dataframe
minval
maxval
Returns
dataframe

Definition at line 391 of file normalizer.py.

Here is the caller graph for this function:

◆ progressiveMissingValues()

def gdayf.normalizer.normalizer.Normalizer.progressiveMissingValues (   self,
  dataframe,
  col,
  objective_col 
)

Internal method oriented to manage progressive imputations for missing values.

([right_not_nan] - [left_not_nan])/Cardinality(is_nan)

Parameters
selfobject pointer
dataframefull column dataframe
colcolumn name for imputation
objective_colobjective_column
Returns
dataframe

Definition at line 502 of file normalizer.py.

Here is the caller graph for this function:

The documentation for this class was generated from the following file: