StackedDatasetTrainer (GridGain 8.9.19)

java.lang.Object
- org.apache.ignite.ml.trainers.DatasetTrainer<StackedModel<IS,IA,O,AM>,L>
- - org.apache.ignite.ml.composition.stacking.StackedDatasetTrainer<IS,IA,O,AM,L>

Type Parameters:

IS - Type of submodels input.

IA - Type of aggregator input.

O - Type of aggregator output.

L - Type of labels.

Direct Known Subclasses:

SimpleStackedDatasetTrainer
```
public class StackedDatasetTrainer<IS,IA,O,AM extends IgniteModel<IA,O>,L>
extends DatasetTrainer<StackedModel<IS,IA,O,AM>,L>
```
DatasetTrainer encapsulating stacking technique for model training. Model produced by this trainer consists of two layers. First layer is a model IS -> IA. This layer is a "parallel" composition of several "submodels", each of them itself is a model IS -> IA with their outputs [IA] merged into single IA. Second layer is an aggregator model IA -> O. Training corresponds to this layered structure in the following way:
```
 1. train models of first layer;
 2. train aggregator model on dataset augmented with outputs of first layer models converted to vectors.
 
```
During second step we can choose if we want to keep original features along with converted outputs of first layer models or use only converted results of first layer models. This choice will also affect inference. This class is a most general stacked trainer, there is a StackedVectorDatasetTrainer: a shortcut version of it with some types and functions specified.

Nested Class Summary
- Nested classes/interfaces inherited from class org.apache.ignite.ml.trainers.DatasetTrainer
  DatasetTrainer.EmptyDatasetException

Field Summary
- Fields inherited from class org.apache.ignite.ml.trainers.DatasetTrainer
  envBuilder, environment

Constructor Summary

Constructors
Constructor and Description
`StackedDatasetTrainer()` Constructs instance of this class.
`StackedDatasetTrainer(DatasetTrainer<AM,L> aggregatorTrainer, IgniteBinaryOperator<IA> aggregatingInputMerger, IgniteFunction<IS,IA> submodelInput2AggregatingInputConverter)` Constructs instance of this class.
`StackedDatasetTrainer(DatasetTrainer<AM,L> aggregatorTrainer, IgniteBinaryOperator<IA> aggregatingInputMerger, IgniteFunction<IS,IA> submodelInput2AggregatingInputConverter, List<DatasetTrainer<IgniteModel<IS,IA>,L>> submodelsTrainers, IgniteFunction<Vector,IS> vector2SubmodelInputConverter, IgniteFunction<IA,Vector> submodelOutput2VectorConverter)` Create instance of this class.

Method Summary

All Methods Instance Methods Concrete Methods
Modifier and Type	Method and Description
`<M1 extends IgniteModel<IS,IA>> StackedDatasetTrainer<IS,IA,O,AM,L>`	`addTrainer(DatasetTrainer<M1,L> trainer)` Adds submodel trainer along with converters needed on training and inference stages.
`<K,V> StackedModel<IS,IA,O,AM>`	`fitWithInitializedDeployingContext(DatasetBuilder<K,V> datasetBuilder, Preprocessor<K,V> preprocessor)` Trains model based on the specified data.
`boolean`	`isUpdateable(StackedModel<IS,IA,O,AM> mdl)` This method is never called, instead of constructing logic of update from `DatasetTrainer.isUpdateable(M)` and `DatasetTrainer.updateModel(M, org.apache.ignite.ml.dataset.DatasetBuilder<K, V>, org.apache.ignite.ml.preprocessing.Preprocessor<K, V>)` in this class we explicitly override update method.
`<K,V> StackedModel<IS,IA,O,AM>`	`update(StackedModel<IS,IA,O,AM> mdl, DatasetBuilder<K,V> datasetBuilder, Preprocessor<K,V> preprocessor)` Gets state of model in arguments, compare it with training parameters of trainer and if they are fit then trainer updates model in according to new data and return new model.
`protected <K,V> StackedModel<IS,IA,O,AM>`	`updateModel(StackedModel<IS,IA,O,AM> mdl, DatasetBuilder<K,V> datasetBuilder, Preprocessor<K,V> preprocessor)` This method is never called, instead of constructing logic of update from `DatasetTrainer.isUpdateable(IgniteModel)` and `DatasetTrainer.updateModel(IgniteModel, DatasetBuilder, Preprocessor)` in this class we explicitly override update method.
`StackedDatasetTrainer<IS,IA,O,AM,L>`	`withAggregatorInputMerger(IgniteBinaryOperator<IA> merger)` Specify binary operator used to merge submodels outputs to one.
`StackedDatasetTrainer<IS,IA,O,AM,L>`	`withAggregatorTrainer(DatasetTrainer<AM,L> aggregatorTrainer)` Specify aggregator trainer.
`StackedDatasetTrainer<IS,IA,O,AM,L>`	`withEnvironmentBuilder(LearningEnvironmentBuilder envBuilder)` Changes learning Environment.
`StackedDatasetTrainer<IS,IA,O,AM,L>`	`withOriginalFeaturesDropped()` Drop original features during training and inference.
`StackedDatasetTrainer<IS,IA,O,AM,L>`	`withOriginalFeaturesKept(IgniteFunction<IS,IA> submodelInput2AggregatingInputConverter)` Keep original features during training and propagate submodels input to aggregator during inference using given function.
`StackedDatasetTrainer<IS,IA,O,AM,L>`	`withSubmodelOutput2VectorConverter(IgniteFunction<IA,Vector> submodelOutput2VectorConverter)` Set function used for conversion of submodel output to `Vector`.
`StackedDatasetTrainer<IS,IA,O,AM,L>`	`withVector2SubmodelInputConverter(IgniteFunction<Vector,IS> vector2SubmodelInputConverter)` Set function used for conversion of `Vector` to submodel input.

Methods inherited from class org.apache.ignite.ml.trainers.DatasetTrainer
fit, fit, fit, fit, fit, fit, getLastTrainedModelOrThrowEmptyDatasetException, identityTrainer, learningEnvironment, update, update, update, update, withConvertedLabels

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

- Constructor Detail
  - StackedDatasetTrainer
```
public StackedDatasetTrainer(DatasetTrainer<AM,L> aggregatorTrainer,
                             IgniteBinaryOperator<IA> aggregatingInputMerger,
                             IgniteFunction<IS,IA> submodelInput2AggregatingInputConverter,
                             List<DatasetTrainer<IgniteModel<IS,IA>,L>> submodelsTrainers,
                             IgniteFunction<Vector,IS> vector2SubmodelInputConverter,
                             IgniteFunction<IA,Vector> submodelOutput2VectorConverter)
```
    Create instance of this class.
    
    Parameters:
    
    aggregatorTrainer - Trainer of model used for aggregation of results of submodels.
    
    aggregatingInputMerger - Binary operator used to merge outputs of submodels into one output passed to aggregator model.
    
    submodelInput2AggregatingInputConverter - Function used to convert input of submodel to output of submodel this function is used if user chooses to keep original features.
    
    submodelsTrainers - List of submodel trainers.
  - StackedDatasetTrainer
```
public StackedDatasetTrainer(DatasetTrainer<AM,L> aggregatorTrainer,
                             IgniteBinaryOperator<IA> aggregatingInputMerger,
                             IgniteFunction<IS,IA> submodelInput2AggregatingInputConverter)
```
    Constructs instance of this class.
    
    Parameters:
    
    aggregatorTrainer - Trainer of model used for aggregation of results of submodels.
    
    aggregatingInputMerger - Binary operator used to merge outputs of submodels into one output passed to aggregator model.
    
    submodelInput2AggregatingInputConverter - Function used to convert input of submodel to output of submodel this function is used if user chooses to keep original features.
  - StackedDatasetTrainer
```
public StackedDatasetTrainer()
```
    Constructs instance of this class.
- Method Detail
  - withOriginalFeaturesKept
```
public StackedDatasetTrainer<IS,IA,O,AM,L> withOriginalFeaturesKept(IgniteFunction<IS,IA> submodelInput2AggregatingInputConverter)
```
    Keep original features during training and propagate submodels input to aggregator during inference using given function. Note that if this object is on, training will be done on vector obtaining from concatenating features passed to submodels trainers and outputs of submodels converted to vectors, this can, for example influence aggregator model input vector dimension (if IS = Vector), or, more generally, some IS parameters which are not reflected just by its type. So converter should be written accordingly.
    
    Parameters:
    
    submodelInput2AggregatingInputConverter - Function used to propagate submodels input to aggregator.
    
    Returns:
    
    This object.
  - withOriginalFeaturesDropped
```
public StackedDatasetTrainer<IS,IA,O,AM,L> withOriginalFeaturesDropped()
```
    Drop original features during training and inference.
    
    Returns:
    
    This object.
  - withSubmodelOutput2VectorConverter
```
public StackedDatasetTrainer<IS,IA,O,AM,L> withSubmodelOutput2VectorConverter(IgniteFunction<IA,Vector> submodelOutput2VectorConverter)
```
    Set function used for conversion of submodel output to Vector. This function is used during building of dataset for training aggregator model. This dataset is augmented with results of submodels converted to Vector.
    
    Parameters:
    
    submodelOutput2VectorConverter - Function used for conversion of submodel output to Vector.
    
    Returns:
    
    This object.
  - withVector2SubmodelInputConverter
```
public StackedDatasetTrainer<IS,IA,O,AM,L> withVector2SubmodelInputConverter(IgniteFunction<Vector,IS> vector2SubmodelInputConverter)
```
    Set function used for conversion of Vector to submodel input. This function is used during building of dataset for training aggregator model. This dataset is augmented with results of submodels applied to Vectors in original dataset.
    
    Parameters:
    
    vector2SubmodelInputConverter - Function used for conversion of Vector to submodel input.
    
    Returns:
    
    This object.
  - withAggregatorTrainer
```
public StackedDatasetTrainer<IS,IA,O,AM,L> withAggregatorTrainer(DatasetTrainer<AM,L> aggregatorTrainer)
```
    Specify aggregator trainer.
    
    Parameters:
    
    aggregatorTrainer - Aggregator trainer.
    
    Returns:
    
    This object.
  - withAggregatorInputMerger
```
public StackedDatasetTrainer<IS,IA,O,AM,L> withAggregatorInputMerger(IgniteBinaryOperator<IA> merger)
```
    Specify binary operator used to merge submodels outputs to one.
    
    Parameters:
    
    merger - Binary operator used to merge submodels outputs to one.
    
    Returns:
    
    This object.
  - addTrainer
```
public <M1 extends IgniteModel<IS,IA>> StackedDatasetTrainer<IS,IA,O,AM,L> addTrainer(DatasetTrainer<M1,L> trainer)
```
    Adds submodel trainer along with converters needed on training and inference stages.
    
    Parameters:
    
    trainer - Submodel trainer.
    
    Returns:
    
    This object.
  - fitWithInitializedDeployingContext
```
public <K,V> StackedModel<IS,IA,O,AM> fitWithInitializedDeployingContext(DatasetBuilder<K,V> datasetBuilder,
                                                                         Preprocessor<K,V> preprocessor)
```
    Trains model based on the specified data.
    
    Specified by:
    
    fitWithInitializedDeployingContext in class DatasetTrainer<StackedModel<IS,IA,O,AM extends IgniteModel<IA,O>>,L>
    
    Type Parameters:
    
    K - Type of a key in upstream data.
    
    V - Type of a value in upstream data.
    
    Parameters:
    
    datasetBuilder - Dataset builder.
    
    preprocessor - Extractor of UpstreamEntry into LabeledVector.
    
    Returns:
    
    Model.
  - update
```
public <K,V> StackedModel<IS,IA,O,AM> update(StackedModel<IS,IA,O,AM> mdl,
                                             DatasetBuilder<K,V> datasetBuilder,
                                             Preprocessor<K,V> preprocessor)
```
    Gets state of model in arguments, compare it with training parameters of trainer and if they are fit then trainer updates model in according to new data and return new model. In other case trains new model.
    
    Overrides:
    
    update in class DatasetTrainer<StackedModel<IS,IA,O,AM extends IgniteModel<IA,O>>,L>
    
    Type Parameters:
    
    K - Type of a key in upstream data.
    
    V - Type of a value in upstream data.
    
    Parameters:
    
    mdl - Learned model.
    
    datasetBuilder - Dataset builder.
    
    preprocessor - Extractor of UpstreamEntry into LabeledVector.
    
    Returns:
    
    Updated model.
  - withEnvironmentBuilder
```
public StackedDatasetTrainer<IS,IA,O,AM,L> withEnvironmentBuilder(LearningEnvironmentBuilder envBuilder)
```
    Changes learning Environment.
    
    Overrides:
    
    withEnvironmentBuilder in class DatasetTrainer<StackedModel<IS,IA,O,AM extends IgniteModel<IA,O>>,L>
    
    Parameters:
    
    envBuilder - Learning environment builder.
  - updateModel
```
protected <K,V> StackedModel<IS,IA,O,AM> updateModel(StackedModel<IS,IA,O,AM> mdl,
                                                     DatasetBuilder<K,V> datasetBuilder,
                                                     Preprocessor<K,V> preprocessor)
```
    This method is never called, instead of constructing logic of update from DatasetTrainer.isUpdateable(IgniteModel) and DatasetTrainer.updateModel(IgniteModel, DatasetBuilder, Preprocessor) in this class we explicitly override update method.
    
    Specified by:
    
    updateModel in class DatasetTrainer<StackedModel<IS,IA,O,AM extends IgniteModel<IA,O>>,L>
    
    Type Parameters:
    
    K - Type of a key in upstream data.
    
    V - Type of a value in upstream data.
    
    Parameters:
    
    mdl - Model.
    
    datasetBuilder - Dataset builder.
    
    preprocessor - Extractor of UpstreamEntry into LabeledVector.
    
    Returns:
    
    Updated model.
  - isUpdateable
```
public boolean isUpdateable(StackedModel<IS,IA,O,AM> mdl)
```
    This method is never called, instead of constructing logic of update from DatasetTrainer.isUpdateable(M) and DatasetTrainer.updateModel(M, org.apache.ignite.ml.dataset.DatasetBuilder<K, V>, org.apache.ignite.ml.preprocessing.Preprocessor<K, V>) in this class we explicitly override update method.
    
    Specified by:
    
    isUpdateable in class DatasetTrainer<StackedModel<IS,IA,O,AM extends IgniteModel<IA,O>>,L>
    
    Parameters:
    
    mdl - Model.
    
    Returns:
    
    True if current critical for training parameters correspond to parameters from last training.

Class StackedDatasetTrainer<IS,IA,O,AM extends IgniteModel<IA,O>,L>

Nested Class Summary

Nested classes/interfaces inherited from class org.apache.ignite.ml.trainers.DatasetTrainer

Field Summary

Fields inherited from class org.apache.ignite.ml.trainers.DatasetTrainer

Constructor Summary

Method Summary

Methods inherited from class org.apache.ignite.ml.trainers.DatasetTrainer

Methods inherited from class java.lang.Object

Constructor Detail

StackedDatasetTrainer

StackedDatasetTrainer

StackedDatasetTrainer

Method Detail

withOriginalFeaturesKept

withOriginalFeaturesDropped

withSubmodelOutput2VectorConverter

withVector2SubmodelInputConverter

withAggregatorTrainer

withAggregatorInputMerger

addTrainer

fitWithInitializedDeployingContext

update

withEnvironmentBuilder

updateModel

isUpdateable