IS
- Type of submodels input.IA
- Type of aggregator input.O
- Type of aggregator output.L
- Type of labels.public class StackedDatasetTrainer<IS,IA,O,AM extends IgniteModel<IA,O>,L> extends DatasetTrainer<StackedModel<IS,IA,O,AM>,L>
DatasetTrainer
encapsulating stacking technique for model training.
Model produced by this trainer consists of two layers. First layer is a model IS -> IA
.
This layer is a "parallel" composition of several "submodels", each of them itself is a model
IS -> IA
with their outputs [IA]
merged into single IA
.
Second layer is an aggregator model IA -> O
.
Training corresponds to this layered structure in the following way:
1. train models of first layer; 2. train aggregator model on dataset augmented with outputs of first layer models converted to vectors.During second step we can choose if we want to keep original features along with converted outputs of first layer models or use only converted results of first layer models. This choice will also affect inference. This class is a most general stacked trainer, there is a
StackedVectorDatasetTrainer
: a shortcut version of
it with some types and functions specified.DatasetTrainer.EmptyDatasetException
envBuilder, environment
Constructor and Description |
---|
StackedDatasetTrainer()
Constructs instance of this class.
|
StackedDatasetTrainer(DatasetTrainer<AM,L> aggregatorTrainer,
IgniteBinaryOperator<IA> aggregatingInputMerger,
IgniteFunction<IS,IA> submodelInput2AggregatingInputConverter)
Constructs instance of this class.
|
StackedDatasetTrainer(DatasetTrainer<AM,L> aggregatorTrainer,
IgniteBinaryOperator<IA> aggregatingInputMerger,
IgniteFunction<IS,IA> submodelInput2AggregatingInputConverter,
List<DatasetTrainer<IgniteModel<IS,IA>,L>> submodelsTrainers,
IgniteFunction<Vector,IS> vector2SubmodelInputConverter,
IgniteFunction<IA,Vector> submodelOutput2VectorConverter)
Create instance of this class.
|
Modifier and Type | Method and Description |
---|---|
<M1 extends IgniteModel<IS,IA>> |
addTrainer(DatasetTrainer<M1,L> trainer)
Adds submodel trainer along with converters needed on training and inference stages.
|
<K,V> StackedModel<IS,IA,O,AM> |
fitWithInitializedDeployingContext(DatasetBuilder<K,V> datasetBuilder,
Preprocessor<K,V> preprocessor)
Trains model based on the specified data.
|
boolean |
isUpdateable(StackedModel<IS,IA,O,AM> mdl)
This method is never called, instead of constructing logic of update from
DatasetTrainer.isUpdateable(M) and
DatasetTrainer.updateModel(M, org.apache.ignite.ml.dataset.DatasetBuilder<K, V>, org.apache.ignite.ml.preprocessing.Preprocessor<K, V>)
in this class we explicitly override update method. |
<K,V> StackedModel<IS,IA,O,AM> |
update(StackedModel<IS,IA,O,AM> mdl,
DatasetBuilder<K,V> datasetBuilder,
Preprocessor<K,V> preprocessor)
Gets state of model in arguments, compare it with training parameters of trainer and if they are fit then trainer
updates model in according to new data and return new model.
|
protected <K,V> StackedModel<IS,IA,O,AM> |
updateModel(StackedModel<IS,IA,O,AM> mdl,
DatasetBuilder<K,V> datasetBuilder,
Preprocessor<K,V> preprocessor)
This method is never called, instead of constructing logic of update from
DatasetTrainer.isUpdateable(IgniteModel) and
DatasetTrainer.updateModel(IgniteModel, DatasetBuilder, Preprocessor)
in this class we explicitly override update method. |
StackedDatasetTrainer<IS,IA,O,AM,L> |
withAggregatorInputMerger(IgniteBinaryOperator<IA> merger)
Specify binary operator used to merge submodels outputs to one.
|
StackedDatasetTrainer<IS,IA,O,AM,L> |
withAggregatorTrainer(DatasetTrainer<AM,L> aggregatorTrainer)
Specify aggregator trainer.
|
StackedDatasetTrainer<IS,IA,O,AM,L> |
withEnvironmentBuilder(LearningEnvironmentBuilder envBuilder)
Changes learning Environment.
|
StackedDatasetTrainer<IS,IA,O,AM,L> |
withOriginalFeaturesDropped()
Drop original features during training and inference.
|
StackedDatasetTrainer<IS,IA,O,AM,L> |
withOriginalFeaturesKept(IgniteFunction<IS,IA> submodelInput2AggregatingInputConverter)
Keep original features during training and propagate submodels input to aggregator during inference
using given function.
|
StackedDatasetTrainer<IS,IA,O,AM,L> |
withSubmodelOutput2VectorConverter(IgniteFunction<IA,Vector> submodelOutput2VectorConverter)
Set function used for conversion of submodel output to
Vector . |
StackedDatasetTrainer<IS,IA,O,AM,L> |
withVector2SubmodelInputConverter(IgniteFunction<Vector,IS> vector2SubmodelInputConverter)
Set function used for conversion of
Vector to submodel input. |
fit, fit, fit, fit, fit, fit, getLastTrainedModelOrThrowEmptyDatasetException, identityTrainer, learningEnvironment, update, update, update, update, withConvertedLabels
public StackedDatasetTrainer(DatasetTrainer<AM,L> aggregatorTrainer, IgniteBinaryOperator<IA> aggregatingInputMerger, IgniteFunction<IS,IA> submodelInput2AggregatingInputConverter, List<DatasetTrainer<IgniteModel<IS,IA>,L>> submodelsTrainers, IgniteFunction<Vector,IS> vector2SubmodelInputConverter, IgniteFunction<IA,Vector> submodelOutput2VectorConverter)
aggregatorTrainer
- Trainer of model used for aggregation of results of submodels.aggregatingInputMerger
- Binary operator used to merge outputs of submodels into one output passed to
aggregator model.submodelInput2AggregatingInputConverter
- Function used to convert input of submodel to output of submodel
this function is used if user chooses to keep original features.submodelsTrainers
- List of submodel trainers.public StackedDatasetTrainer(DatasetTrainer<AM,L> aggregatorTrainer, IgniteBinaryOperator<IA> aggregatingInputMerger, IgniteFunction<IS,IA> submodelInput2AggregatingInputConverter)
aggregatorTrainer
- Trainer of model used for aggregation of results of submodels.aggregatingInputMerger
- Binary operator used to merge outputs of submodels into one output passed to
aggregator model.submodelInput2AggregatingInputConverter
- Function used to convert input of submodel to output of submodel
this function is used if user chooses to keep original features.public StackedDatasetTrainer()
public StackedDatasetTrainer<IS,IA,O,AM,L> withOriginalFeaturesKept(IgniteFunction<IS,IA> submodelInput2AggregatingInputConverter)
IS = Vector
), or, more generally,
some IS
parameters which are not reflected just by its type. So converter should be
written accordingly.submodelInput2AggregatingInputConverter
- Function used to propagate submodels input to aggregator.public StackedDatasetTrainer<IS,IA,O,AM,L> withOriginalFeaturesDropped()
public StackedDatasetTrainer<IS,IA,O,AM,L> withSubmodelOutput2VectorConverter(IgniteFunction<IA,Vector> submodelOutput2VectorConverter)
Vector
. This function is used during
building of dataset for training aggregator model. This dataset is augmented with results of submodels
converted to Vector
.submodelOutput2VectorConverter
- Function used for conversion of submodel output to Vector
.public StackedDatasetTrainer<IS,IA,O,AM,L> withVector2SubmodelInputConverter(IgniteFunction<Vector,IS> vector2SubmodelInputConverter)
Vector
to submodel input. This function is used during
building of dataset for training aggregator model. This dataset is augmented with results of submodels
applied to Vector
s in original dataset.vector2SubmodelInputConverter
- Function used for conversion of Vector
to submodel input.public StackedDatasetTrainer<IS,IA,O,AM,L> withAggregatorTrainer(DatasetTrainer<AM,L> aggregatorTrainer)
aggregatorTrainer
- Aggregator trainer.public StackedDatasetTrainer<IS,IA,O,AM,L> withAggregatorInputMerger(IgniteBinaryOperator<IA> merger)
merger
- Binary operator used to merge submodels outputs to one.public <M1 extends IgniteModel<IS,IA>> StackedDatasetTrainer<IS,IA,O,AM,L> addTrainer(DatasetTrainer<M1,L> trainer)
trainer
- Submodel trainer.public <K,V> StackedModel<IS,IA,O,AM> fitWithInitializedDeployingContext(DatasetBuilder<K,V> datasetBuilder, Preprocessor<K,V> preprocessor)
fitWithInitializedDeployingContext
in class DatasetTrainer<StackedModel<IS,IA,O,AM extends IgniteModel<IA,O>>,L>
K
- Type of a key in upstream
data.V
- Type of a value in upstream
data.datasetBuilder
- Dataset builder.preprocessor
- Extractor of UpstreamEntry
into LabeledVector
.public <K,V> StackedModel<IS,IA,O,AM> update(StackedModel<IS,IA,O,AM> mdl, DatasetBuilder<K,V> datasetBuilder, Preprocessor<K,V> preprocessor)
update
in class DatasetTrainer<StackedModel<IS,IA,O,AM extends IgniteModel<IA,O>>,L>
K
- Type of a key in upstream
data.V
- Type of a value in upstream
data.mdl
- Learned model.datasetBuilder
- Dataset builder.preprocessor
- Extractor of UpstreamEntry
into LabeledVector
.public StackedDatasetTrainer<IS,IA,O,AM,L> withEnvironmentBuilder(LearningEnvironmentBuilder envBuilder)
withEnvironmentBuilder
in class DatasetTrainer<StackedModel<IS,IA,O,AM extends IgniteModel<IA,O>>,L>
envBuilder
- Learning environment builder.protected <K,V> StackedModel<IS,IA,O,AM> updateModel(StackedModel<IS,IA,O,AM> mdl, DatasetBuilder<K,V> datasetBuilder, Preprocessor<K,V> preprocessor)
DatasetTrainer.isUpdateable(IgniteModel)
and
DatasetTrainer.updateModel(IgniteModel, DatasetBuilder, Preprocessor)
in this class we explicitly override update method.updateModel
in class DatasetTrainer<StackedModel<IS,IA,O,AM extends IgniteModel<IA,O>>,L>
K
- Type of a key in upstream
data.V
- Type of a value in upstream
data.mdl
- Model.datasetBuilder
- Dataset builder.preprocessor
- Extractor of UpstreamEntry
into LabeledVector
.public boolean isUpdateable(StackedModel<IS,IA,O,AM> mdl)
DatasetTrainer.isUpdateable(M)
and
DatasetTrainer.updateModel(M, org.apache.ignite.ml.dataset.DatasetBuilder<K, V>, org.apache.ignite.ml.preprocessing.Preprocessor<K, V>)
in this class we explicitly override update method.isUpdateable
in class DatasetTrainer<StackedModel<IS,IA,O,AM extends IgniteModel<IA,O>>,L>
mdl
- Model.
GridGain In-Memory Computing Platform : ver. 8.9.14 Release Date : November 5 2024