K
- Type of a key in upstream
data.V
- Type of a value in upstream
data.public final class StringEncoderPreprocessor<K,V> extends EncoderPreprocessor<K,V> implements DeployableObject
This preprocessor can transform multiple columns which indices are handled during training process. These indexes could be defined via .withEncodedFeature(featureIndex) call.
NOTE: it doesn’t add new column but change data in-place.
There is only a one strategy regarding how StringEncoder will handle unseen labels when you have fit a StringEncoder on one dataset and then use it to transform another: put unseen labels in a special additional bucket, at index is equal amountOfCategories.
Modifier and Type | Field and Description |
---|---|
protected static long |
serialVersionUID |
basePreprocessor, encodingValues, handledIndices, KEY_FOR_NULL_VALUES
Constructor and Description |
---|
StringEncoderPreprocessor(Map<String,Integer>[] encodingValues,
Preprocessor<K,V> basePreprocessor,
Set<Integer> handledIndices)
Constructs a new instance of String Encoder preprocessor.
|
Modifier and Type | Method and Description |
---|---|
LabeledVector |
apply(K k,
V v)
Applies this preprocessor.
|
List<Object> |
getDependencies()
Returns dependencies of this object that can be object with class defined by client side and unknown for server.
|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
map
andThen
andThen
protected static final long serialVersionUID
public StringEncoderPreprocessor(Map<String,Integer>[] encodingValues, Preprocessor<K,V> basePreprocessor, Set<Integer> handledIndices)
basePreprocessor
- Base preprocessor.handledIndices
- Handled indices.public LabeledVector apply(K k, V v)
apply
in interface BiFunction<K,V,LabeledVector>
k
- Key.v
- Value.public List<Object> getDependencies()
getDependencies
in interface DeployableObject
GridGain In-Memory Computing Platform : ver. 8.9.14 Release Date : November 5 2024