public class DatasetFactory extends Object
Dataset construction is based on three major concepts: a partition upstream
, context
and
data
. A partition upstream
is a data source, which assumed to be available all the time regardless
node failures and rebalancing events. A partition context
is a part of a partition maintained during the
whole computation process and stored in a reliable storage so that a context
is staying available and
consistent regardless node failures and rebalancing events as well as an upstream
. A partition data
is a part of partition maintained during a computation process in unreliable local storage such as heap, off-heap or
GPU memory on the node where current computation is performed, so that partition data
can be lost as result
of node failure or rebalancing, but it can be restored from an upstream
and a partition context
.
A partition context
and data
are built on top of an upstream
by using specified
builders: PartitionContextBuilder
and PartitionDataBuilder
correspondingly. To build a generic
dataset the following approach is used:
Dataset<C, D> dataset = DatasetFactory.create(
ignite,
cache,
partitionContextBuilder,
partitionDataBuilder
);
As well as the generic building method create
this factory provides methods that allow to create a
specific dataset types such as method createSimpleDataset
to create SimpleDataset
and method
createSimpleLabeledDataset
to create SimpleLabeledDataset
.
Dataset
,
PartitionContextBuilder
,
PartitionDataBuilder
Constructor and Description |
---|
DatasetFactory() |
Modifier and Type | Method and Description |
---|---|
static <K,V,C extends Serializable,D extends AutoCloseable> |
create(DatasetBuilder<K,V> datasetBuilder,
PartitionContextBuilder<K,V,C> partCtxBuilder,
PartitionDataBuilder<K,V,C,D> partDataBuilder)
Creates a new instance of distributed dataset using the specified
partCtxBuilder and
partDataBuilder . |
static <K,V,C extends Serializable,D extends AutoCloseable> |
create(Ignite ignite,
IgniteCache<K,V> upstreamCache,
PartitionContextBuilder<K,V,C> partCtxBuilder,
PartitionDataBuilder<K,V,C,D> partDataBuilder)
Creates a new instance of distributed dataset using the specified
partCtxBuilder and
partDataBuilder . |
static <K,V,C extends Serializable,D extends AutoCloseable> |
create(Map<K,V> upstreamMap,
int partitions,
PartitionContextBuilder<K,V,C> partCtxBuilder,
PartitionDataBuilder<K,V,C,D> partDataBuilder)
Creates a new instance of local dataset using the specified
partCtxBuilder and partDataBuilder . |
static <K,V> SimpleDataset<EmptyContext> |
createSimpleDataset(DatasetBuilder<K,V> datasetBuilder,
IgniteBiFunction<K,V,Vector> featureExtractor)
Creates a new instance of distributed
SimpleDataset using the specified featureExtractor . |
static <K,V,C extends Serializable> |
createSimpleDataset(DatasetBuilder<K,V> datasetBuilder,
PartitionContextBuilder<K,V,C> partCtxBuilder,
IgniteBiFunction<K,V,Vector> featureExtractor)
Creates a new instance of distributed
SimpleDataset using the specified partCtxBuilder and
featureExtractor . |
static <K,V> SimpleDataset<EmptyContext> |
createSimpleDataset(Ignite ignite,
IgniteCache<K,V> upstreamCache,
IgniteBiFunction<K,V,Vector> featureExtractor)
Creates a new instance of distributed
SimpleDataset using the specified featureExtractor . |
static <K,V,C extends Serializable> |
createSimpleDataset(Ignite ignite,
IgniteCache<K,V> upstreamCache,
PartitionContextBuilder<K,V,C> partCtxBuilder,
IgniteBiFunction<K,V,Vector> featureExtractor)
Creates a new instance of distributed
SimpleDataset using the specified partCtxBuilder and
featureExtractor . |
static <K,V> SimpleDataset<EmptyContext> |
createSimpleDataset(Map<K,V> upstreamMap,
int partitions,
IgniteBiFunction<K,V,Vector> featureExtractor)
Creates a new instance of local
SimpleDataset using the specified featureExtractor . |
static <K,V,C extends Serializable> |
createSimpleDataset(Map<K,V> upstreamMap,
int partitions,
PartitionContextBuilder<K,V,C> partCtxBuilder,
IgniteBiFunction<K,V,Vector> featureExtractor)
Creates a new instance of local
SimpleDataset using the specified partCtxBuilder and
featureExtractor . |
static <K,V> SimpleLabeledDataset<EmptyContext> |
createSimpleLabeledDataset(DatasetBuilder<K,V> datasetBuilder,
IgniteBiFunction<K,V,Vector> featureExtractor,
IgniteBiFunction<K,V,double[]> lbExtractor)
Creates a new instance of distributed
SimpleLabeledDataset using the specified featureExtractor
and lbExtractor . |
static <K,V,C extends Serializable> |
createSimpleLabeledDataset(DatasetBuilder<K,V> datasetBuilder,
PartitionContextBuilder<K,V,C> partCtxBuilder,
IgniteBiFunction<K,V,Vector> featureExtractor,
IgniteBiFunction<K,V,double[]> lbExtractor)
Creates a new instance of distributed
SimpleLabeledDataset using the specified partCtxBuilder ,
featureExtractor and lbExtractor . |
static <K,V> SimpleLabeledDataset<EmptyContext> |
createSimpleLabeledDataset(Ignite ignite,
IgniteCache<K,V> upstreamCache,
IgniteBiFunction<K,V,Vector> featureExtractor,
IgniteBiFunction<K,V,double[]> lbExtractor)
Creates a new instance of distributed
SimpleLabeledDataset using the specified featureExtractor
and lbExtractor . |
static <K,V,C extends Serializable> |
createSimpleLabeledDataset(Ignite ignite,
IgniteCache<K,V> upstreamCache,
PartitionContextBuilder<K,V,C> partCtxBuilder,
IgniteBiFunction<K,V,Vector> featureExtractor,
IgniteBiFunction<K,V,double[]> lbExtractor)
Creates a new instance of distributed
SimpleLabeledDataset using the specified partCtxBuilder ,
featureExtractor and lbExtractor . |
static <K,V> SimpleLabeledDataset<EmptyContext> |
createSimpleLabeledDataset(Map<K,V> upstreamMap,
int partitions,
IgniteBiFunction<K,V,Vector> featureExtractor,
IgniteBiFunction<K,V,double[]> lbExtractor)
Creates a new instance of local
SimpleLabeledDataset using the specified featureExtractor
and lbExtractor . |
static <K,V,C extends Serializable> |
createSimpleLabeledDataset(Map<K,V> upstreamMap,
int partitions,
PartitionContextBuilder<K,V,C> partCtxBuilder,
IgniteBiFunction<K,V,Vector> featureExtractor,
IgniteBiFunction<K,V,double[]> lbExtractor)
Creates a new instance of local
SimpleLabeledDataset using the specified partCtxBuilder ,
featureExtractor and lbExtractor . |
public static <K,V,C extends Serializable,D extends AutoCloseable> Dataset<C,D> create(DatasetBuilder<K,V> datasetBuilder, PartitionContextBuilder<K,V,C> partCtxBuilder, PartitionDataBuilder<K,V,C,D> partDataBuilder)
partCtxBuilder
and
partDataBuilder
. This is the generic methods that allows to create any Ignite Cache based datasets with
any desired partition context
and data
.K
- Type of a key in upstream
data.V
- ype of a value in upstream
data.C
- Type of a partition context
.D
- Type of a partition data
.datasetBuilder
- Dataset builder.partCtxBuilder
- Partition context
builder.partDataBuilder
- Partition data
builder.public static <K,V,C extends Serializable,D extends AutoCloseable> Dataset<C,D> create(Ignite ignite, IgniteCache<K,V> upstreamCache, PartitionContextBuilder<K,V,C> partCtxBuilder, PartitionDataBuilder<K,V,C,D> partDataBuilder)
partCtxBuilder
and
partDataBuilder
. This is the generic methods that allows to create any Ignite Cache based datasets with
any desired partition context
and data
.K
- Type of a key in upstream
data.V
- Type of a value in upstream
data.C
- Type of a partition context
.D
- Type of a partition data
.ignite
- Ignite instance.upstreamCache
- Ignite Cache with upstream
data.partCtxBuilder
- Partition context
builder.partDataBuilder
- Partition data
builder.public static <K,V,C extends Serializable> SimpleDataset<C> createSimpleDataset(DatasetBuilder<K,V> datasetBuilder, PartitionContextBuilder<K,V,C> partCtxBuilder, IgniteBiFunction<K,V,Vector> featureExtractor)
SimpleDataset
using the specified partCtxBuilder
and
featureExtractor
. This methods determines partition data
to be SimpleDatasetData
, but
allows to use any desired type of partition context
.K
- Type of a key in upstream
data.V
- Type of a value in upstream
data.C
- Type of a partition context
.datasetBuilder
- Dataset builder.partCtxBuilder
- Partition context
builder.featureExtractor
- Feature extractor used to extract features and build SimpleDatasetData
.public static <K,V,C extends Serializable> SimpleDataset<C> createSimpleDataset(Ignite ignite, IgniteCache<K,V> upstreamCache, PartitionContextBuilder<K,V,C> partCtxBuilder, IgniteBiFunction<K,V,Vector> featureExtractor)
SimpleDataset
using the specified partCtxBuilder
and
featureExtractor
. This methods determines partition data
to be SimpleDatasetData
, but
allows to use any desired type of partition context
.K
- Type of a key in upstream
data.V
- Type of a value in upstream
data.C
- Type of a partition context
.ignite
- Ignite instance.upstreamCache
- Ignite Cache with upstream
data.partCtxBuilder
- Partition context
builder.featureExtractor
- Feature extractor used to extract features and build SimpleDatasetData
.public static <K,V,C extends Serializable> SimpleLabeledDataset<C> createSimpleLabeledDataset(DatasetBuilder<K,V> datasetBuilder, PartitionContextBuilder<K,V,C> partCtxBuilder, IgniteBiFunction<K,V,Vector> featureExtractor, IgniteBiFunction<K,V,double[]> lbExtractor)
SimpleLabeledDataset
using the specified partCtxBuilder
,
featureExtractor
and lbExtractor
. This method determines partition data
to be
SimpleLabeledDatasetData
, but allows to use any desired type of partition context
.K
- Type of a key in upstream
data.V
- Type of a value in upstream
data.C
- Type of a partition context
.datasetBuilder
- Dataset builder.partCtxBuilder
- Partition context
builder.featureExtractor
- Feature extractor used to extract features and build SimpleLabeledDatasetData
.lbExtractor
- Label extractor used to extract labels and buikd SimpleLabeledDatasetData
.public static <K,V,C extends Serializable> SimpleLabeledDataset<C> createSimpleLabeledDataset(Ignite ignite, IgniteCache<K,V> upstreamCache, PartitionContextBuilder<K,V,C> partCtxBuilder, IgniteBiFunction<K,V,Vector> featureExtractor, IgniteBiFunction<K,V,double[]> lbExtractor)
SimpleLabeledDataset
using the specified partCtxBuilder
,
featureExtractor
and lbExtractor
. This method determines partition data
to be
SimpleLabeledDatasetData
, but allows to use any desired type of partition context
.K
- Type of a key in upstream
data.V
- Type of a value in upstream
data.C
- Type of a partition context
.ignite
- Ignite instance.upstreamCache
- Ignite Cache with upstream
data.partCtxBuilder
- Partition context
builder.featureExtractor
- Feature extractor used to extract features and build SimpleLabeledDatasetData
.lbExtractor
- Label extractor used to extract labels and buikd SimpleLabeledDatasetData
.public static <K,V> SimpleDataset<EmptyContext> createSimpleDataset(DatasetBuilder<K,V> datasetBuilder, IgniteBiFunction<K,V,Vector> featureExtractor)
SimpleDataset
using the specified featureExtractor
. This
methods determines partition context
to be EmptyContext
and partition data
to be
SimpleDatasetData
.K
- Type of a key in upstream
data.V
- Type of a value in upstream
data.datasetBuilder
- Dataset builder.featureExtractor
- Feature extractor used to extract features and build SimpleDatasetData
.public static <K,V> SimpleDataset<EmptyContext> createSimpleDataset(Ignite ignite, IgniteCache<K,V> upstreamCache, IgniteBiFunction<K,V,Vector> featureExtractor)
SimpleDataset
using the specified featureExtractor
. This
methods determines partition context
to be EmptyContext
and partition data
to be
SimpleDatasetData
.K
- Type of a key in upstream
data.V
- Type of a value in upstream
data.ignite
- Ignite instance.upstreamCache
- Ignite Cache with upstream
data.featureExtractor
- Feature extractor used to extract features and build SimpleDatasetData
.public static <K,V> SimpleLabeledDataset<EmptyContext> createSimpleLabeledDataset(DatasetBuilder<K,V> datasetBuilder, IgniteBiFunction<K,V,Vector> featureExtractor, IgniteBiFunction<K,V,double[]> lbExtractor)
SimpleLabeledDataset
using the specified featureExtractor
and lbExtractor
. This methods determines partition context
to be EmptyContext
and
partition data
to be SimpleLabeledDatasetData
.K
- Type of a key in upstream
data.V
- Type of a value in upstream
data.datasetBuilder
- Dataset builder.featureExtractor
- Feature extractor used to extract features and build SimpleLabeledDatasetData
.lbExtractor
- Label extractor used to extract labels and buikd SimpleLabeledDatasetData
.public static <K,V> SimpleLabeledDataset<EmptyContext> createSimpleLabeledDataset(Ignite ignite, IgniteCache<K,V> upstreamCache, IgniteBiFunction<K,V,Vector> featureExtractor, IgniteBiFunction<K,V,double[]> lbExtractor)
SimpleLabeledDataset
using the specified featureExtractor
and lbExtractor
. This methods determines partition context
to be EmptyContext
and
partition data
to be SimpleLabeledDatasetData
.K
- Type of a key in upstream
data.V
- Type of a value in upstream
data.ignite
- Ignite instance.upstreamCache
- Ignite Cache with upstream
data.featureExtractor
- Feature extractor used to extract features and build SimpleLabeledDatasetData
.lbExtractor
- Label extractor used to extract labels and buikd SimpleLabeledDatasetData
.public static <K,V,C extends Serializable,D extends AutoCloseable> Dataset<C,D> create(Map<K,V> upstreamMap, int partitions, PartitionContextBuilder<K,V,C> partCtxBuilder, PartitionDataBuilder<K,V,C,D> partDataBuilder)
partCtxBuilder
and partDataBuilder
.
This is the generic methods that allows to create any Ignite Cache based datasets with any desired partition
context
and data
.K
- Type of a key in upstream
data.V
- Type of a value in upstream
data.C
- Type of a partition context
.D
- Type of a partition data
.upstreamMap
- Map
with upstream
data.partitions
- Number of partitions upstream
Map
will be divided on.partCtxBuilder
- Partition context
builder.partDataBuilder
- Partition data
builder.public static <K,V,C extends Serializable> SimpleDataset<C> createSimpleDataset(Map<K,V> upstreamMap, int partitions, PartitionContextBuilder<K,V,C> partCtxBuilder, IgniteBiFunction<K,V,Vector> featureExtractor)
SimpleDataset
using the specified partCtxBuilder
and
featureExtractor
. This methods determines partition data
to be SimpleDatasetData
, but
allows to use any desired type of partition context
.K
- Type of a key in upstream
data.V
- Type of a value in upstream
data.C
- Type of a partition context
.upstreamMap
- Map
with upstream
data.partitions
- Number of partitions upstream
Map
will be divided on.partCtxBuilder
- Partition context
builder.featureExtractor
- Feature extractor used to extract features and build SimpleDatasetData
.public static <K,V,C extends Serializable> SimpleLabeledDataset<C> createSimpleLabeledDataset(Map<K,V> upstreamMap, int partitions, PartitionContextBuilder<K,V,C> partCtxBuilder, IgniteBiFunction<K,V,Vector> featureExtractor, IgniteBiFunction<K,V,double[]> lbExtractor)
SimpleLabeledDataset
using the specified partCtxBuilder
,
featureExtractor
and lbExtractor
. This method determines partition data
to be
SimpleLabeledDatasetData
, but allows to use any desired type of partition context
.K
- Type of a key in upstream
data.V
- Type of a value in upstream
data.C
- Type of a partition context
.upstreamMap
- Map
with upstream
data.partitions
- Number of partitions upstream
Map
will be divided on.partCtxBuilder
- Partition context
builder.featureExtractor
- Feature extractor used to extract features and build SimpleLabeledDatasetData
.lbExtractor
- Label extractor used to extract labels and buikd SimpleLabeledDatasetData
.public static <K,V> SimpleDataset<EmptyContext> createSimpleDataset(Map<K,V> upstreamMap, int partitions, IgniteBiFunction<K,V,Vector> featureExtractor)
SimpleDataset
using the specified featureExtractor
. This
methods determines partition context
to be EmptyContext
and partition data
to be
SimpleDatasetData
.K
- Type of a key in upstream
data.V
- Type of a value in upstream
data.upstreamMap
- Map
with upstream
data.partitions
- Number of partitions upstream
Map
will be divided on.featureExtractor
- Feature extractor used to extract features and build SimpleDatasetData
.public static <K,V> SimpleLabeledDataset<EmptyContext> createSimpleLabeledDataset(Map<K,V> upstreamMap, int partitions, IgniteBiFunction<K,V,Vector> featureExtractor, IgniteBiFunction<K,V,double[]> lbExtractor)
SimpleLabeledDataset
using the specified featureExtractor
and lbExtractor
. This methods determines partition context
to be EmptyContext
and
partition data
to be SimpleLabeledDatasetData
.K
- Type of a key in upstream
data.V
- Type of a value in upstream
data.upstreamMap
- Map
with upstream
data.partitions
- Number of partitions upstream
Map
will be divided on.featureExtractor
- Feature extractor used to extract features and build SimpleLabeledDatasetData
.lbExtractor
- Label extractor used to extract labels and build SimpleLabeledDatasetData
.
Follow @ApacheIgnite
Ignite Database and Caching Platform : ver. 2.7.2 Release Date : February 6 2019