public class Dataset<Row extends DatasetRow> extends Object implements Serializable, Externalizable
Modifier and Type | Field and Description |
---|---|
protected int |
colSize
Amount of attributes in each vector.
|
protected Row[] |
data
Data to keep.
|
protected boolean |
isDistributed |
protected FeatureMetadata[] |
meta
Metadata to identify feature.
|
protected int |
rowSize
Amount of instances.
|
Constructor and Description |
---|
Dataset()
Default constructor (required by Externalizable).
|
Dataset(int rowSize,
int colSize,
String[] featureNames,
boolean isDistributed)
Creates new Dataset and initialized with empty data structure.
|
Dataset(Row[] data)
Creates new Dataset by given data.
|
Dataset(Row[] data,
FeatureMetadata[] meta)
Creates new Dataset by given data.
|
Dataset(Row[] data,
int colSize)
Creates new Dataset by given data.
|
Dataset(Row[] data,
String[] featureNames,
int colSize)
Creates new Dataset by given data.
|
Modifier and Type | Method and Description |
---|---|
int |
colSize()
Gets amount of attributes.
|
protected void |
convertStringNamesToFeatureMetadata(String[] featureNames) |
DatasetRow[] |
data() |
boolean |
equals(Object o) |
Vector |
features(int idx)
Get the features.
|
protected void |
generateFeatureNames() |
String |
getFeatureName(int i)
Returns feature name for column with given index.
|
Row |
getRow(int idx)
Retrieves Labeled Vector by given index.
|
int |
hashCode() |
boolean |
isDistributed() |
FeatureMetadata[] |
meta() |
void |
readExternal(ObjectInput in) |
int |
rowSize()
Gets amount of observation.
|
void |
setData(Row[] data) |
void |
setDistributed(boolean distributed) |
void |
setMeta(FeatureMetadata[] meta) |
void |
writeExternal(ObjectOutput out) |
protected Row extends DatasetRow[] data
protected FeatureMetadata[] meta
protected int rowSize
protected int colSize
protected boolean isDistributed
public Dataset()
public Dataset(Row[] data, FeatureMetadata[] meta)
data
- Given data. Should be initialized with one vector at least.meta
- Feature's metadata.public Dataset(Row[] data, String[] featureNames, int colSize)
data
- Given data. Should be initialized with one vector at least.featureNames
- Column names.colSize
- Amount of observed attributes in each vector.public Dataset(Row[] data, int colSize)
data
- Should be initialized with one vector at least.colSize
- Amount of observed attributes in each vector.public Dataset(Row[] data)
data
- Should be initialized with one vector at least.public Dataset(int rowSize, int colSize, String[] featureNames, boolean isDistributed)
rowSize
- Amount of instances. Should be > 0.colSize
- Amount of attributes. Should be > 0featureNames
- Column names.protected void convertStringNamesToFeatureMetadata(String[] featureNames)
protected void generateFeatureNames()
public String getFeatureName(int i)
i
- The given index.public DatasetRow[] data()
public void setData(Row[] data)
public FeatureMetadata[] meta()
public void setMeta(FeatureMetadata[] meta)
public int colSize()
public int rowSize()
public Row getRow(int idx)
idx
- Index of observation.public boolean isDistributed()
public void setDistributed(boolean distributed)
public Vector features(int idx)
idx
- Index of observation.public void writeExternal(ObjectOutput out) throws IOException
writeExternal
in interface Externalizable
IOException
public void readExternal(ObjectInput in) throws IOException, ClassNotFoundException
readExternal
in interface Externalizable
IOException
ClassNotFoundException
GridGain In-Memory Computing Platform : ver. 8.9.15 Release Date : December 3 2024