public class IgfsGroupDataBlocksKeyMapper
extends org.apache.ignite.internal.processors.cache.GridCacheDefaultAffinityKeyMapper
IGFS
class providing ability to group file's data blocks together on one node.
All blocks within the same group are guaranteed to be cached together on the same node.
Group size parameter controls how many sequential blocks will be cached together on the same node.
For example, if block size is 64kb
and group size is 256
, then each group will contain
64kb * 256 = 16Mb
. Larger group sizes would reduce number of splits required to run map-reduce
tasks, but will increase inequality of data size being stored on different nodes.
Note that getGroupSize()
parameter must correlate to Hadoop split size parameter defined
in Hadoop via mapred.max.split.size
property. Ideally you want all blocks accessed
within one split to be mapped to 1
group, so they can be located on the same grid node.
For example, default Hadoop split size is 64mb
and default IGFS
block size
is 64kb
. This means that to make sure that each split goes only through blocks on
the same node (without hopping between nodes over network), we have to make the getGroupSize()
value be equal to 64mb / 64kb = 1024
.
It is required for IGFS
data cache to be configured with this mapper. Here is an
example of how it can be specified in XML configuration:
<bean id="cacheCfgBase" class="org.apache.ignite.cache.CacheConfiguration" abstract="true"> ... <property name="affinityMapper"> <bean class="org.apache.ignite.igfs.IgfsGroupDataBlocksKeyMapper"> <!-- How many sequential blocks will be stored on the same node. --> <property name="groupSize" value="512"/> </bean> </property> ... </bean>
Modifier and Type | Field and Description |
---|---|
static int |
DFLT_GRP_SIZE
Default group size.
|
Constructor and Description |
---|
IgfsGroupDataBlocksKeyMapper()
Default constructor.
|
IgfsGroupDataBlocksKeyMapper(int grpSize)
Constructs affinity mapper to group several data blocks with the same key.
|
Modifier and Type | Method and Description |
---|---|
Object |
affinityKey(Object key)
If key class has annotation
AffinityKeyMapped ,
then the value of annotated method or field will be used to get affinity value instead
of the key itself. |
int |
getGroupSize()
Get group size.
|
IgfsGroupDataBlocksKeyMapper |
setGroupSize(int grpSize)
Set group size.
|
String |
toString() |
public static final int DFLT_GRP_SIZE
public IgfsGroupDataBlocksKeyMapper()
public IgfsGroupDataBlocksKeyMapper(int grpSize)
grpSize
- Size of the group in blocks.public Object affinityKey(Object key)
AffinityKeyMapped
,
then the value of annotated method or field will be used to get affinity value instead
of the key itself. If there is no annotation, then the key is returned as is.affinityKey
in interface AffinityKeyMapper
affinityKey
in class org.apache.ignite.internal.processors.cache.GridCacheDefaultAffinityKeyMapper
key
- Key to get affinity key for.public int getGroupSize()
Group size defines how many sequential file blocks will reside on the same node. This parameter
must correlate to Hadoop split size parameter defined in Hadoop via mapred.max.split.size
property. Ideally you want all blocks accessed within one split to be mapped to 1
group,
so they can be located on the same grid node. For example, default Hadoop split size is 64mb
and default IGFS
block size is 64kb
. This means that to make sure that each split
goes only through blocks on the same node (without hopping between nodes over network), we have to
make the group size be equal to 64mb / 64kb = 1024
.
Defaults to DFLT_GRP_SIZE
.
public IgfsGroupDataBlocksKeyMapper setGroupSize(int grpSize)
getGroupSize()
for more information.grpSize
- Group size.this
for chaining.
Follow @ApacheIgnite
Ignite Database and Caching Platform : ver. 2.7.2 Release Date : February 6 2019