Memory and JVM Tuning
This article provides best practices for memory tuning that are relevant for deployments with and without native persistence or an external storage. Even though GridGain stores data and indexes off the Java heap, Java heap is still used to store objects generated by queries and operations executed by your applications. Thus, certain recommendations should be considered for JVM and garbage collection (GC) related optimizations.
Tune Swappiness Setting
An operating system starts swapping pages from RAM to disk when overall RAM usage hits a certain threshold.
Swapping can impact GridGain cluster performance.
You can adjust the operating system’s setting to prevent this from happening.
For Unix, the best option is to either decrease the vm.swappiness
parameter to 10
, or set it to 0
if native persistence is enabled:
sysctl -w vm.swappiness=0
The value of this setting can prolong GC pauses as well. For instance, if your GC logs show low user time, high
system time, long GC pause
records, it might be caused by Java heap pages being swapped in and out. To
address this, use the swappiness
settings above.
Share RAM with OS and Apps
An individual machine’s RAM is shared among the operating system, GridGain, and other applications. As a general recommendation, if a GridGain cluster is deployed in pure in-memory mode (native persistence is disabled), then you should not allocate more than 90% of RAM capacity to GridGain nodes.
On the other hand, if native persistence is used, then the OS requires extra RAM for its page cache in order to optimally sync up data to disk. If the page cache is not disabled, then you should not give more than 70% of the server’s RAM to GridGain.
Refer to memory configuration for configuration examples.
In addition to that, because using native persistence might cause high page cache utilization, the kswapd
daemon might not keep up with page reclamation, which is used by the page cache in the background.
As a result, this can cause high latencies due to direct page reclamation and lead to long GC pauses.
To work around the effects caused by page memory reclamation on Linux, add extra bytes between wmark_min
and wmark_low
with /proc/sys/vm/extra_free_kbytes
:
sysctl -w vm.extra_free_kbytes=1240000
Refer to this resource for more insight into the relationship between page cache settings, high latencies, and long GC pauses.
Java Heap and GC Tuning
Even though GridGain and Ignite keep data in their own off-heap memory regions invisible to Java garbage collectors, Java Heap is still used for objects generated by your applications workloads. For instance, whenever you run SQL queries against a GridGain cluster, the queries will access data and indexes stored in the off-heap memory while the result sets of such queries will be kept in Java Heap until your application reads the result sets. Thus, depending on the throughput and type of operations, Java Heap can still be utilized heavily and this might require JVM and GC related tuning for your workloads.
We’ve included some common recommendations and best practices below. Feel free to start with them and make further adjustments as necessary, depending on the specifics of your applications.
Generic GC Settings
Below are sets of example JVM configurations for applications that can utilize Java Heap on server nodes heavily, thus triggering long — or frequent, short — stop-the-world GC pauses.
For JDK 1.8+ deployments you should use G1 garbage collector. The settings below are a good starting point if 10GB heap is more than enough for your server nodes:
-server
-Xms10g
-Xmx10g
-XX:+AlwaysPreTouch
-XX:+UseG1GC
-XX:+ScavengeBeforeFullGC
If G1 does not work for you, consider using CMS collector and starting with the following settings. Note that 10GB heap is used as an example and a smaller heap can be enough for your use case:
-server
-Xms10g
-Xmx10g
-XX:+AlwaysPreTouch
-XX:+UseParNewGC
-XX:+UseConcMarkSweepGC
-XX:+CMSClassUnloadingEnabled
-XX:+CMSPermGenSweepingEnabled
-XX:+ScavengeBeforeFullGC
-XX:+CMSScavengeBeforeRemark
Advanced Memory Tuning
In Linux and Unix environments, it’s possible that an application can face long GC pauses or lower performance due to I/O or memory starvation due to kernel specific settings. This section provides some guidelines on how to modify kernel settings in order to overcome long GC pauses.
If GC logs show low user time, high system time, long GC pause
then most likely memory constraints are triggering swapping or scanning of a free memory space.
-
Check and adjust the swappiness settings.
-
Add
-XX:+AlwaysPreTouch
to JVM settings on startup. -
Disable NUMA zone-reclaim optimization.
sysctl -w vm.zone_reclaim_mode=0
-
Turn off Transparent Huge Pages if RedHat distribution is used.
echo never > /sys/kernel/mm/redhat_transparent_hugepage/enabled echo never > /sys/kernel/mm/redhat_transparent_hugepage/defrag
Advanced I/O Tuning
If GC logs show low user time, low system time, long GC pause
then GC threads might be spending too much time in the kernel space being blocked by various I/O activities.
For instance, this can be caused by journal commits, gzip, or log roll over procedures.
As a solution, you can try changing the page flushing interval from the default 30 seconds to 5 seconds:
sysctl -w vm.dirty_writeback_centisecs=500
sysctl -w vm.dirty_expire_centisecs=500
NUMA-Aware Memory Allocation
In high-performance environments, the time it takes for the computer to communicate between processor and memory can have negative effects on performance. To further improve performance, you can enable NUMA-aware memory allocation on GridGain. When the option is enabled, processor will access the closest NUMA node, which will increase performance.
Using NUMA-aware allocation leads to load being better spread across the threads.
Requirements
-
You need to make sure
libnuma 2.0.x
is installed on your Linux machine. It is also recommended to install thenumactl
utility.sudo apt install numactl
sudo yum install numactl
-
Enable NUMA-aware allocation by setting the
-XX:+UseNUMA
JVM property.
Using NUMA Allocation
Simple allocation strategy
Simple node allocation strategy is best used to attach the data region to a specific NUMA node.
-
Allocation with default NUMA policy on all NUMA nodes:
<property name="dataStorageConfiguration"> <bean class="org.apache.ignite.configuration.DataStorageConfiguration"> <property name="defaultDataRegionConfiguration"> <bean class="org.apache.ignite.configuration.DataRegionConfiguration"> <property name="name" value="Default_Region"/> .... <property name="memoryAllocator"> <bean class="org.apache.ignite.mem.NumaAllocator"> <constructor-arg> <bean class="org.apache.ignite.mem.SimpleNumaAllocationStrategy"/> </constructor-arg> </bean> </property> </bean> </property> </bean> </property>
-
Allocation on specific NUMA node:
<property name="dataStorageConfiguration"> <bean class="org.apache.ignite.configuration.DataStorageConfiguration"> <property name="defaultDataRegionConfiguration"> <bean class="org.apache.ignite.configuration.DataRegionConfiguration"> <property name="name" value="Default_Region"/> .... <property name="memoryAllocator"> <bean class="org.apache.ignite.mem.NumaAllocator"> <constructor-arg> <bean class="org.apache.ignite.mem.SimpleNumaAllocationStrategy"> <constructor-arg name="node" value="0"/> </bean> </constructor-arg> </bean> </property> </bean> </property> </bean> </property>
Interleaved Allocation Strategy.
Interleaved allocation is best used to attach the data region to multiple NUMA nodes.
-
Interleaved allocation on all NUMA nodes:
<property name="dataStorageConfiguration"> <bean class="org.apache.ignite.configuration.DataStorageConfiguration"> <property name="defaultDataRegionConfiguration"> <bean class="org.apache.ignite.configuration.DataRegionConfiguration"> <property name="name" value="Default_Region"/> .... <property name="memoryAllocator"> <bean class="org.apache.ignite.mem.NumaAllocator"> <constructor-arg> <bean class="org.apache.ignite.mem.InterleavedNumaAllocationStrategy"/> </constructor-arg> </bean> </property> </bean> </property> </bean> </property>
-
Interleaved allocation on specified NUMA nodes:
<property name="dataStorageConfiguration"> <bean class="org.apache.ignite.configuration.DataStorageConfiguration"> <property name="defaultDataRegionConfiguration"> <bean class="org.apache.ignite.configuration.DataRegionConfiguration"> <property name="name" value="Default_Region"/> .... <property name="memoryAllocator"> <bean class="org.apache.ignite.mem.NumaAllocator"> <constructor-arg> <bean class="org.apache.ignite.mem.InterleavedNumaAllocationStrategy"> <constructor-arg name="nodes"> <array> <value>0</value> <value>1</value> </array> </constructor-arg> </bean> </constructor-arg> </bean> </property> </bean> </property> </bean> </property>
Local node allocation strategy
Allocation on local for process NUMA node, uses void* numa_alloc_onnode(size_t)
under the hood.
<property name="dataStorageConfiguration">
<bean class="org.apache.ignite.configuration.DataStorageConfiguration">
<property name="defaultDataRegionConfiguration">
<bean class="org.apache.ignite.configuration.DataRegionConfiguration">
<property name="name" value="Default_Region"/>
....
<property name="memoryAllocator">
<bean class="org.apache.ignite.mem.NumaAllocator">
<constructor-arg>
<constructor-arg>
<bean class="org.apache.ignite.mem.LocalNumaAllocationStrategy"/>
</constructor-arg>
</constructor-arg>
</bean>
</property>
</bean>
</property>
</bean>
</property>
© 2024 GridGain Systems, Inc. All Rights Reserved. Privacy Policy | Legal Notices. GridGain® is a registered trademark of GridGain Systems, Inc.
Apache, Apache Ignite, the Apache feather and the Apache Ignite logo are either registered trademarks or trademarks of The Apache Software Foundation.