Tuning Apache® Ignite™ for Optimal Performance Webinar Recap

December 19, 2016

This informative webinar provided an overview of different potential performance problems and bottlenecks that could occur when using Apache® Ignite™ and techniques for tuning Apache® Ignite™. This included tips on basic cache operations, data loading, affinity collocation, SQL query tuning and JVM tuning. For additional information, including code examples, please see the complete recording of the webinar.

Apache Ignite is an in-memory computing platform. The core component is an in-memory data grid, a distributed in-memory key-value store with built in high availability, rich ANSI-99 compliant SQL support, and very strong consistency guarantees. Apache Ignite delivers real-time performance and massive scalability, but care must be taken to achieve optimal performance when developing fast applications built with this solution.

Tuning Apache Ignite Overview

If you encounter a performance issue, the most important first step is to isolate the issue and understand which component of your system is the bottleneck, and keep in mind that the bottleneck is likely to be caused by incorrect or ineffective code. The first layer to understand is the application layer, followed by the JVM layer, then a system layer, and then the hardware layer. Always troubleshoot from top to bottom when analyzing performance issues, and make double sure that the problem is not in a layer before moving to the next layer. Problems are most commonly found at the application or JVM layers, although the system and hardware layers become very important in high-load systems.

Tuning Apache® Ignite™ diagram

Basic Cache Operations in Apache Ignite

The data grid is the most widely used component of Apache Ignite and performance in the data grid is critical, so let’s start here. The basic cache operations include get, put, and remove. The performance of these operations is measured in latency and throughput. You’ll want to minimize latency and maximize throughput. With cache operations, Apache Ignite has the ability to run each operation separately or batched. Batched cache operations typically result in higher throughput but also higher latency so measuring performance and adjusting code is critical in order to find the best balance. Furthermore, in the instance where you’re caching from a data stream, you’ve got to assess network performance as well.

Affinity Collocation in Apache Ignite

Affinity collocation is a very important topic in Ignite performance and proper configuration can improve performance by orders of magnitude. When you create your data model for the data grid and defining application architecture, it’s critical to focus on affinity, the mapping between the cache keys stored in the data grid and the nodes where they keys are stored. The affinity function itself is a stateless hash function that Ignite uses to determine on which node the cache resides. Using the affinity key properly can provide tremendous performance improvements. For example, you can use the affinity key to run SQL queries or computations on the same node where the cache is housed, optimizing the operations because data doesn’t have to be sent across the network. Minimizing network traffic between nodes is one of the best ways to improve performance in Apache Ignite.

Tuning SQL Queries in Apache Ignite

There are many ways to improve SQL query performance in Ignite, but the cornerstone is indexing. Whenever you want to select any data and filter it, you’ll want to create an index for this field in order to avoid a full scan of the table. Ignite has full indexing support and query optimization with Ignite is similar to with traditional relational databases so the main goal is to find a combination of indexes that yield the best performance while remembering that too many indexes waste memory. Query optimization starts with running the EXPLAIN statement to get an execution plan which will tell you how the query is executed, the order of joins, and the indices used. This information is critical when working to improve query speed. Other tools include the GridGain WebConsole and the H2 Debug Console.

JVM Tuning for Apache Ignite

Most of the issues encountered on the JVM level are related to memory and memory management in Java. The most important thing to know about JVM tuning is to avoid large heap sizes. If the allocated heap is very large (more than 16 GB), then full garbage collection is going to take a long time. Ignite provides an option to store data in off-heap memory and manage it directly, improving performance. The next bit of advice is to always use G1 or CMS collectors. Also, make sure to log garbage collection logs for troubleshooting.

If you’d like to see a code example that brings together all of these techniques make sure to view the recording of this webinar.