GridGain Developers Hub

Persistent Storage

Overview

GridGain Persistence is designed to provide a quick and responsive persistent storage. When using the persistent storage, GridGain stores all the data on disk, and loads as much data as it can into RAM for processing.

When persistence is enabled, GridGain stores each partition in a separate file on disk. In addition to data partitions, GridGain stores indexes and metadata.

Profile Configuration

Each GridGain storage engine can have several storage profiles. Each profile has the following properties:

Property Default Description

engine

The name of the storage engine.

size

256 * 1024 * 1024

Sets the space allocated to the storage profile, in bytes.

replacementMode

CLOCK

Sets the page replacement algorithm.

pageSize

16384

The size of pages in the storage, in bytes.

memoryAllocator.type

unsafe

Memory allocator configuration. Uses sun.misc.Unsafe to improve performance. Currently, no other options are available.

Checkpointing

Checkpointing is the process of copying dirty pages from RAM to partition files on disk. A dirty page is a page that was updated in RAM but was not written to the respective partition file.

After a checkpoint is created, all changes are persisted to disk and will be available if the node crashes and is restarted.

Checkpointing is designed to ensure durability of data and recovery in case of a node failure.

This process helps to utilize disk space frugally by keeping pages in the most up-to-date state on disk.

Configuration

The table below describes checkpoint configuration:

Property Default Description

checkpoint.checkpointDelayMillis

200

Delay before staring a checkpoint after receiving the command.

checkpoint.checkpointThreads

4

Number of CPU threads dedicated to checkpointing.

checkpoint.compactionThreads

4

Number of CPU threads dedicated to data compaction.

checkpoint.frequency

180000

Checkpoint frequency in milliseconds.

checkpoint.frequencyDeviation

40

Allowed deviation in checkpoint frequency, in milliseconds.

checkpoint.logReadLockThresholdTimeout

0

Threshold for logging long read locks, in milliseconds.

checkpoint.readLockTimeout

10000

Timeout for checkpoint read lock acquisition, in milliseconds.

checkpoint.useAsyncFileIoFactory

true

If GridGain uses asynchronous file I/O operations provider.

==

Configuration Example

The example below shows sample GridGain cluster configuration with persistence and checkpoints:

storages:
  engines:
    aipersist:
      checkpoint:
        checkpointDelayMillis: 100
  profiles:
    clock_aipersist:
      engine: aipersist
      replacementMode: CLOCK

You can then use the profile (in this case, clock_aipersist) in your distribution zone configuration.