GridGain Developers Hub

Data Snapshots and Recovery

GridGain provides the ability to create snapshots of data stored cluster-wide, that can later be used for cluster recovery purposes. Having snapshots at hand, they can be used to recover the cluster to a state recorded in a snapshot.

Creating Full Snapshots

To create a full snapshot, use the cluster snapshot create CLI command. In the command, you can specify the list of fully qualified table names to create snapshots of, or specify the --all option to create a snapshot of all tables. For example:

cluster snapshot create --type=full --tables=PERSON --destination=relative-path-example

The command above creates a full snapshot of a table Person at a specified destination path (see Defining Snapshot Paths for the latter).

Creating Incremental Snapshots

When creating incremental snapshots, successive copies of the data contain only the changes since the last full or incremental snapshot. The base snapshot for incremental snapshots must be a full snapshot, but all subsequent ones can be incremental. The latest valid snapshot will be found for the tables you have specified, and an incremental snapshot based on it will be created.

Here is how you can create an incremental snapshot based on the full snapshot created above:

cluster snapshot create --type=incremental --tables=PERSON --destination=relative-path-example

You cannot add more tables to the snapshot when creating an incremental snapshot. You need to have the same tables in it as in the base snapshot created before.

Creating Snapshots in the Past

You can also make a snapshot for the specific cluster state in the past, for example:

cluster snapshot create --type=full --timestamp=2024-09-10T10:53:00+01:00 --all --destination=relative-path-example

The timestamp must be specified in ISO format.

Restoring Snapshots

To restore snapshots, you can use the cluster snapshot restore command.

To make sure your snapshot is restored correctly, follow these guidelines:

  • Make sure that the cluster topology is the same as the one snapshot was taken on.

  • Stop traffic to the cluster during restoration to avoid possible inconsistencies and failed operations.

When you are prepared to restore data to the cluster, run the restore command. For example:

cluster snapshot restore --id=8eb10b48-6885-4922-a1af-c28d8473ba28 --source=relative-path-example

The command above restores all tables in the snapshot with the specified ID, from the specified source path (see Defining Snapshot Paths for the latter). You can also choose to only restore specific tables stored in the snapshot, instead of all of them. In this case, specify the fully qualified table names of the tables to restore, for example:

cluster snapshot restore --id=8eb10b48-6885-4922-a1af-c28d8473ba28 --tables=PERSON

Defining Snapshot Paths

You define snapshot paths in the cluster configuration.

You then define the snapshot path for a specific command (--destination when creating a snapshot, --source when restoring from a snapshot).

The destinations and sources can be defined as either LOCAL or REMOTE.

  • LOCAL snapshots are not shared between nodes. Every node saves the snapshot metadata and all partition files.

  • REMOTE snapshots save only one copy of snapshot metadata and partition files. The location where this copy is saved must be accessible by all nodes with the use of the same base URI.

For absolute LOCAL path, snapshot with ID 1 would be created in the /absolute-path/node-1/snapshot-1 directory. For absolute REMOTE path, the same snapshot would be created in the /absolute-path/snapshot-1 directory. For relative REMOTE and LOCAL paths, the same snapshot would be created in the {GRIDGAIN_HOME}/relative-path/snapshot-1 directory.

Incremental snapshot destination must have the same URI and type as all parents' snapshots. The path name could be different from that of the parents.

The snapshot restore operations must point to the same URI with the same type (LOCAL or REMOTE) that were used when creating that snapshot.

Checking Snapshot Status

You can check the status of all snapshots by using the cluster snapshot status command. By default, this command provides information about all snapshots in the cluster.

cluster snapshot status

You can narrow information down by providing the snapshot ID. If you do, you can also use the --all-nodes option to see information about the snapshot on each specific node in the cluster. For example:

cluster snapshot status --id=8eb10b48-6885-4922-a1af-c28d8473ba28  --all-nodes

The command above returns information about all operations with the snapshots per node.

The following information is provided:

Column Description

Operation ID

The ID of the operation. For create operations, this is the snapshot ID. For delete operations, this is the ID of the delete operation.

Start time

Time when the operation was started in UNIX time.

Operation

The operation performed. CREATE for creating snapshots, DELETE for deleting snapshots.

Status

Current operation status. Possible values: STARTED, COMPLETED, FAILED.

Target Snapshot ID

The snapshot the operation was performed against.

Base Snapshot ID

For incremental snapshots, the id of the snapshot this snapshot is based on.

Description

Operation description.

Timestamp

Point in time that corresponds to the system state the snapshot reflects.

URI

The base URI used by the snapshot operation.

URI Type

The path definition type: LOCAL or REMOTE.

Deleting Snapshots

You can delete a snapshot using the delete command.

cluster snapshot delete --id [--url]

Where: id is the snapshot’s ID and url (optional) is the cluster’s URL.

For example:

cluster snapshot delete --id=8eb10b48-6885-4922-a1af-c28d8473ba28 --url=http://localhost:10300