GridGain Developers Hub

Creating Apache Ignite Cluster Backups

Former Head of Developer Relations at GridGain
Apache Ignite Committer and PMC Member

Before you go into production with GridGain, you need to decide on a cluster backup and recovery strategy. The disk media that persists your Ignite records will not serve forever. A new version of an application might have a bug that corrupts the data. The data center that is running your Ignite-based solution might fail and become unreachable. These examples are just a few of the many problems that can lead to data corruption or make your primary cluster unavailable. You can’t eliminate all such events. However, If you back up Ignite regularly, you can use your backups to restore a cluster that has experienced a data-loss incident.

In this part of the tutorial, you use the Snapshot Management screen to create a cluster backup and use the backup later to resolve a data-corruption incident

Pause the Application

GridGain enables you to create hot cluster snapshots as applications update the cluster records and, later, to use the snapshots and WALs to recover to any point in time. However, in this tutorial, you simulate a data-loss incident by deleting all of a table’s records. Then, you observe that, after the cluster is restored, the number of records in the corrupted table is back to normal. Thus, now, stop the application that is used in the tutorial.

docker-compose -f docker/ignite-streaming-app.yaml stop

Create a Cluster Snapshot

After you pause the application, you navigate to the Snapshot Management screen and create a cluster snapshot:

  1. Click on the ADD SNAPSHOT button:

    Creating Snapshots with Nebula
  2. In the snapshot creation dialog, for Snapshot Type, select Full, and, for Decompression Level, select Default compression:

    Creating Full Ignite Snapshot
  3. Notice that the snapshot that you created is added in the SNAPSHOT list:

    Ignite Snapshots List

Corrupt a Cluster Table

After you create a cluster snapshot, you open the SQL screen and simulate a data-loss incident by removing records from the Trade table:

  1. Discover the number of trades in your cluster by executing the SELECT count(*) FROM Trade query. Your records count might not be the same as the number in the following screenshot, because the application might have been running for different amounts of time in your and my environments.

    Trades Count
  2. Use the DELETE FROM Trade query to remove all the trades.

  3. Confirm that the table is empty by running the SELECT count(*) FROM Trade query again.

Restore from the Snapshot

Now, you use the snapshot that you created to restore the Trade table:

  1. Open the Snapshots Management screen and restore the cluster from the snapshot:

    Restoring Apache Ignite Cluster from Snapshot
  2. After the restore procedure is complete, return to the SQL screen and execute the SELECT count(*) FROM Trade query. Confirm that all the records were recovered:

    Trades Count

What’s Next

Congratulations! You’ve finished all the steps of the tutorial. Now, you can stop the demo: