Backup and Restore

Datomic database can be backed up to a file system directory or to an S3 bucket path, and can be restored to any Datomic system. Backup and restore are useful for

  • disaster recovery
  • moving databases from one storage to another

Backups can be performed at any time against a live system, with no locking. Repeat backups to the same backup URI do not need to copy segments that are unchanged, which can be significantly faster for databases that have changed little between backups.

Backup to a local file is available in both Free and Pro editions, while backup to S3 is only available in Datomic Pro.

URI syntax

Backup URIs have the following syntax:

DestinationSyntax
directoryfile:/full/path/to/backup-directory
S3 (Pro only)s3://bucket/prefix

Database URI syntax is described in the JavaDoc for Peer.connect().

When backing up from DDB, the required AWS permissions are the same as those a Peer would use, i.e. storage read-only. However, when restoring to DDB, the AWS permissions should be those used by the transactor, i.e. read/write credentials. See AWS Access Control for information on providing the backup/restore processes access to the necessary resources.

IAM role policies

Peers performing backup will need to be able to write to nested keys within the backup S3 bucket. Peers performing restore will need to read from the backup S3 bucket. The following role policy would grant both permissions.

{"Statement":
 [{"Effect":"Allow",
   "Action":["s3:*"],
   "Resource":
   ["arn:aws:s3:::{bucket-name}", "arn:aws:s3:::{bucket-name}/*"]}]}

Additionally, peers performing restore to a DDB table will need to be able to both read from and write to that table:

{"Statement":
 [{"Effect":"Allow",
   "Action":["dynamodb:*"],
   "Resource":"arn:aws:dynamodb:*:{account-id}:table/{table-name}"}]}

Backing Up

You can use the backup-db command to back up a database.

bin/datomic -Xmx4g -Xms4g backup-db from-db-uri to-backup-uri

You should give the backup process as much memory as you give the transactor process for a system.

Backup URIs are per database. You can backup the same database at different points in time to a single backup URI. You can not backup different databases to a single backup URI.

Backups to s3 can be encrypted with sse. By default, backups are not stored encrypted. To enable encryption, pass the –encryption sse flag.

bin/datomic -Xmx4g -Xms4g backup-db --encryption sse from-db-uri to-backup-uri

Differential Backup

Datomic version 0.9.5130 introduced differential backup. Differential backup takes advantage of the tree structure of Datomic by skipping consideration of child nodes when a parent node is already present from a previous backup.

If you backup a database repeatedly to the same URI, differential backup will substantially reduce backup times. This approach is strongly recommended, especially for large databases.

Listing Backups

You can use the list-backups command to list the approximate points in time (t) of different available backups.

bin/datomic list-backups backup-uri

Restoring

You can use the restore-db command to restore a database.

bin/datomic -Xmx4g -Xms4g restore-db from-backup-uri to-db-uri (t)

You should give the backup process as much memory as you give the transactor process for a system.

If you do not specify the optional t, the most recent backup will be restored. Note that you can only restore to a t that has been backed up. It is not possible to restore to an arbitrary t.

You can restore into a URI that already points to a different point-in-time for the same database. You cannot restore into a URI that points to a different database.

Restore can rename databases. However, you cannot restore a single database to two different URIs within the same storage.

You do not need to do anything to a storage (e.g. deleting old files or tables) before or after restoring. However, restoring to an "empty" storage will use the least possible storage space.

You must restart peers and transactors after a restore. The instructions below describe how to perform these steps, depending on which storage you are restoring to.

Restoring to :dev and :free storages

:dev and :free storages currently require a running transactor during restore, because storage resides inside the transactor process. You must start the transactor before running a restore, and then shutdown and restart the transactor after the restore completes.

Restoring to all other storages

With all system processes down, run the restore. Then start the transactor and peers.

Backup and Restore Performance

Backup and restore performance can be configured through system properties. In particular, you can change the number of concurrent reads and writes on a local system using datomic.fileBackupConcurrency or on s3 using datomic.s3BackupConcurrency. The default for fileBackupConcurrency is 5 and s3BackupConcurrency defaults to 25. These are reasonable defaults and most applications will never need to change these settings.

Another configurable option provided for backup is backupPaceMsec. This setting can be used to reduce I/O pressure by slowing the pace of backups. This setting defaults to no pacing, allowing backup to go as fast as possible. If you need to slow the pace of backup down, setting this value to an integer will cause backup to pause that many milliseconds between backup operations.

Deleting Backups

If you need to reclaim backup space, you can delete the entire content hierarchy at a particular URI, using the filesystem or S3 tool of your choice. This will reclaim all space, and delete all the different point-in-time backups associated with that URI.

Deleting a single point in time within a backup URI is not supported.

Limitations

Backup and restore are not suitable for cloning of a database within a single storage. If you attempt to restore a database into a storage that already contains that database, but under a different name, the restore operation will fail.

Backup and restore operations require durable storage and will not work with the memory database.