Back up and restore meta service
This guide introduces how to back up meta service data and restore from a backup.
A meta snapshot is a backup of meta service’s data at a specific point in time. Meta snapshots are persisted in S3-compatible storage.
Set backup parameters
Before you can create a meta snapshot, you need to set the backup_storage_url
and backup_storage_directory
system parameters prior to the first backup attempt.
Be careful not to set the backup_storage_url
and backup_storage_directory
when there are snapshots. However, it’s not strictly forbidden. If you insist on doing so, please note the snapshots taken before the setting will all be invalidated and cannot be used in restoration anymore.
To learn about how to configure system parameters, see How to configure system parameters.
Create a meta snapshot
Meta snapshot is created by meta service whenever requested by users. There is no automatic process in RisingWave kernel that creates meta snapshot regularly.
Here’s an example of how to create a new meta snapshot with risectl
:
risectl
is included in the pre-built RisingWave binary. Use the following command instead:
View existing meta snapshots
The following SQL command lists existing meta snapshots:
Example output:
Delete a meta snapshot
Here’s an example of how to delete a meta snapshot with risectl
:
Restore from a meta snapshot
Below are two separate methods to restore from a meta snapshot using SQL database and etcd as the meta store backend.
SQL database as meta store backend
If the cluster has been using a SQL database as meta store backend, follow these steps to restore from a meta snapshot.
- Shut down the meta service.
This step is especially important because the meta backup and recovery process does not replicate SST files. It is not permitted for multiple clusters to run with the same SSTs set at any time, as this can corrupt the SST files.
- Create a new meta store, i.e. a new SQL database instance. Note that this new SQL database instance must have the exact same tables defined as the original, but all tables should remain empty. To achieve this, you can optionally use the schema migration tool to create tables, then truncate those non-empty tables populated by the tool.
- Restore the meta snapshot to the new meta store.
restore-meta
reads snapshot data from backup storage and writes them to meta store and hummock storage.
For example, given the cluster settings below:
Parameters to risectl meta restore-meta
should be:
--backup-storage-url s3://backup_bucket
.--backup-storage-directory backup_data
.--hummock-storage-url s3://state_bucket
. Note that thehummock+
prefix is stripped.--hummock-storage-directory state_data
.
- Configure meta service to use the new meta store.
etcd as meta store backends
If the cluster has been using etcd as meta store backend, follow these steps to restore from a meta snapshot.
- Shut down the meta service.
This step is especially important because the meta backup and recovery process does not replicate SST files. It is not permitted for multiple clusters to run with the same SSTs set at any time, as this can corrupt the SST files.
- Create a new meta store, i.e. a new and empty etcd instance.
- Restore the meta snapshot to the new meta store.
If etcd enables authentication, also specify the following:
restore-meta
reads snapshot data from backup storage and writes them to meta store and hummock storage.
For example, given the cluster settings below:
Parameters to risectl meta restore-meta
should be:
--backup-storage-url s3://backup_bucket
.--backup-storage-directory backup_data
.--hummock-storage-url s3://state_bucket
. Note that thehummock+
prefix is stripped.--hummock-storage-directory state_data
.
- Configure meta service to use the new meta store.
Access historical data backed up by meta snapshot
Meta snapshot is used to support historical data access, also known as time travel query.
Use the following steps to perform a time travel query.
- List all valid historical point-in-time (i.e., epoch) for a table. For example to query the table of id 6:
Example output:
Choose an epoch to query. Valid epochs are within range [safe_epoch
,committed_epoch
], e.g. [7039353459507200, 7039354678542336] or [7039354678542346, 7039622397886464].
2. Set session config QUERY_EPOCH
. By default, it’s 0, which means disabling historical query.
Then, batch queries in this session return data as of this epoch instead of the latest one. 3. Disable historical query.
RisingWave only supports historical data access at a specific point in time backed up by at least one meta snapshot.
Was this page helpful?