This repository has been archived by the owner on Feb 19, 2022. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 9
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
06ed2e5
commit 2948619
Showing
1 changed file
with
87 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,87 @@ | ||
#timescaledb-backup | ||
|
||
`timescaledb-backup` is a program for making dumping and restoring a | ||
[TimescaleDB](//github.com/timescale/timescaledb) database simpler, less error-prone, | ||
and more performant. In particular, the current use of vanilla PostgreSQL tools | ||
[`pg_dump`](//www.postgresql.org/docs/current/app-pgdump.html) and | ||
[`pg_restore`](//www.postgresql.org/docs/current/app-pgrestore.html) has several | ||
limitations when applied to TimescaleDB: | ||
1. The PostgreSQL backup/restore tools do not support backup/restore _across_ | ||
versions of extensions. So that if you take a backup from (say) TimescaleDB | ||
v1.7.1, you need to restore to a database version that is _also_ running | ||
TimescaleDB v1.7.1, and then manually upgrade TimescaleDB to a later version. | ||
1. The backup/restore tools do not _track_ which version of TimescaleDB is in the | ||
backup, so a developer needs to maintain additional external information to | ||
ensure the proper restore process. | ||
1. Users need to take manual steps to run pre- and post-restore hooks (database | ||
functions) in TimescaleDB to ensure correct behavior. Failure to execute these | ||
hooks can prevent restores from functioning correctly. | ||
1. The restore process cannot easily perform parallel restoration for greater | ||
speed/efficiency. | ||
|
||
Towards this end, `timescaledb-backup` overcomes many of these current | ||
limitations. It continues to use `pg_dump` and `pg_restore`, but properly wraps | ||
them to: | ||
1. Track automatically the version of TimescaleDB internally in the information | ||
dumped, to ensure that the proper version is always restored. | ||
1. Run all pre and post restore hooks at their proper times during a restore; and | ||
1. Enable parallel restore by properly sequencing catalog and data restores during | ||
the restore process. | ||
|
||
## Installing `timescaledb-backup` | ||
|
||
You can install by running `go get` | ||
|
||
```bash | ||
$ go get github.com/timescale/timescaledb-backup/ | ||
``` | ||
|
||
Or by downloading a binary at [our release page](//github.com/timescale/timescaledb-backup/releases) | ||
It will also be distributed with a number of our tools in `timescaledb-tools`; yum, apt, Homebrew,etc. | ||
|
||
## Using `timescaledb-backup` | ||
|
||
### Requirements | ||
- You will need binaries for `pg_dump` and `pg_restore` installed where you are running | ||
`timescaledb-backup` | ||
- The target database needs the `.so` file of the dumped version so that we can restore to the correct version. It will also need the `.so` of your target version. | ||
|
||
### Using `ts-dump` | ||
First create a dump using the `ts-dump` command, for those used to using `pg_dump`, the | ||
options are pared down significantly you will need to provide the following parameters: | ||
|
||
- `--db-URI` the database connection string in [Postgres URI](https://www.postgresql.org/docs/current/libpq-connect.html#LIBPQ-CONNSTRING) format to connect to. The Postgres format is: `postgresql://[user[:password]@][host][:port][,...][/dbname][?param1=value1&...]` many of these parameters can be specified in enviornmental variables in the normal Postgres convention and passwords will be looked up in the usual ways as allowed by the `pgx` go library. | ||
- `--dump-dir` the dump directory where you would like the dump to be stored. The dump directory should not exist, it will be created as part of creating the dump, however the path to the directory should exist. | ||
|
||
Optional parameters: | ||
- `--jobs` Sets the number of jobs to run for the dump, by default it is set to 4 and will run in parallel mode, set to 0 to disable parallelism | ||
- `--verbose` Determines whether verbose output will be provided from `pg_dump`. Defaults to false. | ||
|
||
When a dump is created, the dump directory will be created, and a json file containing | ||
TimescaleDB version information as well as a subdirectory containing the dump will be | ||
created. The dump is always created in directory format inside the subdirectory. | ||
|
||
### Using `ts-restore` | ||
Once you have a backup you can run a `ts-restore` by specifying the same dump directory | ||
and a new database uri. Currently, the database you are restoring to must already exist. | ||
If TimescaleDB is installed it will be dropped and re-created at the proper | ||
version, we recommend restoring only to an empty database. You will need to provide the following parameters: | ||
|
||
- `--db-URI` the database connection string in [Postgres URI](https://www.postgresql.org/docs/current/libpq-connect.html#LIBPQ-CONNSTRING) format to connect to. The Postgres format is: `postgresql://[user[:password]@][host][:port][,...][/dbname][?param1=value1&...]` many of these parameters can be specified in enviornmental variables in the normal Postgres convention and passwords will be looked up in the usual ways as allowed by the `pgx` go library. | ||
- `--dump-dir` the dump directory where the output from `ts-dump` was stored. | ||
|
||
Optional parameters: | ||
- `--jobs` Sets the number of jobs to run for the restore, by default it is set to 4 and will run in parallel mode during the sections[^1] that are able to be parallelized. Set to 0 to disable parallelism. | ||
- `--verbose` Determines whether verbose output will be provided from `pg_restore`. Defaults to true. | ||
|
||
## Limitations of TimescaleDB Backup | ||
We currently do not support many of the options that `pg_dump` and `pg_restore` do, please | ||
submit issues or contributions for flags or options that you believe are important to | ||
include. | ||
|
||
|
||
|
||
[^1]: In order to support parallel restores given the way the TimescaleDB catalog works, | ||
we first perform a pre-data restore, then restore the data for the catalog, then in | ||
parallel perform the data section for everything else and the post-data section for | ||
everything. |