StorageTapper is a scalable realtime MySQL change data streaming, logical backup and logical replication service.
Storagetapper is deployed in production at Uber and used to produce snapshot and realtime changed data of thousands of MySQL tables across multiple datacenters.
It is also used as a backup service to snapshot hundreds of terrabytes of Schemaless data to HDFS and S3 with optional asymmetric encryption and compression.
It reads data from source transforms according to the specified event format and produces data to destination.
Supported event sources:
- MySQL
- Schemaless
Supported event destinations:
- Kafka
- HDFS
- S3
- Local file
- MySQL (experimental)
- Postgres (experimental)
- Clickhouse (experimental)
Supported event formats:
- Avro
- JSON
- MsgPack
- SQL
Storagetapper keeps it jobs state in MySQL database and automatically distribute jobs between configured number of workers.
It is also aware of node roles and takes snapshot from the slave nodes in order to reduce load on master nodes. It can also optionally further throttles the reads. Binlogs are streamed from master nodes for better SLAs.
Service is dynamically configurable through RESTful API or builtin UI.
cd storagetapper
make deb && dpkg -i ../storagetapper_1.0_amd64.deb
cd storagetapper
make && make install
/bin/bash scripts/install_deps.sh # install all dependencies: MySQL, Kafka, HDFS, S3, ...
make test # run all tests
GO111MODULE=on TEST_PARAM="-test.run=TestLocalBasic" /bin/bash scripts/run_tests.sh ./pipe # individual test
make test-env
$ make test
Storagetapper loads configuration from the following files and location in the given order:
/etc/storagetapper/base.yaml
/etc/storagetapper/production.yaml
$(HOME)/base.yaml
$(HOME)/production.yaml
$(STORAGETAPPER_CONFIG_DIR)/base.yaml
$(STORAGETAPPER_CONFIG_DIR)/production.yaml
Available options described in Options section
This software is licensed under the MIT License.