Skip to content

Upgrading from 1.x to 2.x

Amnon Heiman edited this page Aug 8, 2018 · 5 revisions

Upgrading from 1.x to 2.x

Upgrade procedure highlights - Use two monitoring stack

Scylla monitoring stack uses Prometheus as its metrics database. Because Prometheus is not backward compatible, following Prometheus upgrade procedure (see below) to upgrade (and save old metrics) you should use a second monitoring stack that would run beside the old one.

Download the 2.0 version and place it in a different directory. You can run two instances of prometheus/grafana/alert manager on the same machine, as long as you are using different ports.

After you make sure that the second stack is running correctly, you can configure it to read history metrics from the old one. As time pass, you can remove that.

Upgrade to the latest 1.x version

Before starting the upgrade procedure, make sure you are running the latest 1.x version.

Install the new monitoring stack

Download the 2.x version from the release page.

Unzip it in a different directory. This is important, Prometheus is not backward compatible and would not be able to use the old data.

You can use the server definitions from the old monitoring stack. start the new monitoring stack (if you are using Docker, make sure you are using -g -p and -m to specify different ports)

While the old system keeps on working, you can take the new up and down to make sure everything works.

Validation

Make sure the new monitoring stack is working before moving to 2.x as your main system. See that the graphs return data and nodes are reachable.

Alerting Rules

Note that alerting rules moved to a yml file format, make sure that all defined rules are taken.

Moving to Prometheus 2.x

Monitoring stack Version 2.0 upgrade the Prometheus version from 1.8 to 2.3. This upgrade is not backward compatible.

Prometheus Migration is cover here.

Note that when using the docker containers, besides the data migration, the docker permissions were changed. This means that the permissions of the data directory will no longer work.

If both systems are working side by side and you want to move to use the new system and need the old data do the following

Set the new system to read from the old

In the prometheus template yaml file add:

remote_read:
  - url: "http://{ip}:9094/api/v1/read"

Where {ip} is the ip of the old system.

Set the old system to expose the web api

Add the command line flag -web.listen-address=:9094 to the old prometheus server.

In the prometheus template yaml file remove everything but the external_labels section.

Validate the upgrade

You should be able to see the graphs on the new stack, make sure you see the graphs history. By default, prometheus retention period is 15 days, so after that period it is safe to take down the old system and remove the remote_read from the new prometheus configuration.

Rollback

In the upgrade procedure, you set up a second monitoring stack. The old monitoring stack continues to work in parallel.

To rollback, simply add back the prometheus targets in the old system, and take down the new system.