Skip to content
This repository has been archived by the owner on Sep 18, 2024. It is now read-only.

Add example of tuning RocksDB on NNI #1610

Merged
merged 6 commits into from
Oct 23, 2019
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
68 changes: 68 additions & 0 deletions docs/en_US/TrialExample/RocksdbExamples.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
# Tuning RocksDB on NNI

## Overview

[RocksDB](https://github.com/facebook/rocksdb) is a popular high performance embedded key-value database used in production systems at various web-scale enterprises including Facebook, Yahoo!, and LinkedIn.. It is a fork of [LevelDB](https://github.com/google/leveldb) by Facebook optimized to exploit many central processing unit (CPU) cores, and make efficient use of fast storage, such as solid-state drives (SSD), for input/output (I/O) bound workloads.

The performance of RocksDB is highly contingent on its tuning. However, because of the complexity of its underlying technology and a large number of configurable parameters, a good configuration is sometimes hard to obtain. NNI can help to address this issue. NNI supports many kinds of tuning algorithms to search the best configuration of RocksDB, and support many kinds of environments like local machine, remote servers and cloud. By following this example, you are able to search the best configuration of RocksDB for a `fillrandom` benchmark with SMAC and TPE tuners. Other tuners can be easily adopted in the same way. Please refer to [here](../Tuner/BuiltinTuner.md) for more information.

`code directory: example/trials/systems/rocksdb-fillrandom`

## Goals

This example illustrates how to use NNI to search the best configuration of RocksDB for a `fillrandom` benchmark supported by a benchmark tool `db_bench`, which is a official benchmark tool provided by RocksDB itself. Therefore, before running this example, please make sure NNI is installed and [`db_bench`](https://github.com/facebook/rocksdb/wiki/Benchmarking-tools) is in your `PATH`. Please refer to [here](../Tutorial/QuickStart.md) for detailed information about installation and preparing of NNI environment, and [here](https://github.com/facebook/rocksdb/blob/master/INSTALL.md) for compiling RocksDB as well as `db_bench`.
xuehui1991 marked this conversation as resolved.
Show resolved Hide resolved

## Search Space

For simplicity, this example tunes three parameters, `buffer_size`, `min_write_buffer_num` and `level0_file_num_compaction_trigger`, for writing 16M keys with 20 Bytes of key size and 100 Bytes of value size randomly, based on writing operations per second (OPS). The search space is specified by a `search_space.json` file as shown below. Detailed explanation of search space could be found [here](https://github.com/microsoft/nni/blob/master/docs/en_US/Tutorial/SearchSpaceSpec.md).

```json
{
"write_buffer_size": {
xuehui1991 marked this conversation as resolved.
Show resolved Hide resolved
"_type": "quniform",
"_value": [2097152, 16777216, 1048576]
},
"min_write_buffer_number_to_merge": {
"_type": "quniform",
"_value": [2, 16, 1]
},
"level0_file_num_compaction_trigger": {
"_type": "quniform",
"_value": [2, 16, 1]
}
}
```

`code directory: example/trials/systems/rocksdb-fillrandom/search_space.json`

## Benchmark

Benchmark code should receive a configuration from NNI manager, and report the corresponding benchmark result back. Following NNI APIs are designed for this purpose. In this example, writing operations per second (OPS) is used as a performance metric. Please refer to [here](Trials.md) for detailed information.

* Use `nni.get_next_parameter()` to get next system configuration.
* Use `nni.report_final_result(metric)` to report the benchmark result.

`code directory: example/trials/systems/rocksdb-fillrandom/main.py`

## Config

One could start a NNI experiment with a config file. Usually, a config file for NNI includes experiment settings (`trialConcurrency`, `maxExecDuration`, `maxTrialNum`, `trial gpuNum`, etc.), platform settings (`trainingServicePlatform`, etc.), path settings (`searchSpacePath`, `trial codeDir`, etc.) and tuner settings (`tuner`, `tuner optimize_mode`, etc.). Please refer to [here](../Tutorial/QuickStart.md) for more information.

Here is the example of tuning RocksDB with SMAC algorithm:

`code directory: examples/trials/systems/rocksdb-fillrandom/config_smac.yml`

Here is the example of tuning RocksDB with TPE algorithm:

`code directory: examples/trials/systems/rocksdb-fillrandom/config_tpe.yml`

## Launch the experiment

In order to run this example, you could enter the example folder and start the experiment using following commands:

```bash
xuehui1991 marked this conversation as resolved.
Show resolved Hide resolved
# tuning RocksDB with SMAC tuner
nnictl create --config ./config_smac.yml
# tuning RocksDB with TPE tuner
nnictl create --config ./config_tpe.yml
```
1 change: 1 addition & 0 deletions docs/en_US/TrialExample/Trials.md
Original file line number Diff line number Diff line change
Expand Up @@ -163,3 +163,4 @@ For more information, please refer to [HowToDebug](../Tutorial/HowToDebug.md)
* [How to tune Scikit-learn on NNI](SklearnExamples.md)
* [Automatic Model Architecture Search for Reading Comprehension.](SquadEvolutionExamples.md)
* [Tuning GBDT on NNI](GbdtExample.md)
* [Tuning RocksDB on NNI](RocksdbExamples.md)
68 changes: 68 additions & 0 deletions docs/zh_CN/TrialExample/RocksdbExamples.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
# 使用 NNI 调优 RocksDB

## 概述

[RocksDB](https://github.com/facebook/rocksdb) 是一种很受欢迎的高性能嵌入式键值数据库,被许多公司,如 Facebook, Yahoo! 和 LinkedIn 等,广泛应用于各种网络规模的产品中。它是 Facebook 在 [LevelDB](https://github.com/google/leveldb) 的基础上,通过充分利用多核心中央处理器和快速存储的特点,针对IO密集型应用优化而成的。

RocksDB 的性能高度依赖于运行参数的调优。然而,由于其底层技术极为复杂,需要调整的参数过多,有时很难找到合适的运行参数。NNI 可以帮助数据库运维工程师解决这个问题。NNI 支持多种自动调参算法,并且支持运行于本地、远程和云端的各种负载。本例展示了如何使用 NNI 中的 SMAC 和 TPE 调参器搜索 RocksDB 的最佳运行参数,使其在随机写实验中获得最好的性能。其他调参器的用法也很类似。更详细的信息可以在[这里](../Tuner/BuiltinTuner.md)找到。

`代码目录: example/trials/systems/rocksdb-fillrandom`

## 目标

本例将展示如何使用 NNI 来搜索 RocksDB 在 `fillrandom` 基准测试中性能最好的运行参数。`fillrandom` 基准测试是 RocksDB 官方提供的基准测试工具 `db_bench` 所支持的一种基准测试,因此在运行本例之前请确保您已经安装了 NNI,并且 `db_bench` 在您的 `PATH` 路径中。关于如何安装和准备 NNI 环境,请参考[这里](../Tuner/BuiltinTuner.md),关于如何编译 RocksDB 以及 `db_bench`,请参考[这里](https://github.com/facebook/rocksdb/blob/master/INSTALL.md)。

## 搜索空间

简便起见,本例基于 Rocks_DB 每秒的写入操作数(Operations Per Second, OPS),在随机写入 16M 个键长为 20 字节值长为 100 字节的键值对的情况下,对三个系统运行参数,`buffer_size`,`min_write_buffer_num` 和 `level0_file_num_compaction_trigger`,进行了调优。搜索空间由如下所示的文件 `search_space.json` 指定。更多关于搜索空间的解释请参考[这里](https://github.com/microsoft/nni/blob/master/docs/en_US/Tutorial/SearchSpaceSpec.md)。

```json
{
"write_buffer_size": {
"_type": "quniform",
"_value": [2097152, 16777216, 1048576]
},
"min_write_buffer_number_to_merge": {
"_type": "quniform",
"_value": [2, 16, 1]
},
"level0_file_num_compaction_trigger": {
"_type": "quniform",
"_value": [2, 16, 1]
}
}
```

`代码目录: example/trials/systems/rocksdb-fillrandom/search_space.json`

## 基准测试

基准测试程序需要从 NNI manager 接收一个运行参数,并在运行基准测试以后向 NNI manager 汇报基准测试结果。NNI 提供了下面两个 APIs 来完成这些任务。更多关于 NNI trials 的信息请参考[这里](Trials.md)。

* 使用 `nni.get_next_parameter()` 从 NNI manager 得到需要测试的系统运行参数。
* 使用 `nni.report_final_result(metric)` 向 NNI manager 汇报基准测试的结果。

`代码目录: example/trials/systems/rocksdb-fillrandom/main.py`

## NNI 配置文件

NNI 实验可以通过配置文件来启动。通常而言,NNI 配置文件需要包括实验设置(`trialConcurrency`,`maxExecDuration`,`maxTrialNum`,`trial gpuNum` 等),运行平台设置(`trainingServicePlatform` 等),路径设置(`searchSpacePath`,`trial codeDir` 等)和 调参器设置(`tuner`,`tuner optimize_mode` 等)。更多关于 NNI 配置文件的信息请参考[这里](../Tutorial/QuickStart.md)。

下面是使用 SMAC 算法调优 RocksDB 配置文件的例子:

`代码目录: examples/trials/systems/rocksdb-fillrandom/config_smac.yml`

下面是使用 TPE 算法调优 RocksDB 配置文件的例子:

`代码目录: examples/trials/systems/rocksdb-fillrandom/config_tpe.yml`

## 运行调优实验

以上文件即为本例包含的主要内容。进入本例文件夹内,用下面的命令即可启动实验:

```bash
# tuning RocksDB with SMAC tuner
nnictl create --config ./config_smac.yml
# tuning RocksDB with TPE tuner
nnictl create --config ./config_tpe.yml
```
3 changes: 2 additions & 1 deletion docs/zh_CN/TrialExample/Trials.md
Original file line number Diff line number Diff line change
Expand Up @@ -168,4 +168,5 @@ echo $? `date +%s%3N` >/home/user_name/nni/experiments/$experiment_id$/trials/$t
* [为 CIFAR 10 分类找到最佳的 optimizer](Cifar10Examples.md)
* [如何在 NNI 调优 SciKit-learn 的参数](SklearnExamples.md)
* [在阅读理解上使用自动模型架构搜索。](SquadEvolutionExamples.md)
* [如何在 NNI 上调优 GBDT](GbdtExample.md)
* [如何在 NNI 上调优 GBDT](GbdtExample.md)
* [如何在 NNI 上调优 RocksDB](RocksdbExamples.md)
21 changes: 21 additions & 0 deletions examples/trials/systems/rocksdb-fillrandom/config_smac.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
authorName: default
experimentName: auto_rocksdb_SMAC
trialConcurrency: 1
maxExecDuration: 12h
maxTrialNum: 256
#choice: local, remote, pai
trainingServicePlatform: local
searchSpacePath: search_space.json
#choice: true, false
useAnnotation: false
tuner:
#choice: TPE, Random, Anneal, Evolution, BatchTuner, MetisTuner
#SMAC (SMAC should be installed through nnictl)
builtinTunerName: SMAC
classArgs:
#choice: maximize, minimize
optimize_mode: maximize
trial:
command: python3 main.py
codeDir: .
gpuNum: 0
21 changes: 21 additions & 0 deletions examples/trials/systems/rocksdb-fillrandom/config_tpe.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
authorName: default
experimentName: auto_rocksdb_TPE
trialConcurrency: 1
maxExecDuration: 12h
maxTrialNum: 256
#choice: local, remote, pai
trainingServicePlatform: local
searchSpacePath: search_space.json
#choice: true, false
useAnnotation: false
tuner:
#choice: TPE, Random, Anneal, Evolution, BatchTuner, MetisTuner
#SMAC (SMAC should be installed through nnictl)
builtinTunerName: TPE
classArgs:
#choice: maximize, minimize
optimize_mode: maximize
trial:
command: python3 main.py
codeDir: .
gpuNum: 0
96 changes: 96 additions & 0 deletions examples/trials/systems/rocksdb-fillrandom/main.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,96 @@
# Copyright (c) Microsoft Corporation
# All rights reserved.
#
# MIT License
#
# Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated
# documentation files (the "Software"), to deal in the Software without restriction, including without limitation
# the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and
# to permit persons to whom the Software is furnished to do so, subject to the following conditions:
# The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
#
# THE SOFTWARE IS PROVIDED *AS IS*, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING
# BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
# NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,
# DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

import nni
import subprocess
import logging

LOG = logging.getLogger('rocksdb-fillrandom')


def run(**parameters):
'''Run rocksdb benchmark and return throughput'''
bench_type = parameters['benchmarks']
# recover args
args = [f"--{k}={v}" for k, v in parameters.items()]
# subprocess communicate
process = subprocess.Popen(['db_bench'] + args, stdout=subprocess.PIPE)
out, err = process.communicate()
# split into lines
lines = out.decode("utf8").splitlines()

match_lines = []
for line in lines:
# find the line with matched str
if bench_type not in line:
continue
else:
match_lines.append(line)
break

results = {}
for line in match_lines:
key, _, value = line.partition(":")
key = key.strip()
value = value.split("op")[1]
results[key] = float(value)

return results[bench_type]


def generate_params(received_params):
'''generate parameters based on received parameters'''
params = {
"benchmarks": "fillrandom",
"threads": 1,
"key_size": 20,
"value_size": 100,
"num": 13107200,
"db": "/tmp/rockdb",
"disable_wal": 1,
"max_background_flushes": 1,
"max_background_compactions": 4,
"write_buffer_size": 67108864,
"max_write_buffer_number": 16,
"min_write_buffer_number_to_merge": 2,
"level0_file_num_compaction_trigger": 2,
"max_bytes_for_level_base": 268435456,
"max_bytes_for_level_multiplier": 10,
"target_file_size_base": 33554432,
"target_file_size_multiplier": 1
}

for k, v in received_params.items():
params[k] = int(v)

return params


if __name__ == "__main__":
try:
# get parameters from tuner
RECEIVED_PARAMS = nni.get_next_parameter()
LOG.debug(RECEIVED_PARAMS)
PARAMS = generate_params(RECEIVED_PARAMS)
LOG.debug(PARAMS)
# run benchmark
throughput = run(**PARAMS)
# report throughput to nni
nni.report_final_result(throughput)
except Exception as exception:
LOG.exception(exception)
raise
14 changes: 14 additions & 0 deletions examples/trials/systems/rocksdb-fillrandom/search_space.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
{
"write_buffer_size": {
"_type": "quniform",
"_value": [2097152, 16777216, 1048576]
xuehui1991 marked this conversation as resolved.
Show resolved Hide resolved
},
"min_write_buffer_number_to_merge": {
"_type": "quniform",
"_value": [2, 16, 1]
},
"level0_file_num_compaction_trigger": {
"_type": "quniform",
"_value": [2, 16, 1]
}
}