Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: adjust WAL purge default configurations #5107

Merged
merged 3 commits into from
Dec 11, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 11 additions & 11 deletions config/config.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,11 +13,11 @@
| Key | Type | Default | Descriptions |
| --- | -----| ------- | ----------- |
| `mode` | String | `standalone` | The running mode of the datanode. It can be `standalone` or `distributed`. |
| `enable_telemetry` | Bool | `true` | Enable telemetry to collect anonymous usage data. |
| `default_timezone` | String | Unset | The default timezone of the server. |
| `init_regions_in_background` | Bool | `false` | Initialize all regions in the background during the startup.<br/>By default, it provides services after all regions have been initialized. |
| `init_regions_parallelism` | Integer | `16` | Parallelism of initializing regions. |
| `max_concurrent_queries` | Integer | `0` | The maximum current queries allowed to be executed. Zero means unlimited. |
| `enable_telemetry` | Bool | `true` | Enable telemetry to collect anonymous usage data. Enabled by default. |
| `runtime` | -- | -- | The runtime options. |
| `runtime.global_rt_size` | Integer | `8` | The number of threads to execute the runtime for global read operations. |
| `runtime.compact_rt_size` | Integer | `4` | The number of threads to execute the runtime for global write operations. |
Expand Down Expand Up @@ -61,9 +61,9 @@
| `wal` | -- | -- | The WAL options. |
| `wal.provider` | String | `raft_engine` | The provider of the WAL.<br/>- `raft_engine`: the wal is stored in the local file system by raft-engine.<br/>- `kafka`: it's remote wal that data is stored in Kafka. |
| `wal.dir` | String | Unset | The directory to store the WAL files.<br/>**It's only used when the provider is `raft_engine`**. |
| `wal.file_size` | String | `256MB` | The size of the WAL segment file.<br/>**It's only used when the provider is `raft_engine`**. |
| `wal.purge_threshold` | String | `4GB` | The threshold of the WAL size to trigger a flush.<br/>**It's only used when the provider is `raft_engine`**. |
| `wal.purge_interval` | String | `10m` | The interval to trigger a flush.<br/>**It's only used when the provider is `raft_engine`**. |
| `wal.file_size` | String | `128MB` | The size of the WAL segment file.<br/>**It's only used when the provider is `raft_engine`**. |
| `wal.purge_threshold` | String | `1GB` | The threshold of the WAL size to trigger a flush.<br/>**It's only used when the provider is `raft_engine`**. |
| `wal.purge_interval` | String | `1m` | The interval to trigger a flush.<br/>**It's only used when the provider is `raft_engine`**. |
| `wal.read_batch_size` | Integer | `128` | The read batch size.<br/>**It's only used when the provider is `raft_engine`**. |
| `wal.sync_write` | Bool | `false` | Whether to use sync write.<br/>**It's only used when the provider is `raft_engine`**. |
| `wal.enable_log_recycle` | Bool | `true` | Whether to reuse logically truncated log files.<br/>**It's only used when the provider is `raft_engine`**. |
Expand Down Expand Up @@ -287,12 +287,12 @@
| `bind_addr` | String | `127.0.0.1:3002` | The bind address of metasrv. |
| `server_addr` | String | `127.0.0.1:3002` | The communication server address for frontend and datanode to connect to metasrv, "127.0.0.1:3002" by default for localhost. |
| `store_addrs` | Array | -- | Store server address default to etcd store. |
| `store_key_prefix` | String | `""` | If it's not empty, the metasrv will store all data with this key prefix. |
| `backend` | String | `EtcdStore` | The datastore for meta server. |
| `selector` | String | `round_robin` | Datanode selector type.<br/>- `round_robin` (default value)<br/>- `lease_based`<br/>- `load_based`<br/>For details, please see "https://docs.greptime.com/developer-guide/metasrv/selector". |
| `use_memory_store` | Bool | `false` | Store data in memory. |
| `enable_telemetry` | Bool | `true` | Whether to enable greptimedb telemetry. |
| `store_key_prefix` | String | `""` | If it's not empty, the metasrv will store all data with this key prefix. |
| `enable_region_failover` | Bool | `false` | Whether to enable region failover.<br/>This feature is only available on GreptimeDB running on cluster mode and<br/>- Using Remote WAL<br/>- Using shared storage (e.g., s3). |
| `backend` | String | `EtcdStore` | The datastore for meta server. |
| `enable_telemetry` | Bool | `true` | Whether to enable greptimedb telemetry. Enabled by default. |
| `runtime` | -- | -- | The runtime options. |
| `runtime.global_rt_size` | Integer | `8` | The number of threads to execute the runtime for global read operations. |
| `runtime.compact_rt_size` | Integer | `4` | The number of threads to execute the runtime for global write operations. |
Expand Down Expand Up @@ -357,14 +357,14 @@
| `node_id` | Integer | Unset | The datanode identifier and should be unique in the cluster. |
| `require_lease_before_startup` | Bool | `false` | Start services after regions have obtained leases.<br/>It will block the datanode start if it can't receive leases in the heartbeat from metasrv. |
| `init_regions_in_background` | Bool | `false` | Initialize all regions in the background during the startup.<br/>By default, it provides services after all regions have been initialized. |
| `enable_telemetry` | Bool | `true` | Enable telemetry to collect anonymous usage data. |
| `init_regions_parallelism` | Integer | `16` | Parallelism of initializing regions. |
| `max_concurrent_queries` | Integer | `0` | The maximum current queries allowed to be executed. Zero means unlimited. |
| `rpc_addr` | String | Unset | Deprecated, use `grpc.addr` instead. |
| `rpc_hostname` | String | Unset | Deprecated, use `grpc.hostname` instead. |
| `rpc_runtime_size` | Integer | Unset | Deprecated, use `grpc.runtime_size` instead. |
| `rpc_max_recv_message_size` | String | Unset | Deprecated, use `grpc.rpc_max_recv_message_size` instead. |
| `rpc_max_send_message_size` | String | Unset | Deprecated, use `grpc.rpc_max_send_message_size` instead. |
| `enable_telemetry` | Bool | `true` | Enable telemetry to collect anonymous usage data. Enabled by default. |
| `http` | -- | -- | The HTTP server options. |
| `http.addr` | String | `127.0.0.1:4000` | The address to bind the HTTP server. |
| `http.timeout` | String | `30s` | HTTP request timeout. Set to 0 to disable timeout. |
Expand Down Expand Up @@ -399,9 +399,9 @@
| `wal` | -- | -- | The WAL options. |
| `wal.provider` | String | `raft_engine` | The provider of the WAL.<br/>- `raft_engine`: the wal is stored in the local file system by raft-engine.<br/>- `kafka`: it's remote wal that data is stored in Kafka. |
| `wal.dir` | String | Unset | The directory to store the WAL files.<br/>**It's only used when the provider is `raft_engine`**. |
| `wal.file_size` | String | `256MB` | The size of the WAL segment file.<br/>**It's only used when the provider is `raft_engine`**. |
| `wal.purge_threshold` | String | `4GB` | The threshold of the WAL size to trigger a flush.<br/>**It's only used when the provider is `raft_engine`**. |
| `wal.purge_interval` | String | `10m` | The interval to trigger a flush.<br/>**It's only used when the provider is `raft_engine`**. |
| `wal.file_size` | String | `128MB` | The size of the WAL segment file.<br/>**It's only used when the provider is `raft_engine`**. |
| `wal.purge_threshold` | String | `1GB` | The threshold of the WAL size to trigger a flush.<br/>**It's only used when the provider is `raft_engine`**. |
| `wal.purge_interval` | String | `1m` | The interval to trigger a flush.<br/>**It's only used when the provider is `raft_engine`**. |
| `wal.read_batch_size` | Integer | `128` | The read batch size.<br/>**It's only used when the provider is `raft_engine`**. |
| `wal.sync_write` | Bool | `false` | Whether to use sync write.<br/>**It's only used when the provider is `raft_engine`**. |
| `wal.enable_log_recycle` | Bool | `true` | Whether to reuse logically truncated log files.<br/>**It's only used when the provider is `raft_engine`**. |
Expand Down
11 changes: 5 additions & 6 deletions config/datanode.example.toml
Original file line number Diff line number Diff line change
Expand Up @@ -13,9 +13,6 @@ require_lease_before_startup = false
## By default, it provides services after all regions have been initialized.
init_regions_in_background = false

## Enable telemetry to collect anonymous usage data.
enable_telemetry = true

## Parallelism of initializing regions.
init_regions_parallelism = 16

Expand All @@ -42,6 +39,8 @@ rpc_max_recv_message_size = "512MB"
## @toml2docs:none-default
rpc_max_send_message_size = "512MB"

## Enable telemetry to collect anonymous usage data. Enabled by default.
#+ enable_telemetry = true

## The HTTP server options.
[http]
Expand Down Expand Up @@ -143,15 +142,15 @@ dir = "/tmp/greptimedb/wal"

## The size of the WAL segment file.
## **It's only used when the provider is `raft_engine`**.
file_size = "256MB"
file_size = "128MB"

## The threshold of the WAL size to trigger a flush.
## **It's only used when the provider is `raft_engine`**.
purge_threshold = "4GB"
purge_threshold = "1GB"

## The interval to trigger a flush.
## **It's only used when the provider is `raft_engine`**.
purge_interval = "10m"
purge_interval = "1m"

## The read batch size.
## **It's only used when the provider is `raft_engine`**.
Expand Down
16 changes: 8 additions & 8 deletions config/metasrv.example.toml
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,12 @@ server_addr = "127.0.0.1:3002"
## Store server address default to etcd store.
store_addrs = ["127.0.0.1:2379"]

## If it's not empty, the metasrv will store all data with this key prefix.
store_key_prefix = ""

## The datastore for meta server.
backend = "EtcdStore"

## Datanode selector type.
## - `round_robin` (default value)
## - `lease_based`
Expand All @@ -20,20 +26,14 @@ selector = "round_robin"
## Store data in memory.
use_memory_store = false

## Whether to enable greptimedb telemetry.
enable_telemetry = true

## If it's not empty, the metasrv will store all data with this key prefix.
store_key_prefix = ""

## Whether to enable region failover.
## This feature is only available on GreptimeDB running on cluster mode and
## - Using Remote WAL
## - Using shared storage (e.g., s3).
enable_region_failover = false

## The datastore for meta server.
backend = "EtcdStore"
## Whether to enable greptimedb telemetry. Enabled by default.
#+ enable_telemetry = true

## The runtime options.
#+ [runtime]
Expand Down
12 changes: 6 additions & 6 deletions config/standalone.example.toml
Original file line number Diff line number Diff line change
@@ -1,9 +1,6 @@
## The running mode of the datanode. It can be `standalone` or `distributed`.
mode = "standalone"

## Enable telemetry to collect anonymous usage data.
enable_telemetry = true

## The default timezone of the server.
## @toml2docs:none-default
default_timezone = "UTC"
Expand All @@ -18,6 +15,9 @@ init_regions_parallelism = 16
## The maximum current queries allowed to be executed. Zero means unlimited.
max_concurrent_queries = 0

## Enable telemetry to collect anonymous usage data. Enabled by default.
#+ enable_telemetry = true

## The runtime options.
#+ [runtime]
## The number of threads to execute the runtime for global read operations.
Expand Down Expand Up @@ -147,15 +147,15 @@ dir = "/tmp/greptimedb/wal"

## The size of the WAL segment file.
## **It's only used when the provider is `raft_engine`**.
file_size = "256MB"
file_size = "128MB"

## The threshold of the WAL size to trigger a flush.
## **It's only used when the provider is `raft_engine`**.
purge_threshold = "4GB"
purge_threshold = "1GB"

## The interval to trigger a flush.
## **It's only used when the provider is `raft_engine`**.
purge_interval = "10m"
purge_interval = "1m"

## The read batch size.
## **It's only used when the provider is `raft_engine`**.
Expand Down
6 changes: 3 additions & 3 deletions src/common/wal/src/config/raft_engine.rs
Original file line number Diff line number Diff line change
Expand Up @@ -49,9 +49,9 @@ impl Default for RaftEngineConfig {
fn default() -> Self {
Self {
dir: None,
file_size: ReadableSize::mb(256),
purge_threshold: ReadableSize::gb(4),
purge_interval: Duration::from_secs(600),
file_size: ReadableSize::mb(128),
purge_threshold: ReadableSize::gb(1),
purge_interval: Duration::from_secs(60),
read_batch_size: 128,
sync_write: false,
enable_log_recycle: true,
Expand Down
6 changes: 3 additions & 3 deletions tests-integration/tests/http.rs
Original file line number Diff line number Diff line change
Expand Up @@ -881,9 +881,9 @@ with_metric_engine = true

[wal]
provider = "raft_engine"
file_size = "256MiB"
purge_threshold = "4GiB"
purge_interval = "10m"
file_size = "128MiB"
purge_threshold = "1GiB"
purge_interval = "1m"
read_batch_size = 128
sync_write = false
enable_log_recycle = true
Expand Down
Loading