hashicorp · im2nguyen · Feb 27, 2023 · Feb 15, 2023 · Feb 15, 2023 · Feb 23, 2023
@@ -1586,15 +1586,95 @@ Valid time units are 'ns', 'us' (or 'µs'), 'ms', 's', 'm', 'h'."
 
 ## Raft Parameters
 
-- `raft_boltdb` ((#raft_boltdb)) This is a nested object that allows configuring
-  options for Raft's BoltDB based log store.
-
-  - `NoFreelistSync` ((#NoFreelistSync)) Setting this to `true` will disable
-    syncing the BoltDB freelist to disk within the raft.db file. Not syncing
-    the freelist to disk will reduce disk IO required for write operations
-    at the expense of potentially increasing start up time due to needing
-    to scan the db to discover where the free space resides within the file.
-
+- `raft_boltdb` ((#raft_boltdb)) **These fields are deprecated in Consul 1.15.0.
+  Use [`raft_logstore`](#raft_logstore) instead.** This is a nested
+  object that allows configuring options for Raft's BoltDB based log store.
+
+  - `NoFreelistSync` **This field is deprecated in Consul 1.15.0. Use the
+    [`raft_logstore.boltdb.no_freelist_sync`](#raft_logstore_boltdb_no_freelist_sync) field
+    instead.** Setting this to `true` will disable syncing the BoltDB freelist
+    to disk within the raft.db file. Not syncing the freelist to disk will
+    reduce disk IO required for write operations at the expense of potentially
+    increasing start up time due to needing to scan the db to discover where the
+    free space resides within the file.
+
+- `raft_logstore` ((#raft_logstore)) This is a nested object that allows
+  configuring options for Raft's LogStore component which is used to persist
+  logs and crucial Raft state on disk during writes. This was added in Consul
+  1.15.
+
+  - `backend` ((#raft_logstore_backend)) Specifies which storage
+    engine to use to persist logs. Valid options are `boltdb` or `wal`. Default
+    is `boltdb`. The `wal` option specifies an experimental backend that 
+    should be used with caution. Refer to 
+    [Experimental WAL LogStore backend](/consul/docs/agent/wal-logstore) 
+    for more information.
+
+  - `disable_log_cache` ((#raft_logstore_disable_log_cache)) This allows
+    disabling of the in-memory cache of recent logs. This exists mostly for
+    performance testing purposes. In theory the log cache prevents disk reads
+    for recent logs. In practice recent logs are still in OS page cache so tend
+    not to be slow to read using either backend. We recommend leaving it enabled
+    for now as we've not measured a significant improvement in any metric by
+    disabling.
+
+  - `verification` ((#raft_logstore_verification)) This is a nested object that
+    allows configuring online verification of the LogStore. Verification
+    provides additional assurances that LogStore backends are correctly storing
+    data. It imposes very low overhead on servers and is safe to run in
+    production, however it's mostly useful when evaluating a new backend
+    implementation.
+
+    Verification must be enabled on the leader to have any effect and can be
+    used with any backend. When enabled, the leader will periodically write a
+    special "checkpoint" log message including checksums of all log entries
+    written to Raft since the last checkpoint. Followers that have verification
+    enabled will run a background task for each checkpoint that reads all logs
+    directly from the LogStore and recomputes the checksum. A report is output
+    as an INFO level log for each checkpoint.
+
+    Checksum failure should never happen and indicate unrecoverable corruption
+    on that server. The only correct response is to stop the server, remove its
+    data directory, and restart so it can be caught back up with a correct
+    server again. Please report verification failures including details about
+    your hardware and workload via GitHub issues. Refer to 
+    [Experimental WAL LogStore backend](/consul/docs/agent/wal-logstore) 
+    for more information.
-    Checksum failure should never happen and indicate unrecoverable corruption
-    on that server. The only correct response is to stop the server, remove its
-    data directory, and restart so it can be caught back up with a correct
-    server again. Please report verification failures including details about
-    your hardware and workload via GitHub issues. Refer to 
-    [Experimental WAL LogStore backend](/consul/docs/agent/wal-logstore) 
-    for more information.
+    Checksum failure indicates unrecoverable corruption
+    on that server. The only corrective response is to stop the server, remove its
+    data directory, and then restart. These actions catch up the data directory with a correct
+    server again. Report verification failures including details about
+    your hardware and workload through GitHub issues. Refer to 
+    [Experimental WAL LogStore backend](/consul/docs/agent/wal-logstore) 
+    for more information.
-    Checksum failure should never happen and indicate unrecoverable corruption
-    on that server. The only correct response is to stop the server, remove its
-    data directory, and restart so it can be caught back up with a correct
-    server again. Please report verification failures including details about
-    your hardware and workload via GitHub issues. Refer to 
-    [Experimental WAL LogStore backend](/consul/docs/agent/wal-logstore) 
-    for more information.
+    Checksum failure indicatea unrecoverable corruption
+    on that server. If this occurs, stop the server, remove the data directory, and restart so it can re-sync its state with the leader again. Report each verification failure including details about
+    your hardware and workload in a GitHub issue. Refer to 
+    [Experimental WAL LogStore backend](/consul/docs/agent/wal-logstore) 
+    for more information.
-    Checksum failure should never happen and indicate unrecoverable corruption
-    on that server. The only correct response is to stop the server, remove its
-    data directory, and restart so it can be caught back up with a correct
-    server again. Please report verification failures including details about
-    your hardware and workload via GitHub issues. Refer to 
-    [Experimental WAL LogStore backend](/consul/docs/agent/wal-logstore) 
-    for more information.
+    Checksum failure indicates unrecoverable corruption
+    on that server. The only corrective response is to stop the server, remove its
+    data directory, and then restart. These actions catch up the data directory with a correct
+    server again. Report verification failures including details about
+    your hardware and workload through GitHub issues. Refer to 
+    [Experimental WAL LogStore backend](/consul/docs/agent/wal-logstore) 
+    for more information.
-    Checksum failure should never happen and indicate unrecoverable corruption
-    on that server. The only correct response is to stop the server, remove its
-    data directory, and restart so it can be caught back up with a correct
-    server again. Please report verification failures including details about
-    your hardware and workload via GitHub issues. Refer to 
-    [Experimental WAL LogStore backend](/consul/docs/agent/wal-logstore) 
-    for more information.
+    Checksum failure indicatea unrecoverable corruption
+    on that server. If this occurs, stop the server, remove the data directory, and restart so it can re-sync its state with the leader again. Report each verification failure including details about
+    your hardware and workload in a GitHub issue. Refer to 
+    [Experimental WAL LogStore backend](/consul/docs/agent/wal-logstore) 
+    for more information.
+
+    - `enabled` ((#raft_logstore_verification_enabled)) - Set to `true` to
+    allow this Consul server to write and verify log verification checkpoints 
+    when it is elected leader.
+
+    - `interval` ((#raft_logstore_verification_interval)) - Specifies the time 
+      interval between checkpoints. There is no default value. You must 
+      configure the `interval` and set [`enabled`](#raft_logstore_verification_enabled) 
+      to `true` to correctly enable intervals. We recommend using an interval 
+      between `30s` and `5m`. The performance overhead is insignificant if the 
+      interval is set to `5m` or less.
+
+  - `boltdb` ((#raft_logstore_boltdb)) - Object that configures options for 
+    Raft's `boltdb` backend. It has no effect if the `backend` is not `boltdb`.
+
+    - `no_freelist_sync` ((#raft_logstore_boltdb_no_freelist_sync)) - Set to 
+    `true` to disable storing BoltDB's freelist to disk within the
+    `raft.db` file. Disabling freelist syncs reduces the disk IO required
+    for write operations, but could potentially increase start up time
+    because Consul must scan the database to find free space
+    within the file.
+
+  - - `wal` ((#raft_logstore_wal)) - Object that configures the `wal` backend. 
+    Refer to [Experimental WAL LogStore backend](/consul/docs/agent/wal-logstore) 
+    for more information.
+
+    - `segment_size_mb` ((#raft_logstore_wal_segment_size_mb)) - Integer value 
+      that represents the target size in MB for each segment file before
+      rolling to a new segment. The default is `64` and is suitable for
+      most deployments. A smaller value may use less disk space because you 
+      can reclaim space by deleting old segments sooner, but a smaller segment 
+      may affect performance because safely rotating to a new file more
+      frequently could impact tail latencies. Larger values are unlikely
+      to improve performance significantly. We recommend using this 
+      configuration for performance testing purposes.
 
 - `raft_protocol` ((#raft_protocol)) Equivalent to the [`-raft-protocol`
   command-line flag](/consul/docs/agent/config/cli-flags#_raft_protocol).