Previous change logs can be found at CHANGELOG-3.2.
v3.3.4 (2018-04-24)
See code changes and v3.3 upgrade guide for any breaking changes. Again, before running upgrades from any previous release, please make sure to read change logs below and v3.3 upgrade guide.
- Add
etcd_server_is_leader
Prometheus metric. - Fix
etcd_debugging_server_lease_expired_total
Prometheus metric. - Fix race conditions in v2 server stat collecting.
- Fix TLS reload when certificate SAN field only includes IP addresses but no domain names.
- In Go, server calls
(*tls.Config).GetCertificate
for TLS reload if and only if server's(*tls.Config).Certificates
field is not empty, or(*tls.ClientHelloInfo).ServerName
is not empty with a valid SNI from the client. Previously, etcd always populates(*tls.Config).Certificates
on the initial client TLS handshake, as non-empty. Thus, client was always expected to supply a matching SNI in order to pass the TLS verification and to trigger(*tls.Config).GetCertificate
to reload TLS assets. - However, a certificate whose SAN field does not include any domain names but only IP addresses would request
*tls.ClientHelloInfo
with an emptyServerName
field, thus failing to trigger the TLS reload on initial TLS handshake; this becomes a problem when expired certificates need to be replaced online. - Now,
(*tls.Config).Certificates
is created empty on initial TLS client handshake, first to trigger(*tls.Config).GetCertificate
, and then to populate rest of the certificates on every new TLS connection, even when client SNI is empty (e.g. cert only includes IPs).
- In Go, server calls
- Add
--initial-election-tick-advance
flag to configure initial election tick fast-forward.- By default,
--initial-election-tick-advance=true
, then local member fast-forwards election ticks to speed up "initial" leader election trigger. - This benefits the case of larger election ticks. For instance, cross datacenter deployment may require longer election timeout of 10-second. If true, local node does not need wait up to 10-second. Instead, forwards its election ticks to 8-second, and have only 2-second left before leader election.
- Major assumptions are that: cluster has no active leader thus advancing ticks enables faster leader election. Or cluster already has an established leader, and rejoining follower is likely to receive heartbeats from the leader after tick advance and before election timeout.
- However, when network from leader to rejoining follower is congested, and the follower does not receive leader heartbeat within left election ticks, disruptive election has to happen thus affecting cluster availabilities.
- Now, this can be disabled by setting
--initial-election-tick-advance=false
. - Disabling this would slow down initial bootstrap process for cross datacenter deployments. Make tradeoffs by configuring
--initial-election-tick-advance
at the cost of slow initial bootstrap. - If single-node, it advances ticks regardless.
- Address disruptive rejoining follower node.
- By default,
- Add
embed.Config.InitialElectionTickAdvance
to enable/disable initial election tick fast-forward.embed.NewConfig()
would return*embed.Config
withInitialElectionTickAdvance
as true by default.
- Compile with Go 1.9.5.
v3.3.3 (2018-03-29)
See code changes and v3.3 upgrade guide for any breaking changes. Again, before running upgrades from any previous release, please make sure to read change logs below and v3.3 upgrade guide.
- Adjust election timeout on server restart to reduce disruptive rejoining servers.
- Previously, etcd fast-forwards election ticks on server start, with only one tick left for leader election. This is to speed up start phase, without having to wait until all election ticks elapse. Advancing election ticks is useful for cross datacenter deployments with larger election timeouts. However, it was affecting cluster availability if the last tick elapses before leader contacts the restarted node.
- Now, when etcd restarts, it adjusts election ticks with more than one tick left, thus more time for leader to prevent disruptive restart.
- Adjust periodic compaction retention window.
- e.g.
--auto-compaction-mode=revision --auto-compaction-retention=1000
automaticallyCompact
on"latest revision" - 1000
every 5-minute (when latest revision is 30000, compact on revision 29000). - e.g. Previously,
--auto-compaction-mode=periodic --auto-compaction-retention=72h
automaticallyCompact
with 72-hour retention windown for every 7.2-hour. Now,Compact
happens, for every 1-hour but still with 72-hour retention window. - e.g. Previously,
--auto-compaction-mode=periodic --auto-compaction-retention=30m
automaticallyCompact
with 30-minute retention windown for every 3-minute. Now,Compact
happens, for every 30-minute but still with 30-minute retention window. - Periodic compactor keeps recording latest revisions for every compaction period when given period is less than 1-hour, or for every 1-hour when given compaction period is greater than 1-hour (e.g. 1-hour when
--auto-compaction-mode=periodic --auto-compaction-retention=24h
). - For every compaction period or 1-hour, compactor uses the last revision that was fetched before compaction period, to discard historical data.
- The retention window of compaction period moves for every given compaction period or hour.
- For instance, when hourly writes are 100 and
--auto-compaction-mode=periodic --auto-compaction-retention=24h
,v3.2.x
,v3.3.0
,v3.3.1
, andv3.3.2
compact revision 2400, 2640, and 2880 for every 2.4-hour, whilev3.3.3
or later compacts revision 2400, 2500, 2600 for every 1-hour. - Futhermore, when
--auto-compaction-mode=periodic --auto-compaction-retention=30m
and writes per minute are about 1000,v3.3.0
,v3.3.1
, andv3.3.2
compact revision 30000, 33000, and 36000, for every 3-minute, whilev3.3.3
or later compacts revision 30000, 60000, and 90000, for every 30-minute.
- e.g.
- Add missing
etcd_network_peer_sent_failures_total
count.
- Compile with Go 1.9.5.
v3.3.2 (2018-03-08)
See code changes and v3.3 upgrade guide for any breaking changes. Again, before running upgrades from any previous release, please make sure to read change logs below and v3.3 upgrade guide.
- Fix server panic on invalid Election Proclaim/Resign HTTP(S) requests.
- Previously, wrong-formatted HTTP requests to Election API could trigger panic in etcd server.
- e.g.
curl -L http://localhost:2379/v3/election/proclaim -X POST -d '{"value":""}'
,curl -L http://localhost:2379/v3/election/resign -X POST -d '{"value":""}'
.
- Fix revision-based compaction retention parsing.
- Previously,
--auto-compaction-mode revision --auto-compaction-retention 1
was translated to revision retention 3600000000000. - Now,
--auto-compaction-mode revision --auto-compaction-retention 1
is correctly parsed as revision retention 1.
- Previously,
- Prevent overflow by large
TTL
values forLease
Grant
.TTL
parameter toGrant
request is unit of second.- Leases with too large
TTL
values exceedingmath.MaxInt64
expire in unexpected ways. - Server now returns
rpctypes.ErrLeaseTTLTooLarge
to client, when the requestedTTL
is larger than 9,000,000,000 seconds (which is >285 years). - Again, etcd
Lease
is meant for short-periodic keepalives or sessions, in the range of seconds or minutes. Not for hours or days!
- Enable etcd server
raft.Config.CheckQuorum
when starting withForceNewCluster
.
- Compile with Go 1.9.4.
v3.3.1 (2018-02-12)
See code changes and v3.3 upgrade guide for any breaking changes. Again, before running upgrades from any previous release, please make sure to read change logs below and v3.3 upgrade guide.
- Add warnings on requests taking too long.
- e.g.
etcdserver: read-only range request "key:\"\\000\" range_end:\"\\000\" " took too long [3.389041388s] to execute
- e.g.
- Fix
mvcc
"unsynced" watcher restore operation.- "unsynced" watcher is watcher that needs to be in sync with events that have happened.
- That is, "unsynced" watcher is the slow watcher that was requested on old revision.
- "unsynced" watcher restore operation was not correctly populating its underlying watcher group.
- Which possibly causes missing events from "unsynced" watchers.
- Compile with Go 1.9.4.
v3.3.0 (2018-02-01)
See code changes and v3.3 upgrade guide for any breaking changes. Again, before running upgrades from any previous release, please make sure to read change logs below and v3.3 upgrade guide.
- v3.3.0-rc.4 (2018-01-22), see code changes.
- v3.3.0-rc.3 (2018-01-17), see code changes.
- v3.3.0-rc.2 (2018-01-11), see code changes.
- v3.3.0-rc.1 (2018-01-02), see code changes.
- v3.3.0-rc.0 (2017-12-20), see code changes.
- Use
coreos/bbolt
to replaceboltdb/bolt
. - Support database size larger than 8GiB (8GiB is now a suggested maximum size for normal environments)
- Reduce memory allocation on Range operations.
- Rate limit and randomize lease revoke on restart or leader elections.
- Prevent spikes in Raft proposal rate.
- Support
clientv3
balancer failover under network faults/partitions. - Better warning on mismatched
--initial-cluster
flag.- etcd compares
--initial-advertise-peer-urls
against corresponding--initial-cluster
URLs with forward-lookup. - If resolved IP addresses of
--initial-advertise-peer-urls
and--initial-cluster
do not match (e.g. due to DNS error), etcd will exit with errors.- v3.2 error:
--initial-cluster must include s1=https://s1.test:2380 given --initial-advertise-peer-urls=https://s1.test:2380
. - v3.3 error:
failed to resolve https://s1.test:2380 to match --initial-cluster=s1=https://s1.test:2380 (failed to resolve "https://s1.test:2380" (error ...))
.
- v3.2 error:
- etcd compares
- Require
google.golang.org/grpc
v1.7.4
orv1.7.5
.- Deprecate
metadata.Incoming/OutgoingContext
. - Deprecate
grpclog.Logger
, upgrade togrpclog.LoggerV2
. - Deprecate
grpc.ErrClientConnTimeout
errors inclientv3
. - Use
MaxRecvMsgSize
andMaxSendMsgSize
to limit message size, in etcd server.
- Deprecate
- Translate gRPC status error in v3 client
Snapshot
API. - v3
etcdctl
lease timetolive LEASE_ID
on expired lease now prints"lease LEASE_ID already expired"
.- <=3.2 prints
"lease LEASE_ID granted with TTL(0s), remaining(-1s)"
.
- <=3.2 prints
- Replace gRPC gateway endpoint
/v3alpha
with/v3beta
.- To deprecate
/v3alpha
in v3.4. - In v3.3,
curl -L http://localhost:2379/v3alpha/kv/put -X POST -d '{"key": "Zm9v", "value": "YmFy"}'
still works as a fallback tocurl -L http://localhost:2379/v3beta/kv/put -X POST -d '{"key": "Zm9v", "value": "YmFy"}'
, butcurl -L http://localhost:2379/v3alpha/kv/put -X POST -d '{"key": "Zm9v", "value": "YmFy"}'
won't work in v3.4. Usecurl -L http://localhost:2379/v3beta/kv/put -X POST -d '{"key": "Zm9v", "value": "YmFy"}'
instead.
- To deprecate
- Change
--auto-compaction-retention
flag to accept string values with finer granularity.- Now that
--auto-compaction-retention
accepts string values, etcd configuration YAML fileauto-compaction-retention
field must be changed tostring
type. - Previously,
--config-file etcd.config.yaml
can haveauto-compaction-retention: 24
field, now must beauto-compaction-retention: "24"
orauto-compaction-retention: "24h"
. - If configured as
--auto-compaction-mode periodic --auto-compaction-retention "24h"
, the time duration value for--auto-compaction-retention
flag must be valid fortime.ParseDuration
function in Go.
- Now that
- Upgrade
boltdb/bolt
fromv1.3.0
tocoreos/bbolt
v1.3.1-coreos.6
. - Upgrade
google.golang.org/grpc
fromv1.2.1
tov1.7.5
. - Upgrade
github.com/ugorji/go/codec
tov1.1
, and regenerate v2client
. - Upgrade
github.com/ugorji/go/codec
tougorji/go@54210f4e0
, and regenerate v2client
. - Upgrade
github.com/grpc-ecosystem/grpc-gateway
fromv1.2.2
tov1.3.0
. - Upgrade
golang.org/x/crypto/bcrypt
togolang/crypto@6c586e17d
.
- Add
etcd --listen-metrics-urls
flag for additional/metrics
endpoints.- Useful for bypassing critical APIs when monitoring etcd.
- Add
etcd_server_version
Prometheus metric.- To replace Kubernetes
etcd-version-monitor
.
- To replace Kubernetes
- Add
etcd_debugging_mvcc_db_compaction_keys_total
Prometheus metric. - Add
etcd_debugging_server_lease_expired_total
Prometheus metric.- To improve lease revoke monitoring.
- Document Prometheus 2.0 rules.
- Initialize gRPC server metrics with zero values.
- Fix range/put/delete operation metrics with transaction.
etcd_debugging_mvcc_range_total
etcd_debugging_mvcc_put_total
etcd_debugging_mvcc_delete_total
etcd_debugging_mvcc_txn_total
- Fix
etcd_debugging_mvcc_keys_total
on restore. - Fix
etcd_debugging_mvcc_db_total_size_in_bytes
on restore.- Also change to
prometheus.NewGaugeFunc
.
- Also change to
See security doc for more details.
- Add CRL based connection rejection to manage revoked certs.
- Document TLS authentication changes.
- Server accepts connections if IP matches, without checking DNS entries. For instance, if peer cert contains IP addresses and DNS names in Subject Alternative Name (SAN) field, and the remote IP address matches one of those IP addresses, server just accepts connection without further checking the DNS names.
- Server supports reverse-lookup on wildcard DNS
SAN
. For instance, if peer cert contains only DNS names (no IP addresses) in Subject Alternative Name (SAN) field, server first reverse-lookups the remote IP address to get a list of names mapping to that address (e.g.nslookup IPADDR
). Then accepts the connection if those names have a matching name with peer cert's DNS names (either by exact or wildcard match). If none is matched, server forward-lookups each DNS entry in peer cert (e.g. look upexample.default.svc
when the entry is*.example.default.svc
), and accepts connection only when the host's resolved addresses have the matching IP address with the peer's remote IP address.
- Add
etcd --peer-cert-allowed-cn
flag.- To support CommonName(CN) based auth for inter peer connection.
- Swap priority of cert CommonName(CN) and username + password.
- Protect lease revoke with auth.
- Provide user's role on auth permission error.
- Fix auth store panic with disabled token.
- Add
--experimental-initial-corrupt-check
flag to check cluster database hashes before serving client/peer traffic.--experimental-initial-corrupt-check=false
by default.- v3.4 will enable
--initial-corrupt-check=true
by default.
- Add
--experimental-corrupt-check-time
flag to raise corrupt alarm monitoring.--experimental-corrupt-check-time=0s
disabled by default.
- Add
--experimental-enable-v2v3
flag to emulate v2 API with v3.--experimental-enable-v2v3=false
by default.
- Add
--max-txn-ops
flag to configure maximum number operations in transaction. - Add
--max-request-bytes
flag to configure maximum client request size.- If not configured, it defaults to 1.5 MiB.
- Add
--client-crl-file
,--peer-crl-file
flags for Certificate revocation list. - Add
--peer-cert-allowed-cn
flag to support CN-based auth for inter-peer connection. - Add
--listen-metrics-urls
flag for additional/metrics
endpoints.- Support additional (non) TLS
/metrics
endpoints for a TLS-enabled cluster. - e.g.
--listen-metrics-urls=https://localhost:2378,http://localhost:9379
to serve/metrics
in secure port 2378 and insecure port 9379. - Useful for bypassing critical APIs when monitoring etcd.
- Support additional (non) TLS
- Add
--auto-compaction-mode
flag to support revision-based compaction. - Change
--auto-compaction-retention
flag to accept string values with finer granularity.- Now that
--auto-compaction-retention
accepts string values, etcd configuration YAML fileauto-compaction-retention
field must be changed tostring
type. - Previously,
--config-file etcd.config.yaml
can haveauto-compaction-retention: 24
field, now must beauto-compaction-retention: "24"
orauto-compaction-retention: "24h"
. - If configured as
--auto-compaction-mode periodic --auto-compaction-retention "24h"
, the time duration value for--auto-compaction-retention
flag must be valid fortime.ParseDuration
function in Go. - e.g.
--auto-compaction-mode=revision --auto-compaction-retention=1000
automaticallyCompact
on"latest revision" - 1000
every 5-minute (when latest revision is 30000, compact on revision 29000). - e.g.
--auto-compaction-mode=periodic --auto-compaction-retention=72h
automaticallyCompact
with 72-hour retention windown, for every 7.2-hour. - e.g.
--auto-compaction-mode=periodic --auto-compaction-retention=30m
automaticallyCompact
with 30-minute retention windown, for every 3-minute. - Periodic compactor continues to record latest revisions for every 1/10 of given compaction period (e.g. 1-hour when
--auto-compaction-mode=periodic --auto-compaction-retention=10h
). - For every 1/10 of given compaction period, compactor uses the last revision that was fetched before compaction period, to discard historical data.
- The retention window of compaction period moves for every 1/10 of given compaction period.
- For instance, when hourly writes are 100 and
--auto-compaction-retention=10
, v3.1 compacts revision 1000, 2000, and 3000 for every 10-hour, while v3.2.x, v3.3.0, v3.3.1, and v3.3.2 compact revision 1000, 1100, and 1200 for every 1-hour. Futhermore, when writes per minute are 1000, v3.3.0, v3.3.1, and v3.3.2 with--auto-compaction-mode=periodic --auto-compaction-retention=30m
compact revision 30000, 33000, and 36000, for every 3-minute with more finer granularity. - Whether compaction succeeds or not, this process repeats for every 1/10 of given compaction period. If compaction succeeds, it just removes compacted revision from historical revision records.
- Now that
- Add
--grpc-keepalive-min-time
,--grpc-keepalive-interval
,--grpc-keepalive-timeout
flags to configure server-side keepalive policies. - Serve
/health
endpoint as unhealthy when alarm (e.g.NOSPACE
) is raised or there's no leader.- Define
etcdhttp.Health
struct with JSON encoder. - Note that
"health"
field isstring
type, notbool
.- e.g.
{"health":"false"}
,{"health":"true"}
- e.g.
- Remove
"errors"
field sincev3.3.0-rc.3
(did exist only inv3.3.0-rc.0
,v3.3.0-rc.1
,v3.3.0-rc.2
).
- Define
- Move logging setup to embed package
- Disable gRPC server info-level logs by default (can be enabled with
etcd --debug
flag).
- Disable gRPC server info-level logs by default (can be enabled with
- Use monotonic time in Go 1.9 for
lease
package. - Warn on empty hosts in advertise URLs.
- Address advertise client URLs accepts empty hosts.
- etcd v3.4 will exit on this error.
- e.g.
--advertise-client-urls=http://:2379
.
- e.g.
- Warn on shadowed environment variables.
- Address error on shadowed environment variables.
- etcd v3.4 will exit on this error.
- Support ranges in transaction comparisons for disconnected linearized reads.
- Add nested transactions to extend proxy use cases.
- Add lease comparison target in transaction.
- Add lease list.
- Add hash by revision for better corruption checking against boltdb.
- Add health balancer to fix watch API hangs, improve endpoint switch under network faults.
- Refactor balancer and add client-side keepalive pings to handle network partitions.
- Add
MaxCallSendMsgSize
andMaxCallRecvMsgSize
fields toclientv3.Config
.- Fix exceeded response size limit error in client-side.
- Address kubernetes#51099.
- In previous versions(v3.2.10, v3.2.11), client response size was limited to only 4 MiB.
MaxCallSendMsgSize
default value is 2 MiB, if not configured.MaxCallRecvMsgSize
default value ismath.MaxInt32
, if not configured.
- Accept
Compare_LEASE
inclientv3.Compare
. - Add
LeaseValue
helper toCmp
LeaseID
values inTxn
. - Add
MoveLeader
toMaintenance
. - Add
HashKV
toMaintenance
. - Add
Leases
toLease
. - Add
clientv3/ordering
for enforce ordering in serialized requests.
- Fix "put at-most-once" violation.
- Fix
WatchResponse.Canceled
on compacted watch request. - Fix
concurrency/stm
Put
with serializable snapshot.- Use store revision from first fetch to resolve write conflicts instead of modified revision.
- Add
--discovery-srv
flag. - Add
--keepalive-time
,--keepalive-timeout
flags. - Add
lease list
command. - Add
lease keep-alive --once
flag. - Make
lease timetolive LEASE_ID
on expired lease printlease LEASE_ID already expired
.- <=3.2 prints
lease LEASE_ID granted with TTL(0s), remaining(-1s)
.
- <=3.2 prints
- Add
snapshot restore --wal-dir
flag. - Add
defrag --data-dir
flag. - Add
move-leader
command. - Add
endpoint hashkv
command. - Add
endpoint --cluster
flag, equivalent to v2etcdctl cluster-health
. - Make
endpoint health
command terminate with non-zero exit code on unhealthy status. - Add
lock --ttl
flag. - Support
watch [key] [range_end] -- [exec-command…]
, equivalent to v2etcdctl exec-watch
.- Make
watch -- [exec-command]
set environmental variablesETCD_WATCH_REVISION
,ETCD_WATCH_EVENT_TYPE
,ETCD_WATCH_KEY
,ETCD_WATCH_VALUE
for each event.
- Make
- Support
watch
with environmental variablesETCDCTL_WATCH_KEY
andETCDCTL_WATCH_RANGE_END
. - Enable
clientv3.WithRequireLeader(context.Context)
forwatch
command. - Print
"del"
instead of"delete"
intxn
interactive mode. - Print
ETCD_INITIAL_ADVERTISE_PEER_URLS
inmember add
.
- Handle empty key permission in
etcdctl
.
- Add
backup --with-v3
flag.
- Add
grpc-proxy start --experimental-leasing-prefix
flag.- For disconnected linearized reads.
- Based on V system leasing.
- See "Disconnected consistent reads with etcd" blog post.
- Add
grpc-proxy start --experimental-serializable-ordering
flag.- To ensure serializable reads have monotonically increasing store revisions across endpoints.
- Add
grpc-proxy start --metrics-addr
flag for an additional/metrics
endpoint.- Set
--metrics-addr=http://[HOST]:9379
to serve/metrics
in insecure port 9379.
- Set
- Serve
/health
endpoint in grpc-proxy. - Add
grpc-proxy start --debug
flag. - Add
grpc-proxy start --max-send-bytes
flag to configure maximum client request size. - Add
grpc-proxy start --max-recv-bytes
flag to configure maximum client request size.
- Fix Snapshot API error handling.
- Fix KV API
PrevKv
flag handling. - Fix KV API
KeysOnly
flag handling.
- Replace gRPC gateway endpoint
/v3alpha
with/v3beta
.- To deprecate
/v3alpha
in v3.4. - In v3.3,
curl -L http://localhost:2379/v3alpha/kv/put -X POST -d '{"key": "Zm9v", "value": "YmFy"}'
still works as a fallback tocurl -L http://localhost:2379/v3beta/kv/put -X POST -d '{"key": "Zm9v", "value": "YmFy"}'
, butcurl -L http://localhost:2379/v3alpha/kv/put -X POST -d '{"key": "Zm9v", "value": "YmFy"}'
won't work in v3.4. Usecurl -L http://localhost:2379/v3beta/kv/put -X POST -d '{"key": "Zm9v", "value": "YmFy"}'
instead.
- To deprecate
- Support "authorization" token.
- Support websocket for bi-directional streams.
- Upgrade gRPC gateway to v1.3.0.
- Fix backend database in-memory index corruption issue on restore (only 3.2.0 is affected).
- Fix watch restore from snapshot.
- Fix
mvcc/backend.defragdb
nil-pointer dereference on create bucket failure. - Fix server crash on invalid transaction request from gRPC gateway.
- Prevent server panic from member update/add with wrong scheme URLs.
- Make peer dial timeout longer.
- See coreos/etcd-operator#1300 for more detail.
- Make server wait up to request time-out with pending RPCs.
- Fix
grpc.Server
panic onGracefulStop
with TLS-enabled server. - Fix "multiple peer URLs cannot start" issue.
- Fix server-side auth so concurrent auth operations do not return old revision error.
- Handle WAL renaming failure on Windows.
- Upgrade
coreos/go-systemd
tov15
(see https://github.com/coreos/go-systemd/releases/tag/v15).
- Fail-over v2 client to next endpoint on oneshot failure.
- Put back
/v2/machines
endpoint for python-etcd wrapper.
- Add non-voting member.
- To implement Raft thesis 4.2.1 Catching up new servers.
Learner
node does not vote or promote itself.
- Support previous two minor versions (see our new release policy).
v3.3.x
is the last release cycle that supportsACI
.- AppC was officially suspended, as of late 2016.
acbuild
is not maintained anymore.*.aci
files won't be available from etcd v3.4 release.
- Add container registry
gcr.io/etcd-development/etcd
.- quay.io/coreos/etcd is still supported as secondary.
- Require Go 1.9+.
- Compile with Go 1.9.3.
- Deprecate
golang.org/x/net/context
.