physical/postgresql: Add support for high availability #2700

louis-paul · 2017-05-09T16:46:38Z

Are there any e2e tests to run for HA? Thanks!

deverton · 2017-06-29T01:33:37Z

Would this be better implemented with advisory locks? https://www.postgresql.org/docs/9.1/static/functions-admin.html#FUNCTIONS-ADVISORY-LOCKS

louis-paul · 2017-06-29T18:42:27Z

Hi Dan, I initially started working with advisory locks, which would have been better. Unfortunately advisory locks are per-connection, and Go’s SQL package does pooling over multiple connections by default. This makes lock-related calls not produce the same results depending on the connection they use. From my point of view, there are only two ways to use advisory locks for Vault HA with PostgreSQL: - Restrict Vault to at most a single connection per instance. This is likely to impact performance in larger deployments in a very significant way (have not measured though). - Dedicate a per-instance connection for locking. Using a full connection that is almost always idle is rather wasteful and may put strain on smaller PostgreSQL servers. To the best of my knowledge, this pull request has the same characteristics as an advisory lock-based implementation. Let me know if you have ideas on how to improve the situation.

…

-- Louis-Paul

On Jun 29, 2017, at 3:33 AM, Dan Everton ***@***.***> wrote: Would this be better implemented with advisory locks? https://www.postgresql.org/docs/9.1/static/functions-admin.html#FUNCTIONS-ADVISORY-LOCKS — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.

jefferai · 2017-07-01T20:55:49Z

Honestly, it sounds like advisory locks are the correct way to go. Compared to the traffic coming from Vault generally, an additional single connection with low traffic doesn't seem like a bad tradeoff for getting HA capabilities. That might just be my postgres naivete coming through though.

louis-paul · 2017-07-03T08:01:45Z

A particularity of PostgreSQL is that client connections are very expensive for servers, more so than for other databases. From my (limited) understanding of the internals of Postgres, each connection is a separate process with a lot of overhead. Connection pooling middleware (pgbouncer or pgpool) is commonly used to have fewer connections with more activity. The PostgreSQL wiki has a page on connection count tuning.

For reference, Vault seems to currently use 2 connections per instance (the default from database/sql).

I wanted to highlight the potential tradeoffs; I would be happy to rewrite the pull request using actual locks if you think that's best. Having the opinion of someone with more experience administering PostgreSQL could be valuable as well.

jefferai · 2017-07-03T13:12:34Z

Yeah -- I don't have that experience and I don't know that anyone on the Vault team does. I just know that whenever postgres HA has been discussed in the past advisory locks were deemed the correct approach. I have someone in mind to loop in, will try to get them to comment.

sean-

This looks like a great start. There are a number little changes. The length of the review is indicative of some nits that predate @louis-paul and trying to dry-code some of the suggestions.

There are two primary changes that I'd like to see with this: 1) the DELETE should happen after the lock has been acquired. I think this needs to happen in a different way where all Vault servers perform an INSERT and then the election happens as an UPDATE. 2) The renewal should be automatically calculated based on the lockTTL as described in the doc comments and elsewhere throughout the review.

I've actually come full circle on the advisory locks for PostgreSQL HA. In order to perform correct hand-off between two different PostgreSQL backends when a failover happens faster than a lock duration, or using something like pl_paxos, or CockroachDB, use of a heartbeat table like this is actually the most stable way to go.

In the future I can see an additional configuration parameter that would possibly modify this behavior to do something backend specific, but for the time being, the most universally useful and reliable way to do HA PostgreSQL is to use a heartbeat table and poll the database every poll_interval.

sean- · 2017-07-08T23:15:11Z

physical/postgresql.go

@@ -39,6 +56,12 @@ func newPostgreSQLBackend(conf map[string]string, logger log.Logger) (Backend, e
 	}
 	quoted_table := pq.QuoteIdentifier(unquoted_table)

+	unquoted_lock_table, ok := conf["lock_table"]
+	if !ok {
+		unquoted_lock_table = "vault_lock"


Add the following to the const section at the top of the file:

DeafultPostgreSQLLockTable = "vault_lock"

And then this becomes:

if !ok { unquotedLockTable = DeafultPostgreSQLLockTable }

sean- · 2017-07-08T23:24:45Z

physical/postgresql.go

+const (
+	// PostgreSQLLockRetryInterval is the interval of time to wait between new
+	// locking attempts
+	PostgreSQLLockRetryInterval = time.Second


Prefix with Default, changed it from "Retry" to Poll" because that's how its being used further below: s/PostgreSQLLockRetryInterval/DefaultPostgreSQLPollInterval/g

DefaultPostgreSQLPollInterval = 1 * time.Second

sean- · 2017-07-08T23:25:53Z

physical/postgresql.go

+	PostgreSQLLockRetryInterval = time.Second
+	// PostgreSQLLockErrorRetryMax is the number of retries to make after an
+	// error fetching the status of a lock
+	PostgreSQLLockErrorRetryMax = 5


Let's remove PostgreSQLLockErrorRetryMax all together. We're mixing different dimensions of liveliness (time and counts). Everything else here is specified in terms of time, let's stick with that.

sean- · 2017-07-08T23:34:17Z

physical/postgresql.go

+	// error fetching the status of a lock
+	PostgreSQLLockErrorRetryMax = 5
+	// PostgreSQLLockTTL is the maximum length of time of a lock, in seconds
+	PostgreSQLLockTTL = 10


Prefix with Default: s/PostgreSQLLockTTL/DefaultPostgreSQLLockTTL/g

Also, let's explicitly type PostgreSQLLockTTL so that the compiler will catch any uses of a negative value:

DefaultPostgreSQLLockTTL = uint(10) // or better yet: DefaultPostgreSQLLockTTL = 10 * time.Second

PostgreSQLLockTTL is assumed to be a positive integer further down in the code. Either a uint or preferably let's move this to a time.Duration. Further below I add a small test for this as a parameter. The rest of this review assumes a time.Duration.

sean- · 2017-07-08T23:47:16Z

physical/postgresql.go

+		unquoted_lock_table = "vault_lock"
+	}
+	quoted_lock_table := pq.QuoteIdentifier(unquoted_lock_table)
+


Can we add configuration parameters? Something like:

// Move this to the top of the file with the other consts const ( PostgreSQLLockTTLConf = "lock_ttl" PostgreSQLPollIntervalConf = "poll_interval" PostgreSQLLockTableConf = "lock_table" PostgreSQLLockSchemaConf = "lock_schema" MinimumPostgreSQLPollInterval = 1 * time.Second MinimumPostgreSQLLockGracePeriod = 1 * time.Second // Meta-comment: I am explicitly not using "public" as the default schema so that the DBA caring for this instance can perform an ALTER ROLE vault_user SET search_path = 'hashicorp' or something and have it just work from either the DBAs perspective of Vault's perspective. TL;DR: it is not okay to explicitly use or suggest the use of the public schema even though it may be the default. DefaultPostgreSQLSchema = "" DefaultPostgreSQLLockTable = "vault_lock" ) var lockTTL time.Duration { rawLockTTL, found := conf[PostgreSQLLockTTLConf] if found { var err error if lockTTL, err = time.ParseDuration(rawLockTTL); err != nil { return nil, fmt.Errorf("%s error: %v", PostgreSQLLockTTLConf, err) } } else { lockTTL = DefaultPostgreSQLLockTTL } } var pollInterval time.Duration { rawPollInterval, found := conf[PostgreSQLPollIntervalConf] if found { if pollInterval, err = time.ParseDuration(rawPollInterval); err != nil { return nil, fmt.Errorf("%s error: %v", err, PostgreSQLPollIntervalConf) } } else { pollInterval = DefaultPostgreSQLPollInterval } } var lockTableName string { rawLockTableName, found := conf[PostgreSQLTableNameConf] if found { lockTableName = pq.QuoteIdentifier(strings.TrimSpace(rawLockTableName)) } else { lockTableName = DefaultPostgreSQLLockTableName } } var lockSchemaName string { rawLockSchemaName, found := conf[PostgreSQLLockSchemaNameConf] if found { lockSchemaName = pq.QuoteIdentifier(strings.TrimSpace(rawLockSchemaName)) } else { lockSchemaName = DefaultPostgreSQLLockSchemaName } } // Sanity check inputs if pollInterval < 0 { return nil, fmt.Errorf("%s (%q) must be a positive time duration", PostgreSQLPollIntervalConf, pollInterval) } if !(pollInterval < lockTTL) { return nil, fmt.Errorf("%s (%q) must be smaller than the %s (%q)", PostgreSQLPollIntervalConf, PostgreSQLLockTTLConf, pollInterval, lockTTL) } if pollInterval < MinimumPostgreSQLPollInterval { return nil, fmt.Errorf("%s (%q) can not be less than %q", PostgreSQLPollIntervalConf, pollInterval, MinimumPostgreSQLPollInterval) } if lockTTL - pollInterval < MinimumPostgreSQLLockGracePeriod { return nil, fmt.Errorf("There must be at least %s between the %s (%q) and %s (%q)", MinimumPostgreSQLLockGracePeriod, PostgreSQLPollIntervalConf, pollInterval, PostgreSQLLockTTLConf, lockTTL) } if lockTableName == "" { return nil, fmt.Errorf("%s error: can not be an empty string", PostgreSQLTableNameConf) } // No sanity check on lockSchemaName

sean- · 2017-07-09T02:46:43Z

physical/postgresql.go

+		WHERE key = $1`,
+		m.key,
+	).Scan(&held, &value)
+	return held, value, err


valueSQL := fmt.Sprintf(`SELECT expiration > now(), value FROM %s WHERE key = $1 `, m.relationName() err := m.client.QueryRow(valueSQL, m.key).Scan(&held, &value)

sean- · 2017-07-09T02:51:10Z

website/source/docs/configuration/storage/postgresql.html.md

+  value      TEXT,
+  vault_id   TEXT NOT NULL,
+  expiration TIMESTAMP NOT NULL
+);


CREATE TABLE vault_lock ( key TEXT COLLATE "C" PRIMARY KEY, value TEXT COLLATE "C", vault_id TEXT COLLATE "C" NOT NULL, expiration TIMESTAMP NOT NULL );

Be explicit here: TIMESTAMP WITHOUT TIME ZONE vs TIMESTAMP and use COLLATE "C" instead of leaving it up to the database's defaults. COLLATE "C" also has the nice benefit of consistent ordering because it bypasses calls to iconv(3) (important when replicating across multiple architectures or distributions and is ~2x faster if for some reason this ever became a performance concern).

sean- · 2017-07-09T03:02:04Z

website/source/docs/configuration/storage/postgresql.html.md

@@ -82,6 +89,9 @@ LANGUAGE plpgsql;
  which to write Vault data. This table must already exist (Vault will not
  attempt to create it).

+- `lock_table` `(string: "vault_lock")` – Specifies the name of the table to use
+  for high availability locks. Like `table`, this table must already exist.
+


Other doc items to add:

* `lock_ttl` (string: "10s") - Specify the duration of the each lease renewal of the leader. Once the leader has acquired the lock, it will fresh its lock halfway through its lease. The first time the leader fails to renew its lease it will attempt every `poll_interval` until either successful or it looses its lease. * `poll_interval` (string: "1s") - Specify how often clients should poll for the lock. * `lock_schema` (string: "") - Specify the schema name to use. If specified the `lock_schema` is used to provide the fully-qualified name of the `lock_table` and bypass looking for the table using the connecition's [`search_path`](https://www.postgresql.org/docs/current/static/ddl-schemas.html#DDL-SCHEMAS-PATH).

sean- · 2017-07-09T03:04:35Z

physical/postgresql.go

+		if err != nil {
+			return nil, err
+		}
+		lockTTL := strconv.Itoa(PostgreSQLLockTTL)


Remove lockTTL.

Add a small helper function, relationName():

func (m *PostgreSQLLock) relationName() string relationName := pq.QuoteIdentifier(m.lockTableName) if schemaName == "" { relationName = fmt.Sprintf("%s.%s", pq.QuoteIdentifier(m.lockSchemaName), pq.QuoteIdentifier(m.lockTableName)) } return relationName }

Both lockTableName and lockSchemaName were already pq.QuoteIdentifier()'ed, but if this gets moved to a helper function then we can remove the pq.QuoteIdentifer() at config time. This is defensive in advance of a HUP signal handler.

lockSQL := fmt.Sprintf(`INSERT INTO %s (key, value, vault_id, expiration) VALUES ($1, $2, $3, now() + $4::INTERVAL)`, relationName)

Instead of concatenating strings, use fmt.Sprintf() to construct the SQL.

That way we're no longer constructing SQL by hand. It's low risk here, but worth observing best practices.

sean- · 2017-07-09T03:08:34Z

physical/postgresql.go

+			m.key,
+			m.value,
+			m.vaultID,
+		)


Rely on the database driver to perform this logic now that lock expiration, lockTTL, is of type time.Duration.

_, err := m.client.ExecContext(ctxt, lockSQL, m.key, m.value, m.vaultID, m.lockTTL.String())

- Remove string literals in the code - Stop constructing SQL by hand - Simplify configration options - Store hostname in the lock key - Add support for custom schemas

Lock refreshes now only use the PostgreSQLPollInterval and PostgreSQLLockTTL options. Locks are refreshed halfway through their expiration. Failure to refresh a lock causes the leader to increase the frequency of its attempts until the lock is lost.

- Small changes to the lock table structure - Add new HA configuration option

louis-paul · 2017-07-12T15:19:20Z

Sean, thanks for the extensive review. I've fixed most of the style/configuration issues you mentioned. Let me know if there are more.

I have changed the lock watching logic, it would be great to see if that's what you are looking for. Something I have not changed yet is the main locking logic; I've replied to your comment with my questions.

sean-

Excellent, these changes look good. I responded to the locking protocol inline.

sean- · 2017-07-13T21:52:03Z

physical/postgresql.go

@@ -50,17 +58,68 @@ func newPostgreSQLBackend(conf map[string]string, logger log.Logger) (Backend, e
 		return nil, fmt.Errorf("missing connection_url")
 	}

-	unquoted_table, ok := conf["table"]
+	unquoted_table, ok := conf["lockTableName"]


The Go variable name for the configuration parameter lock_table should be lockTableName. Actually, the line should just read:

unquotedTable, found := conf[PostgreSQLLockTableNameConf]

That was some heavy-handed refactoring with my editor 🙂

sean- · 2017-07-13T21:55:57Z

physical/postgresql.go

@@ -221,13 +289,23 @@ func (m *PostgreSQLBackend) LockWith(key, value string) (Lock, error) {
 	if err != nil {
 		return nil, err
 	}
+	// Record the hostname to give DBAs a chance to figure out which Vault
+	// service has the lock


Enhance this comment and add, "this error is not fatal if the hostname fails for some reason. If this fails, use a sensible default." Or something. :)

sean- · 2017-07-13T21:58:03Z

physical/postgresql.go

 	"strings"
 	"time"

 	"github.com/armon/go-metrics"
 	"github.com/hashicorp/go-uuid"
 	"github.com/lib/pq"
-	log "github.com/mgutz/logxi/v1"
+	"github.com/mgutz/logxi/v1"


Did goimports remove the log prefix here? The sorting looks right, however.

log "github.com/mgutz/logxi/v1"

sean- · 2017-07-13T21:59:09Z

physical/postgresql.go

+	refreshTicker := time.NewTicker(m.lockTTL / 2)
+	defer refreshTicker.Stop()
+	pollTicker := time.NewTicker(m.pollInterval)
+	defer pollTicker.Stop()


Thank you for remembering to Stop these tickers.

sean- · 2017-07-13T22:05:16Z

physical/postgresql.go

@@ -376,17 +382,21 @@ func (m *PostgreSQLLock) watch() {
 			r, err := m.client.Exec(refreshLockSQL, m.lockTTL.String(), m.key,
 				m.vaultID)
 			if err != nil || r == nil {


If err != nil, log the error but stay in watch().

If r == nil, then test and conditionally return.

Please have a look at the new version, let me know if that's what you meant.

sean- · 2017-07-14T00:16:24Z

physical/postgresql.go

+		)
+		if err != nil {
+			return nil, err
+		}


The locking protocol could race if two servers start at the same time. Also, the key isn't very sophisticated so I wouldn't recommend using it for much of anything right now (it's important to make it part of the lock, but I wouldn't rely on it exclusively):

vault/vault/core.go

Lines 39 to 41 in f75f5b0

// coreLockPath is the path used to acquire a coordinating lock

// for a highly-available deploy.

coreLockPath = "core/lock"

vault/vault/core.go

Lines 736 to 737 in f75f5b0

// Initialize a lock

lock, err := c.ha.LockWith(coreLockPath, "read")

vault/vault/core.go

Line 1445 in f75f5b0

lock, err := c.ha.LockWith(coreLockPath, uuid)

I think it's necessary to do something like the following:

I'd change the table definition to be something like the following (range types FTW):

-- Optional: -- -- CREATE EXTENSION IF NOT EXISTS btree_gist; -- -- if you want to incorporate the `key` into the lock domain - good practice, not required. Not all installations have the `btree_gist` extension. CREATE TABLE vault_lock ( vault_id TEXT COLLATE "C" NOT NULL, key TEXT COLLATE "C", value TEXT COLLATE "C", -- lock_lease lock_lease tstzrange NOT NULL, -- row_expiration is the time that we should unconditionally remove this row. row_expiration should be at least 2x lock_ttl. row_expiration TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT NOW() + '30 minutes'::INTERVAL, EXCLUDE USING gist (/* key WITH = ,*/ lock_lease WITH &&) ); CREATE INDEX vault_id_idx ON vault_lock(vault_id); -- CREATE INDEX vault_lock_row_expiration_idx ON vault_lock(row_expiration); -- only suggested if the row_expiration is very large and a large number of rows accumulate. This is not necessary but couldn't hurt much. CREATE INDEX vault_key_idx ON vault_lock(vault_id) WHERE key IS NULL;

The easy part is is cleaning old rows: DELETE FROM vault_lock WHERE row_expiration < NOW(). Simple and easy to understand. This could be run by any node (ideally just the one that has the lock after it acquires or renews its lock).

Registration of a node is a prerequisite: INSERT INTO vault_lock (key, value, vault_id) VALUES ($1, $2, $3)

Lock acquisition is an UPDATE: UPDATE vault_lock SET key = $1, lock_lease = ('\'[' || NOW() || ',' || NOW() + $2 || ']\'')::tstzrange WHERE vault_id = $3 AND key IS NULL;

If the above lock has been acquired, you can expire all old rows (query from step 2) DELETE.

Lock renewal is the following UPDATE: UPDATE vault_lock SET lock_lease = (\'[' || lower(lock_lease) || ',' || NOW() + $1 || ']\'')::tstzrange WHERE vault_id = $2 AND key = $3

For followers attempting to acquire the lock, poll using:

SELECT TRUE AS lock_held, vault_id AS lock_held_by, upper(lock_lease) - NOW() AS remaining_lock_duration FROM vault_lock WHERE lock_lease @> NOW()

If zero rows come back, attempt to acquire the lock. If the lock acquisition fails, there was a race and someone else acquired the lock and the follower should go back to polling. It's tempting to use the remaining_lock_duration as the time that the followers should sleep but that also means that the lock won't be acquired until the the lease expires. Maybe do that if the poll_interval is set to 0s. ? Food for thought.

… name

This lock table is now able to accept registration of all Vault instances. Locks are held by updating the `lock_lease_end` field of a row. Irresponsive Vault instances are periodically garbage collected.

The lock system now has one main loop that does all the lock-related maintenance (lock polling, lock updates, `row_expiration` updates, row garbage collection). A channel is used for signals to the loop.

louis-paul · 2017-08-10T16:58:03Z

I revamped the locking logic using your suggestions @sean-. I could not wrap my head around your CockroachDB-compatible transaction, but I tested the SQL statements against CockroachDB v1.0.4. After after changing a bit the table, they seem compatible.

All instances now register in the table when starting the locking process. Active locks are represented with a lock_lease_end field that is after NOW(). Instances check for the availability of the lock and grab it within the same SERIALIZABLE transaction. I removed the lock_lease_start which did not seem useful for now. I assumed the row_expiration field you made mention of is a mean of garbage collecting dead Vault instances and added logic for that purpose. I’ve hard-coded row_expiration to be 2 × lock_ttl but we could add a configuration setting for that.

sokoow · 2017-10-10T08:10:05Z

Bump, where are we with this - will it ever get merged ?

jefferai · 2017-10-10T12:45:58Z

@sokoow waiting on review/discussion.

mytototo · 2018-03-15T08:34:44Z

Any news regarding the merge of HA feature for PostgreSQL ?

jefferai · 2018-08-30T14:53:05Z

@louis-paul why'd you close it?

louis-paul · 2018-08-30T15:34:04Z

@jefferai: There has not been any significant activity for a over a year. Unless I’m mistaken, this is despite the latest review comments having been addressed 😕

jefferai · 2018-08-30T16:07:47Z

@louis-paul Probably better to just try pinging @sean- again instead of closing it :-)

jefferai · 2018-08-30T22:15:54Z

(P.S. I've also pinged @sean- through a back channel...hopefully I'll at least hear whether or not he is going to review this, and then we can proceed from there)

glerchundi · 2018-09-03T06:29:59Z

This would be absolutely fantastic if it can get into Vault, good job @louis-paul and it's also CockroachDB-compatible (@sean- 👏 )!

I know that probably everyone is busy from the Hashicorp's side but having a estimated time of action would help making this process a little bit more community-friendly. I also experienced the same feeling some time ago when my PR took 2 years long til got it merged (hashicorp/terraform#3858).

At that time @grubernaut told me that they were trying to improve the reviewing process but, sadly, this PR shows that nothing has changed since then.

This makes that for example me, as an independent developer, will think twice before starting a contribution process on any Hashicorp's projects.

/cc @mitchellh @armon

jefferai · 2018-09-04T18:29:12Z

I know that probably everyone is busy from the Hashicorp's side but having a estimated time of action would help making this process a little bit more community-friendly. I also experienced the same feeling some time ago when my PR took 2 years long til got it merged (hashicorp/terraform#3858).
At that time @grubernaut told me that they were trying to improve the reviewing process but, sadly, this PR shows that nothing has changed since then.

Please note that Terraform is a totally separate product with totally separate workflows, and Jake has not been with the company for a long time. I don't know what the holdup was with Terraform, but the problem with this issue is that HashiCorp explicitly does not maintain this storage backend. We do basic due diligence, but we don't test it out, and we aren't PG experts. So we try to get people that actually are using it and/or are PG experts, other than the PR submitter, to review it so that merging something doesn't break everybody else.

The problem right now is that the person who did the review and asked for a number of changes then disappeared, and in the interim time, past the initial ping after the changes were made, nobody tried pinging that person to see if they could come back and take another look.

The person has been pinged here and I've pinged them through a backchannel, but that was on Thursday and it's the first day back from a holiday weekend here in the U.S. It's not unreasonable that there's been no reply yet.

Another possibility would be to find other users of the storage backend (there are many; you could for instance post on the mailing list) and ask some of them to review it. Then you're not bottlenecked on the single reviewer from before.

jefferai · 2018-09-04T18:30:13Z

I should also mention that there are merge conflicts which have to be solved before we can merge post-review anyways...

sean-

I'm not in a position to test this code, but this looks good for this eyeball-linter, modulo some of the comments in the review. I had to think through the locking protocol a few times, but this looks sound and portable. Most of this should work for crdb with a little work (the list_query attr for PostgreSQLBackend is the only item that actually jumped out at me as a crdb portability issue).

sean- · 2018-09-04T23:34:02Z

physical/postgresql.go

+
+	if lockTTL-pollInterval < MinimumPostgreSQLLockGracePeriod {
+		return nil, fmt.Errorf(
+			"There must be at least %s between the %s (%q) and %s (%q)",


Lowercase t in there

sean- · 2018-09-04T23:40:50Z

physical/postgresql.go

+	}
+	existingSQL := `AND (lock_lease_end IS NULL OR lock_lease_end < NOW())`
+	if existing {
+		existingSQL = `AND lock_lease_end >= NOW()`


This may read like a petty nit, but I have a style preference to prepend the whitespace in the SQL that is appended to the statement below.

sean- · 2018-09-04T23:42:11Z

physical/postgresql.go

+			return err
+		}
+	}
+	existingSQL := `AND (lock_lease_end IS NULL OR lock_lease_end < NOW())`


Prepend a single space character ( ), see the next few comments for rationale.

sean- · 2018-09-04T23:42:43Z

physical/postgresql.go

+	}
+
+	grabSQL := fmt.Sprintf(`UPDATE %s SET lock_lease_end = NOW() + $3::INTERVAL
+		WHERE vault_id = $1 AND key = $2 %s`, m.relationName(), existingSQL)


WHERE vault_id = $1 AND key = $2%s

That way if there is an option to have an empty existingSQL, there is no extra trailing space in the SQL that is emitted.

sean- · 2018-09-04T23:43:21Z

physical/postgresql.go

+	}
+	if res == nil {
+		tx.Rollback()
+		return errors.New("Tried updating a lock but affected 0 rows")


Lowercase T (first character in an error message should be lowercase, if you can make a sweep for that style idiom).

sean- · 2018-09-05T00:11:40Z

physical/postgresql.go

@@ -74,6 +156,11 @@ func newPostgreSQLBackend(conf map[string]string, logger log.Logger) (Backend, e
 			"UNION SELECT DISTINCT substring(substr(path, length($1)+1) from '^.*?/') FROM " +
 			quoted_table + " WHERE parent_path LIKE concat($1, '%')",
 		logger: logger,
+


This is unrelated to this PR, but something I noticed in this review (it can be tackled in the future as a new issue or quickly bolted onto this PR): it would be good if the above SQL were constructed using fmt.Sprintf()-like semantics like is done elsewhere in the provider.

I need to re-read what list_query is doing and how/where it's being used, but I don't think the above UNION SELECT DISTINCT query will work for CockroachDB (CC @bdarnell ).

pinging @knz which is one of the master & commanders of CockroachDB in case he says this is not supported yet and if he has an idea of how could be rewritten in a way which it is.

I'm not exactly sure what the question is? You already use pq.QuoteIdentifier properly; you could format your query with fmt.Sprintf("SELECT value FROM %s WHERE path = $1 AND key = $2", quoted_table) and that should work. Was there any reason to not do that?

Meanwhile regarding the list_query:

the UNION should work; however

I'm not sure what you are doing with substring( ... from '^.*?/'). This does't look right; substring does not work with regular expressions. There are regexp_xxx() functions that can help though.

It was for the UNION SELECT DISTINCT thingy so thanks for the quick response!

sean- · 2018-09-05T00:12:34Z

physical/postgresql.go

+}
+
+// Lock grabs a lock, or waits until it is available
+func (m *PostgreSQLLock) Lock(stopCh <-chan struct{}) (<-chan struct{}, error) {


Woof. What I wouldn't give to see this moved to use context.Context instead of a dedicated channel to indicate shutdown. I don't think that's possible or in-scope for this PR. CC @jefferai in case there's a change to the interface coming down the pike soon. I see context.Context is being used immediately below, but this now sticks out when reading this code.

sean- · 2018-09-05T00:19:03Z

physical/postgresql.go

+
+	ctx, cancel := context.WithCancel(context.Background())
+	go func() {
+		m.stepDownCh <- <-stopCh


I'm not terribly worried about this given the timescale at which promotions come and go, but all step-down and promotion events should be serialized. If this node flaps, it is a low-probability event that this node steps down after being promoted. It would be nice if these step-own events were aligned to some configuration epoch. Unrelated to Vault, but this is something we've had to code against in the last few months a prudent thing to guard against. I don't think a spurious step-down is fatal in this case and think this code is just cargo-culted from elsewhere, so all backends could have this issue.

glerchundi · 2018-09-05T11:07:19Z

@jefferai now is crystal clear, thanks for giving us an update.

@louis-paul in case you have no time to invest on this I can take it and made the requested changes.

jefferai · 2018-10-04T14:58:08Z

@glerchundi It seems like the OP isn't interested -- if you can address this feedback it'd be great!

glerchundi · 2018-10-04T15:30:51Z

I’m on vacation now but will take this and address the issues once I find a free slot. This means in a month or so. Thanks for pinging.

…

On Thu, 4 Oct 2018 at 16:58, Jeff Mitchell ***@***.***> wrote: @glerchundi <https://github.com/glerchundi> It seems like the OP isn't interested -- if you can address this feedback it'd be great! — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#2700 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ACIPlsYF6OTFfM2vveogqB-LbtSJnONtks5uhiIcgaJpZM4NVlhF> .

bjorndolk · 2018-11-01T12:54:38Z

Hello!,
I started working on addressing feedback in this PR. The filestructure for under physical/... has been refactored since this PR was opened. It is kind of a mess to merge.
@jefferai
How about I create a new PR and do a manual merge of the work louis-paul has done?

bjorndolk · 2018-11-01T14:17:11Z

We (silverrail) may also work on a docker image for postgres to get this more testable, but we will see about that.

bjorndolk · 2018-11-02T06:24:34Z

@louis-paul would you consider allowing me write access to your repo, hereby allowing me to contribute to this PR instead of opening a new one?

louis-paul · 2018-11-02T12:13:11Z

Hi @bjorndolk, I’ve added you as a collaborator to the fork’s repo (closing this PR and opening a new one is also fine).

I have looked at updating the branch, but the amount of work is non-trivial and I found my code’s quality to be worse than I expected. Sorry for the inactivity on this change, but I can’t justify spending more time on this as my company has dropped Vault.

bjorndolk · 2018-11-02T13:10:06Z

@louis-paul Thanks !! I got this merged and compiling now. Next is to add the generic HA tests and see how that goes.

bjorndolk · 2018-11-05T10:36:30Z

Running the genric tests, normal backend tests still work, so didnt break that in merge.
However the HA tests breaks, these tests did not exist when this was originally written. I will dig into this.

bjorndolk · 2018-11-05T13:28:14Z

The HA tests get stuck when trying to lock for some reason.
I find this model for locks somewhat messy why do evry instance need to create an entry with null lease_end, instead of only having one lock entry owned by some instance (similiar logic to the dynamodb implementation). The row_expiration column seems unnecisarry to me lock_lease_end should be sufficient, expired locks should always be safe to remove.

I am itching to rewrite this to function similar to dynamodb with a simple insert ... on conflict .. update .. where .. statement to try to steal lock. Since lock steal is done in one bang no need for high transaction isolation levels, even dirty-read should work.

May I rewrite this logic? @jefferai @louis-paul

bjorndolk · 2018-11-08T11:01:17Z

@jefferai please consider #5731 for postgres ha support.

jefferai · 2018-11-08T14:31:52Z

Closed this as the only person that seems interested in getting it across the finish line has rewritten it in a new PR (linked above).

glerchundi · 2019-01-31T23:02:48Z

Sorry for the inactivity on this change, but I can’t justify spending more time on this as my company has dropped Vault.

Could I ask what are you using instead of Vault, hand rolled or another soft? We're evaluating the use of Vault and would be good to know why people drop Vault!

Thanks!

louis-paul added 2 commits May 9, 2017 18:33

physical: Add support for PostgreSQL HA

c6ce2fc

Update documentation for PostgreSQL HA

b957d7c

sean- suggested changes Jul 9, 2017

View reviewed changes

sean- mentioned this pull request Jul 9, 2017

sql: use Vault with CockroachDB cockroachdb/cockroach#14988

Closed

louis-paul added 3 commits July 12, 2017 16:59

physical/postgres: Add configuration fields; clean up style

ab72c10

- Remove string literals in the code - Stop constructing SQL by hand - Simplify configration options - Store hostname in the lock key - Add support for custom schemas

physical/postgres: Update HA documentation

763d2a3

- Small changes to the lock table structure - Add new HA configuration option

sean- suggested changes Jul 14, 2017

View reviewed changes

louis-paul added 6 commits July 26, 2017 11:55

physical/postgres: Re-import logxi as log

0e75305

physical/postgres: Use the configuration directive for the lock table…

6c4ea69

… name

physical/postgres: Keep refreshing on lock update errors, log messages

adce946

physical/postgres: Update lock hostname comment

15bd0f0

physical/postgres: Update lock table definition

e9faff2

This lock table is now able to accept registration of all Vault instances. Locks are held by updating the `lock_lease_end` field of a row. Irresponsive Vault instances are periodically garbage collected.

physical/postgres: Update the locking logic

520be9f

The lock system now has one main loop that does all the lock-related maintenance (lock polling, lock updates, `row_expiration` updates, row garbage collection). A channel is used for signals to the loop.

physical/postgres: Fix HA lock relation in Value()

5ca4413

louis-paul closed this Aug 30, 2018

jefferai reopened this Aug 30, 2018

sean- reviewed Sep 5, 2018

View reviewed changes

bjorndolk mentioned this pull request Nov 8, 2018

Added HA backend for postgres based on dynamodb model #5731

Merged

jefferai closed this Nov 8, 2018

	// coreLockPath is the path used to acquire a coordinating lock
	// for a highly-available deploy.
	coreLockPath = "core/lock"

	// Initialize a lock
	lock, err := c.ha.LockWith(coreLockPath, "read")

physical/postgresql: Add support for high availability #2700

physical/postgresql: Add support for high availability #2700

Conversation

louis-paul commented May 9, 2017

deverton commented Jun 29, 2017

louis-paul commented Jun 29, 2017 via email • edited Loading

jefferai commented Jul 1, 2017

louis-paul commented Jul 3, 2017

jefferai commented Jul 3, 2017

sean- left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

louis-paul commented Jul 12, 2017 • edited Loading

sean- left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sean- Jul 14, 2017 • edited Loading

Choose a reason for hiding this comment

louis-paul commented Aug 10, 2017

sokoow commented Oct 10, 2017

jefferai commented Oct 10, 2017

mytototo commented Mar 15, 2018

jefferai commented Aug 30, 2018

louis-paul commented Aug 30, 2018

jefferai commented Aug 30, 2018

jefferai commented Aug 30, 2018

glerchundi commented Sep 3, 2018 • edited Loading

jefferai commented Sep 4, 2018

jefferai commented Sep 4, 2018

sean- left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

glerchundi commented Sep 5, 2018

jefferai commented Oct 4, 2018

glerchundi commented Oct 4, 2018 via email

bjorndolk commented Nov 1, 2018 • edited Loading

bjorndolk commented Nov 1, 2018

bjorndolk commented Nov 2, 2018

louis-paul commented Nov 2, 2018

bjorndolk commented Nov 2, 2018

bjorndolk commented Nov 5, 2018

bjorndolk commented Nov 5, 2018

bjorndolk commented Nov 8, 2018

jefferai commented Nov 8, 2018

glerchundi commented Jan 31, 2019

louis-paul commented Jun 29, 2017 via email •

edited

Loading

sean- left a comment •

edited

Loading

louis-paul commented Jul 12, 2017 •

edited

Loading

sean- Jul 14, 2017 •

edited

Loading

glerchundi commented Sep 3, 2018 •

edited

Loading

bjorndolk commented Nov 1, 2018 •

edited

Loading