Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

physical/postgresql: Add support for high availability #2700

Closed
wants to merge 12 commits into from
Closed

physical/postgresql: Add support for high availability #2700

wants to merge 12 commits into from

Conversation

louis-paul
Copy link

Are there any e2e tests to run for HA? Thanks!

@deverton
Copy link
Contributor

Would this be better implemented with advisory locks? https://www.postgresql.org/docs/9.1/static/functions-admin.html#FUNCTIONS-ADVISORY-LOCKS

@louis-paul
Copy link
Author

louis-paul commented Jun 29, 2017 via email

@jefferai
Copy link
Member

jefferai commented Jul 1, 2017

Honestly, it sounds like advisory locks are the correct way to go. Compared to the traffic coming from Vault generally, an additional single connection with low traffic doesn't seem like a bad tradeoff for getting HA capabilities. That might just be my postgres naivete coming through though.

@louis-paul
Copy link
Author

A particularity of PostgreSQL is that client connections are very expensive for servers, more so than for other databases. From my (limited) understanding of the internals of Postgres, each connection is a separate process with a lot of overhead. Connection pooling middleware (pgbouncer or pgpool) is commonly used to have fewer connections with more activity. The PostgreSQL wiki has a page on connection count tuning.

For reference, Vault seems to currently use 2 connections per instance (the default from database/sql).

I wanted to highlight the potential tradeoffs; I would be happy to rewrite the pull request using actual locks if you think that's best. Having the opinion of someone with more experience administering PostgreSQL could be valuable as well.

@jefferai
Copy link
Member

jefferai commented Jul 3, 2017

Yeah -- I don't have that experience and I don't know that anyone on the Vault team does. I just know that whenever postgres HA has been discussed in the past advisory locks were deemed the correct approach. I have someone in mind to loop in, will try to get them to comment.

Copy link
Contributor

@sean- sean- left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a great start. There are a number little changes. The length of the review is indicative of some nits that predate @louis-paul and trying to dry-code some of the suggestions.

There are two primary changes that I'd like to see with this: 1) the DELETE should happen after the lock has been acquired. I think this needs to happen in a different way where all Vault servers perform an INSERT and then the election happens as an UPDATE. 2) The renewal should be automatically calculated based on the lockTTL as described in the doc comments and elsewhere throughout the review.

I've actually come full circle on the advisory locks for PostgreSQL HA. In order to perform correct hand-off between two different PostgreSQL backends when a failover happens faster than a lock duration, or using something like pl_paxos, or CockroachDB, use of a heartbeat table like this is actually the most stable way to go.

In the future I can see an additional configuration parameter that would possibly modify this behavior to do something backend specific, but for the time being, the most universally useful and reliable way to do HA PostgreSQL is to use a heartbeat table and poll the database every poll_interval.

@@ -39,6 +56,12 @@ func newPostgreSQLBackend(conf map[string]string, logger log.Logger) (Backend, e
}
quoted_table := pq.QuoteIdentifier(unquoted_table)

unquoted_lock_table, ok := conf["lock_table"]
if !ok {
unquoted_lock_table = "vault_lock"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add the following to the const section at the top of the file:

DeafultPostgreSQLLockTable = "vault_lock"

And then this becomes:

if !ok {
  unquotedLockTable = DeafultPostgreSQLLockTable
}

const (
// PostgreSQLLockRetryInterval is the interval of time to wait between new
// locking attempts
PostgreSQLLockRetryInterval = time.Second
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prefix with Default, changed it from "Retry" to Poll" because that's how its being used further below: s/PostgreSQLLockRetryInterval/DefaultPostgreSQLPollInterval/g

DefaultPostgreSQLPollInterval = 1 * time.Second

PostgreSQLLockRetryInterval = time.Second
// PostgreSQLLockErrorRetryMax is the number of retries to make after an
// error fetching the status of a lock
PostgreSQLLockErrorRetryMax = 5
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's remove PostgreSQLLockErrorRetryMax all together. We're mixing different dimensions of liveliness (time and counts). Everything else here is specified in terms of time, let's stick with that.

// error fetching the status of a lock
PostgreSQLLockErrorRetryMax = 5
// PostgreSQLLockTTL is the maximum length of time of a lock, in seconds
PostgreSQLLockTTL = 10
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prefix with Default: s/PostgreSQLLockTTL/DefaultPostgreSQLLockTTL/g

Also, let's explicitly type PostgreSQLLockTTL so that the compiler will catch any uses of a negative value:

DefaultPostgreSQLLockTTL = uint(10)

// or better yet:

DefaultPostgreSQLLockTTL = 10 * time.Second

PostgreSQLLockTTL is assumed to be a positive integer further down in the code. Either a uint or preferably let's move this to a time.Duration. Further below I add a small test for this as a parameter. The rest of this review assumes a time.Duration.

unquoted_lock_table = "vault_lock"
}
quoted_lock_table := pq.QuoteIdentifier(unquoted_lock_table)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add configuration parameters? Something like:

// Move this to the top of the file with the other consts
const (
  PostgreSQLLockTTLConf = "lock_ttl"
  PostgreSQLPollIntervalConf = "poll_interval"
  PostgreSQLLockTableConf = "lock_table"
  PostgreSQLLockSchemaConf = "lock_schema"
  MinimumPostgreSQLPollInterval = 1 * time.Second
  MinimumPostgreSQLLockGracePeriod = 1 * time.Second

  // Meta-comment: I am explicitly not using "public" as the default schema so that the DBA caring for this instance can perform an ALTER ROLE vault_user SET search_path = 'hashicorp' or something and have it just work from either the DBAs perspective of Vault's perspective.  TL;DR: it is not okay to explicitly use or suggest the use of the public schema even though it may be the default.
  DefaultPostgreSQLSchema = ""
  DefaultPostgreSQLLockTable = "vault_lock"
)

var lockTTL time.Duration
{
  rawLockTTL, found := conf[PostgreSQLLockTTLConf]
  if found {
    var err error
    if lockTTL, err = time.ParseDuration(rawLockTTL); err != nil {
      return nil, fmt.Errorf("%s error: %v", PostgreSQLLockTTLConf, err)
    }
  } else {
    lockTTL = DefaultPostgreSQLLockTTL
  }
}

var pollInterval time.Duration
{
  rawPollInterval, found := conf[PostgreSQLPollIntervalConf]
  if found {
    if pollInterval, err = time.ParseDuration(rawPollInterval); err != nil {
      return nil, fmt.Errorf("%s error: %v", err, PostgreSQLPollIntervalConf)
    }
  } else {
    pollInterval = DefaultPostgreSQLPollInterval
  }
}

var lockTableName string
{
  rawLockTableName, found := conf[PostgreSQLTableNameConf]
  if found {
    lockTableName = pq.QuoteIdentifier(strings.TrimSpace(rawLockTableName))
  } else {
    lockTableName = DefaultPostgreSQLLockTableName
  }
}

var lockSchemaName string
{
  rawLockSchemaName, found := conf[PostgreSQLLockSchemaNameConf]
  if found {
    lockSchemaName = pq.QuoteIdentifier(strings.TrimSpace(rawLockSchemaName))
  } else {
    lockSchemaName = DefaultPostgreSQLLockSchemaName
  }
}

// Sanity check inputs
if pollInterval < 0 {
  return nil, fmt.Errorf("%s (%q) must be a positive time duration", PostgreSQLPollIntervalConf, pollInterval)
}

if !(pollInterval < lockTTL) {
  return nil, fmt.Errorf("%s (%q) must be smaller than the %s (%q)", PostgreSQLPollIntervalConf, PostgreSQLLockTTLConf, pollInterval, lockTTL)
}

if pollInterval < MinimumPostgreSQLPollInterval {
  return nil, fmt.Errorf("%s (%q) can not be less than %q", PostgreSQLPollIntervalConf, pollInterval, MinimumPostgreSQLPollInterval)
}

if lockTTL - pollInterval < MinimumPostgreSQLLockGracePeriod {
  return nil, fmt.Errorf("There must be at least %s between the %s (%q) and %s (%q)", MinimumPostgreSQLLockGracePeriod, PostgreSQLPollIntervalConf, pollInterval, PostgreSQLLockTTLConf, lockTTL)
}

if lockTableName == "" {
  return nil, fmt.Errorf("%s error: can not be an empty string", PostgreSQLTableNameConf)
}

// No sanity check on lockSchemaName

WHERE key = $1`,
m.key,
).Scan(&held, &value)
return held, value, err
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

valueSQL := fmt.Sprintf(`SELECT expiration > now(), value FROM %s WHERE key = $1 `, m.relationName()
err := m.client.QueryRow(valueSQL, m.key).Scan(&held, &value)

value TEXT,
vault_id TEXT NOT NULL,
expiration TIMESTAMP NOT NULL
);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CREATE TABLE vault_lock (
key        TEXT COLLATE "C" PRIMARY KEY,
value      TEXT COLLATE "C",
vault_id   TEXT COLLATE "C" NOT NULL,
expiration TIMESTAMP NOT NULL
);

Be explicit here: TIMESTAMP WITHOUT TIME ZONE vs TIMESTAMP and use COLLATE "C" instead of leaving it up to the database's defaults. COLLATE "C" also has the nice benefit of consistent ordering because it bypasses calls to iconv(3) (important when replicating across multiple architectures or distributions and is ~2x faster if for some reason this ever became a performance concern).

@@ -82,6 +89,9 @@ LANGUAGE plpgsql;
which to write Vault data. This table must already exist (Vault will not
attempt to create it).

- `lock_table` `(string: "vault_lock")` – Specifies the name of the table to use
for high availability locks. Like `table`, this table must already exist.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other doc items to add:

* `lock_ttl` (string: "10s") - Specify the duration of the each lease renewal of the leader.  Once the leader has acquired the lock, it will fresh its lock halfway through its lease.  The first time the leader fails to renew its lease it will attempt every `poll_interval` until either successful or it looses its lease.
* `poll_interval` (string: "1s") - Specify how often clients should poll for the lock.
* `lock_schema` (string: "") - Specify the schema name to use.  If specified the `lock_schema` is used to provide the fully-qualified name of the `lock_table` and bypass looking for the table using the connecition's [`search_path`](https://www.postgresql.org/docs/current/static/ddl-schemas.html#DDL-SCHEMAS-PATH).

if err != nil {
return nil, err
}
lockTTL := strconv.Itoa(PostgreSQLLockTTL)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove lockTTL.

Add a small helper function, relationName():

func (m *PostgreSQLLock) relationName() string
  relationName := pq.QuoteIdentifier(m.lockTableName)
  if schemaName == "" {
    relationName = fmt.Sprintf("%s.%s", pq.QuoteIdentifier(m.lockSchemaName), pq.QuoteIdentifier(m.lockTableName))
  }
  return relationName
}

Both lockTableName and lockSchemaName were already pq.QuoteIdentifier()'ed, but if this gets moved to a helper function then we can remove the pq.QuoteIdentifer() at config time. This is defensive in advance of a HUP signal handler.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lockSQL := fmt.Sprintf(`INSERT INTO %s (key, value, vault_id, expiration)
    VALUES ($1, $2, $3, now() + $4::INTERVAL)`, relationName)

Instead of concatenating strings, use fmt.Sprintf() to construct the SQL.

That way we're no longer constructing SQL by hand. It's low risk here, but worth observing best practices.

m.key,
m.value,
m.vaultID,
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rely on the database driver to perform this logic now that lock expiration, lockTTL, is of type time.Duration.

_, err := m.client.ExecContext(ctxt, lockSQL, m.key, m.value, m.vaultID, m.lockTTL.String())

 - Remove string literals in the code
 - Stop constructing SQL by hand
 - Simplify configration options
 - Store hostname in the lock key
 - Add support for custom schemas
Lock refreshes now only use the PostgreSQLPollInterval and
PostgreSQLLockTTL options. Locks are refreshed halfway through their
expiration. Failure to refresh a lock causes the leader to increase the
frequency of its attempts until the lock is lost.
 - Small changes to the lock table structure
 - Add new HA configuration option
@louis-paul
Copy link
Author

louis-paul commented Jul 12, 2017

Sean, thanks for the extensive review. I've fixed most of the style/configuration issues you mentioned. Let me know if there are more.

I have changed the lock watching logic, it would be great to see if that's what you are looking for. Something I have not changed yet is the main locking logic; I've replied to your comment with my questions.

Copy link
Contributor

@sean- sean- left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent, these changes look good. I responded to the locking protocol inline.

@@ -50,17 +58,68 @@ func newPostgreSQLBackend(conf map[string]string, logger log.Logger) (Backend, e
return nil, fmt.Errorf("missing connection_url")
}

unquoted_table, ok := conf["table"]
unquoted_table, ok := conf["lockTableName"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Go variable name for the configuration parameter lock_table should be lockTableName. Actually, the line should just read:

unquotedTable, found := conf[PostgreSQLLockTableNameConf]

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was some heavy-handed refactoring with my editor 🙂

@@ -221,13 +289,23 @@ func (m *PostgreSQLBackend) LockWith(key, value string) (Lock, error) {
if err != nil {
return nil, err
}
// Record the hostname to give DBAs a chance to figure out which Vault
// service has the lock
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Enhance this comment and add, "this error is not fatal if the hostname fails for some reason. If this fails, use a sensible default." Or something. :)

"strings"
"time"

"github.com/armon/go-metrics"
"github.com/hashicorp/go-uuid"
"github.com/lib/pq"
log "github.com/mgutz/logxi/v1"
"github.com/mgutz/logxi/v1"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did goimports remove the log prefix here? The sorting looks right, however.

log "github.com/mgutz/logxi/v1"

refreshTicker := time.NewTicker(m.lockTTL / 2)
defer refreshTicker.Stop()
pollTicker := time.NewTicker(m.pollInterval)
defer pollTicker.Stop()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for remembering to Stop these tickers.

@@ -376,17 +382,21 @@ func (m *PostgreSQLLock) watch() {
r, err := m.client.Exec(refreshLockSQL, m.lockTTL.String(), m.key,
m.vaultID)
if err != nil || r == nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If err != nil, log the error but stay in watch().

If r == nil, then test and conditionally return.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please have a look at the new version, let me know if that's what you meant.

)
if err != nil {
return nil, err
}
Copy link
Contributor

@sean- sean- Jul 14, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The locking protocol could race if two servers start at the same time. Also, the key isn't very sophisticated so I wouldn't recommend using it for much of anything right now (it's important to make it part of the lock, but I wouldn't rely on it exclusively):

vault/vault/core.go

Lines 39 to 41 in f75f5b0

// coreLockPath is the path used to acquire a coordinating lock
// for a highly-available deploy.
coreLockPath = "core/lock"

vault/vault/core.go

Lines 736 to 737 in f75f5b0

// Initialize a lock
lock, err := c.ha.LockWith(coreLockPath, "read")

lock, err := c.ha.LockWith(coreLockPath, uuid)

I think it's necessary to do something like the following:

  1. I'd change the table definition to be something like the following (range types FTW):
-- Optional:
--
-- CREATE EXTENSION IF NOT EXISTS btree_gist;
--
-- if you want to incorporate the `key` into the lock domain - good practice, not required.  Not all installations have the `btree_gist` extension.

CREATE TABLE vault_lock (
  vault_id   TEXT COLLATE "C" NOT NULL,
  key        TEXT COLLATE "C",
  value      TEXT COLLATE "C",

  -- lock_lease
  lock_lease tstzrange NOT NULL,

  -- row_expiration is the time that we should unconditionally remove this row.  row_expiration should be at least 2x lock_ttl.
  row_expiration TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT NOW() + '30 minutes'::INTERVAL,
  EXCLUDE USING gist (/* key WITH = ,*/ lock_lease WITH &&)
);

CREATE INDEX vault_id_idx ON vault_lock(vault_id);
-- CREATE INDEX vault_lock_row_expiration_idx ON vault_lock(row_expiration); -- only suggested if the row_expiration is very large and a large number of rows accumulate.  This is not necessary but couldn't hurt much.
CREATE INDEX vault_key_idx ON vault_lock(vault_id) WHERE key IS NULL;
  1. The easy part is is cleaning old rows: DELETE FROM vault_lock WHERE row_expiration < NOW(). Simple and easy to understand. This could be run by any node (ideally just the one that has the lock after it acquires or renews its lock).

  2. Registration of a node is a prerequisite: INSERT INTO vault_lock (key, value, vault_id) VALUES ($1, $2, $3)

  3. Lock acquisition is an UPDATE: UPDATE vault_lock SET key = $1, lock_lease = ('\'[' || NOW() || ',' || NOW() + $2 || ']\'')::tstzrange WHERE vault_id = $3 AND key IS NULL;

  4. If the above lock has been acquired, you can expire all old rows (query from step 2) DELETE.

  5. Lock renewal is the following UPDATE: UPDATE vault_lock SET lock_lease = (\'[' || lower(lock_lease) || ',' || NOW() + $1 || ']\'')::tstzrange WHERE vault_id = $2 AND key = $3

For followers attempting to acquire the lock, poll using:

SELECT TRUE AS lock_held, vault_id AS lock_held_by, upper(lock_lease) - NOW() AS remaining_lock_duration FROM vault_lock WHERE lock_lease @> NOW()

If zero rows come back, attempt to acquire the lock. If the lock acquisition fails, there was a race and someone else acquired the lock and the follower should go back to polling. It's tempting to use the remaining_lock_duration as the time that the followers should sleep but that also means that the lock won't be acquired until the the lease expires. Maybe do that if the poll_interval is set to 0s. ? Food for thought.

This lock table is now able to accept registration of all Vault
instances. Locks are held by updating the `lock_lease_end` field of a
row. Irresponsive Vault instances are periodically garbage collected.
The lock system now has one main loop that does all the lock-related
maintenance (lock polling, lock updates, `row_expiration` updates, row
garbage collection). A channel is used for signals to the loop.
@louis-paul
Copy link
Author

I revamped the locking logic using your suggestions @sean-. I could not wrap my head around your CockroachDB-compatible transaction, but I tested the SQL statements against CockroachDB v1.0.4. After after changing a bit the table, they seem compatible.

All instances now register in the table when starting the locking process. Active locks are represented with a lock_lease_end field that is after NOW(). Instances check for the availability of the lock and grab it within the same SERIALIZABLE transaction. I removed the lock_lease_start which did not seem useful for now. I assumed the row_expiration field you made mention of is a mean of garbage collecting dead Vault instances and added logic for that purpose. I’ve hard-coded row_expiration to be 2 × lock_ttl but we could add a configuration setting for that.

@sokoow
Copy link

sokoow commented Oct 10, 2017

Bump, where are we with this - will it ever get merged ?

@jefferai
Copy link
Member

@sokoow waiting on review/discussion.

@mytototo
Copy link

Any news regarding the merge of HA feature for PostgreSQL ?

@louis-paul louis-paul closed this Aug 30, 2018
@jefferai
Copy link
Member

@louis-paul why'd you close it?

@louis-paul
Copy link
Author

@jefferai: There has not been any significant activity for a over a year. Unless I’m mistaken, this is despite the latest review comments having been addressed 😕

@jefferai
Copy link
Member

@louis-paul Probably better to just try pinging @sean- again instead of closing it :-)

@jefferai jefferai reopened this Aug 30, 2018
@jefferai
Copy link
Member

(P.S. I've also pinged @sean- through a back channel...hopefully I'll at least hear whether or not he is going to review this, and then we can proceed from there)

@glerchundi
Copy link

glerchundi commented Sep 3, 2018

This would be absolutely fantastic if it can get into Vault, good job @louis-paul and it's also CockroachDB-compatible (@sean- 👏 )!

I know that probably everyone is busy from the Hashicorp's side but having a estimated time of action would help making this process a little bit more community-friendly. I also experienced the same feeling some time ago when my PR took 2 years long til got it merged (hashicorp/terraform#3858).

At that time @grubernaut told me that they were trying to improve the reviewing process but, sadly, this PR shows that nothing has changed since then.

This makes that for example me, as an independent developer, will think twice before starting a contribution process on any Hashicorp's projects.

/cc @mitchellh @armon

@jefferai
Copy link
Member

jefferai commented Sep 4, 2018

I know that probably everyone is busy from the Hashicorp's side but having a estimated time of action would help making this process a little bit more community-friendly. I also experienced the same feeling some time ago when my PR took 2 years long til got it merged (hashicorp/terraform#3858).
At that time @grubernaut told me that they were trying to improve the reviewing process but, sadly, this PR shows that nothing has changed since then.

Please note that Terraform is a totally separate product with totally separate workflows, and Jake has not been with the company for a long time. I don't know what the holdup was with Terraform, but the problem with this issue is that HashiCorp explicitly does not maintain this storage backend. We do basic due diligence, but we don't test it out, and we aren't PG experts. So we try to get people that actually are using it and/or are PG experts, other than the PR submitter, to review it so that merging something doesn't break everybody else.

The problem right now is that the person who did the review and asked for a number of changes then disappeared, and in the interim time, past the initial ping after the changes were made, nobody tried pinging that person to see if they could come back and take another look.

The person has been pinged here and I've pinged them through a backchannel, but that was on Thursday and it's the first day back from a holiday weekend here in the U.S. It's not unreasonable that there's been no reply yet.

Another possibility would be to find other users of the storage backend (there are many; you could for instance post on the mailing list) and ask some of them to review it. Then you're not bottlenecked on the single reviewer from before.

@jefferai
Copy link
Member

jefferai commented Sep 4, 2018

I should also mention that there are merge conflicts which have to be solved before we can merge post-review anyways...

Copy link
Contributor

@sean- sean- left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not in a position to test this code, but this looks good for this eyeball-linter, modulo some of the comments in the review. I had to think through the locking protocol a few times, but this looks sound and portable. Most of this should work for crdb with a little work (the list_query attr for PostgreSQLBackend is the only item that actually jumped out at me as a crdb portability issue).


if lockTTL-pollInterval < MinimumPostgreSQLLockGracePeriod {
return nil, fmt.Errorf(
"There must be at least %s between the %s (%q) and %s (%q)",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lowercase t in there

}
existingSQL := `AND (lock_lease_end IS NULL OR lock_lease_end < NOW())`
if existing {
existingSQL = `AND lock_lease_end >= NOW()`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may read like a petty nit, but I have a style preference to prepend the whitespace in the SQL that is appended to the statement below.

return err
}
}
existingSQL := `AND (lock_lease_end IS NULL OR lock_lease_end < NOW())`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prepend a single space character ( ), see the next few comments for rationale.

}

grabSQL := fmt.Sprintf(`UPDATE %s SET lock_lease_end = NOW() + $3::INTERVAL
WHERE vault_id = $1 AND key = $2 %s`, m.relationName(), existingSQL)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WHERE vault_id = $1 AND key = $2%s

That way if there is an option to have an empty existingSQL, there is no extra trailing space in the SQL that is emitted.

}
if res == nil {
tx.Rollback()
return errors.New("Tried updating a lock but affected 0 rows")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lowercase T (first character in an error message should be lowercase, if you can make a sweep for that style idiom).

@@ -74,6 +156,11 @@ func newPostgreSQLBackend(conf map[string]string, logger log.Logger) (Backend, e
"UNION SELECT DISTINCT substring(substr(path, length($1)+1) from '^.*?/') FROM " +
quoted_table + " WHERE parent_path LIKE concat($1, '%')",
logger: logger,

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is unrelated to this PR, but something I noticed in this review (it can be tackled in the future as a new issue or quickly bolted onto this PR): it would be good if the above SQL were constructed using fmt.Sprintf()-like semantics like is done elsewhere in the provider.

I need to re-read what list_query is doing and how/where it's being used, but I don't think the above UNION SELECT DISTINCT query will work for CockroachDB (CC @bdarnell ).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pinging @knz which is one of the master & commanders of CockroachDB in case he says this is not supported yet and if he has an idea of how could be rewritten in a way which it is.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not exactly sure what the question is? You already use pq.QuoteIdentifier properly; you could format your query with fmt.Sprintf("SELECT value FROM %s WHERE path = $1 AND key = $2", quoted_table) and that should work. Was there any reason to not do that?

Meanwhile regarding the list_query:

  • the UNION should work; however
  • I'm not sure what you are doing with substring( ... from '^.*?/'). This does't look right; substring does not work with regular expressions. There are regexp_xxx() functions that can help though.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was for the UNION SELECT DISTINCT thingy so thanks for the quick response!

}

// Lock grabs a lock, or waits until it is available
func (m *PostgreSQLLock) Lock(stopCh <-chan struct{}) (<-chan struct{}, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Woof. What I wouldn't give to see this moved to use context.Context instead of a dedicated channel to indicate shutdown. I don't think that's possible or in-scope for this PR. CC @jefferai in case there's a change to the interface coming down the pike soon. I see context.Context is being used immediately below, but this now sticks out when reading this code.


ctx, cancel := context.WithCancel(context.Background())
go func() {
m.stepDownCh <- <-stopCh
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not terribly worried about this given the timescale at which promotions come and go, but all step-down and promotion events should be serialized. If this node flaps, it is a low-probability event that this node steps down after being promoted. It would be nice if these step-own events were aligned to some configuration epoch. Unrelated to Vault, but this is something we've had to code against in the last few months a prudent thing to guard against. I don't think a spurious step-down is fatal in this case and think this code is just cargo-culted from elsewhere, so all backends could have this issue.

@glerchundi
Copy link

@jefferai now is crystal clear, thanks for giving us an update.

@louis-paul in case you have no time to invest on this I can take it and made the requested changes.

@jefferai
Copy link
Member

jefferai commented Oct 4, 2018

@glerchundi It seems like the OP isn't interested -- if you can address this feedback it'd be great!

@glerchundi
Copy link

glerchundi commented Oct 4, 2018 via email

@bjorndolk
Copy link
Contributor

bjorndolk commented Nov 1, 2018

Hello!,
I started working on addressing feedback in this PR. The filestructure for under physical/... has been refactored since this PR was opened. It is kind of a mess to merge.
@jefferai
How about I create a new PR and do a manual merge of the work louis-paul has done?

@bjorndolk
Copy link
Contributor

We (silverrail) may also work on a docker image for postgres to get this more testable, but we will see about that.

@bjorndolk
Copy link
Contributor

@louis-paul would you consider allowing me write access to your repo, hereby allowing me to contribute to this PR instead of opening a new one?

@louis-paul
Copy link
Author

Hi @bjorndolk, I’ve added you as a collaborator to the fork’s repo (closing this PR and opening a new one is also fine).

I have looked at updating the branch, but the amount of work is non-trivial and I found my code’s quality to be worse than I expected. Sorry for the inactivity on this change, but I can’t justify spending more time on this as my company has dropped Vault.

@bjorndolk
Copy link
Contributor

@louis-paul Thanks !! I got this merged and compiling now. Next is to add the generic HA tests and see how that goes.

@bjorndolk
Copy link
Contributor

Running the genric tests, normal backend tests still work, so didnt break that in merge.
However the HA tests breaks, these tests did not exist when this was originally written. I will dig into this.

@bjorndolk
Copy link
Contributor

The HA tests get stuck when trying to lock for some reason.
I find this model for locks somewhat messy why do evry instance need to create an entry with null lease_end, instead of only having one lock entry owned by some instance (similiar logic to the dynamodb implementation). The row_expiration column seems unnecisarry to me lock_lease_end should be sufficient, expired locks should always be safe to remove.

I am itching to rewrite this to function similar to dynamodb with a simple insert ... on conflict .. update .. where .. statement to try to steal lock. Since lock steal is done in one bang no need for high transaction isolation levels, even dirty-read should work.

May I rewrite this logic? @jefferai @louis-paul

@bjorndolk
Copy link
Contributor

@jefferai please consider #5731 for postgres ha support.

@jefferai jefferai closed this Nov 8, 2018
@jefferai
Copy link
Member

jefferai commented Nov 8, 2018

Closed this as the only person that seems interested in getting it across the finish line has rewritten it in a new PR (linked above).

@glerchundi
Copy link

Sorry for the inactivity on this change, but I can’t justify spending more time on this as my company has dropped Vault.

Could I ask what are you using instead of Vault, hand rolled or another soft? We're evaluating the use of Vault and would be good to know why people drop Vault!

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants