Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove explicit "init" step to start a cluster #4027

Merged
merged 1 commit into from
Feb 3, 2016

Conversation

spencerkimball
Copy link
Member

Previously, you had to create a cluster by init'ing and starting first node:

./cockroach init --stores=...
./cockroach start --stores=... --gossip=self=

and add new nodes with:

./cockroach start --stores=... --gossip=<node[,node,...]>

Now, a cluster is created just by running start:

./cockroach start --stores=...

and add new nodes with:

./cockroach start --stores=... --join=<node[,node,...]>

Also cleaned up the gossip address storage code. It had gotten out of
whack somehow and wasn't storing addresses in the event that a node
had no bootstrapped stores on start.

Review on Reviewable

@vivekmenezes
Copy link
Contributor

Awesome Spencer!

On Thu, Jan 28, 2016 at 8:40 PM Spencer Kimball notifications@github.com
wrote:

Previously, you had to create a cluster by init'ing and starting first
node:

./cockroach init --stores=...
./cockroach start --stores=... --gossip=self=

and add new nodes with:

./cockroach start --stores=... --gossip=

Now, a cluster is created just by running start:

./cockroach start --stores=...

and add new nodes with:

./cockroach start --stores=... --join=

Also cleaned up the gossip address storage code. It had gotten out of
whack somehow and wasn't storing addresses in the event that a node

had no bootstrapped stores on start.

You can view, comment on, or merge this pull request online at:

#4027
Commit Summary

  • Remove explicit "init" step to start a cluster

File Changes

Patch Links:


Reply to this email directly or view it on GitHub
#4027.

@petermattis
Copy link
Collaborator

I didn't give this a completely thorough review, just enough to understand how it was working. Looks good.


Review status: 0 of 24 files reviewed at latest revision, 4 unresolved discussions, some commit checks failed.


cli/flags.go, line 78 [r1] (raw file):
Is the unix resolver used?


cli/start.go, line 92 [r1] (raw file):
If the node is initialized, what happens if the --join flag is specified? I would assume it is a no-op. Probably worth calling out that case.


server/context.go, line 245 [r1] (raw file):
Why isn't this being done in the cli package? I would set the default value of the --join flag to os.Getenv("COCKROACH_JOIN").


server/node.go, line 270 [r1] (raw file):
Heh, is this the new style for log messages?


Comments from the review on Reviewable.io

@nvanbenschoten
Copy link
Member

Reviewed 24 of 24 files at r1.
Review status: all files reviewed at latest revision, 5 unresolved discussions, some commit checks failed.


gossip/gossip.go, line 403 [r1] (raw file):
Wouldn't ex == addr work? If not, might be worth adding a Equal method.


Comments from the review on Reviewable.io

@tamird
Copy link
Contributor

tamird commented Jan 29, 2016

This seems like it's going to require different run scripts for bootstrapped and non-bootstrapped nodes, just like the current state. What's the benefit?


Reviewed 24 of 24 files at r1.
Review status: all files reviewed at latest revision, 9 unresolved discussions, some commit checks failed.


main.go, line 38 [r1] (raw file):
why lose the error here?


server/node_test.go, line 133 [r1] (raw file):
directly


server/server.go, line 208 [r1] (raw file):
s/the //


server/server.go, line 226 [r1] (raw file):
comment is not useful


Comments from the review on Reviewable.io

@bdarnell
Copy link
Contributor

I worry that this optimizes for ease of use for toy deployments but makes things worse for non-toy ones. Deployments that want to use the http-lb resolver type, for example, will have to start one node with no flags, kill it (unlike the old init step where it died after doing its job), and then they can start everything with the identical http-lb argument.

With gossip persistence, the fancier resolvers are less significant, so maybe many deployments can do without them. But even then, things are tricky. If --join is a single node (or list of nodes) instead of a load balancer (or DNS alias, etc), then it must be a node that was up at the time this node was bootstrapped. However, this means that either the --join flags get out of sync across all nodes or nodes get restarted just to update this flag which no longer has any effect. Many deployment schemes do not have a good way to handle this kind of per-node variance in command arguments. (maybe an environment variable makes this easier in some deployment systems, but none of the ones I've personally worked with have had this property).

I think that getting rid of the init command is a good idea only if we can come up with a solution that has the following properties:

  • All nodes can be started with the same command line flags.
  • If the --join flag specifies a load balancer or alias then those flags never need to change over time.
  • Based on those flags, one node will bootstrap exactly once.

My suggestion is to add a --bootstrap-on=addr flag to specify the node that should bootstrap. This flag would have no effect on nodes other than the indicated one, but cause that node to bootstrap rather than failing if it has no bootstrapped stores. There is some chance that a node may be removed from service and a new node installed in its place, and we wouldn't want that store to re-bootstrap. This could be prevented with a --bootstrap-by=date flag to ensure that after a day has passed we won't try to bootstrap again.

(This would've been a good candidate for an RFC, BTW, and it may be worth doing that before coding up another iteration of the idea)


Review status: all files reviewed at latest revision, 15 unresolved discussions, some commit checks failed.


acceptance/cluster/localcluster.go, line 246 [r1] (raw file):
I think this can be simplified further. Maybe use Entrypoint: "/bin/true" and don't reference the cockroach binary at all?


gossip/gossip.go, line 273 [r1] (raw file):
Why was this removed? addr.String() is not guaranteed to be unique across different network types.


gossip/gossip.go, line 466 [r1] (raw file):
This will ultimately be quadratic in the number of nodes. Might be worth e.g. keeping the bootstrap info sorted so we can use a binary search.


gossip/resolver/node_lookup.go, line 108 [r1] (raw file):
This needs a comment to explain why nodeLookupResolver is permanently exhausted.


server/context.go, line 197 [r1] (raw file):
The --join flag will be empty in this case so can't we proceed to parse zero resolvers instead of treating this as a special case?


server/context.go, line 245 [r1] (raw file):
+1. By doing this here tests may be affected by COCKROACH_JOIN which is presumably unintended.


server/node.go, line 270 [r1] (raw file):
This should perhaps be written to stdout or stderr directly instead of going through the logs. (the logs have structured prefixes and generally don't contain newlines, and we may do something to reduce the default verbosity of the logs to stderr, such as defaulting --stderrthreshold to WARNING)


storage/stores.go, line 48 [r1] (raw file):
Why did this change? I'd still prefer to avoid a member variable for this.


Comments from the review on Reviewable.io

@spencerkimball
Copy link
Member Author

Lot of points here.

  • Toy deployments matter. I think they matter a great deal in this stage of what we're doing. The init step makes plenty of sense when either you've conceived it yourself or have banged your head against it a couple of times. But when approaching CockroachDB with no prior knowledge, it makes more sense to start the seed node and then have other nodes join it (notably without having to introduce the concept of gossip). I'm optimizing for new users' conceptual flow, and I'm taking that cue directly from some of our new engineers.
  • Some objections mentioned in comments are mitigated via use of the COCKROACH_JOIN environment variable – something that can be set on AWS on a per-EC2 instance basis. This will allow you to run all of your instances with the same command line flags or supervisord config files. @tamird, this addresses your comment. @bdarnell, I don't believe that the http-lb resolver type is all that relevant anymore, though I'm keeping it because I think it could make manual scaling a lot simpler with pre-existing clusters.
  • @bdarnell to you points: 1) All nodes CAN be started with the same command line flags, 2) seems irrelevant with gossip persistence, 3) would be great, but I don't think possible if you never want flags to change over time; if you insist on this point, then we won't be able to improve what I think are the deeper conceptual problems with our current process.
  • --bootstrap-on=addr seems reasonable for non-"toy" deployments, but it's pretty clunky when trying to illustrate to new users how simple CockroachDB is to setup. It's more difficult to explain what's going on there than what we already have with the init step.

Review status: all files reviewed at latest revision, 15 unresolved discussions, some commit checks failed.


cli/flags.go, line 78 [r1] (raw file):
No, I'll remove it.


cli/start.go, line 92 [r1] (raw file):
It's not a noop, it's always merged with the persisted gossip addresses to provide additional potential bootstrap hosts. You might still need to specify --join in the event that you re-networked all of your nodes.


gossip/gossip.go, line 466 [r1] (raw file):
I'll add a TODO.


gossip/resolver/node_lookup.go, line 108 [r1] (raw file):
Done.


main.go, line 38 [r1] (raw file):
It's already printed by cobra commander.


server/context.go, line 197 [r1] (raw file):
We could do that if we were sure the resolvers would be empty. If someone specifies --dev and also --join because they're confused though, this would cause the ephemeral single node to hang. Safer to ignore the --join flag.


server/node.go, line 270 [r1] (raw file):
@petermattis I wanted to call it out.
@bdarnell I'll write it to stdout.


storage/stores.go, line 48 [r1] (raw file):
Here's the situation:

  • Node is started with all uninitialized stores
  • gossip.SetStorage is called with an empty Stores object, as there are no stores yet
  • Gossip connects and finds other nodes in the cluster
  • Gossip calls gossip.Storage.WriteBootstrapInfo(), but nothing is written as no stores have been added to the Stores object.
  • Node and stores are initialized by allocating IDs. Stores are added to Stores object

In the old model, in that last step, it would just try to read the existing data and then update that to the newly added stores but the newly added stores have no data to read.


Comments from the review on Reviewable.io

@spencerkimball spencerkimball force-pushed the spencerkimball/remove-init-step branch from c22ac62 to fccf43d Compare January 29, 2016 16:23
@spencerkimball
Copy link
Member Author

Fixes #3909

@spencerkimball spencerkimball force-pushed the spencerkimball/remove-init-step branch from fccf43d to fabb223 Compare January 29, 2016 16:36
@spencerkimball spencerkimball force-pushed the spencerkimball/remove-init-step branch from fabb223 to d90d638 Compare January 29, 2016 16:56
@tamird
Copy link
Contributor

tamird commented Jan 29, 2016

Reviewed 8 of 8 files at r2, 2 of 2 files at r3.
Review status: all files reviewed at latest revision, 12 unresolved discussions, some commit checks failed.


server/node.go, line 270 [r1] (raw file):
How come stderr now?


Comments from the review on Reviewable.io

@petermattis
Copy link
Collaborator

Review status: all files reviewed at latest revision, 11 unresolved discussions, some commit checks failed.


server/node.go, line 270 [r1] (raw file):
If you're writing directly to stderr and expecting logs to go somewhere else (not stderr), I think all of the hoopla is overkill. I'd like to see a nice simple indication when a cluster is created. And when a node joins a cluster for the first time. I'm not sure this is the right approach, though I'm also not sure what the right approach is.


server/node.go, line 265 [r3] (raw file):
s/initialized/created/g? The term created seems simpler and more direct to me.


Comments from the review on Reviewable.io

@bdarnell
Copy link
Contributor

First impressions and toy deployments are definitely very important; I'm just worried that this tips the scales too far towards first impressions in a way that will make things more difficult later on.

You may be able to set environment variables on a per-node basis when creating EC2 nodes by hand, but if you're using any sort of automation to set up your instances it may not be as easy. So while the env var may help in some cases, it's not always applicable.

In your response to my points you say that "all nodes CAN be started with the same command line flags", are you assuming the use of per-node environment variables? Because I'm not seeing another way to do that. On point 3, I do insist that the system be reasonably robust at avoiding accidental re-bootstraps.

I think http-lb remains useful as a poor-man's DNS alias.

I agree that --bootstrap-on is not necessarily better than the current init command. So let's keep init as an option, along with the rest of this change. That would satisfy me by allowing deployments that choose this workflow to start every node with a --join flag.

More alternatives:

  • Real deployments will have TLS certificates. Could we use information embedded in the certs to choose a node to bootstrap (or if not in the certs themselves, could we use the same scripts to create certs and run init)? Can we streamline/automate the cert process enough that people can use it even for toy deployments?
  • What if we reversed the order: start all the nodes, which will link up the gossip network but nothing will be bootstrapped yet, and then send an RPC to one node to ask it to bootstrap. It's still an extra step but it's less of a conceptual hurdle because it's more like the CREATE DATABASE command that users will need to do after startup anyway. The risk, of course, is that there is a window during each new node's creation during which it could be tricked into re-bootstrapping.

Review status: all files reviewed at latest revision, 11 unresolved discussions, some commit checks failed.


Comments from the review on Reviewable.io

@JackKrupansky
Copy link

For reference, here's the process for initializing a Cassandra cluster:
https://docs.datastax.com/en/cassandra/3.x/cassandra/initialize/initSingleDS.html

I won't hold up Cassandra as being the ideal to pursue, but certainly you probably want to be at least as easy if not a lot easier than them.

I would hold them up as being the king of the hill to beat in the domain of fully-distributed horizontal scaleout databases (LSM with sstables and all of that, the Big Table heritage.)

They do use gossip coupled with designated seed nodes for bootstrapping.

@tbg
Copy link
Member

tbg commented Feb 1, 2016

I haven't given this a close look, but I tend to agree with @bdarnell's objections. On the other hand, I like the --join syntax and would like to make it work for all deployments (like http-lb). I'm confident that with some tweaks (bootstrap-on, bootstrap-by, perhaps bootstrap-only) we can manage that, and that we don't have to figure all of it out right now. I'm not a fan of keeping both the init step and the --join parameter since there should be an "obvious" way of settings things up.

I wouldn't add the COCKROACH_JOIN variable (right now). Might prove useful eventually, but having the COCKROACH_DEV flag around has cost me some time already due to the intransparency with which it works. I'm not opposed to using environment variables for configuration, but I would want to guard them behind a flag (defaulting to false) so that
./cockroach <op> --env forces your hand to supply all flags from the environment with names that clearly map to the flag names, and forbids you to mix-and-match how you pass your options. But that sounds like a separate PR to me.

@spencerkimball
Copy link
Member Author

A quick google search yields plenty of examples of people setting environment variables programmatically via automated AWS tools.

Yes, I'm assuming per-node env variables. Or else you could set up one node without --join and then stop it, then start them all with --join set to DNS round robin or http-lb endpoint. This change really doesn't make it any more difficult to do automated deploys.

@bdarnell, I'm not sure what you mean by http-lbbeing useful as a poor-man's DNS alias. If you have an http load balancer to play with, you certainly don't need a DNS alias. Also, I'm confused about why you'd want to keep init as an option. Are you saying make --init a command line flag, which acts to initialize the node to which it is supplied? If so, that would mean different flags again. If you mean keep it as a command, it's no different from running that node without --join, stopping it, then starting all of the nodes with identical --join flags.

In terms of the TLS certificates, I agree broadly with your point, but it argues that getting a "real" production cluster deployed should not necessarily focus on insisting that each node be run exactly once with identical flags. I think it's OK to have some one time setup steps when bootstrapping a cluster. What I'm really optimizing for with this change is making first time manual installs more approachable.

The second alternative seems considerably more complicated than what we have now and what this change provides.

@JackKrupansky, that Cassandra setup is surprisingly complicated, so we're looking good by contrast.

@tschottdorf, the --join syntax does work for all deployments (not sure what you mean by "like http-lb, though). I'll remove the use of environment variables for now.


Review status: all files reviewed at latest revision, 11 unresolved discussions, some commit checks failed.


server/node.go, line 270 [r1] (raw file):
@tamird because this was messing up a lot of Example style unittests. I think it probably should be a log message actually, so I'm going to switch it back.

@petermattis it really felt like this needs to be very clearly communicated in the logs and it usually just gets lost in the log noise.


Comments from the review on Reviewable.io

@spencerkimball spencerkimball force-pushed the spencerkimball/remove-init-step branch from d90d638 to 274f865 Compare February 1, 2016 15:07
@tbg
Copy link
Member

tbg commented Feb 1, 2016

I was referring to @bdarnell's comment about per-node variables. Agree though that there will be ways around it that are simple to handle in automated deploys (where a command more or less doesn't really matter).

Cassandra is indeed very intransparent with respect to bootstrapping (this is obvious from the docs, but also from my ops experience I remember a bunch of confusion around it), so I wouldn't take it as a benchmark here. Looking at RethinkDB it works exactly as in this suggested change.

@bdarnell
Copy link
Contributor

bdarnell commented Feb 1, 2016

I called http-lb a poor man's DNS alias because I think the DNS alias is a superior solution (clients can use it too!), but in my experience people are more likely to have a suitable HTTP server/proxy available than an easily-configurable DNS server.

Yes, init is equivalent to "start, wait, stop". It just avoids the "wait" step of uncertain duration (running it by hand is fine, but if you're submitting it to some job scheduler it may not be obvious how long you need to wait before killing the job). It could also be spelled something like start --stop-after-bootstrap.

@spencerkimball
Copy link
Member Author

Just like I've removed the environment variable code pending actual use case, I'll wait on the --stop-after-bootstrap until progress on various deployment mechanisms warrants it.

@bdarnell Is this an LGTM?

@spencerkimball spencerkimball force-pushed the spencerkimball/remove-init-step branch from 274f865 to 5f94d5a Compare February 1, 2016 22:15
@bdarnell
Copy link
Contributor

bdarnell commented Feb 1, 2016

I think the need for --stop-after-boostrap is more clear than the need for an environment variable but other than that LGTM. @mberhault also needs to weigh in.


Review status: 10 of 24 files reviewed at latest revision, 9 unresolved discussions.


cli/start.go, line 90 [r5] (raw file):
s/healhty/healthy/


gossip/resolver/node_lookup.go, line 108 [r1] (raw file):
But why is it always true instead of always false?


server/context.go, line 246 [r5] (raw file):
Why is this an error? I would think that --join=$MY_ADDR would act essentially the same as no --join flag but would prevent bootstrapping. This seems like a useful configuration after the initial setup has been done (so that all nodes get node zero's address on the command line, even node zero itself)


Comments from the review on Reviewable.io

bootstrap. Each item in the list has an optional type:
"join": wrapText(`
A comma-separated list of addresses to use when a new node is joining
an existing cluster. Each address in the list has an optional type:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You may want to be more explicit about leaving it empty for the very first node. It's implied here, but emphasis wouldn't hurt.

@mberhault
Copy link
Contributor

LGTM. I started converting my tools and scripts. I'm hoping to have it ready to merge by the time you're done, but even then, a broken nightly run or two won't hurt.

@spencerkimball
Copy link
Member Author

Review status: 10 of 24 files reviewed at latest revision, 10 unresolved discussions, some commit checks failed.


gossip/resolver/node_lookup.go, line 108 [r1] (raw file):
Oh, good point. Oops.


server/context.go, line 246 [r5] (raw file):
OK changed.


Comments from the review on Reviewable.io

@spencerkimball spencerkimball force-pushed the spencerkimball/remove-init-step branch from 5f94d5a to dac13d8 Compare February 2, 2016 20:34
Previously, you had to create a cluster by init'ing and starting first node:

./cockroach init --stores=...
./cockroach start --stores=... --gossip=self=

  and add new nodes with:

./cockroach start --stores=... --gossip=<node[,node,...]>

Now, a cluster is created just by running start:

./cockroach start --stores=...

  and add new nodes with:

./cockroach start --stores=... --join=<node[,node,...]>

Also cleaned up the gossip address storage code. It had gotten out of
whack somehow and wasn't storing addresses in the event that a node
had no bootstrapped stores on start.
@spencerkimball spencerkimball force-pushed the spencerkimball/remove-init-step branch from dac13d8 to 783d21b Compare February 3, 2016 14:16
spencerkimball added a commit that referenced this pull request Feb 3, 2016
…step

Remove explicit "init" step to start a cluster
@spencerkimball spencerkimball merged commit 175bd54 into master Feb 3, 2016
@spencerkimball spencerkimball deleted the spencerkimball/remove-init-step branch February 3, 2016 14:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants