Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Figure out what's up with 5.0 #98

Closed
tianon opened this issue Apr 14, 2016 · 68 comments
Closed

Figure out what's up with 5.0 #98

tianon opened this issue Apr 14, 2016 · 68 comments

Comments

@tianon
Copy link
Member

tianon commented Apr 14, 2016

Full log from running it:

$ docker run ...
[2016-04-14 16:41:30,521][WARN ][bootstrap                ] unable to install syscall filter: 
java.lang.UnsupportedOperationException: seccomp unavailable: your kernel is buggy and you should upgrade
    at org.elasticsearch.bootstrap.Seccomp.linuxImpl(Seccomp.java:279)
    at org.elasticsearch.bootstrap.Seccomp.init(Seccomp.java:616)
    at org.elasticsearch.bootstrap.JNANatives.trySeccomp(JNANatives.java:215)
    at org.elasticsearch.bootstrap.Natives.trySeccomp(Natives.java:99)
    at org.elasticsearch.bootstrap.Bootstrap.initializeNatives(Bootstrap.java:98)
    at org.elasticsearch.bootstrap.Bootstrap.setup(Bootstrap.java:151)
    at org.elasticsearch.bootstrap.Bootstrap.init(Bootstrap.java:263)
    at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:111)
    at org.elasticsearch.bootstrap.Elasticsearch.execute(Elasticsearch.java:106)
    at org.elasticsearch.cli.Command.mainWithoutErrorHandling(Command.java:88)
    at org.elasticsearch.cli.Command.main(Command.java:53)
    at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:74)
    at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:67)
Exception in thread "main" java.lang.RuntimeException: please set [discovery.zen.minimum_master_nodes] to a majority of the number of master eligible nodes in your cluster.
    at org.elasticsearch.bootstrap.BootstrapCheck.check(BootstrapCheck.java:79)
    at org.elasticsearch.bootstrap.BootstrapCheck.check(BootstrapCheck.java:60)
    at org.elasticsearch.bootstrap.Bootstrap.setup(Bootstrap.java:187)
    at org.elasticsearch.bootstrap.Bootstrap.init(Bootstrap.java:263)
    at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:111)
    at org.elasticsearch.bootstrap.Elasticsearch.execute(Elasticsearch.java:106)
    at org.elasticsearch.cli.Command.mainWithoutErrorHandling(Command.java:88)
    at org.elasticsearch.cli.Command.main(Command.java:53)
    at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:74)
    at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:67)
Refer to the log for complete error details.

(from #96)

@tianon
Copy link
Member Author

tianon commented Apr 14, 2016

Wagering a guess, I'm thinking that Docker's seccomp filter is blocking that unimplemented syscall outright instead of returning ENOSYS.

@tianon
Copy link
Member Author

tianon commented Apr 14, 2016

If I run it on Docker 1.9.1 (before seccomp filtering was implemented), it gets past that line and I get errors about missing cluster config instead, so that's pretty good confirmation of my guess.

@tianon
Copy link
Member Author

tianon commented Apr 14, 2016

Same with --security-opt seccomp=unconfined -- so we might have to come up with a seccomp profile or try to convince upstream that this check should handle blocked syscalls too. 😢

@tianon
Copy link
Member Author

tianon commented Apr 14, 2016

Here's the next hurdle (which is more one of new functionality than Docker blocking us):

Exception in thread "main" java.lang.RuntimeException: please set [discovery.zen.minimum_master_nodes] to a majority of the number of master eligible nodes in your cluster.
    at org.elasticsearch.bootstrap.BootstrapCheck.check(BootstrapCheck.java:79)
    at org.elasticsearch.bootstrap.BootstrapCheck.check(BootstrapCheck.java:60)
    at org.elasticsearch.bootstrap.Bootstrap.setup(Bootstrap.java:187)
    at org.elasticsearch.bootstrap.Bootstrap.init(Bootstrap.java:263)
    at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:111)
    at org.elasticsearch.bootstrap.Elasticsearch.execute(Elasticsearch.java:106)
    at org.elasticsearch.cli.Command.mainWithoutErrorHandling(Command.java:88)
    at org.elasticsearch.cli.Command.main(Command.java:53)
    at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:74)
    at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:67)
Refer to the log for complete error details.

@tianon
Copy link
Member Author

tianon commented Apr 14, 2016

https://www.elastic.co/guide/en/elasticsearch/reference/master/breaking_50_settings_changes.html#_discovery_settings

The discovery.zen.minimum_master_node must bet set for nodes that have network.host, network.bind_host, network.publish_host, transport.host, transport.bind_host, or transport.publish_host configuration options set. We see those nodes as in "production" mode and thus require the setting.

tianon added a commit to infosiftr/stackbrew that referenced this issue Apr 18, 2016
- `elasticsearch`: remove EOL 1.3, add 5.0.0-alpha1 (has some seccomp issues; see docker-library/elasticsearch#98)
- `ghost`: 0.7.9
- `java`: 9~b113-1
- `kibana`: 5.0.0-alpha1
- `logstash`: 5.0.0-alpha1
- `mongo`: 3.2.5
- `tomcat`: 7.0.69
tianon added a commit to infosiftr/stackbrew that referenced this issue Apr 18, 2016
- `elasticsearch`: remove EOL 1.3, add 5.0.0-alpha1 (has some seccomp issues; see docker-library/elasticsearch#98)
- `ghost`: 0.7.9
- `java`: 9~b113-1
- `kibana`: 5.0.0-alpha1
- `logstash`: 5.0.0-alpha1
- `mongo`: 3.2.5
- `tomcat`: 7.0.69
tianon added a commit to infosiftr/stackbrew that referenced this issue Apr 18, 2016
- `elasticsearch`: remove EOL 1.3, add 5.0.0-alpha1 (has some seccomp issues; see docker-library/elasticsearch#98)
- `ghost`: 0.7.9
- `java`: 9~b113-1
- `kibana`: 5.0.0-alpha1
- `logstash`: 5.0.0-alpha1
- `mongo`: 3.2.5
- `tomcat`: 7.0.69
@tianon
Copy link
Member Author

tianon commented May 5, 2016

The saga continues with alpha2:

Exception in thread "main" java.lang.RuntimeException: bootstrap checks failed
initial heap size [268435456] not equal to maximum heap size [1073741824]; this can cause resize pauses and prevents mlockall from locking the entire heap
max virtual memory areas vm.max_map_count [65530] likely too low, increase to at least [262144]
    at org.elasticsearch.bootstrap.BootstrapCheck.check(BootstrapCheck.java:93)
    at org.elasticsearch.bootstrap.BootstrapCheck.check(BootstrapCheck.java:66)
    at org.elasticsearch.bootstrap.Bootstrap$5.validateNodeBeforeAcceptingRequests(Bootstrap.java:191)
    at org.elasticsearch.node.Node.start(Node.java:323)
    at org.elasticsearch.bootstrap.Bootstrap.start(Bootstrap.java:206)
    at org.elasticsearch.bootstrap.Bootstrap.init(Bootstrap.java:269)
    at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:111)
    at org.elasticsearch.bootstrap.Elasticsearch.execute(Elasticsearch.java:106)
    at org.elasticsearch.cli.Command.mainWithoutErrorHandling(Command.java:91)
    at org.elasticsearch.cli.Command.main(Command.java:53)
    at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:74)
    at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:67)
Refer to the log for complete error details.

@tianon
Copy link
Member Author

tianon commented May 6, 2016

Ok, we might want to add -E es.bootstrap.seccomp=false to our default config to overcome the seccomp filtering issue, but alpha3 of 5.0 might include -E es.bootstrap.ignore_system_bootstrap_checks=true, if we're lucky. 😅

@tianon
Copy link
Member Author

tianon commented May 6, 2016

(At least for our tests -- I don't think our default config should reasonably include es.bootstrap.ignore_system_bootstrap_checks.)

@tianon
Copy link
Member Author

tianon commented May 6, 2016

(from elastic/elasticsearch@8e178c4)

@xificurC
Copy link

xificurC commented May 9, 2016

@tianon hi, the error you show from alpha2, how can one fix that? Running the current image gives me the same error.

@tianon
Copy link
Member Author

tianon commented May 9, 2016

I think adjusting vm.max_map_count is going to have to be something that happens on the host system.

I'm not sure yet how to overcome the heap size check properly 😞

@tianon
Copy link
Member Author

tianon commented May 9, 2016

$ docker run -it --rm --sysctl vm.max_map_count=262144 elasticsearch:5.0 -E es.bootstrap.seccomp=false
invalid value "vm.max_map_count=262144" for flag --sysctl: sysctl 'vm.max_map_count=262144' is not whitelisted
See 'docker run --help'.

@xificurC
Copy link

xificurC commented May 9, 2016

I see, so alpha2 image is unusable right now. I will look around tomorrow if I can be of any help

@tobstarr
Copy link

@xificurC I increased max_map_count on my docker host via sudo sysctl -w vm.max_map_count=262144 and then started elasticsearch using docker run -e ES_JAVA_OPTS="-Xms1g -Xmx1g" elasticsearch:5.0.0-alpha2 and it seems to work.

Without the ES_JAVA_OPTS elasticsearch is complaining about

initial heap size [268435456] not equal to maximum heap size [1073741824]; this can cause resize pauses and prevents mlockall from locking the entire heap

and then just crashes.

@xificurC
Copy link

@tobstarr Thanks, these changes make alpha2 work

@jmreicha
Copy link

jmreicha commented Jun 3, 2016

Just ran across this as well. Is there a fix or a workaround?

I am using the Docker beta app and don't know of a way to adjust vm.max_map_count.

@owjprice
Copy link

@jmreicha I am running into the same issue and using the Beta app as well....did you find a workaround to this?

@jmreicha
Copy link

@owjprice No I didn't spend much time on it, I just reverted back to 2.3 for now.

@tianon
Copy link
Member Author

tianon commented Jun 10, 2016

I'm able to launch alpha3 successfully with the following, although it's far from ideal:

$ docker run -it --rm -e ES_JAVA_OPTS='-Xms1g -Xmx1g' elasticsearch:5 -E bootstrap.ignore_system_bootstrap_checks=true
....

@dror-g
Copy link

dror-g commented Jun 10, 2016

I can confirm @tianon 's command works, but container seems to hang after a while.

RichardScothern pushed a commit to RichardScothern/official-images that referenced this issue Jun 14, 2016
- `elasticsearch`: remove EOL 1.3, add 5.0.0-alpha1 (has some seccomp issues; see docker-library/elasticsearch#98)
- `ghost`: 0.7.9
- `java`: 9~b113-1
- `kibana`: 5.0.0-alpha1
- `logstash`: 5.0.0-alpha1
- `mongo`: 3.2.5
- `tomcat`: 7.0.69
@nguoianphu
Copy link

@tianon ,
I'm sorry for commenting in a closed issue. But I'm using Docker Toolbox on Windows. I can set the sysctl -w vm.max_map_count=262144 but the setting is reset each time I restart my Docker host machine. Is there any way to deal with it? :(

@pires
Copy link

pires commented Oct 31, 2016

bootstrap.ignore_system_bootstrap_checks doesn't exist in 5.0.0.

[2016-10-31T13:50:41,262][WARN ][o.e.b.ElasticsearchUncaughtExceptionHandler] [] uncaught exception in thread [main]
org.elasticsearch.bootstrap.StartupException: java.lang.IllegalArgumentException: unknown setting [bootstrap.ignore_system_bootstrap_checks] please check that any required plugins are installed, or check the breaking changes documentation for removed settings
    at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:116) ~[elasticsearch-5.0.0.jar:5.0.0]
    at org.elasticsearch.bootstrap.Elasticsearch.execute(Elasticsearch.java:103) ~[elasticsearch-5.0.0.jar:5.0.0]
    at org.elasticsearch.cli.SettingCommand.execute(SettingCommand.java:54) ~[elasticsearch-5.0.0.jar:5.0.0]
    at org.elasticsearch.cli.Command.mainWithoutErrorHandling(Command.java:96) ~[elasticsearch-5.0.0.jar:5.0.0]
    at org.elasticsearch.cli.Command.main(Command.java:62) ~[elasticsearch-5.0.0.jar:5.0.0]
    at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:80) ~[elasticsearch-5.0.0.jar:5.0.0]
    at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:73) ~[elasticsearch-5.0.0.jar:5.0.0]
Caused by: java.lang.IllegalArgumentException: unknown setting [bootstrap.ignore_system_bootstrap_checks] please check that any required plugins are installed, or check the breaking changes documentation for removed settings
    at org.elasticsearch.common.settings.AbstractScopedSettings.validate(AbstractScopedSettings.java:271) ~[elasticsearch-5.0.0.jar:5.0.0]
    at org.elasticsearch.common.settings.AbstractScopedSettings.validate(AbstractScopedSettings.java:239) ~[elasticsearch-5.0.0.jar:5.0.0]
    at org.elasticsearch.common.settings.SettingsModule.<init>(SettingsModule.java:138) ~[elasticsearch-5.0.0.jar:5.0.0]
    at org.elasticsearch.node.Node.<init>(Node.java:311) ~[elasticsearch-5.0.0.jar:5.0.0]
    at org.elasticsearch.node.Node.<init>(Node.java:220) ~[elasticsearch-5.0.0.jar:5.0.0]
    at org.elasticsearch.bootstrap.Bootstrap$5.<init>(Bootstrap.java:191) ~[elasticsearch-5.0.0.jar:5.0.0]
    at org.elasticsearch.bootstrap.Bootstrap.setup(Bootstrap.java:191) ~[elasticsearch-5.0.0.jar:5.0.0]
    at org.elasticsearch.bootstrap.Bootstrap.init(Bootstrap.java:286) ~[elasticsearch-5.0.0.jar:5.0.0]
    at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:112) ~[elasticsearch-5.0.0.jar:5.0.0]
    ... 6 more

@rajasoun
Copy link

rajasoun commented Nov 1, 2016

Increase max_map_count on the Host System (Pls. note not in the Docker) using the following command
sudo sysctl -w vm.max_map_count=262144.

I have added this to the Dockerfile itself which has Fixed the Problem

@tianon
Copy link
Member Author

tianon commented Nov 1, 2016

@nguoianphu with Docker Toolbox, your VM is likely boot2docker, so you'll want to add it to /var/lib/boot2docker/profile so that it runs at boot; if you're using Docker for Mac or Docker for Windows, I believe their settings have a place to add additional sysctl values for the VM host 👍

@codefromthecrypt
Copy link

hey, folks. just chiming an opinion in.

Elasticsearch is (or was) and extremely easy storage setup. No weird knowledge required. I'd highly recommend someone from elasticsearch trying to figure out a pragmatic update to ES itself which allows it to work with "listen all" and without require host modifications. Otherwise, downstream users, especially those with limited experience will perceive Elasticsearch as expert-only and/or fail. Being able to quickly start without wizard skills is probably a part of ES' success and something that it should be able to retain.

@nubunto
Copy link

nubunto commented Nov 29, 2016

I also noticed that a few of the options (e.g. default.http.port) are also missing from 5.0. Do we have a changelog on those changes?

@tianon
Copy link
Member Author

tianon commented Nov 29, 2016

@nubunto I think https://www.elastic.co/guide/en/elasticsearch/reference/5.0/es-release-notes.html might be what you're looking for.

https://www.elastic.co/guide/en/elasticsearch/reference/5.0/breaking-changes-5.0.html is probably also useful for what you're looking for.

@aliasdhacker
Copy link

aliasdhacker commented Dec 12, 2016

sysctl -w vm.max_map_count=262144

Worked for me, thank you very much.

First try: (Failure)

elasticsearch | [2016-12-12T10:31:36,346][INFO ][o.e.n.Node ] [Fovo3av] starting ...
elasticsearch | [2016-12-12T10:31:36,730][INFO ][o.e.t.TransportService ] [Fovo3av] publish_address {172.19.0.3:9300}, bound_addresses {[::]:9300}
elasticsearch | [2016-12-12T10:31:36,736][INFO ][o.e.b.BootstrapCheck ] [Fovo3av] bound or publishing to a non-loopback or non-link-local address, enforcing bootstrap checks
elasticsearch | ERROR: bootstrap checks failed
elasticsearch | max virtual memory areas vm.max_map_count [65530] likely too low, increase to at least [262144]
elasticsearch | [2016-12-12T10:31:36,777][INFO ][o.e.n.Node ] [Fovo3av] stopping ...

Second try after updating vm.max_map_count: (Success)

elasticsearch | [2016-12-12T10:34:54,902][INFO ][o.e.n.Node ] [Fovo3av] initialized
elasticsearch | [2016-12-12T10:34:54,909][INFO ][o.e.n.Node ] [Fovo3av] starting ...
elasticsearch | [2016-12-12T10:34:55,368][INFO ][o.e.t.TransportService ] [Fovo3av] publish_address {172.19.0.3:9300}, bound_addresses {[::]:9300}
elasticsearch | [2016-12-12T10:34:55,374][INFO ][o.e.b.BootstrapCheck ] [Fovo3av] bound or publishing to a non-loopback or non-link-local address, enforcing bootstrap checks
elasticsearch | [2016-12-12T10:34:56,978][WARN ][o.e.m.j.JvmGcMonitorService] [Fovo3av] [gc][young][1][2] duration [1.2s], collections [1]/[1.9s], total [1.2s]/[1.8s], memory [217.2mb]->[74.1mb]/[1.9gb], all_pools {[young] [198.3mb]->[36.4mb]/[266.2mb]}{[survivor] [18.9mb]->[33.1mb]/[33.2mb]}{[old] [0b]->[8.1mb]/[1.6gb]}

@ghost
Copy link

ghost commented Dec 12, 2016

This is still not solved...

For those who "solve" this by modifying host settings: seriously? What is the point of running this in a container if you have to modify host settings?

As of right now the master is completely unusable with docker-compose. Docker compose doesn't allow one to specify sysctl values, and this ridiculous requirement on the ES part is making this a no-go.

@tianon
Copy link
Member Author

tianon commented Dec 12, 2016

@least-olegs the value in question has to be modified on the host -- it is not scoped to any namespace, and Elasticsearch 5.x is more aggressive about enforcing it than previous versions where. There's not really anything further we can do from the Dockerization itself to solve this.

@codefromthecrypt
Copy link

codefromthecrypt commented Dec 13, 2016 via email

@ghost
Copy link

ghost commented Dec 15, 2016

@tianon you are plain wrong. For these reasons:

  1. This value is mindless nonsense and is not a real requirement. Some idiot in Elasticsearch discovered there was a setting they didn't really understand the purpose of, and decided to make it mandatory.
  2. And, if you cannot do anything: you cannot close the ticket. Just go home and do something else. Your code doesn't work and you have no way of making it work, so how is the problem solved if it isn't?
  3. But, in fact, you can make it work! Just don't use docker-compose, or roll your own version of docker-compose. The problem is not in the Docker, however crappy that program is. The problem is simply that you cannot specify a configuration parameter through the configuration tool that you chose. The answer? - ditch / modify your tool. It's that simple.

@BretFisher
Copy link

@least-olegs FYI you're in a docker repo complaining about ES's new requirement to a docker person.

@ain
Copy link

ain commented Dec 19, 2016

I've just reproduced on elasticsearch:5 and elasticsearch:5.1.

Could anyone please explain why is it closed? Host modification can not be a way to run a container.

@tianon
Copy link
Member Author

tianon commented Dec 19, 2016

Upstream's startup checks in 5.x are now more explicit about the host environment settings. There is nothing we can do from the image to configure those for you, which is why the image documentation now explicitly mentions configuring those bits on your host yourself. The settings in question are not namespaced to containers, and thus cannot be adjusted by a container, or by a docker run flag, and must be set on the host. There is nothing else we, the Docker image maintainers, can do here.

@docker-library docker-library locked and limited conversation to collaborators Dec 19, 2016
@tianon
Copy link
Member Author

tianon commented Dec 19, 2016

See also elastic/elasticsearch#4978 (comment) for some notes from upstream on why they've made this change:

As of 5.0, Elasticsearch will not start in production mode if vm.max_map_count is not high enough. Silencing the warning when we are not able to apply the MAX_MAP_COUNT setting on openvz will just make debugging the issue harder, as it will not be obvious why the setting is not being applied. Instead, if you are running on a system where you cannot set vm.max_map_count, but it is set to be high enough for Elasticsearch's bootstrap checks, then you can silence the warning by removing the MAX_MAP_COUNT setting. If the value on your system is NOT high enough, then your cluster is going to crash and burn at some stage and you will lose data. Instead of trying to work around it with hacks like #4978 (comment), you should either speak to your sysadmin to configure vm.max_map_count correctly, or move to a platform where you can set it.

@tianon
Copy link
Member Author

tianon commented Jan 11, 2017

See #153 for a possible pending fix to this issue (essentially no longer having the image in "production" mode by default). 👍

@yosifkit
Copy link
Member

We have merged a fix in #153 so that elasticsearch will run anywhere by default (by not needing to run the bootstrap checks). The minimal set to make it work for clustering is the following:

$ docker run -d --name elas elasticsearch -Etransport.host=0.0.0.0 -Ediscovery.zen.minimum_master_nodes=1
$ # note, running it with these flags would require the bootstrap checks to pass (vm.max_map_count, etc)

These same configs could also be put in a custom config file to replace what is currently there.

This will be available on the Docker Hub once a PR to official-images has been made and merged. (hopefully later today).

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests