Skip to content
This repository has been archived by the owner on Nov 7, 2018. It is now read-only.

Upgrade to ElasticSearch v5.0 #60

Closed
aadu opened this issue Oct 26, 2016 · 31 comments
Closed

Upgrade to ElasticSearch v5.0 #60

aadu opened this issue Oct 26, 2016 · 31 comments

Comments

@aadu
Copy link

aadu commented Oct 26, 2016

No description provided.

@pires
Copy link
Owner

pires commented Oct 31, 2016

Unfortunately, can't do. Running Elasticsearch 5.0.0 on Docker is proving to be really hard. I just can't support having users changing their Kubernetes nodes configuration with sysctl, as it seems too cumbersome.

@AtzeDeVries
Copy link

I think it would still be to allready have a 5.0.0 branch. As far as i know the sysctl issue is the only blocking issue.

If you install ES from .deb or .rpm on a machine it automaticly updates the vm.max_map_count setting. Check:

https://www.elastic.co/guide/en/elasticsearch/reference/5.0/vm-max-map-count.html

So there is probally no way around it.

@webwurst
Copy link

webwurst commented Nov 8, 2016

We are trying to set this via init-container: giantswarm/kubernetes-elastic-stack/manifests/elasticsearch-deployment.yaml#L13

Obviously not a nice thing to change settings on the host by a privileged container. But seems to work for now.

@aeneaswiener
Copy link

aeneaswiener commented Nov 9, 2016

@pires the init-container solution described above is acceptable for me. Would you consider putting 5.0.0 on a feature branch for now as suggested by @AtzeDeVries?

@pires
Copy link
Owner

pires commented Nov 9, 2016

Will have to take some time to try it out on a few different setups, namely GKE.

@aeneaswiener
Copy link

@pires I would be happy to test on GKE also if you provide a 5.0.0 image.

@pires
Copy link
Owner

pires commented Nov 9, 2016

OK, I have the changes to be pushed. Will do in the next few hours and ping you.

@puja108
Copy link
Contributor

puja108 commented Nov 9, 2016

@pires the initContainer was the only thing that came to my mind that would run no matter if the host got reset or not, I didn't want to have a solution that was bound to break in case of an "ephemeral host", but if you find another solution would be glad to hear from it and test.

@pires
Copy link
Owner

pires commented Nov 9, 2016

I have released an image but I haven't been able to test it - I'm revamping pires/kubernetes-vagrant-coreos-cluster and it's broken as we speak.

@aeneaswiener
Copy link

I am testing image: quay.io/pires/docker-elasticsearch-kubernetes:5.0.0

but all the ES pods error out with the following log message:

$ kubectl logs -f es-0-client-dgadp
su-exec: /elasticsearch/bin/elasticsearch: Text file busy

Has anyone been able to successfully run @pires's new version yet?

@aeneaswiener
Copy link

aeneaswiener commented Nov 14, 2016

I think the error may be related to the installation of the elasticsearch-cloud-kubernetes plugin via

RUN /elasticsearch/bin/elasticsearch-plugin install io.fabric8:elasticsearch-cloud-kubernetes:5.0.0 --verbose

which results in the following error in the output of the docker image build:

- Plugin information:
Name: discovery-kubernetes
Description: Elasticsearch Kubernetes cloud plugin
Version: 5.0.0
 * Classname: io.fabric8.elasticsearch.plugin.discovery.kubernetes.KubernetesDiscoveryPlugin
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@     WARNING: plugin requires additional permissions     @
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
* java.io.FilePermission <<ALL FILES>> read,execute,readlink
* java.lang.RuntimePermission accessClassInPackage.sun.security.ssl
* java.lang.RuntimePermission accessDeclaredMembers
* java.lang.RuntimePermission setFactory
* java.lang.reflect.ReflectPermission suppressAccessChecks
* java.net.NetPermission getCookieHandler
* java.net.NetPermission getProxySelector
See http://docs.oracle.com/javase/8/docs/technotes/guides/security/permissions.html
for descriptions of what these permissions allow and the associated risks.

Though at least part of these warnings appear to have been there already in the 2.x version, see https://hub.docker.com/r/imelnik/docker-elasticsearch-kubernetes/builds/brzpwudjamraxdechd5xnak/

@pires
Copy link
Owner

pires commented Nov 14, 2016

That's not an error but a warning about the fact the plug-in is requiring additional permissions.

I will have to give this a try. I will do it later this week.

@HTChang
Copy link

HTChang commented Nov 17, 2016

@pires I saw the image 5.0.0 in pires / docker-elasticsearch-kubernetes.
Does it ready for using in production now?

@pires
Copy link
Owner

pires commented Nov 17, 2016

I was able to run it. I am updating this repo later today, if I'm able to reproduce my local setup on GKE.

@pires
Copy link
Owner

pires commented Nov 17, 2016

@aeneaswiener it kinda works for me:

$ kubectl get pods
NAME                         READY     STATUS    RESTARTS   AGE
es-master-1133701621-tphop   1/1       Running   0          19m

$ kubectl logs es-master-1133701621-tphop
[2016-11-17T17:32:13,732][WARN ][o.e.c.l.LogConfigurator  ] ignoring unsupported logging configuration file [/elasticsearch/config/logging.yml], logging is configured via [/elasticsearch/config/log4j2.properties]
[2016-11-17T17:32:13,998][INFO ][o.e.n.Node               ] [] initializing ...
[2016-11-17T17:32:14,072][INFO ][o.e.e.NodeEnvironment    ] [uRvMVPn] using [1] data paths, mounts [[/data (/dev/sda1)]], net usable_space [90.2gb], net total_space [98.3gb], spins? [possibly], types [ext4]
[2016-11-17T17:32:14,073][INFO ][o.e.e.NodeEnvironment    ] [uRvMVPn] heap size [1007.3mb], compressed ordinary object pointers [true]
[2016-11-17T17:32:14,074][INFO ][o.e.n.Node               ] [uRvMVPn] node name [uRvMVPn] derived from node ID; set [node.name] to override
[2016-11-17T17:32:14,076][INFO ][o.e.n.Node               ] [uRvMVPn] version[5.0.0], pid[6], build[253032b/2016-10-26T04:37:51.531Z], OS[Linux/3.16.0-4-amd64/amd64], JVM[Oracle Corporation/OpenJDK 64-Bit Server VM/1.8.0_92-internal/25.92-b14]
[2016-11-17T17:32:15,236][INFO ][o.e.p.PluginsService     ] [uRvMVPn] loaded module [aggs-matrix-stats]
[2016-11-17T17:32:15,236][INFO ][o.e.p.PluginsService     ] [uRvMVPn] loaded module [ingest-common]
[2016-11-17T17:32:15,237][INFO ][o.e.p.PluginsService     ] [uRvMVPn] loaded module [lang-expression]
[2016-11-17T17:32:15,237][INFO ][o.e.p.PluginsService     ] [uRvMVPn] loaded module [lang-groovy]
[2016-11-17T17:32:15,237][INFO ][o.e.p.PluginsService     ] [uRvMVPn] loaded module [lang-mustache]
[2016-11-17T17:32:15,238][INFO ][o.e.p.PluginsService     ] [uRvMVPn] loaded module [lang-painless]
[2016-11-17T17:32:15,239][INFO ][o.e.p.PluginsService     ] [uRvMVPn] loaded module [percolator]
[2016-11-17T17:32:15,240][INFO ][o.e.p.PluginsService     ] [uRvMVPn] loaded module [reindex]
[2016-11-17T17:32:15,240][INFO ][o.e.p.PluginsService     ] [uRvMVPn] loaded module [transport-netty3]
[2016-11-17T17:32:15,241][INFO ][o.e.p.PluginsService     ] [uRvMVPn] loaded module [transport-netty4]
[2016-11-17T17:32:15,241][INFO ][o.e.p.PluginsService     ] [uRvMVPn] loaded plugin [discovery-kubernetes]
[2016-11-17T17:32:18,445][INFO ][o.e.n.Node               ] [uRvMVPn] initialized
[2016-11-17T17:32:18,446][INFO ][o.e.n.Node               ] [uRvMVPn] starting ...
[2016-11-17T17:32:18,692][INFO ][o.e.t.TransportService   ] [uRvMVPn] publish_address {10.0.2.4:9300}, bound_addresses {10.0.2.4:9300}
[2016-11-17T17:32:18,702][INFO ][o.e.b.BootstrapCheck     ] [uRvMVPn] bound or publishing to a non-loopback or non-link-local address, enforcing bootstrap checks
SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.
[2016-11-17T17:32:22,675][INFO ][o.e.c.s.ClusterService   ] [uRvMVPn] new_master {uRvMVPn}{uRvMVPnnRwOv3EyNyY7PsQ}{uV68ryBGR1ycz5YXs1RVGw}{10.0.2.4}{10.0.2.4:9300}, reason: zen-disco-elected-as-master ([0] nodes joined)
[2016-11-17T17:32:22,682][INFO ][o.e.n.Node               ] [uRvMVPn] started
[2016-11-17T17:32:22,711][INFO ][o.e.g.GatewayService     ] [uRvMVPn] recovered [0] indices into cluster_state

I say kinda because it's killed from time to time. I used the init-container as per @puja108 instructions.

@pires
Copy link
Owner

pires commented Nov 17, 2016

Actually, I just realized I'm having the same issue as you do but after a while, and a couple pod restarts, it just works!

@thuandt
Copy link

thuandt commented Nov 17, 2016

If I have cluster running 2.4 in GKE, how can I upgrade it to 5.0?

@pires
Copy link
Owner

pires commented Nov 21, 2016

To anyone trying this, i released quay.io/pires/docker-elasticsearch-kubernetes:5.0.1. The /elasticsearch/bin/elasticsearch: Text file busy problems is still present but is random. Help most needed!

@thuandt
Copy link

thuandt commented Nov 22, 2016

@pires I try with new image in my GKE cluster, everything is working fine.
Only small change in environment:

        - name: NODE_MASTER
          value: "true"
        - name: NODE_DATA
          value: "false"
        - name: NODE_INGEST
          value: "false"
        - name: HTTP_ENABLE
          value: "false"
        - name: ES_JAVA_OPTS
          value: "-Xms256m -Xmx256m"

and of course, init container to set vm.max_map_count

@pires
Copy link
Owner

pires commented Nov 22, 2016

How many instances of each do you have? How much memory did you set? Also is your init container as someone suggested above?

@puja108
Copy link
Contributor

puja108 commented Nov 22, 2016

Just FYI, as ES storage can get pretty big with time, we also added a Scheduled Job running the ES Curator once a day to clean up old indices: https://github.com/giantswarm/kubernetes-elastic-stack/blob/master/manifests/curator-scheduledjob.yaml

The config is kept in a Config Map: https://github.com/giantswarm/kubernetes-elastic-stack/blob/master/manifests/curator-configmap.yaml

@pires
Copy link
Owner

pires commented Nov 22, 2016

@puja108 that's really cool and it would be awesome to have that as an add-on to this repo, if you're willing to contribute it.

@puja108
Copy link
Contributor

puja108 commented Nov 22, 2016

Will do a PR

@pires
Copy link
Owner

pires commented Nov 24, 2016

Has anyone tried the latest 5.0.1 image and found it to work or any issues?

@pires pires closed this as completed in 63110c4 Nov 24, 2016
@vganapathy1
Copy link

vganapathy1 commented Nov 25, 2016

I'm able to deploy and the cluster comes up properly. But, when i scale the data node (any node for that matter), it failed with the below message

Caused by: java.lang.IllegalStateException: failed to obtain node locks, tried [[/data/data/myesdb]] with lock id [0]; maybe these locations are not writable or multiple nodes were started without increasing [node.max_local_storage_nodes] (was [1])?

Can you please add an environment variable for "node.max_local_storage_nodes"?

@pires
Copy link
Owner

pires commented Nov 25, 2016

Sure. Can you open an issue on github.com/pires/docker-elasticsearch?

@kmvenkatesh
Copy link

kmvenkatesh commented Nov 26, 2016 via email

@devth
Copy link

devth commented Apr 17, 2017

The init container to set vm.max_map_count hack no longer appears to work on GKE running K8S 1.6.x. Errors with:

sysctl: error setting key 'vm.max_map_count': Read-only file system

Trying to figure out a work around.

@devth
Copy link

devth commented Apr 17, 2017

False alarm. Destroying and recreating pods appears to have applied the vm.max_map_count setting correctly.

@attwad
Copy link

attwad commented Jul 30, 2017

For posterity, with kubernetes 1.6 it seems they moved it out of beta and the syntax is now:

spec:
      initContainers:
        - name: init-sysctl
          image: busybox
          imagePullPolicy: IfNotPresent
          command: ["sysctl", "-w", "vm.max_map_count=262144"]
          securityContext:
            privileged: true

@pires
Copy link
Owner

pires commented Jul 31, 2017

Done.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests