Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Elasticsearch tarball install fails to start under systemd #585

Closed
iancward opened this issue Jun 15, 2017 · 15 comments
Closed

Elasticsearch tarball install fails to start under systemd #585

iancward opened this issue Jun 15, 2017 · 15 comments
Labels
Bug Something isn't working

Comments

@iancward
Copy link

We used this cookbook to deploy Elasticsearch 5.3.0 via a tarball on RHEL7, and it deploys both a SysV (which sets vm.max_map_count) startup script and a systemd unit file (which doesn't set vm.max_map_count). When Chef goes to start the service, the service claims vm.maxmap_count is too low and it fails to start.

bootstrap checks failed
max virtual memory areas vm.max_map_count [65530] is too low, increase to at least [262144]

If I run /sbin/service elasticsearch start, the command gets redirected to systemctl (part of systemd) and the process fails to start. However, if I explicitly run /etc/init.d/elasticsearch start, the process will start, as that script uses sysctl to set vm.max_map_count.

I've been trying to find a way to set vm.max_map_count via systemd and have not found anything. This cookbook might need to set the max_map_count by some other means before it tries to start the service.

@iancward
Copy link
Author

iancward commented Jun 15, 2017

Relevant info regarding package install taking care of this: elastic/elasticsearch#21507

@martinb3
Copy link
Contributor

Hi @iancward -- thanks for the heads up! We're actually following the upstream packaging as much as possible. Looks like you've found that as well ^. We'll update our files based on the latest from upstream as soon as they are released. I'll leave this open to remind us to do so 👍

@martinb3 martinb3 added Bug Something isn't working upstream labels Jun 15, 2017
@randyrue
Copy link

Sorry to throw a likely newbie question into a possibly only partially related thread.

I'm trying to set this up using the tarball install, for an openSUSE build that includes both init.d and systemd framework. By default it appears to be trying to install an init script, which fails, and in any case I'd like to use systemd.

I can't figure out a working syntax for the elasticsearch_service part of my wrapper that will configure the service under systemd. I've been reading every part of the cookbook and crawling the intertubes for two days, found some references to specifying my own init script, but no luck on how to force a systemd install or specify a services file (looks like the /templates/default/systemd_unit.erb would work OK in any case).

I'd be grateful for any help including an RTFM if you'll just send me a link to the M I need to FR.

Randy

@martinb3
Copy link
Contributor

Hi @randyrue -- the deb and rpm packages install both, and this cookbook reproduces that behavior. As far as actually starting or stopping Elasticsearch with a specific one -- this cookbook relies on Chef to determine what init system to use. What version of openSUSE are we talking about? I could compare and see what behavior I see...

@randyrue
Copy link

randyrue commented Aug 15, 2017

Hi Martin - openSUSE Leap 42.2. I've inherited this effort from someone who previously found a workaround by temporarily renaming /etc/init.d and then restoring it after the elasticsearch_service call. I'm thinking it was meant to be a quick hack and now I'm trying to do it right.
Without the hack, chef-client fails when it can't find a suitable init script template:

ERROR: elasticsearch_service[elasticsearch] (scharp_base_server::elasticsearch line 70) had an error: Chef::Exceptions::FileNotFound: template[/etc/init.d/elasticsearch] (/var/chef/cache/cookbooks/elasticsearch/libraries/provider_service.rb line 32) had an error: Chef::Exceptions::FileNotFound: Cookbook 'elasticsearch' (3.0.5) does not contain a file at any of these locations:
templates/opensuseleap-42.2/initscript.erb
templates/opensuseleap/initscript.erb
templates/default/initscript.erb
templates/initscript.erb
This cookbook does contain: ['templates/default/elasticsearch.in.sh.erb','templates/default/elasticsearch.yml.erb','templates/default/jvm_options.erb','templates/amazon/initscript.erb','templates/redhat/initscript.erb','templates/centos/initscript.erb','templates/debian/initscript.erb','templates/default/log4j2.properties.erb','templates/default/systemd_unit.erb','templates/ubuntu/initscript.erb','templates/oracle/initscript.erb']
[2017-08-15T10:55:04-07:00] FATAL: Chef::Exceptions::ChildConvergeError: Chef run process exited unsuccessfully (exit code 1)

@martinb3 martinb3 reopened this Aug 30, 2017
@martinb3
Copy link
Contributor

Hi folks --

@randyrue please try the v3.3.0 release, and set init_source on elasticsearch_service to nil, and it'll skip that whole section that you're trying to work around. You can do the same for systemd_source too, if you want to skip that.

@iancward still going to leave this open -- it looks like we need to do some of the same postinst steps that the packaging does for systemd. In the mean time, a workaround would be to run those same commands yourself:

  1. Add vm.max_map_count = 262144 to /usr/lib/sysctl.d/elasticsearch.conf
  2. Run /usr/lib/systemd/systemd-sysctl

@randyrue
Copy link

randyrue commented Sep 7, 2017

Using 3.3.0 and my service section now reads:

elasticsearch_service 'elasticsearch' do
init_source 'nil' # skip the init.d script setup
service_actions [:enable, :start]

Do I have that syntax right?

Still fails with:
[2017-09-07T09:57:38-07:00] ERROR: Running exception handlers
Running handlers complete
[2017-09-07T09:57:38-07:00] ERROR: Exception handlers complete
Chef Client failed. 6 resources updated in 31 seconds
[2017-09-07T09:57:38-07:00] FATAL: Stacktrace dumped to /var/chef/cache/chef-stacktrace.out
[2017-09-07T09:57:38-07:00] FATAL: Please provide the contents of the stacktrace.out file if you file a bug report
[2017-09-07T09:57:38-07:00] ERROR: elasticsearch_service[elasticsearch] (scharp_base_server::elasticsearch line 71) had an error: Chef::Exceptions::FileNotFound: template[/etc/init.d/elasticsearch] (/var/chef/cache/cookbooks/elasticsearch/libraries/provider_service.rb line 33) had an error: Chef::Exceptions::FileNotFound: Cookbook 'elasticsearch' (3.3.0) does not contain a file at any of these locations:
templates/opensuseleap-42.2/nil
templates/opensuseleap/nil
templates/default/nil
templates/nil

This cookbook does contain: ['templates/default/elasticsearch.in.sh.erb','templates/default/elasticsearch.yml.erb','templates/amazon/initscript.erb','templates/default/jvm_options.erb','templates/debian/initscript.erb','templates/default/log4j2.properties.erb','templates/redhat/initscript.erb','templates/centos/initscript.erb','templates/default/systemd_unit.erb','templates/ubuntu/initscript.erb','templates/oracle/initscript.erb']
[2017-09-07T09:57:38-07:00] FATAL: Chef::Exceptions::ChildConvergeError: Chef run process exited unsuccessfully (exit code 1)

@martinb3
Copy link
Contributor

Hi @randyrue -- no, the syntax 'nil' is a string. You should pass actually just nil. Thanks!

@randyrue
Copy link

randyrue commented Sep 11, 2017

Still no love, sorry...

Removed the ticks and still get:
`* template[/etc/init.d/elasticsearch] action create

  ================================================================================
  Error executing action `create` on resource 'template[/etc/init.d/elasticsearch]'
  ================================================================================

  Chef::Exceptions::FileNotFound
  ------------------------------
  Cookbook 'elasticsearch' (3.3.0) does not contain a file at any of these locations:
    templates/opensuseleap-42.2/nil
    templates/opensuseleap/nil
    templates/default/nil
    templates/nil

  This cookbook _does_ contain: ['templates/debian/initscript.erb','templates/centos/initscript.erb','templates/amazon/initscript.erb','templates/oracle/initscript.erb','templates/default/jvm_options.erb','templates/default/log4j2.properties.erb','templates/default/systemd_unit.erb','templates/default/elasticsearch.yml.erb','templates/redhat/initscript.erb','templates/ubuntu/initscript.erb','templates/default/elasticsearch.in.sh.erb']

  Cookbook Trace:
  ---------------
  /var/chef/cache/cookbooks/elasticsearch/libraries/provider_service.rb:45:in `action_configure'

  Resource Declaration:
  ---------------------
  # In /var/chef/cache/cookbooks/elasticsearch/libraries/provider_service.rb

   33:       init_r = template "/etc/init.d/#{new_resource.service_name}" do
   34:         source new_resource.init_source
   35:         cookbook new_resource.init_cookbook
   36:         owner 'root'
   37:         mode '0755'
   38:         variables(
   39:           # we need to include something about #{progname} fixed in here.
   40:           program_name: new_resource.service_name
   41:         )
   42:         only_if { ::File.exist?('/etc/init.d') }
   43:         action :nothing
   44:       end
   45:       init_r.run_action(:create)

  Compiled Resource:
  ------------------
  # Declared in /var/chef/cache/cookbooks/elasticsearch/libraries/provider_service.rb:33:in `action_configure'

  template("/etc/init.d/elasticsearch") do
    action [:nothing]
    default_guard_interpreter :default
    source "nil"
    cookbook "elasticsearch"
    variables {:program_name=>"elasticsearch"}
    declared_type :template
    cookbook_name "scharp_base_server"
    mode "0755"
    path "/etc/init.d/elasticsearch"
    owner "root"
    group nil
    verifications []
    only_if { #code block }
  end

  System Info:
  ------------
  chef_version=13.3.42
  platform=opensuseleap
  platform_version=42.2
  ruby=ruby 2.4.1p111 (2017-03-22 revision 58053) [x86_64-linux]
  program_name=chef-client worker: ppid=1329;start=10:15:07;
  executable=/opt/chef/bin/chef-client


================================================================================
Error executing action `configure` on resource 'elasticsearch_service[elasticsearch]'
================================================================================

Chef::Exceptions::FileNotFound
------------------------------
template[/etc/init.d/elasticsearch] (/var/chef/cache/cookbooks/elasticsearch/libraries/provider_service.rb line 33) had an error: Chef::Exceptions::FileNotFound: Cookbook 'elasticsearch' (3.3.0) does not contain a file at any of these locations:
  templates/opensuseleap-42.2/nil
  templates/opensuseleap/nil
  templates/default/nil
  templates/nil

This cookbook _does_ contain: ['templates/debian/initscript.erb','templates/centos/initscript.erb','templates/amazon/initscript.erb','templates/oracle/initscript.erb','templates/default/jvm_options.erb','templates/default/log4j2.properties.erb','templates/default/systemd_unit.erb','templates/default/elasticsearch.yml.erb','templates/redhat/initscript.erb','templates/ubuntu/initscript.erb','templates/default/elasticsearch.in.sh.erb']

Cookbook Trace:
---------------
/var/chef/cache/cookbooks/elasticsearch/libraries/provider_service.rb:45:in `action_configure'

Resource Declaration:
---------------------
# In /var/chef/cache/cookbooks/scharp_base_server/recipes/elasticsearch.rb

 71: elasticsearch_service 'elasticsearch' do
 72:   init_source 'nil' # skip the init.d script setup
 73:   service_actions [:enable, :start]
 74: end
 75:

Compiled Resource:
------------------
# Declared in /var/chef/cache/cookbooks/scharp_base_server/recipes/elasticsearch.rb:71:in `from_file'

elasticsearch_service("elasticsearch") do
  action [:configure]
  default_guard_interpreter :default
  declared_type :elasticsearch_service
  cookbook_name "scharp_base_server"
  recipe_name "elasticsearch"
  init_source "nil"
  service_actions [:enable, :start]
  service_name "elasticsearch"
end

System Info:
------------
chef_version=13.3.42
platform=opensuseleap
platform_version=42.2
ruby=ruby 2.4.1p111 (2017-03-22 revision 58053) [x86_64-linux]
program_name=chef-client worker: ppid=1329;start=10:15:07;
executable=/opt/chef/bin/chef-client

Running handlers:
[2017-09-11T10:15:33-07:00] ERROR: Running exception handlers
Running handlers complete
[2017-09-11T10:15:33-07:00] ERROR: Exception handlers complete
Chef Client failed. 2 resources updated in 25 seconds
[2017-09-11T10:15:33-07:00] FATAL: Stacktrace dumped to /var/chef/cache/chef-stacktrace.out
[2017-09-11T10:15:33-07:00] FATAL: Please provide the contents of the stacktrace.out file if you file a bug report
[2017-09-11T10:15:33-07:00] ERROR: elasticsearch_service[elasticsearch] (scharp_base_server::elasticsearch line 71) had an error: Chef::Exceptions::FileNotFound: template[/etc/init.d/elasticsearch] (/var/chef/cache/cookbooks/elasticsearch/libraries/provider_service.rb line 33) had an error: Chef::Exceptions::FileNotFound: Cookbook 'elasticsearch' (3.3.0) does not contain a file at any of these locations:
templates/opensuseleap-42.2/nil
templates/opensuseleap/nil
templates/default/nil
templates/nil

This cookbook does contain: ['templates/debian/initscript.erb','templates/centos/initscript.erb','templates/amazon/initscript.erb','templates/oracle/initscript.erb','templates/default/jvm_options.erb','templates/default/log4j2.properties.erb','templates/default/systemd_unit.erb','templates/default/elasticsearch.yml.erb','templates/redhat/initscript.erb','templates/ubuntu/initscript.erb','templates/default/elasticsearch.in.sh.erb']
[2017-09-11T10:15:33-07:00] FATAL: Chef::Exceptions::ChildConvergeError: Chef run process exited unsuccessfully (exit code 1)`

@martinb3
Copy link
Contributor

martinb3 commented Sep 12, 2017

Hi @randyrue -- that output shows you're still passing a string (see the source "nil" and 72: init_source 'nil' # skip the init.d script setup lines) to the resource. It needs to be just nil, not a string.

@randyrue
Copy link

I swear that's what I had :)

I had removed the line and replaced our 'mv /etc/init.d' workaround. When I added "init_source nil" to the service section, and disabled the mv hack, all appears to be well.

Thanks!

@randyrue
Copy link

randyrue commented Oct 4, 2017

Hi Again,

Found a new issue but still within the area of systemd, don't know if it warrants a new issue or if it's appropriate to put it here.

ES fails to start because it's trying to create PID file in /var/run/elasticsearch. But adding some lines to my recipe to create that subdir is no good, as later OS's are using tmpfs for /var/run (any changes are lost on reboot).

Added a 'path.pid' => '/var/run', line to the configuration entries for my configuration chunk, and I can see a matching entry in my resulting /etc/elasticsearch/elasticsearch.yml. But the service file created for systemd still has a flat entry calling for /var/run/elasticsearch/. Looks like the template in the code also uses a flat entry for that, no substitution.

What are my options? Rip out the tmpfs /var/run and mount it "real?" (yuck)
Use the "line" cookbook to replace that unit file line? Yuck. Also, I'd have to run (or wait for a run) chef-client after a reboot before I could start ES.

Let me know. And let me know if you want me to delete this and repost as new...
Randy


More to report after some progress:
I've removed the path.pid entry as on subsequent runs it seemed to be killing the start of ES by being passed along directly to java, got errors along the line of "no such argument path.pid"
I'm playing with adding a conf file to /etc/tmpfile.d to configure the addition of a /var/run/elasticsearch directory on startup, no luck so far.
I've noticed if I run chef-client, /var/run/elasticsearch is being created. But using that as my workaround still requires me to run chef-client after every reboot, before I can start ES. This should be hands-free.

@randyrue
Copy link

randyrue commented Oct 4, 2017

After some flailing the less than ideal workaround is to add a "file" entry in the recipe that creates /usr/lib/tmpfiles.d/elasticsearch.conf, containing one line:

d /run/elasticsearch 0777 - - - -

Note that the resulting subdir /var/run/elasticsearch is owned by root as our elasticsearch user exists in our ldap so the elasticsearch_user call in the recipe fails, and systemd-tmpfile fails to create the subdir if I specify elasticsearch as the owner. Instead I'm giving it 777 permisions (bleah), which are corrected the first time chef-client runs.

good enough for now

@martinb3
Copy link
Contributor

martinb3 commented Oct 6, 2017

Hi @randyrue -- please open separate issues for your other questions/follow ups.

@martinb3
Copy link
Contributor

martinb3 commented Oct 6, 2017

Closing this issue since I pulled down the latest scripts from the latest packaging earlier this week.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants