Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Logging is not working on Selinux enabled setups #30949

Closed
aaronyeeski opened this issue Jan 22, 2021 · 12 comments
Closed

Logging is not working on Selinux enabled setups #30949

aaronyeeski opened this issue Jan 22, 2021 · 12 comments
Assignees
Labels
area/logging internal kind/bug-qa Issues that have not yet hit a real release. Bugs introduced by a new feature or enhancement kind/enhancement Issues that improve or augment existing functionality priority/0 QA/M release-note Note this issue in the milestone's release notes
Milestone

Comments

@aaronyeeski
Copy link
Contributor

aaronyeeski commented Jan 22, 2021

What kind of request is this (question/bug/enhancement/feature request):
Bug

Steps to reproduce (least amount of steps as possible):
Install Rancher v2.5.5 on a RHEL instance (ami-0057f60d0abc2fcc7) with SELinux enabled
Launch a downstream cluster
Install Logging v2 via the cluster explorer
Set up Cluster Flow and Cluster Output using Elasticsearch.
See logs flow in Elasticsearch

Result:
Saw the following errors in Elasticsearch:
Screen Shot 2021-01-22 at 7 05 58 AM

No logs are flowing from the rancher-logging-rke-aggregator pods
Errors seen in rancher-logging-rke-aggregator pods:

[2021/01/21 18:22:25] [Warning] [config] I cannot open /fluent-bit/etc/..2021_01_21_18_22_19.819627234/parsers.conf file
[2021/01/21 18:22:25] [ info] [engine] started (pid=1)
[2021/01/21 18:22:25] [ info] [storage] version=1.0.6, initializing...
[2021/01/21 18:22:25] [ info] [storage] in-memory
[2021/01/21 18:22:25] [ info] [storage] normal synchronization mode, checksum disabled, max_chunks_up=128
[2021/01/21 18:22:25] [error] [input:tail:tail.0] parser 'json' is not registered
[2021/01/21 18:22:25] [error] [sqldb] cannot open database /tail-db/tail-containers-state.db
[2021/01/21 18:22:25] [error] [input:tail:tail.0] could not open/create database
[2021/01/21 18:22:25] [ info] [sp] stream processor started

Other details that may be helpful:
This behavior has also been seen on CentOS 7.9 clusters running Docker CE w/ SELinux enabled

Environment information

  • Rancher version (rancher/rancher/rancher/server image tag or shown bottom left in the UI): v2.5.5
  • Installation option (single install/HA): Single install

Cluster information

  • Cluster type (Hosted/Infrastructure Provider/Custom/Imported): Hosted AWS
  • Kubernetes version (use kubectl version): v1.19.6
@aaronyeeski aaronyeeski added area/logging kind/bug-qa Issues that have not yet hit a real release. Bugs introduced by a new feature or enhancement labels Jan 22, 2021
@aaronyeeski aaronyeeski added this to the v2.5.6 milestone Jan 22, 2021
@aaronyeeski aaronyeeski self-assigned this Jan 22, 2021
@dweomer
Copy link
Contributor

dweomer commented Jan 22, 2021

There are two things that need to happen for this to work, I think:

  1. the fluentbit container (and anything else) needing to read the log database should have the type portion of their selinux label set to container_logreader_t (this is on the security context)
  2. we may need to develop/install policy to ensure that the logging db file has the proper type label aka container_var_log_t

@cbron
Copy link
Contributor

cbron commented Feb 16, 2021

Unless someone disagree's we will leave this as a yaml option in the chart, and not put it in the UI as a checkbox. cc @ebauman

@ebauman
Copy link

ebauman commented Feb 17, 2021

Is it a big ask to put it in the UI? It seems like a worthy inclusion. "Check this box if you're using SELinux-enabled container runtime".

@cbron
Copy link
Contributor

cbron commented Feb 17, 2021

It's not hard to do, but we have a lot of settings in the values.yaml and almost none of them are in the UI:
Screen Shot 2021-02-17 at 2 01 30 PM

@paynejacob @nickgerace was there ever a decision made here or did we just not get around to adding anything.

@izaac
Copy link
Contributor

izaac commented Mar 16, 2021

Reproduced in Rancher version: v2.5-head (03/16/2021) ff68b3a
Logging Chart version: rancher-logging:3.8.201

  • Single docker Rancher install. (I installed it using the SELinux enabled AMI from the description but might not be necessary)
  • Downstream cluster using a CentOS 7.9 AMI (EC2 RKE driver)
    • Without SELinux enforce and without docker selinux enabled option
  • Wait for the cluster to come up active
  • SSH into each node and set selinux enabled in docker /etc/docker/daemon.json with {"selinux-enabled": true}
  • set SELinux Enforcing in CentOS in the /etc/selinux/configfile.
  • Reboot the node
  • Do the same with all the cluster nodes.
  • Deploy the Logging chart 3.8.201 from dashboard (use defaults, you can also pick a project)
  • Check the pod logs in the rancher-logging-rke-aggregator DaemonSet

Some file access issues are reported in the logs. No log files were collected.

Fluent Bit v1.6.4
* Copyright (C) 2019-2020 The Fluent Bit Authors
* Copyright (C) 2015-2018 Treasure Data
* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
* https://fluentbit.io

[2021/03/16 19:28:02] [ info] [engine] started (pid=1)
[2021/03/16 19:28:02] [ info] [storage] version=1.0.6, initializing...
[2021/03/16 19:28:02] [ info] [storage] in-memory
[2021/03/16 19:28:02] [ info] [storage] normal synchronization mode, checksum disabled, max_chunks_up=128
[2021/03/16 19:28:02] [error] [sqldb] cannot open database /tail-db/tail-containers-state.db
[2021/03/16 19:28:02] [error] [input:tail:tail.0] could not open/create database
[2021/03/16 19:28:02] [ info] [sp] stream processor started

@izaac
Copy link
Contributor

izaac commented Mar 18, 2021

Reopening as there are access issues when using SELinux in the latest 3.9.002-rc01
Reported here - Steps to repro.
More debugging here from Dev (response to the above linked report)

@cbron
Copy link
Contributor

cbron commented Apr 1, 2021

Waiting on new RPM setup then we can re-test

@cbron
Copy link
Contributor

cbron commented Apr 2, 2021

Back in test. @izaac note that the RPM has changed.

@izaac
Copy link
Contributor

izaac commented Apr 5, 2021

Rancher v2.5-head (04/05/2021) 2f2cfe6
Logging Chart version: 3.9.002-rc03
OS version: Centos 7.8 SELinux Enforcing
Docker: 19.03.15 SELinux ON.

Rancher SELinux policies RPM repository used:

cat << EOF > /etc/yum.repos.d/rancher-testing.repo
[rancher-testing]
name=Rancher Testing
baseurl=https://rpm-testing.rancher.io/rancher/testing/centos/7/noarch
enabled=1
gpgcheck=1
gpgkey=https://rpm-testing.rancher.io/public.key
EOF
yum -y install rancher-selinux
rpm -qa |grep rancher
rancher-selinux-0.1~alpha2-1.el7.noarch

I was able to see the control plane logs flowing to the ClusterOutput.
Log files from this path in the nodes: /var/lib/rancher/rke/log/
Scenario tested but using the RPM repo: #31309 (comment) (Docker configured after cluster provisioning)

Issue opened about cluster owner not able to install v2 charts: #31919

@izaac
Copy link
Contributor

izaac commented Apr 6, 2021

RKE 1

Rancher v2.5-head (04/06/2021) 93d921f
Logging Chart version: 3.9.400-rc02

OS version: RHEL 8.3 SELinux Enforcing
Docker: 20.10.5 SELinux ON.
Docker configured after cluster provisioning.
Rancher SELinux policies from Rancher RPM

OS version: RHEL 7.9 SELinux Enforcing
Docker: 19.03.15 SELinux ON
Rancher SELinux policies from Rancher RPM
AMI configured before provisioning

OS version: Oracle Linux 7.9 SELinux Enforcing
Docker: 19.03.15 SELinux ON
Rancher SELinux policies from Rancher RPM
AMI configured before provisioning

OS version: Oracle Linux 8.3 SELinux Enforcing
Docker: 20.10.5 SELinux ON
Rancher SELinux policies from Rancher RPM
AMI configured before provisioning

OS version: CentOS 8.3 SELinux Enforcing
Docker: 20.10.5 SELinux ON
Rancher SELinux policies from Rancher RPM
AMI configured before provisioning

Validated the RKE logs were flowing to the cluster output using the logging example project.
https://github.com/paynejacob/rancher-logging-examples

@izaac
Copy link
Contributor

izaac commented Apr 7, 2021

#31964 Dashboard is not available on master-head cc29360

@izaac
Copy link
Contributor

izaac commented Apr 12, 2021

We're still missing validation of RKE2 on v2.5-head but the validation will be the same that's going to be done as part of validating #31309

master-head issue:
#32064

@izaac izaac closed this as completed Apr 12, 2021
@cbron cbron added the release-note Note this issue in the milestone's release notes label Apr 15, 2021
@shpwrck shpwrck added the kind/enhancement Issues that improve or augment existing functionality label May 4, 2021
@zube zube bot removed the [zube]: Done label Jul 12, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/logging internal kind/bug-qa Issues that have not yet hit a real release. Bugs introduced by a new feature or enhancement kind/enhancement Issues that improve or augment existing functionality priority/0 QA/M release-note Note this issue in the milestone's release notes
Projects
None yet
Development

No branches or pull requests