Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG]: driver logs gets flooded with "Checking if host information is added to array" in debug mode #108

Closed
karthikk92 opened this issue Nov 25, 2021 · 5 comments
Assignees
Labels
area/csi-unity Issue pertains to the CSI Driver for Dell EMC Unity type/bug Something isn't working. This is the default label associated with a bug issue.

Comments

@karthikk92
Copy link

Customer provided info:
I set LogLevel to debug and found out that 15 minutes after starting the pods the driver container floods the log the following message. Then the cpu usage raises.
time="2021-11-11T11:50:47Z" level=debug runid=node-1 msg="Checking if host information is added to array" func="github.com/dell/csi-unity/service.(*service).syncNodeInfoRoutine()" file="/go/src/csi-unity/service/node.go:1648"
time="2021-11-11T11:50:47Z" level=debug runid=node-1 msg="Checking if host information is added to array" func="github.com/dell/csi-unity/service.(*service).syncNodeInfoRoutine()" file="/go/src/csi-unity/service/node.go:1648"
edit 1:
I guess this is related to this env var: X_CSI_UNITY_SYNC_NODEINFO_INTERVAL.
edit 2:
Seems that the function is stuck in the for loop: https://github.com/dell/csi-unity/blob/a6688df88ddf23d45fd268dd1a70880903ddd320/service/node.go#L1668 (edited)
node.go
func (s *service) syncNodeInfoRoutine(ctx context.Context) {

@karthikk92 karthikk92 added type/question Ask a question. This is the default label associated with a question issue. area/csi-unity Issue pertains to the CSI Driver for Dell EMC Unity labels Nov 25, 2021
@karthikk92 karthikk92 self-assigned this Nov 25, 2021
@mkej
Copy link

mkej commented Nov 25, 2021

I have also observed this behavior using csi-unity driver versions v1.6.0 and v2.0.0, on Kubernetes cluster running v1.21.3 and v1.22.4 (same cluster, upgraded).

Changing X_CSI_UNITY_SYNC_NODEINFO_INTERVAL to 5 made this issue appear after 5 minutes from pod creation, as expected.

@rajendraindukuri rajendraindukuri changed the title [QUESTION]: driver logs gets flooded with "Checking if host information is added to array" in debug mode [BUG]: driver logs gets flooded with "Checking if host information is added to array" in debug mode Dec 8, 2021
@rajendraindukuri rajendraindukuri added type/bug Something isn't working. This is the default label associated with a bug issue. and removed type/question Ask a question. This is the default label associated with a question issue. labels Dec 8, 2021
@muellerfabi
Copy link

Hello,
I was the one creating that initial issue on Slack and want to give an update.

I just updated both Operator and Driver, but Issue still occurs on

  • Openshift 4.8.20 (K8s 1.21.4)
  • with Dell CSI Operator v1.6.0
  • and Unity Driver v2.1.0
oc adm top pods --use-protocol-buffers  
NAME                                                   CPU(cores)   MEMORY(bytes)     
dell-csi-operator-controller-manager-8c8b9f98f-m7njb   2m           47Mi              
unity-controller-74bfdc88f5-ctwh9                      0m           135Mi             
unity-node-gcrm6                                       1012m        39Mi              
unity-node-pmhfr                                       1270m        42Mi              
unity-node-zpssf                                       1274m        40Mi          

oc get po
NAME                                                   READY   STATUS      RESTARTS   AGE
dell-csi-operator-controller-manager-8c8b9f98f-m7njb   1/1     Running     1          114m
installplan-approver-txc88                             0/1     Completed   0          4h11m
unity-controller-74bfdc88f5-ctwh9                      5/5     Running     5          114m
unity-node-gcrm6                                       2/2     Running     1          114m
unity-node-pmhfr                                       2/2     Running     2          114m    
unity-node-zpssf                                       2/2     Running     2          114m

Cheers

@rensyct
Copy link

rensyct commented Dec 10, 2021

Thank you fubsle for the update. Unity driver v2.1.0 does not have the fix for this issue. Our team has started looking into this issue and work is in progress. We will update this ticket once the fix is available

@karthikk92
Copy link
Author

karthikk92 commented Feb 8, 2022

This issue is fixed and please go ahead and use latest nightly build tag available.

Please use version tag as "nightly" as specified bellow:

./csi-install.sh --namespace unity --skip-verify-node --version "nightly" --values ./myvalues.yaml. and use version as nightly in myvalues.yaml as shown:
version: nightly
@fubsle

@karthikk92
Copy link
Author

Hi,

The fix is tested on OCP 4.8 environment with the drivers installed and csi-unity tests runs against it.
The user can pick the latest ‘nightly’ build tag for csi-unity.
https://hub.docker.com/r/dellemc/csi-unity/tags

NOTE: during csi-driver driver deployment using operator, the user would need to edit the value for SYNC_NODE_TIME_INTERVAL in ConfigMap as follows

apiVersion: v1
kind: ConfigMap
metadata:
name: unity-config-params
namespace: test-unity
data:
driver-config-params.yaml: |
CSI_LOG_LEVEL: "info"
ALLOW_RWO_MULTIPOD_ACCESS: "false"
MAX_UNITY_VOLUMES_PER_NODE: "0"
SYNC_NODE_INFO_TIME_INTERVAL: "15"
TENANT_NAME: ""

This operator related sample default value change would be updated in the upcoming release of dell-csi-operator.

Thanks,
Keerthi vardhan

csmbot pushed a commit that referenced this issue Aug 1, 2023
* [replication] Added upgrade page and updated install info (#57)

* Added note about repctl logs file

* Added upgrade instructions for both controller and sidecar

* modified installation\upgrade section

* Fixed couple of grammar mistakes

* Added new entry to troubleshooting page

* Addressed review comments

* Changed link address

Co-authored-by: Maxim Sklyarov <Maxim_Sklyarov@dell.com>

* Update deployment steps for CSM Authorization (#58)

* begin updating deployment

* fixed typos

* add auth upgrade doc

* updated powerscale with authorization

* updated authorization documentation for powermax, powerflex, and powerscale

* refactored for powermax

* added vxflexos related docs for auth deployment and configuration

* consolidated proxy server root cert

* fix grammar, notes, value.yaml parameters, update auth deployment

* added note for driver configurations with auth

* updated note

* add auth note to drivers

* update upgrade path

Co-authored-by: atye <tyeaaron@gmail.com>
Co-authored-by: sharmilarama <sharmila.ramamoorthy@dell.com>
Co-authored-by: Logan Jones <logan_jones2@dell.com>

* Fix operator install docs (#62)

* Small update to the contributing doc (#54)

* Update _index.md

* Update _index.md

* fixed sidecar instructions

* Update _index.md

* making changes requested by Aron

* trying to get rid of unwanted changes

Co-authored-by: gallacher <35462391+gallacher@users.noreply.github.com>

* add Volume Health Monitor section (#67)

* add Volume Health Monitor section

* PR feedback

* pv/pvc metrics csi-powerstore changes (#64)

* Added troubleshooting documentation about gateway timeout for authorization (#63)

* Upgrade and Rollback Support for CSM for Authorization proxy server (#66)

* added auth upgrade and rollback, updated auth notes for drivers

* fixed spacing

* [replication] Added uninstall page, updated repctl readme (#70)

* static provisioning and ephemeral changes (#71)

* Update uninstall.md

* updated auuth deployment steps (#72)

* add  healthMonitorInterval to values table (#79)

* Helm install update (#74)

* updating helm install instructions

* adding troubleshooting for helm update

* minor changes and updates

* more minor changes

* word change

* more minor changes

* addressing comments from Jacob

* fixing numbers

* update code owners (#76)

* Move health monitor section to correct file  (#81)

* update correct file

* remove feature from wrong file

* Removed older OpenShift and added new driver versions (#84)

* Feature rwop csi powerstore (#89)

* Documentation for RWOP - CSI Powerstore

* Addressed review comment

* Update powerstore.md

Co-authored-by: shanmydell <82038610+shanmydell@users.noreply.github.com>

* Feature rwop accessmode support for csi-powerscale (#90)

Co-authored-by: shanmydell <82038610+shanmydell@users.noreply.github.com>

* Tenant documentation for both csi-unity and operator (#85)

Co-authored-by: shanmydell <82038610+shanmydell@users.noreply.github.com>

* Replication prerequisites & troubleshooting (#93)

Co-authored-by: shanmydell <82038610+shanmydell@users.noreply.github.com>

* Feature/pvc metrics csi powerstore update (#91)

* volume health monitoring update (#92)

* volume health monitoring update

* Update powerscale.md

* update documentation for health monitoring

Co-authored-by: shanmydell <82038610+shanmydell@users.noreply.github.com>
Co-authored-by: Randeep Sharma <92301596+randeepsharma@users.noreply.github.com>
Co-authored-by: Bahubali Jain <bahubali_jain@dell.com>

* Changed replication support matrix (#94)

* Changed replication support matrix

* Changed to X

* Add health values (#95)

* add new values to values table

* Add note to features section

* fix typo

* Common changes (#86)

* Unity - RWOP Access Mode and Volume Health Monitoring (#77)

* RWOP support matrix change (#96)

* Added known issue for unity (#97)

* Update powerflex.md (#98)

* powerscale release notes updated (#99)

* Operator Docs changes related to Unity features (#102)

* Operator upgrade documentation for volume health monitor changes (#104)

* Added note about how to list volume snapshots (#101)

* restructured deployment docs (#106)

* Improve operator install steps (#107)

* Update versions (#100)

* Added note that clarifies keys for csm installer (#108)

* Added volume health monitor in CSI spec support (#109)

* updated sample update for topology usage (#112)

#82

Co-authored-by: Andrey Schipilo <superdron97@yandex.ru>
Co-authored-by: Maxim Sklyarov <Maxim_Sklyarov@dell.com>
Co-authored-by: shaynafinocchiaro <66699024+shaynafinocchiaro@users.noreply.github.com>
Co-authored-by: atye <tyeaaron@gmail.com>
Co-authored-by: sharmilarama <sharmila.ramamoorthy@dell.com>
Co-authored-by: Logan Jones <logan_jones2@dell.com>
Co-authored-by: Jooseppi Luna <jooseppi_luna@dell.com>
Co-authored-by: JacobGros <jacobgrosner4@gmail.com>
Co-authored-by: Ashish Verma <32611022+ashish2207@users.noreply.github.com>
Co-authored-by: Trevor Dawe <trevor.dawe@dell.com>
Co-authored-by: gilltaran <91598969+gilltaran@users.noreply.github.com>
Co-authored-by: hoppea2 <33433874+hoppea2@users.noreply.github.com>
Co-authored-by: Francis Nijay <francis.nijay@dell.com>
Co-authored-by: shanmydell <82038610+shanmydell@users.noreply.github.com>
Co-authored-by: Bahubali Jain <66621574+bpjain2004@users.noreply.github.com>
Co-authored-by: karthikk92 <92289639+karthikk92@users.noreply.github.com>
Co-authored-by: Sakshi-dell <75004921+Sakshi-dell@users.noreply.github.com>
Co-authored-by: Randeep Sharma <92301596+randeepsharma@users.noreply.github.com>
Co-authored-by: Bahubali Jain <bahubali_jain@dell.com>
Co-authored-by: rensyct <80810999+rensyct@users.noreply.github.com>
Co-authored-by: Rajendra Indukuri <82365588+rajendraindukuri@users.noreply.github.com>
Co-authored-by: abhi16394 <32352976+abhi16394@users.noreply.github.com>
Co-authored-by: panigs7 <92028646+panigs7@users.noreply.github.com>
Co-authored-by: Prasanna M <35757638+prablr79@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/csi-unity Issue pertains to the CSI Driver for Dell EMC Unity type/bug Something isn't working. This is the default label associated with a bug issue.
Projects
None yet
Development

No branches or pull requests

5 participants