Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added ability to enable DCGM library log output #276

Merged
merged 1 commit into from
Mar 5, 2024

Conversation

nvvfedorov
Copy link
Collaborator

Users are now able to use the "--enable-dcgm-log" and "--dcgm-log-level=[NONE,FATAL,ERROR,WARN,INFO,DEBUG,VERB]" flags to enable DCGM library log output into the dcgm-exporter logs.

Important: The default value for the "--dcgm-log-level" is "NONE". To see DCGM logs, you must set at least the INFO log level.

Here is an example of the DCGM log output:

time="2024-03-01T14:24:35-06:00" level=info msg="version:3.3.3;arch:x86_64;buildtype:Release;buildid:11;builddate:2024-01-18;commit:c3aed64480553cd5ba1a32d165c7967936446631;branch:rel_dcgm_3_3;buildplatform:Linux 4.15.0-180-generic #189-Ubuntu SMP Wed May 18 14:13:57 UTC 2022 x86_64;;crc:39ff792f5514bdc5af56ce313a8fa90e [/workspaces/dcgm-rel_dcgm_3_3-postmerge/dcgmlib/src/DcgmApi.cpp:5012] [{anonymous}::StartEmbeddedV2]" dcgm_level=INFO
time="2024-03-01T14:24:35-06:00" level=info msg="Signal 12 is already handled. Nothing to do. [/workspaces/dcgm-rel_dcgm_3_3-postmerge/common/DcgmThread/DcgmThread.cpp:394] [DcgmThread::InstallSignalHandler]" dcgm_level=INFO
time="2024-03-01T14:24:35-06:00" level=info msg="Parsed driver string is 5452902, IsR450OrNewer: 1, IsR520OrNewer: 1 [/workspaces/dcgm-rel_dcgm_3_3-postmerge/dcgmlib/src/DcgmCacheManager.cpp:2592] [DcgmCacheManager::ReadAndCacheDriverVersions]" dcgm_level=INFO
time="2024-03-01T14:24:35-06:00" level=info msg="Detected 0 NVLinks for GPU 0 [/workspaces/dcgm-rel_dcgm_3_3-postmerge/dcgmlib/src/DcgmCacheManager.cpp:1263] [DcgmCacheManager::InitializeNvLinkCount]" dcgm_level=INFO
time="2024-03-01T14:24:35-06:00" level=info msg="Got 0 excluded GPUs [/workspaces/dcgm-rel_dcgm_3_3-postmerge/dcgmlib/src/DcgmCacheManager.cpp:1087] [DcgmCacheManager::ReadAndCacheGpuExclusionList]" dcgm_level=INFO

@nvvfedorov nvvfedorov self-assigned this Mar 1, 2024
pkg/cmd/app.go Show resolved Hide resolved
pkg/cmd/app.go Outdated Show resolved Hide resolved
pkg/dcgmexporter/const.go Show resolved Hide resolved
pkg/stdout/logprocessor.go Outdated Show resolved Hide resolved
pkg/stdout/logprocessor.go Outdated Show resolved Hide resolved
pkg/stdout/capture.go Show resolved Hide resolved
pkg/stdout/capture.go Outdated Show resolved Hide resolved
pkg/stdout/capture.go Outdated Show resolved Hide resolved
internal/pkg/logging/logger_adapter.go Outdated Show resolved Hide resolved
@nvvfedorov nvvfedorov force-pushed the enable-dcgm-logging-options branch 3 times, most recently from c138138 to 57643f8 Compare March 5, 2024 05:20
@nvvfedorov nvvfedorov force-pushed the enable-dcgm-logging-options branch 2 times, most recently from 14766d6 to 8667564 Compare March 5, 2024 19:19
Copy link
Collaborator

@rohit-arora-dev rohit-arora-dev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just couple of minor suggestions based on your recent changes otherwise looking good. Thanks for addressing the review comments.

internal/pkg/logging/logger_adapter.go Outdated Show resolved Hide resolved
pkg/cmd/app.go Show resolved Hide resolved
pkg/cmd/app.go Show resolved Hide resolved
checkout

Signed-off-by: Vadym Fedorov <vfedorov@nvidia.com>
Copy link
Collaborator

@rohit-arora-dev rohit-arora-dev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @nvvfedorov for this change and addressing the review comments.

/LGTM

@nvvfedorov nvvfedorov merged commit 7ec7bb7 into main Mar 5, 2024
1 check passed
@nvvfedorov nvvfedorov deleted the enable-dcgm-logging-options branch March 5, 2024 22:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants