-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
libcontainer: intelrdt: add user-friendly diagnostics for Intel RDT operation errors #1913
libcontainer: intelrdt: add user-friendly diagnostics for Intel RDT operation errors #1913
Conversation
339bcf2
to
6c307f8
Compare
…peration errors Linux kernel v4.15 introduces better diagnostics for Intel RDT operation errors. If any error returns when making new directories or writing to any of the control file in resctrl filesystem, reading file /sys/fs/resctrl/info/last_cmd_status could provide more information that can be conveyed in the error returns from file operations. Some examples: echo "L3:0=f3;1=ff" > /sys/fs/resctrl/container_id/schemata -bash: echo: write error: Invalid argument cat /sys/fs/resctrl/info/last_cmd_status mask f3 has non-consecutive 1-bits echo "MB:0=0;1=110" > /sys/fs/resctrl/container_id/schemata -bash: echo: write error: Invalid argument cat /sys/fs/resctrl/info/last_cmd_status MB value 0 out of range [10,100] cd /sys/fs/resctrl mkdir 1 2 3 4 5 6 7 8 mkdir: cannot create directory '8': No space left on device cat /sys/fs/resctrl/info/last_cmd_status out of CLOSIDs See 'last_cmd_status' for more details in kernel documentation: https://www.kernel.org/doc/Documentation/x86/intel_rdt_ui.txt In runc, we could append the diagnostics information to the error message of Intel RDT operation errors to provide more user-friendly information. Signed-off-by: Xiaochen Shen <xiaochen.shen@intel.com>
lastCmdStatus, err := getIntelRdtParamString(path, "last_cmd_status") | ||
if err != nil { | ||
return "", err | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if the system doesn't support "last_cmd_status", is it gonna fall back to use the error we got in getIntelRdtParamString
? Doesn't sound quite right.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for kind code review!
This change could handle the case when system doesn't support "last_cmd_status":
Firstly, getLastCmdStatus() returns the error which we get from getIntelRdtParamString() to the caller NewLastCmdError().
In NewLastCmdError(), parameter 'err' is the original error. 'err1' is the error returns from getLastCmdStatus().
- err1 != nil indicates that we don't support "last_cmd_status". We just return original 'err' directly.
- Only when err1 == nil, which indicates that we support "last_cmd_status", we will call overloaded function Error() to append extra diagnostics information to original 'err'.
See more details in NewLastCmdError():
func NewLastCmdError(err error) error {
lastCmdStatus, err1 := getLastCmdStatus()
if err1 == nil {
return &LastCmdError{
LastCmdStatus: lastCmdStatus,
Err: err,
}
}
return err
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I misread that, thanks for explanation.
RFC @opencontainers/runc-maintainers |
1 similar comment
RFC @opencontainers/runc-maintainers |
Sorry for some duplicate comments here. It looks like something wrong with github due to the incident: https://blog.github.com/2018-10-21-october21-incident-report/ |
2 similar comments
pull approve is a little off recently |
ya I'll poke around
…On Thu, Oct 25, 2018 at 2:27 PM Michael Crosby ***@***.***> wrote:
pull approve is a little off recently
—
You are receiving this because you modified the open/close state.
Reply to this email directly, view it on GitHub
<#1913 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAD5IX9d7RPp5cwqIbLaOdXYhs07e9zRks5uohCugaJpZM4XoGpQ>
.
--
Cheers,
Chris Aniszczyk
http://aniszczyk.org
+1 512 961 6719
|
Linux kernel v4.15 introduces better diagnostics for Intel RDT operation
errors. If any error returns when making new directories or writing to
any of the control file in resctrl filesystem, reading file
/sys/fs/resctrl/info/last_cmd_status could provide more information that
can be conveyed in the error returns from file operations.
Some examples:
See 'last_cmd_status' for more details in kernel documentation:
https://www.kernel.org/doc/Documentation/x86/intel_rdt_ui.txt
In runc, we could append the diagnostics information to the error
message of Intel RDT operation errors to provide more user-friendly
information.
Signed-off-by: Xiaochen Shen xiaochen.shen@intel.com