You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
output of uname -r on the node: 5.4.0-1009-aws (Ubuntu 20.04 AMI)
Platform:
AWS (with kops)
Problem:
I've been able to successfully run this command: kubectl trace run -e "tracepoint:syscalls:sys_enter_* { if(args->ret < 0) {@[ustack] = count();} }" pod/some-pod -a
However, when I try to run this command in a different namespace (on a pod in the kube-system namespace for example), I get:
if your program has maps to print, send a SIGINT using Ctrl-C, if you want to interrupt the execution send SIGINT two times
definitions.h:55:3: error: unknown type name 'key_serial_t'
definitions.h:123:3: error: unknown type name 'cap_user_header_t'
definitions.h:124:3: error: unknown type name 'cap_user_data_t'
definitions.h:133:3: error: unknown type name 'cap_user_header_t'
definitions.h:134:9: error: unknown type name 'cap_user_data_t'
definitions.h:375:9: error: unknown type name 'sigset_t'
definitions.h:1102:3: error: unknown type name 'aio_context_t'
definitions.h:1113:3: error: unknown type name 'aio_context_t'
definitions.h:1122:3: error: unknown type name 'aio_context_t'
definitions.h:1135:3: error: unknown type name 'aio_context_t'
definitions.h:1150:3: error: unknown type name 'aio_context_t'
definitions.h:1159:3: error: unknown type name 'aio_context_t'
definitions.h:1174:9: error: unknown type name 'sigset_t'
definitions.h:1987:3: error: unknown type name 'siginfo_t'
definitions.h:2071:9: error: unknown type name 'sigset_t'
definitions.h:2124:3: error: unknown type name 'rwf_t'
definitions.h:2229:3: error: unknown type name 'rwf_t'
definitions.h:2240:3: error: unknown type name 'qid_t'
definitions.h:2417:3: error: unknown type name 'key_serial_t'
fatal error: too many errors emitted, stopping now [-ferror-limit=]
exit status 1
Steps taken:
At first I thought it had something to do with trying to use the default service account, but following the instructions in the readme does not appear to make a difference.
Afterwards, I noticed this issue on bpftrace which seems to indicate that this is an issue with headers not being installed; however, running the same command with --fetch-headers appended yields:
WARNING: Cannot find distro-specific headers for "Ubuntu". Fetching generic headers.
++ uname -r
+ BUILD_DIR=/linux-generic-5.4.0-1009-aws
++ uname -r
+ SOURCES_DIR=/usr/src/linux-generic-5.4.0-1009-aws
+ '[' '!' -e /usr/src/linux-generic-5.4.0-1009-aws/.installed ']'
+ echo 'Installing kernel headers for generic kernel'
+ fetch_generic_linux_sources
Installing kernel headers for generic kernel
++ uname -r
+ kernel_version=5.4.0-1009-aws
++ echo 5.4.0-1009-aws
Fetching upstream kernel sources for 5.4.0-1009-aws.
+ major_version=5
+ echo 'Fetching upstream kernel sources for 5.4.0-1009-aws.'
+ mkdir -p /linux-generic-5.4.0-1009-aws
+ curl -sL https://www.kernel.org/pub/linux/kernel/v5.x/linux-5.4.0-1009-aws.tar.gz
+ tar --strip-components=1 -xzf - -C /linux-generic-5.4.0-1009-aws
tar: invalid magic
tar: short read
real 0m0.271s
user 0m0.038s
sys 0m0.004s
+ generate_headers
Generating kernel headers
+ echo 'Generating kernel headers'
+ cd /linux-generic-5.4.0-1009-aws
+ zcat /proc/config.gz
zcat: /proc/config.gz: No such file or directory
+ make ARCH=x86 oldconfig
make: *** No rule to make target 'oldconfig'. Stop.
+ make ARCH=x86 prepare
make: *** No rule to make target 'prepare'. Stop.
+ find /linux-generic-5.4.0-1009-aws -regex '.*\.c\|.*\.txt\|.*Makefile\|.*Build\|.*Kconfig' -type f -delete
real 0m0.006s
user 0m0.005s
sys 0m0.000s
+ mv /linux-generic-5.4.0-1009-aws /usr/src
real 0m0.001s
user 0m0.001s
sys 0m0.000s
+ touch /usr/src/linux-generic-5.4.0-1009-aws/.installed
+ HEADERS_TARGET=/usr/src/linux-generic-5.4.0-1009-aws
++ uname -r
+ mkdir -p /lib/modules/5.4.0-1009-aws
++ uname -r
+ ln -sf /usr/src/linux-generic-5.4.0-1009-aws /lib/modules/5.4.0-1009-aws/source
++ uname -r
+ ln -sf /usr/src/linux-generic-5.4.0-1009-aws /lib/modules/5.4.0-1009-aws/build
+ touch /lib/modules/.installed
stream closed
kubectl-trace-67b1691a-9ae6-11ea-be33-acde48001122 if your program has maps to print, send a SIGINT using Ctrl-C, if you want to interrupt the execution send SIGINT two times
kubectl-trace-67b1691a-9ae6-11ea-be33-acde48001122 fatal error: '/lib/modules/5.4.0-1009-aws/source/include/linux/kconfig.h' file not found
When changing the bpftrace expression to just: "tracepoint:syscalls:sys_enter_* { @[ustack] = count(); }, then it works regardless of pod/namespace, which leads me to believe that it has something to do with the args parameter.
Apologies again if this the wrong place to ask this kind of question or if my assessment of my problem is incorrect.
The text was updated successfully, but these errors were encountered:
Looking at the URL in question, there is no entry for the -aws... kernel version. Looking at the source code for fetch-headers I see this line which is supposed to strip out such information.
Running this script locally (MacOS with the supplied version of AWK; i.e., BSD AWK and not GNU AWK), I get 5.4.0-1009-aws. However, when installing GNU AWK (and mawk which is, apparently standard on Ubuntu and Debian), I get 5.4.0, as expected. I'll see if I can create a new cluster with the same, or a similar image, and see what happens when I run the linked command in the cluster later on.
This is my first day using
kubectl-trace
, so I apologize if this is a silly question/if my assessment of it is wrong.output of
kubectl trace version
:git commit: d34d1d5
build date: 2019-09-19 09:00:13 -0600 MDT
output of
uname -r
on the node:5.4.0-1009-aws
(Ubuntu 20.04 AMI)Platform:
AWS (with kops)
Problem:
I've been able to successfully run this command:
kubectl trace run -e "tracepoint:syscalls:sys_enter_* { if(args->ret < 0) {@[ustack] = count();} }" pod/some-pod -a
However, when I try to run this command in a different namespace (on a pod in the
kube-system
namespace for example), I get:Steps taken:
At first I thought it had something to do with trying to use the default service account, but following the instructions in the readme does not appear to make a difference.
Afterwards, I noticed this issue on bpftrace which seems to indicate that this is an issue with headers not being installed; however, running the same command with
--fetch-headers
appended yields:When changing the bpftrace expression to just:
"tracepoint:syscalls:sys_enter_* { @[ustack] = count(); }
, then it works regardless of pod/namespace, which leads me to believe that it has something to do with theargs
parameter.Apologies again if this the wrong place to ask this kind of question or if my assessment of my problem is incorrect.
The text was updated successfully, but these errors were encountered: