Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fatal error condition occurred: aws-c-common string.c:217: allocator && aws_byte_cursor_is_valid(cursor) #411

Closed
thomas-roos opened this issue May 8, 2023 · 16 comments
Labels
bug Something isn't working

Comments

@thomas-roos
Copy link
Contributor

Describe the bug

root@qemux86-64:/usr/lib/aws-iot-device-client/ptest# ./test-aws-iot-device-client
[==========] Running 276 tests from 32 test suites.
[----------] Global test environment set-up.
[----------] 48 tests from ConfigTestFixture
[ RUN ] ConfigTestFixture.AllFeaturesEnabled
Fatal error condition occurred in /usr/src/debug/aws-c-common/0.8.19-r0/git/source/string.c:217: allocator && aws_byte_cursor_is_valid(cursor)
Exiting Application
################################################################################
Stack trace:
################################################################################
/usr/lib/libaws-c-common.so.1(aws_backtrace_print+0x62) [0x7f974e575282]
/usr/lib/libaws-c-common.so.1(aws_fatal_assert+0x48) [0x7f974e55a428]
/usr/lib/libaws-c-common.so.1(+0x37599) [0x7f974e57d599]
/usr/lib/libaws-c-common.so.1(aws_json_value_get_from_object+0x29) [0x7f974e56cbf9]
/usr/lib/libaws-crt-cpp.so(_ZNK3Aws3Crt8JsonView11ValueExistsEPKc+0x1d) [0x7f974e6ef47d]
./test-aws-iot-device-client(+0x9b1b9) [0x555e6911d1b9]
./test-aws-iot-device-client(+0x11ce97) [0x555e6919ee97]
./test-aws-iot-device-client(+0x26223d) [0x555e692e423d]
./test-aws-iot-device-client(+0x2562e6) [0x555e692d82e6]
./test-aws-iot-device-client(+0x256572) [0x555e692d8572]
./test-aws-iot-device-client(+0x25671c) [0x555e692d871c]
./test-aws-iot-device-client(+0x256ed3) [0x555e692d8ed3]
./test-aws-iot-device-client(+0x26275d) [0x555e692e475d]
./test-aws-iot-device-client(+0x257051) [0x555e692d9051]
./test-aws-iot-device-client(+0x73d6f) [0x555e690f5d6f]
/lib/libc.so.6(__libc_start_main+0xeb) [0x7f974e0e3cfb]
./test-aws-iot-device-client(+0x8475a) [0x555e6910675a]
Aborted

Environment (please complete the following information):

  • OS: yocto + meta-aws
  • Architecture: any
  • Device Client commit: bb7ff67
@thomas-roos thomas-roos added the bug Something isn't working label May 8, 2023
@thomas-roos
Copy link
Contributor Author

of course this was not seen as this is not working: #404

@thomas-roos
Copy link
Contributor Author

This is the backtrace:
backtrace.txt

@thomas-roos
Copy link
Contributor Author

BTW: as easy as adding

EXTRA_IMAGE_FEATURES ?= "debug-tweaks dbg-pkgs tools-debug src-pkgs"
QB_MEM = "-m 8192" 

to your Yocto local.conf

thomas-roos added a commit to thomas-roos/yocto_example that referenced this issue May 23, 2023
@thomas-roos
Copy link
Contributor Author

@dbouras
Copy link

dbouras commented May 31, 2023

I was able to build a static binary using an Ubuntu 20.04 VM that at least doesn't have the issue with startup and also does not crash when connecting to the endpoint when a secure tunnel is open.

To recap, when using the Siemens kas container, the issue is outlined in this thread and also here. At a 10,000 feet level, the gcc environment in kas is:

Using built-in specs.
COLLECT_GCC=g++
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/10/lto-wrapper
OFFLOAD_TARGET_NAMES=nvptx-none:amdgcn-amdhsa:hsa
OFFLOAD_TARGET_DEFAULT=1
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Debian 10.2.1-6' --with-bugurl=file:///usr/share/doc/gcc-10/README.Bugs --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --prefix=/usr --with-gcc-major-version-only --program-suffix=-10 --program-prefix=x86_64-linux-gnu- --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --enable-bootstrap --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-vtable-verify --enable-plugin --enable-default-pie --with-system-zlib --enable-libphobos-checking=release --with-target-system-zlib=auto --enable-objc-gc=auto --enable-multiarch --disable-werror --with-arch-32=i686 --with-abi=m64 --with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic --enable-offload-targets=nvptx-none=/build/gcc-10-Km9U7s/gcc-10-10.2.1/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-10-Km9U7s/gcc-10-10.2.1/debian/tmp-gcn/usr,hsa --without-cuda-driver --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu --with-build-config=bootstrap-lto-lean --enable-link-mutex
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 10.2.1 20210110 (Debian 10.2.1-6) 

and in the Ubuntu 20.04 VM:

Using built-in specs.
COLLECT_GCC=g++
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/7/lto-wrapper
OFFLOAD_TARGET_NAMES=nvptx-none
OFFLOAD_TARGET_DEFAULT=1
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu 7.5.0-3ubuntu1~18.04' --with-bugurl=file:///usr/share/doc/gcc-7/README.Bugs --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++ --prefix=/usr --with-gcc-major-version-only --program-suffix=-7 --program-prefix=x86_64-linux-gnu- --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --enable-bootstrap --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-vtable-verify --enable-libmpx --enable-plugin --enable-default-pie --with-system-zlib --with-target-system-zlib --enable-objc-gc=auto --enable-multiarch --disable-werror --with-arch-32=i686 --with-abi=m64 --with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic --enable-offload-targets=nvptx-none --without-cuda-driver --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu
Thread model: posix
gcc version 7.5.0 (Ubuntu 7.5.0-3ubuntu1~18.04) 

The client still dumps core, however, when using child processes to parse /etc/host.conf or /etc/hosts. The contents do not matter - if the file size of either /etc/hosts or /etc/host.conf is anything > 0, the issue occurs. An indicative strace -f excerpt is as follows (only /etc/hosts exists in this test):

execve("/sbin/aws-iot-device-client", ["aws-iot-device-client", "--config-file", "./client.conf"], 0x7ffe10c990c8 /* 24 vars */) = 0
brk(NULL)                               = 0xfa8000
brk(0xfaa6c0)                           = 0xfaa6c0
arch_prctl(ARCH_SET_FS, 0xfa9d80)       = 0
uname({sysname="Linux", nodename="XXXXXXXX", ...}) = 0
set_tid_address(0xfaa050)               = 3506
set_robust_list(0xfaa060, 24)           = 0
rt_sigaction(SIGRTMIN, {sa_handler=0x9df3a0, sa_mask=[], sa_flags=SA_RESTORER|SA_SIGINFO, sa_restorer=0x9de8f0}, NULL, 8) = 0
rt_sigaction(SIGRT_1, {sa_handler=0x9df440, sa_mask=[], sa_flags=SA_RESTORER|SA_RESTART|SA_SIGINFO, sa_restorer=0x9de8f0}, NULL, 8) = 0
rt_sigprocmask(SIG_UNBLOCK, [RTMIN RT_1], NULL, 8) = 0
prlimit64(0, RLIMIT_STACK, NULL, {rlim_cur=8192*1024, rlim_max=RLIM64_INFINITY}) = 0
readlink("/proc/self/exe", "/sbin/aws-iot-device-client", 4096) = 27
brk(0xfcb6c0)                           = 0xfcb6c0
brk(0xfcc000)                           = 0xfcc000
access("/etc/ld.so.nohwcap", F_OK)      = -1 ENOENT (No such file or directory)
futex(0xf9e424, FUTEX_WAKE_PRIVATE, 2147483647) = 0
futex(0xf9e430, FUTEX_WAKE_PRIVATE, 2147483647) = 0
openat(AT_FDCWD, "/home/root/.aws-iot-device-client/aws-iot-device-client.conf", O_RDONLY) = -1 ENOENT (No such file or directory)
openat(AT_FDCWD, "./client.conf", O_RDONLY) = 3
close(3)                                = 0
openat(AT_FDCWD, "./client.conf", O_RDONLY) = 3
close(3)                                = 0
stat("./client.conf", {st_mode=S_IFREG|0644, st_size=1366, ...}) = 0
stat("./", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
clock_gettime(CLOCK_REALTIME, {tv_sec=1685503056, tv_nsec=73054094}) = 0
stat("./client.conf", {st_mode=S_IFREG|0644, st_size=1366, ...}) = 0
clock_gettime(CLOCK_REALTIME, {tv_sec=1685503056, tv_nsec=74008911}) = 0
openat(AT_FDCWD, "./client.conf", O_RDONLY) = 3
read(3, "{\n    \"endpoint\": \"a33mq08dt6fpr"..., 8191) = 1366
read(3, "", 8191)                       = 0
openat(AT_FDCWD, "/edge/aws/certs/AmazonRootCA1.pem.crt", O_RDONLY) = 4
close(4)                                = 0
clock_gettime(CLOCK_REALTIME, {tv_sec=1685503056, tv_nsec=77417641}) = 0
clock_gettime(CLOCK_REALTIME, {tv_sec=1685503056, tv_nsec=77798479}) = 0
clock_gettime(CLOCK_REALTIME, {tv_sec=1685503056, tv_nsec=78255752}) = 0
clock_gettime(CLOCK_REALTIME, {tv_sec=1685503056, tv_nsec=78701869}) = 0
clock_gettime(CLOCK_REALTIME, {tv_sec=1685503056, tv_nsec=79029140}) = 0
clock_gettime(CLOCK_REALTIME, {tv_sec=1685503056, tv_nsec=79341974}) = 0
clock_gettime(CLOCK_REALTIME, {tv_sec=1685503056, tv_nsec=79766095}) = 0
clock_gettime(CLOCK_REALTIME, {tv_sec=1685503056, tv_nsec=80091772}) = 0
clock_gettime(CLOCK_REALTIME, {tv_sec=1685503056, tv_nsec=80416059}) = 0
clock_gettime(CLOCK_REALTIME, {tv_sec=1685503056, tv_nsec=80848218}) = 0
close(3)                                = 0
mkdir("/home", 0700)                    = -1 EEXIST (File exists)
mkdir("/home/root", 0700)               = -1 EEXIST (File exists)
mkdir("/home/root/.aws-iot-device-client", 0700) = -1 EEXIST (File exists)
mkdir("/home/root/.aws-iot-device-client/sample-shadow", 0700) = -1 EEXIST (File exists)
mkdir("/home/root/.aws-iot-device-client/sample-shadow/", 0700) = -1 EEXIST (File exists)
stat("/home/root/.aws-iot-device-client/sample-shadow/", {st_mode=S_IFDIR|0700, st_size=4096, ...}) = 0
clock_gettime(CLOCK_REALTIME, {tv_sec=1685503056, tv_nsec=83773115}) = 0
clock_gettime(CLOCK_REALTIME, {tv_sec=1685503056, tv_nsec=84119105}) = 0
openat(AT_FDCWD, "/home/root/.aws-iot-device-client/sample-shadow/default-sample-shadow-document", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 3
close(3)                                = 0
chmod("/home/root/.aws-iot-device-client/sample-shadow/default-sample-shadow-document", 0600) = 0
stat("/home/root/.aws-iot-device-client/sample-shadow/default-sample-shadow-document", {st_mode=S_IFREG|0600, st_size=0, ...}) = 0
getuid()                                = 0
stat("/home/root/.aws-iot-device-client/sample-shadow/default-sample-shadow-document", {st_mode=S_IFREG|0600, st_size=0, ...}) = 0
clock_gettime(CLOCK_REALTIME, {tv_sec=1685503056, tv_nsec=86717790}) = 0
openat(AT_FDCWD, "/home/root/.aws-iot-device-client/aws-iot-device-client-runtime.conf", O_RDONLY) = -1 ENOENT (No such file or directory)
clock_gettime(CLOCK_REALTIME, {tv_sec=1685503056, tv_nsec=87373800}) = 0
openat(AT_FDCWD, "/home/root/.aws-iot-device-client/http-proxy.conf", O_RDONLY) = -1 ENOENT (No such file or directory)
clock_gettime(CLOCK_REALTIME, {tv_sec=1685503056, tv_nsec=87909912}) = 0
openat(AT_FDCWD, "/edge/aws/certs/AmazonRootCA1.pem.crt", O_RDONLY) = 3
close(3)                                = 0
stat("/edge/aws/certs/", {st_mode=S_IFDIR|0700, st_size=4096, ...}) = 0
getuid()                                = 0
stat("/edge/aws/certs/", {st_mode=S_IFDIR|0700, st_size=4096, ...}) = 0
stat("/edge/aws/certs/AmazonRootCA1.pem.crt", {st_mode=S_IFREG|0644, st_size=1188, ...}) = 0
getuid()                                = 0
stat("/edge/aws/certs/AmazonRootCA1.pem.crt", {st_mode=S_IFREG|0644, st_size=1188, ...}) = 0
openat(AT_FDCWD, "/edge/aws/certs/XXXXXXXX.pem.crt", O_RDONLY) = 3
close(3)                                = 0
openat(AT_FDCWD, "/edge/aws/certs/XXXXXXXX.private.key", O_RDONLY) = 3
close(3)                                = 0
stat("/edge/aws/logs/", {st_mode=S_IFDIR|0745, st_size=4096, ...}) = 0
stat("/edge/aws/logs/", {st_mode=S_IFDIR|0745, st_size=4096, ...}) = 0
openat(AT_FDCWD, "/edge/aws/logs/client.log", O_WRONLY|O_CREAT|O_APPEND, 0666) = 3
lseek(3, 0, SEEK_END)                   = 4605
stat("/edge/aws/logs/client.log", {st_mode=S_IFREG|0600, st_size=4605, ...}) = 0
mmap(NULL, 8392704, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_STACK, -1, 0) = 0x7f982b246000
mprotect(0x7f982b247000, 8388608, PROT_READ|PROT_WRITE) = 0
clone(child_stack=0x7f982ba44970, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTIDstrace: Process 3507 attached
, parent_tid=[3507], tls=0x7f982ba46700, child_tidptr=0x7f982ba469d0) = 3507
[pid  3506] getcwd( <unfinished ...>
[pid  3507] set_robust_list(0x7f982ba469e0, 24 <unfinished ...>
[pid  3506] <... getcwd resumed>"/edge/aws/client", 4096) = 17
[pid  3507] <... set_robust_list resumed>) = 0
[pid  3506] clock_gettime(CLOCK_REALTIME, {tv_sec=1685503056, tv_nsec=93036071}) = 0
[pid  3507] mmap(NULL, 134217728, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_NORESERVE, -1, 0 <unfinished ...>
[pid  3506] futex(0xfab0d8, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...>
[pid  3507] <... mmap resumed>)         = 0x7f9823246000
[pid  3507] munmap(0x7f9823246000, 14393344) = 0
[pid  3507] munmap(0x7f9828000000, 52715520) = 0
[pid  3507] mprotect(0x7f9824000000, 135168, PROT_READ|PROT_WRITE) = 0
[pid  3507] futex(0xfab0d8, FUTEX_WAKE_PRIVATE, 1) = 1
[pid  3506] <... futex resumed>)        = 0
[pid  3507] nanosleep({tv_sec=0, tv_nsec=1000000},  <unfinished ...>
[pid  3506] futex(0xfab0d8, FUTEX_WAKE_PRIVATE, 1) = 0
[pid  3506] clock_gettime(CLOCK_REALTIME, {tv_sec=1685503056, tv_nsec=93777609}) = 0
[pid  3506] openat(AT_FDCWD, "/run/lock/devicecl.lock", O_RDONLY) = 4
[pid  3506] read(4, "XXXXXXXX\n3493", 8191) = 16
[pid  3506] read(4, "", 8191)           = 0
[pid  3506] kill(3493, 0)               = -1 ESRCH (No such process)
[pid  3506] unlink("/run/lock/devicecl.lock") = 0
[pid  3506] close(4 <unfinished ...>
[pid  3507] <... nanosleep resumed>0x7f982ba44710) = 0
[pid  3506] <... close resumed>)        = 0
[pid  3507] openat(AT_FDCWD, "/etc/localtime", O_RDONLY|O_CLOEXEC <unfinished ...>
[pid  3506] openat(AT_FDCWD, "/run/lock/devicecl.lock", O_WRONLY|O_CREAT|O_EXCL|O_TRUNC, 0666 <unfinished ...>
[pid  3507] <... openat resumed>)       = -1 ENOENT (No such file or directory)
[pid  3506] <... openat resumed>)       = 5
[pid  3506] getpid( <unfinished ...>
[pid  3507] write(3, "2023-05-31T03:17:36.073Z [WARN] "..., 169 <unfinished ...>
[pid  3506] <... getpid resumed>)       = 3506
[pid  3506] fstat(5,  <unfinished ...>
[pid  3507] <... write resumed>)        = 169
[pid  3506] <... fstat resumed>{st_mode=S_IFREG|0644, st_size=0, ...}) = 0
[pid  3506] write(5, "XXXXXXXX\n3506", 16 <unfinished ...>
[pid  3507] nanosleep({tv_sec=0, tv_nsec=1000000},  <unfinished ...>
[pid  3506] <... write resumed>)        = 16
[pid  3506] close(5)                    = 0
[pid  3506] clock_gettime(CLOCK_REALTIME, {tv_sec=1685503056, tv_nsec=95704751}) = 0
[pid  3506] rt_sigprocmask(SIG_BLOCK, [HUP INT], NULL, 8) = 0
[pid  3506] futex(0xfa61c0, FUTEX_WAKE_PRIVATE, 2147483647) = 0
[pid  3506] openat(AT_FDCWD, "/dev/urandom", O_RDONLY) = 4
[pid  3507] <... nanosleep resumed>0x7f982ba44710) = 0
[pid  3507] write(3, "2023-05-31T03:17:36.074Z [WARN] "..., 180 <unfinished ...>
[pid  3506] read(4, "\370\3427\371G\32\203W7\27\212\7u\355\t\240\23\270N\274Z\"\1.*0\335\200\301\370\301\5", 32) = 32
[pid  3507] <... write resumed>)        = 180
[pid  3506] read(4,  <unfinished ...>
[pid  3507] nanosleep({tv_sec=0, tv_nsec=1000000},  <unfinished ...>
[pid  3506] <... read resumed>"\242\340\31\303,<\356\257\262\361JvY8:\264\353v\\\241z\333\330\34S\0327k\27\310@\217", 32) = 32
[pid  3506] futex(0xf9cac0, FUTEX_WAKE_PRIVATE, 2147483647) = 0
[pid  3506] futex(0xf9cab4, FUTEX_WAKE_PRIVATE, 2147483647) = 0
[pid  3506] futex(0xf9af9c, FUTEX_WAKE_PRIVATE, 2147483647) = 0
[pid  3506] futex(0xf9c964, FUTEX_WAKE_PRIVATE, 2147483647) = 0
[pid  3506] futex(0xf9c930, FUTEX_WAKE_PRIVATE, 2147483647) = 0
[pid  3506] brk(0xfed000 <unfinished ...>
[pid  3507] <... nanosleep resumed>0x7f982ba44710) = 0
[pid  3506] <... brk resumed>)          = 0xfed000
[pid  3507] write(3, "2023-05-31T03:17:36.077Z [WARN] "..., 132) = 132
[pid  3507] nanosleep({tv_sec=0, tv_nsec=1000000},  <unfinished ...>
[pid  3506] futex(0xf9caac, FUTEX_WAKE_PRIVATE, 2147483647) = 0
[pid  3506] futex(0xf9cb58, FUTEX_WAKE_PRIVATE, 2147483647) = 0
[pid  3507] <... nanosleep resumed>0x7f982ba44710) = 0
[pid  3507] write(3, "2023-05-31T03:17:36.077Z [WARN] "..., 127 <unfinished ...>
[pid  3506] futex(0xf9caa0, FUTEX_WAKE_PRIVATE, 2147483647) = 0
[pid  3507] <... write resumed>)        = 127
[pid  3507] nanosleep({tv_sec=0, tv_nsec=1000000},  <unfinished ...>
[pid  3506] futex(0xf9ca98, FUTEX_WAKE_PRIVATE, 2147483647) = 0
[pid  3506] openat(AT_FDCWD, "/usr/lib/ssl/cert.pem", O_RDONLY) = -1 ENOENT (No such file or directory)
[pid  3506] futex(0xf9c93c, FUTEX_WAKE_PRIVATE, 2147483647) = 0
[pid  3506] openat(AT_FDCWD, "/usr/lib/ssl/cert.pem", O_RDONLY) = -1 ENOENT (No such file or directory)
[pid  3506] stat("/etc/ssl/certs", {st_mode=S_IFDIR|0755, st_size=16384, ...}) = 0
[pid  3506] stat("/etc/ssl/certs/ca-certificates.crt", {st_mode=S_IFREG|0644, st_size=195453, ...}) = 0
[pid  3507] <... nanosleep resumed>0x7f982ba44710) = 0
[pid  3507] write(3, "2023-05-31T03:17:36.078Z [WARN] "..., 129) = 129
[pid  3507] nanosleep({tv_sec=0, tv_nsec=1000000},  <unfinished ...>
[pid  3506] stat("/edge/aws/logs/", {st_mode=S_IFDIR|0745, st_size=4096, ...}) = 0
[pid  3506] stat("/edge/aws/logs/", {st_mode=S_IFDIR|0745, st_size=4096, ...}) = 0
[pid  3506] openat(AT_FDCWD, "/edge/aws/logs/sdk.log", O_RDONLY) = 5
[pid  3506] close(5)                    = 0
[pid  3506] stat("/edge/aws/logs/sdk.log", {st_mode=S_IFREG|0600, st_size=0, ...}) = 0
[pid  3506] openat(AT_FDCWD, "/edge/aws/logs/sdk.log", O_RDWR|O_CREAT|O_APPEND, 0666) = 5
[pid  3506] mmap(NULL, 8392704, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_STACK, -1, 0) = 0x7f982aa45000
[pid  3506] mprotect(0x7f982aa46000, 8388608, PROT_READ|PROT_WRITE <unfinished ...>
[pid  3507] <... nanosleep resumed>0x7f982ba44710) = 0
[pid  3506] <... mprotect resumed>)     = 0
[pid  3507] write(3, "2023-05-31T03:17:36.078Z [WARN] "..., 132 <unfinished ...>
[pid  3506] clone(child_stack=0x7f982b243970, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID <unfinished ...>
[pid  3507] <... write resumed>)        = 132
strace: Process 3508 attached
[pid  3506] <... clone resumed>, parent_tid=[3508], tls=0x7f982b245700, child_tidptr=0x7f982b2459d0) = 3508
[pid  3508] set_robust_list(0x7f982b2459e0, 24 <unfinished ...>
[pid  3507] nanosleep({tv_sec=0, tv_nsec=1000000},  <unfinished ...>
[pid  3506] sched_setaffinity(3508, 128, [0] <unfinished ...>
[pid  3508] <... set_robust_list resumed>) = 0
[pid  3506] <... sched_setaffinity resumed>) = 0
[pid  3508] futex(0x7f982b245d18, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...>
[pid  3506] futex(0x7f982b245d18, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
[pid  3508] <... futex resumed>)        = -1 EAGAIN (Resource temporarily unavailable)
[pid  3506] <... futex resumed>)        = 0
[pid  3508] futex(0x7f982b245d18, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
[pid  3506] clock_gettime(CLOCK_REALTIME,  <unfinished ...>
[pid  3508] <... futex resumed>)        = 0
[pid  3506] <... clock_gettime resumed>{tv_sec=1685503056, tv_nsec=102339929}) = 0
[pid  3508] mmap(NULL, 134217728, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_NORESERVE, -1, 0 <unfinished ...>
[pid  3506] clock_gettime(CLOCK_REALTIME,  <unfinished ...>
[pid  3508] <... mmap resumed>)         = 0x7f981c000000
[pid  3506] <... clock_gettime resumed>{tv_sec=1685503056, tv_nsec=102497909}) = 0
[pid  3508] munmap(0x7f9820000000, 67108864 <unfinished ...>
[pid  3506] epoll_create(100 <unfinished ...>
[pid  3508] <... munmap resumed>)       = 0
[pid  3506] <... epoll_create resumed>) = 6
[pid  3508] mprotect(0x7f981c000000, 135168, PROT_READ|PROT_WRITE <unfinished ...>
[pid  3506] clock_gettime(CLOCK_REALTIME,  <unfinished ...>
[pid  3508] <... mprotect resumed>)     = 0
[pid  3506] <... clock_gettime resumed>{tv_sec=1685503056, tv_nsec=102709526}) = 0
[pid  3508] fstat(5,  <unfinished ...>
[pid  3506] eventfd2(0, EFD_CLOEXEC|EFD_NONBLOCK <unfinished ...>
[pid  3508] <... fstat resumed>{st_mode=S_IFREG|0600, st_size=0, ...}) = 0
[pid  3506] <... eventfd2 resumed>)     = 7
[pid  3508] futex(0xfe5168, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
[pid  3506] clock_gettime(CLOCK_REALTIME, {tv_sec=1685503056, tv_nsec=102921921}) = 0
[pid  3506] futex(0xfe5168, FUTEX_WAKE_PRIVATE, 1) = 1
[pid  3508] <... futex resumed>)        = 0
[pid  3506] mmap(NULL, 8392704, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_STACK, -1, 0 <unfinished ...>
[pid  3508] futex(0xfe50d0, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
[pid  3506] <... mmap resumed>)         = 0x7f982a244000
[pid  3508] <... futex resumed>)        = 0
[pid  3506] mprotect(0x7f982a245000, 8388608, PROT_READ|PROT_WRITE <unfinished ...>
[pid  3508] futex(0xfe516c, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
[pid  3506] <... mprotect resumed>)     = 0
[pid  3506] clone(child_stack=0x7f982aa42970, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTIDstrace: Process 3509 attached
, parent_tid=[3509], tls=0x7f982aa44700, child_tidptr=0x7f982aa449d0) = 3509
[pid  3507] <... nanosleep resumed>0x7f982ba44710) = 0
[pid  3506] clock_gettime(CLOCK_REALTIME, {tv_sec=1685503056, tv_nsec=103556261}) = 0
[pid  3507] write(3, "2023-05-31T03:17:36.079Z [WARN] "..., 131 <unfinished ...>
[pid  3506] futex(0xfe516c, FUTEX_WAKE_PRIVATE, 1) = 1
[pid  3508] <... futex resumed>)        = 0
[pid  3507] <... write resumed>)        = 131
[pid  3506] clock_gettime(CLOCK_REALTIME,  <unfinished ...>
[pid  3508] futex(0xfe50d0, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
[pid  3506] <... clock_gettime resumed>{tv_sec=1685503056, tv_nsec=103790449}) = 0
[pid  3508] <... futex resumed>)        = 0
[pid  3506] clock_gettime(CLOCK_REALTIME,  <unfinished ...>
[pid  3507] nanosleep({tv_sec=0, tv_nsec=1000000},  <unfinished ...>
[pid  3508] futex(0xfe5168, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
[pid  3506] <... clock_gettime resumed>{tv_sec=1685503056, tv_nsec=103910031}) = 0
[pid  3506] futex(0xfe5168, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
[pid  3509] set_robust_list(0x7f982aa449e0, 24 <unfinished ...>
[pid  3506] <... futex resumed>)        = 1
[pid  3508] <... futex resumed>)        = 0
[pid  3506] mmap(NULL, 8392704, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_STACK, -1, 0 <unfinished ...>
[pid  3508] futex(0xfe50d0, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
[pid  3506] <... mmap resumed>)         = 0x7f9829a43000
[pid  3508] <... futex resumed>)        = 0
[pid  3506] mprotect(0x7f9829a44000, 8388608, PROT_READ|PROT_WRITE <unfinished ...>
[pid  3508] futex(0xfe516c, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
[pid  3506] <... mprotect resumed>)     = 0
[pid  3509] <... set_robust_list resumed>) = 0
[pid  3506] clone(child_stack=0x7f982a241970, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTIDstrace: Process 3510 attached
 <unfinished ...>
[pid  3509] mmap(0x7f9820000000, 67108864, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_NORESERVE, -1, 0 <unfinished ...>
[pid  3506] <... clone resumed>, parent_tid=[3510], tls=0x7f982a243700, child_tidptr=0x7f982a2439d0) = 3510
[pid  3510] set_robust_list(0x7f982a2439e0, 24 <unfinished ...>
[pid  3509] <... mmap resumed>)         = 0x7f9820000000
[pid  3506] futex(0x7f982a2439d0, FUTEX_WAIT, 3510, NULL <unfinished ...>
[pid  3510] <... set_robust_list resumed>) = 0
[pid  3510] clock_gettime(CLOCK_REALTIME,  <unfinished ...>
[pid  3509] mprotect(0x7f9820000000, 135168, PROT_READ|PROT_WRITE <unfinished ...>
[pid  3510] <... clock_gettime resumed>{tv_sec=1685503056, tv_nsec=104626632}) = 0
[pid  3509] <... mprotect resumed>)     = 0
[pid  3510] mmap(NULL, 134217728, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_NORESERVE, -1, 0) = 0x7f9814000000
[pid  3510] munmap(0x7f9818000000, 67108864 <unfinished ...>
[pid  3509] clock_gettime(CLOCK_REALTIME,  <unfinished ...>
[pid  3510] <... munmap resumed>)       = 0
[pid  3510] mprotect(0x7f9814000000, 135168, PROT_READ|PROT_WRITE <unfinished ...>
[pid  3509] <... clock_gettime resumed>{tv_sec=1685503056, tv_nsec=104864282}) = 0
[pid  3510] <... mprotect resumed>)     = 0
[pid  3509] futex(0xfe516c, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
[pid  3510] stat("/edge/aws/certs/XXXXXXXX.private.key", {st_mode=S_IFREG|0600, st_size=1679, ...}) = 0
[pid  3509] <... futex resumed>)        = 1
[pid  3508] <... futex resumed>)        = 0
[pid  3510] stat("/edge/aws/certs/",  <unfinished ...>
[pid  3508] futex(0xfe50d0, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...>
[pid  3510] <... stat resumed>{st_mode=S_IFDIR|0700, st_size=4096, ...}) = 0
[pid  3510] getuid( <unfinished ...>
[pid  3507] <... nanosleep resumed>0x7f982ba44710) = 0
[pid  3510] <... getuid resumed>)       = 0
[pid  3510] stat("/edge/aws/certs/",  <unfinished ...>
[pid  3507] write(3, "2023-05-31T03:17:36.079Z [WARN] "..., 134 <unfinished ...>
[pid  3510] <... stat resumed>{st_mode=S_IFDIR|0700, st_size=4096, ...}) = 0
[pid  3510] stat("/edge/aws/certs/XXXXXXXX.private.key",  <unfinished ...>
[pid  3507] <... write resumed>)        = 134
[pid  3510] <... stat resumed>{st_mode=S_IFREG|0600, st_size=1679, ...}) = 0
[pid  3510] getuid( <unfinished ...>
[pid  3507] nanosleep({tv_sec=0, tv_nsec=1000000},  <unfinished ...>
[pid  3510] <... getuid resumed>)       = 0
[pid  3510] stat("/edge/aws/certs/XXXXXXXX.private.key",  <unfinished ...>
[pid  3509] futex(0xfe50d0, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
[pid  3510] <... stat resumed>{st_mode=S_IFREG|0600, st_size=1679, ...}) = 0
[pid  3510] stat("/edge/aws/certs/XXXXXXXX.pem.crt",  <unfinished ...>
[pid  3509] <... futex resumed>)        = 1
[pid  3508] <... futex resumed>)        = 0
[pid  3510] <... stat resumed>{st_mode=S_IFREG|0644, st_size=1220, ...}) = 0
[pid  3508] futex(0xfe50d0, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
[pid  3510] stat("/edge/aws/certs/",  <unfinished ...>
[pid  3508] <... futex resumed>)        = 0
[pid  3510] <... stat resumed>{st_mode=S_IFDIR|0700, st_size=4096, ...}) = 0
[pid  3508] futex(0xfe5168, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
[pid  3510] getuid()                    = 0
[pid  3509] epoll_ctl(6, EPOLL_CTL_ADD, 7, {events=EPOLLIN|EPOLLERR|EPOLLHUP|EPOLLRDHUP|EPOLLET, data={u32=536873984, u64=140291348630528}} <unfinished ...>
[pid  3510] stat("/edge/aws/certs/", {st_mode=S_IFDIR|0700, st_size=4096, ...}) = 0
[pid  3509] <... epoll_ctl resumed>)    = 0
[pid  3510] stat("/edge/aws/certs/XXXXXXXX.pem.crt", {st_mode=S_IFREG|0644, st_size=1220, ...}) = 0
[pid  3509] clock_gettime(CLOCK_REALTIME,  <unfinished ...>
[pid  3510] getuid()                    = 0
[pid  3509] <... clock_gettime resumed>{tv_sec=1685503056, tv_nsec=106288718}) = 0
[pid  3510] stat("/edge/aws/certs/XXXXXXXX.pem.crt", {st_mode=S_IFREG|0644, st_size=1220, ...}) = 0
[pid  3509] futex(0xfe5168, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
[pid  3510] openat(AT_FDCWD, "/edge/aws/certs/XXXXXXXX.pem.crt", O_RDONLY <unfinished ...>
[pid  3509] <... futex resumed>)        = 1
[pid  3508] <... futex resumed>)        = 0
[pid  3510] <... openat resumed>)       = 8
[pid  3508] futex(0xfe50d0, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...>
[pid  3510] fstat(8, {st_mode=S_IFREG|0644, st_size=1220, ...}) = 0
[pid  3509] futex(0xfe50d0, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
[pid  3510] fstat(8,  <unfinished ...>
[pid  3508] <... futex resumed>)        = 0
[pid  3507] <... nanosleep resumed>0x7f982ba44710) = 0
[pid  3510] <... fstat resumed>{st_mode=S_IFREG|0644, st_size=1220, ...}) = 0
[pid  3508] futex(0xfe50d0, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
[pid  3507] write(3, "2023-05-31T03:17:36.079Z [WARN] "..., 133 <unfinished ...>
[pid  3510] lseek(8, 0, SEEK_SET <unfinished ...>
[pid  3508] <... futex resumed>)        = 0
[pid  3507] <... write resumed>)        = 133
[pid  3510] <... lseek resumed>)        = 0
[pid  3508] futex(0xfe516c, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
[pid  3507] nanosleep({tv_sec=0, tv_nsec=1000000},  <unfinished ...>
[pid  3510] read(8,  <unfinished ...>
[pid  3509] <... futex resumed>)        = 1
[pid  3510] <... read resumed>"-----BEGIN CERTIFICATE-----\nMIID"..., 1220) = 1220
[pid  3510] lseek(8, 1220, SEEK_SET <unfinished ...>
[pid  3509] epoll_wait(6,  <unfinished ...>
[pid  3510] <... lseek resumed>)        = 1220
[pid  3510] close(8)                    = 0
[pid  3510] openat(AT_FDCWD, "/edge/aws/certs/XXXXXXXX.private.key", O_RDONLY) = 8
[pid  3510] fstat(8, {st_mode=S_IFREG|0600, st_size=1679, ...}) = 0
[pid  3510] fstat(8, {st_mode=S_IFREG|0600, st_size=1679, ...}) = 0
[pid  3510] lseek(8, 0, SEEK_SET)       = 0
[pid  3510] read(8, "-----BEGIN RSA PRIVATE KEY-----\n"..., 1679) = 1679
[pid  3510] lseek(8, 1679, SEEK_SET)    = 1679
[pid  3510] close(8)                    = 0
[pid  3507] <... nanosleep resumed>0x7f982ba44710) = 0
[pid  3507] write(3, "2023-05-31T03:17:36.080Z [WARN] "..., 138) = 138
[pid  3510] openat(AT_FDCWD, "/edge/aws/certs/AmazonRootCA1.pem.crt", O_RDONLY <unfinished ...>
[pid  3507] nanosleep({tv_sec=0, tv_nsec=1000000},  <unfinished ...>
[pid  3510] <... openat resumed>)       = 8
[pid  3510] fstat(8, {st_mode=S_IFREG|0644, st_size=1188, ...}) = 0
[pid  3510] fstat(8, {st_mode=S_IFREG|0644, st_size=1188, ...}) = 0
[pid  3510] lseek(8, 0, SEEK_SET)       = 0
[pid  3510] read(8, "-----BEGIN CERTIFICATE-----\nMIID"..., 1188) = 1188
[pid  3510] lseek(8, 1188, SEEK_SET)    = 1188
[pid  3510] close(8)                    = 0
[pid  3510] openat(AT_FDCWD, "/usr/lib/ssl/cert.pem", O_RDONLY) = -1 ENOENT (No such file or directory)
[pid  3510] clock_gettime(CLOCK_REALTIME, {tv_sec=1685503056, tv_nsec=108896682}) = 0
[pid  3510] futex(0xfe516c, FUTEX_WAKE_PRIVATE, 1) = 1
[pid  3508] <... futex resumed>)        = 0
[pid  3508] futex(0xfe50d0, FUTEX_WAKE_PRIVATE, 1) = 0
[pid  3508] futex(0xfe5168, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
[pid  3507] <... nanosleep resumed>0x7f982ba44710) = 0
[pid  3507] write(3, "2023-05-31T03:17:36.080Z [WARN] "..., 143) = 143
[pid  3507] nanosleep({tv_sec=0, tv_nsec=1000000},  <unfinished ...>
[pid  3510] openat(AT_FDCWD, "/dev/urandom", O_RDONLY) = 8
[pid  3510] fstat(8, {st_mode=S_IFCHR|0666, st_rdev=makedev(0x1, 0x9), ...}) = 0
[pid  3510] openat(AT_FDCWD, "/dev/random", O_RDONLY) = 9
[pid  3510] fstat(9, {st_mode=S_IFCHR|0666, st_rdev=makedev(0x1, 0x8), ...}) = 0
[pid  3510] openat(AT_FDCWD, "/dev/srandom", O_RDONLY) = -1 ENOENT (No such file or directory)
[pid  3510] futex(0xf9cbc8, FUTEX_WAKE_PRIVATE, 2147483647 <unfinished ...>
[pid  3507] <... nanosleep resumed>0x7f982ba44710) = 0
[pid  3510] <... futex resumed>)        = 0
[pid  3507] writev(3, [{iov_base="2023-05-31T03:17:36.080Z [INFO] "..., iov_len=47}, {iov_base="Successfully fetched JSON config"..., iov_len=1406}], 2 <unfinished ...>
[pid  3510] read(4,  <unfinished ...>
[pid  3507] <... writev resumed>)       = 1453
[pid  3510] <... read resumed>"\270\365\360\216\343\307\357]\242\322ij\2*^P\353\20\343uEk\0\344\253&\f\273Z\20+\20", 32) = 32
[pid  3507] write(3, "\n", 1 <unfinished ...>
[pid  3510] read(4,  <unfinished ...>
[pid  3507] <... write resumed>)        = 1
[pid  3510] <... read resumed>"\324\235 N\244\n\346:2\363\260\206\1\2121u\246\37\330B\234O\320\261!:<\343\264\353\233X", 32) = 32
[pid  3507] nanosleep({tv_sec=0, tv_nsec=1000000}, 0x7f982ba44710) = 0
[pid  3507] write(3, "2023-05-31T03:17:36.083Z [INFO] "..., 160) = 160
[pid  3507] nanosleep({tv_sec=0, tv_nsec=1000000},  <unfinished ...>
[pid  3510] clock_gettime(CLOCK_REALTIME, {tv_sec=1685503056, tv_nsec=112462358}) = 0
[pid  3510] futex(0xfe5168, FUTEX_WAKE_PRIVATE, 1) = 1
[pid  3508] <... futex resumed>)        = 0
[pid  3510] clock_gettime(CLOCK_REALTIME,  <unfinished ...>
[pid  3508] futex(0xfe50d0, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
[pid  3510] <... clock_gettime resumed>{tv_sec=1685503056, tv_nsec=112660417}) = 0
[pid  3508] <... futex resumed>)        = 0
[pid  3510] clock_gettime(CLOCK_REALTIME,  <unfinished ...>
[pid  3508] futex(0xfe516c, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
[pid  3510] <... clock_gettime resumed>{tv_sec=1685503056, tv_nsec=112738269}) = 0
[pid  3510] futex(0xfe516c, FUTEX_WAKE_PRIVATE, 1) = 1
[pid  3508] <... futex resumed>)        = 0
[pid  3510] clock_gettime(CLOCK_REALTIME,  <unfinished ...>
[pid  3507] <... nanosleep resumed>0x7f982ba44710) = 0
[pid  3510] <... clock_gettime resumed>{tv_sec=1685503056, tv_nsec=112910162}) = 0
[pid  3508] futex(0xfe50d0, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
[pid  3507] write(3, "2023-05-31T03:17:36.084Z [INFO] "..., 118 <unfinished ...>
[pid  3510] clock_gettime(CLOCK_REALTIME,  <unfinished ...>
[pid  3508] <... futex resumed>)        = 0
[pid  3507] <... write resumed>)        = 118
[pid  3510] <... clock_gettime resumed>{tv_sec=1685503056, tv_nsec=113013278}) = 0
[pid  3508] futex(0xfe5168, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
[pid  3507] nanosleep({tv_sec=0, tv_nsec=1000000},  <unfinished ...>
[pid  3510] futex(0xfe5168, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
[pid  3508] <... futex resumed>)        = -1 EAGAIN (Resource temporarily unavailable)
[pid  3510] <... futex resumed>)        = 0
[pid  3508] futex(0xfe50d0, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...>
[pid  3510] futex(0xfe50d0, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
[pid  3508] <... futex resumed>)        = -1 EAGAIN (Resource temporarily unavailable)
[pid  3510] <... futex resumed>)        = 0
[pid  3508] futex(0xfe50d0, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
[pid  3510] clock_gettime(CLOCK_REALTIME,  <unfinished ...>
[pid  3508] <... futex resumed>)        = 0
[pid  3510] <... clock_gettime resumed>{tv_sec=1685503056, tv_nsec=113270015}) = 0
[pid  3508] futex(0xfe516c, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
[pid  3510] futex(0xfe516c, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
[pid  3508] <... futex resumed>)        = -1 EAGAIN (Resource temporarily unavailable)
[pid  3510] <... futex resumed>)        = 0
[pid  3508] futex(0xfe50d0, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...>
[pid  3510] futex(0xfe50d0, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
[pid  3508] <... futex resumed>)        = -1 EAGAIN (Resource temporarily unavailable)
[pid  3510] <... futex resumed>)        = 0
[pid  3508] futex(0xfe50d0, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
[pid  3510] clock_gettime(CLOCK_REALTIME,  <unfinished ...>
[pid  3508] <... futex resumed>)        = 0
[pid  3510] <... clock_gettime resumed>{tv_sec=1685503056, tv_nsec=113517406}) = 0
[pid  3508] futex(0xfe5168, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
[pid  3510] futex(0xfe5168, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
[pid  3508] <... futex resumed>)        = -1 EAGAIN (Resource temporarily unavailable)
[pid  3510] <... futex resumed>)        = 0
[pid  3508] futex(0xfe50d0, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...>
[pid  3510] futex(0xfe50d0, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
[pid  3508] <... futex resumed>)        = -1 EAGAIN (Resource temporarily unavailable)
[pid  3510] <... futex resumed>)        = 0
[pid  3508] futex(0xfe50d0, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
[pid  3510] clock_gettime(CLOCK_REALTIME,  <unfinished ...>
[pid  3508] <... futex resumed>)        = 0
[pid  3510] <... clock_gettime resumed>{tv_sec=1685503056, tv_nsec=113750967}) = 0
[pid  3508] futex(0xfe516c, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
[pid  3510] clock_gettime(CLOCK_REALTIME, {tv_sec=1685503056, tv_nsec=113829332}) = 0
[pid  3510] futex(0xfe516c, FUTEX_WAKE_PRIVATE, 1) = 1
[pid  3508] <... futex resumed>)        = 0
[pid  3510] mmap(NULL, 8392704, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_STACK, -1, 0 <unfinished ...>
[pid  3508] futex(0xfe50d0, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
[pid  3510] <... mmap resumed>)         = 0x7f9829242000
[pid  3508] <... futex resumed>)        = 0
[pid  3510] mprotect(0x7f9829243000, 8388608, PROT_READ|PROT_WRITE <unfinished ...>
[pid  3508] futex(0xfe5168, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
[pid  3510] <... mprotect resumed>)     = 0
[pid  3510] clone(child_stack=0x7f9829a40970, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID <unfinished ...>
[pid  3507] <... nanosleep resumed>0x7f982ba44710) = 0
strace: Process 3511 attached
[pid  3507] write(3, "2023-05-31T03:17:36.086Z [INFO] "..., 200 <unfinished ...>
[pid  3511] set_robust_list(0x7f9829a429e0, 24 <unfinished ...>
[pid  3510] <... clone resumed>, parent_tid=[3511], tls=0x7f9829a42700, child_tidptr=0x7f9829a429d0) = 3511
[pid  3507] <... write resumed>)        = 200
[pid  3511] <... set_robust_list resumed>) = 0
[pid  3510] futex(0x7f98140081f0, FUTEX_WAIT, 2147483648, NULL <unfinished ...>
[pid  3507] nanosleep({tv_sec=0, tv_nsec=1000000},  <unfinished ...>
[pid  3511] mmap(0x7f9818000000, 67108864, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_NORESERVE, -1, 0) = 0x7f9818000000
[pid  3511] mprotect(0x7f9818000000, 135168, PROT_READ|PROT_WRITE) = 0
[pid  3511] clock_gettime(CLOCK_REALTIME, {tv_sec=1685503056, tv_nsec=114547760}) = 0
[pid  3511] futex(0xfe5168, FUTEX_WAKE_PRIVATE, 1) = 1
[pid  3508] <... futex resumed>)        = 0
[pid  3511] futex(0xfe50d0, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
[pid  3508] futex(0xfe50d0, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...>
[pid  3511] <... futex resumed>)        = 0
[pid  3508] <... futex resumed>)        = -1 EAGAIN (Resource temporarily unavailable)
[pid  3511] socket(AF_UNIX, SOCK_STREAM|SOCK_CLOEXEC|SOCK_NONBLOCK, 0 <unfinished ...>
[pid  3508] futex(0xfe50d0, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
[pid  3511] <... socket resumed>)       = 10
[pid  3508] <... futex resumed>)        = 0
[pid  3511] connect(10, {sa_family=AF_UNIX, sun_path="/var/run/nscd/socket"}, 110 <unfinished ...>
[pid  3508] futex(0xfe516c, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
[pid  3511] <... connect resumed>)      = -1 ENOENT (No such file or directory)
[pid  3511] close(10)                   = 0
[pid  3511] socket(AF_UNIX, SOCK_STREAM|SOCK_CLOEXEC|SOCK_NONBLOCK, 0) = 10
[pid  3511] connect(10, {sa_family=AF_UNIX, sun_path="/var/run/nscd/socket"}, 110) = -1 ENOENT (No such file or directory)
[pid  3511] close(10)                   = 0
[pid  3511] openat(AT_FDCWD, "/etc/nsswitch.conf", O_RDONLY|O_CLOEXEC) = 10
[pid  3511] fstat(10, {st_mode=S_IFREG|0644, st_size=509, ...}) = 0
[pid  3511] read(10, "# /etc/nsswitch.conf\n#\n# Example"..., 4096) = 509
[pid  3507] <... nanosleep resumed>0x7f982ba44710) = 0
[pid  3511] read(10,  <unfinished ...>
[pid  3507] write(3, "2023-05-31T03:17:36.087Z [DEBUG]"..., 147 <unfinished ...>
[pid  3511] <... read resumed>"", 4096) = 0
[pid  3507] <... write resumed>)        = 147
[pid  3511] close(10 <unfinished ...>
[pid  3507] nanosleep({tv_sec=0, tv_nsec=1000000},  <unfinished ...>
[pid  3511] <... close resumed>)        = 0
[pid  3511] stat("/etc/resolv.conf", {st_mode=S_IFREG|0644, st_size=788, ...}) = 0
[pid  3511] openat(AT_FDCWD, "/etc/host.conf", O_RDONLY|O_CLOEXEC) = -1 ENOENT (No such file or directory)
[pid  3511] futex(0xfa6df4, FUTEX_WAKE_PRIVATE, 2147483647) = 0
[pid  3511] openat(AT_FDCWD, "/etc/resolv.conf", O_RDONLY|O_CLOEXEC) = 10
[pid  3511] fstat(10, {st_mode=S_IFREG|0644, st_size=788, ...}) = 0
[pid  3511] read(10, "# This is /run/systemd/resolve/r"..., 4096) = 788
[pid  3511] read(10, "", 4096)          = 0
[pid  3511] close(10)                   = 0
[pid  3511] openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 10
[pid  3511] fstat(10, {st_mode=S_IFREG|0644, st_size=11195, ...}) = 0
[pid  3511] mmap(NULL, 11195, PROT_READ, MAP_PRIVATE, 10, 0) = 0x7f982923f000
[pid  3511] close(10)                   = 0
[pid  3511] access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory)
[pid  3507] <... nanosleep resumed>0x7f982ba44710) = 0
[pid  3511] openat(AT_FDCWD, "/lib/libnss_files.so.2", O_RDONLY|O_CLOEXEC <unfinished ...>
[pid  3507] write(3, "2023-05-31T03:17:36.087Z [DEBUG]"..., 184 <unfinished ...>
[pid  3511] <... openat resumed>)       = 10
[pid  3507] <... write resumed>)        = 184
[pid  3511] read(10,  <unfinished ...>
[pid  3507] nanosleep({tv_sec=0, tv_nsec=1000000},  <unfinished ...>
[pid  3511] <... read resumed>"\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\0\0\0\0\0\0\0\0"..., 832) = 832
[pid  3511] fstat(10, {st_mode=S_IFREG|0755, st_size=14232, ...}) = 0
[pid  3511] mmap(NULL, 16400, PROT_READ, MAP_PRIVATE|MAP_DENYWRITE, 10, 0) = 0x7f982923a000
[pid  3511] mmap(0x7f982923b000, 4096, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 10, 0x1000) = 0x7f982923b000
[pid  3511] mmap(0x7f982923c000, 4096, PROT_READ, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 10, 0x2000) = 0x7f982923c000
[pid  3511] mmap(0x7f982923d000, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 10, 0x2000) = 0x7f982923d000
[pid  3511] close(10)                   = 0
[pid  3511] access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory)
[pid  3511] openat(AT_FDCWD, "/lib/libc.so.6", O_RDONLY|O_CLOEXEC) = 10
[pid  3511] read(10, "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0@\327\2\0\0\0\0\0"..., 832) = 832
[pid  3511] lseek(10, 64, SEEK_SET)     = 64
[pid  3511] read(10, "\6\0\0\0\4\0\0\0@\0\0\0\0\0\0\0@\0\0\0\0\0\0\0@\0\0\0\0\0\0\0"..., 784) = 784
[pid  3511] lseek(10, 848, SEEK_SET)    = 848
[pid  3511] read(10, "\4\0\0\0000\0\0\0\5\0\0\0GNU\0\2\0\0\300\4\0\0\0\3\0\0\0\0\0\0\0"..., 64) = 64
[pid  3511] lseek(10, 912, SEEK_SET <unfinished ...>
[pid  3507] <... nanosleep resumed>0x7f982ba44710) = 0
[pid  3511] <... lseek resumed>)        = 912
[pid  3507] write(3, "2023-05-31T03:17:36.093Z [DEBUG]"..., 261 <unfinished ...>
[pid  3511] read(10,  <unfinished ...>
[pid  3507] <... write resumed>)        = 261
[pid  3511] <... read resumed>"\4\0\0\0\24\0\0\0\3\0\0\0GNU\0\223\t\304D\236j\331_N\t\366j\217t\346\v"..., 68) = 68
[pid  3507] nanosleep({tv_sec=0, tv_nsec=1000000},  <unfinished ...>
[pid  3511] fstat(10, {st_mode=S_IFREG|0755, st_size=2035928, ...}) = 0
[pid  3511] lseek(10, 64, SEEK_SET)     = 64
[pid  3511] read(10, "\6\0\0\0\4\0\0\0@\0\0\0\0\0\0\0@\0\0\0\0\0\0\0@\0\0\0\0\0\0\0"..., 784) = 784
[pid  3511] mmap(NULL, 2087984, PROT_READ, MAP_PRIVATE|MAP_DENYWRITE, 10, 0) = 0x7f982903c000
[pid  3511] mprotect(0x7f9829068000, 1830912, PROT_NONE) = 0
[pid  3511] mmap(0x7f9829068000, 1478656, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 10, 0x2c000) = 0x7f9829068000
[pid  3511] mmap(0x7f98291d1000, 348160, PROT_READ, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 10, 0x195000) = 0x7f98291d1000
[pid  3511] mmap(0x7f9829227000, 24576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 10, 0x1ea000) = 0x7f9829227000
[pid  3511] mmap(0x7f982922d000, 52272, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7f982922d000
[pid  3511] close(10)                   = 0
[pid  3511] access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory)
[pid  3511] openat(AT_FDCWD, "/lib/ld-linux-x86-64.so.2", O_RDONLY|O_CLOEXEC) = 10
[pid  3511] read(10, "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\360\320\1\0\0\0\0\0"..., 832) = 832
[pid  3511] fstat(10, {st_mode=S_IFREG|0755, st_size=227280, ...}) = 0
[pid  3511] mmap(NULL, 230040, PROT_READ, MAP_PRIVATE|MAP_DENYWRITE, 10, 0 <unfinished ...>
[pid  3507] <... nanosleep resumed>0x7f982ba44710) = 0
[pid  3511] <... mmap resumed>)         = 0x7f9829003000
[pid  3507] write(3, "2023-05-31T03:17:36.093Z [DEBUG]"..., 68 <unfinished ...>
[pid  3511] mprotect(0x7f9829005000, 208896, PROT_NONE <unfinished ...>
[pid  3507] <... write resumed>)        = 68
[pid  3511] <... mprotect resumed>)     = 0
[pid  3507] nanosleep({tv_sec=0, tv_nsec=1000000},  <unfinished ...>
[pid  3511] mmap(0x7f9829005000, 159744, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 10, 0x2000) = 0x7f9829005000
[pid  3511] mmap(0x7f982902c000, 45056, PROT_READ, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 10, 0x29000) = 0x7f982902c000
[pid  3511] mmap(0x7f9829038000, 16384, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 10, 0x34000) = 0x7f9829038000
[pid  3511] close(10)                   = 0
[pid  3511] arch_prctl(0x3001 /* ARCH_??? */, 0x7f9829a3f570) = -1 EINVAL (Invalid argument)
[pid  3511] mprotect(0x7f9829038000, 8192, PROT_READ) = 0
[pid  3511] mprotect(0x7f9829227000, 12288, PROT_READ) = 0
[pid  3511] mprotect(0x7f982923d000, 4096, PROT_READ) = 0
[pid  3511] munmap(0x7f982923f000, 11195) = 0
[pid  3511] getrandom("\xc8\x1b\xf3\xf7\x7a\x36\x55\x72", 8, GRND_NONBLOCK) = 8
[pid  3511] mmap(NULL, 1048576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f9828f03000
[pid  3507] <... nanosleep resumed>0x7f982ba44710) = 0
[pid  3507] write(3, "2023-05-31T03:17:36.095Z [INFO] "..., 103 <unfinished ...>
[pid  3511] openat(AT_FDCWD, "/etc/hosts", O_RDONLY|O_CLOEXEC <unfinished ...>
[pid  3507] <... write resumed>)        = 103
[pid  3511] <... openat resumed>)       = 10
[pid  3507] nanosleep({tv_sec=0, tv_nsec=1000000},  <unfinished ...>
[pid  3511] newfstatat(10, "", {st_mode=S_IFREG|0644, st_size=262, ...}, AT_EMPTY_PATH) = 0
[pid  3511] lseek(10, 0, SEEK_SET)      = 0
[pid  3511] read(10, "127.0.0.1\tlocalhost.localdomain\t"..., 4096) = 262
[pid  3511] --- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=0x63} ---
[pid  3506] <... futex resumed>)        = ?
[pid  3510] <... futex resumed>)        = ?
[pid  3509] <... epoll_wait resumed> <unfinished ...>) = ?
[pid  3508] <... futex resumed>)        = ?
[pid  3507] <... nanosleep resumed> <unfinished ...>) = ?
[pid  3508] +++ killed by SIGSEGV +++
[pid  3510] +++ killed by SIGSEGV +++
[pid  3509] +++ killed by SIGSEGV +++
[pid  3507] +++ killed by SIGSEGV +++
[pid  3511] +++ killed by SIGSEGV +++
+++ killed by SIGSEGV +++

It appears that SIGSEGV happens when attempting to read file contents into a buffer, so maybe related to the issue at hand. By moving /etc/hosts and /etc/host.conf out of the way I was able to stop it from dumping core and successfully tested the secure tunnel connection.

@dbouras
Copy link

dbouras commented May 31, 2023

Maybe a thread-safety issue?
(as in this example: Qihoo360/evpp#164)

@dbouras
Copy link

dbouras commented Jun 1, 2023

Just to be sure, I rebuilt the entire Yocto image from scratch with the most recent Kirkstone release, and meta-aws from aws4embeddedlinux/meta-aws@912bb30  as suggested by @thomas-roos. Hashes and versions as follows:

BB_VERSION           = "2.0.0"
BUILD_SYS            = "x86_64-linux"
NATIVELSBSTRING      = "universal"
TARGET_SYS           = "x86_64-poky-linux"
MACHINE              = "zeronorth-edge"
DISTRO               = "poky"
DISTRO_VERSION       = "4.0.10"
TUNE_FEATURES        = "m64 core2"
TARGET_FPU           = ""
meta-aws             = "HEAD:912bb30955c7b28e70d3440dc6b24d8794aa7042"
meta-intel           = "HEAD:1edf26e5b90371c9acf7bd6ac7155000de85f133"
meta-filesystems     
meta-networking      
meta-oe              
meta-python          = "HEAD:5f120a926b0fcd55cfe7565bb7ddf23661cad498"
meta-python2         = "HEAD:f02882e2aa9279ca7becca8d0cedbffe88b5a253"
meta-swupdate        = "HEAD:91de63c006e886eba8d9c263c9aa55d4c7632072"
meta                 
meta-poky            = "HEAD:f53ab3a2ff206a130cdc843839dd0ea5ec4ad02f"

The startup problem when aws-iot-device-client would dump core while trying to read the conf file is not there, but it still dumps core when trying to receive the secure tunnel notification:

023-06-01T07:45:36.298Z [INFO]  {Main.cpp}: Now running AWS IoT Device Client version v1.8.0-bb7ff67
2023-06-01T07:45:36.307Z [INFO]  {SharedCrtResourceManager.cpp}: SDK logging is enabled. Check /edge/aws/logs/sdk.log for SDK logs.
2023-06-01T07:45:36.308Z [DEBUG] {Retry.cpp}: Retryable function starting, it will retry until success
2023-06-01T07:45:36.317Z [INFO]  {SharedCrtResourceManager.cpp}: Establishing MQTT connection with client id XXXXXXXX...
2023-06-01T07:45:36.805Z [INFO]  {SharedCrtResourceManager.cpp}: MQTT connection established with return code: 0
2023-06-01T07:45:36.805Z [INFO]  {SharedCrtResourceManager.cpp}: Shared MQTT connection is ready!
2023-06-01T07:45:36.806Z [INFO]  {Main.cpp}: Provisioning with Secure Elements is disabled
2023-06-01T07:45:36.806Z [INFO]  {Main.cpp}: Config shadow is disabled
2023-06-01T07:45:36.806Z [INFO]  {Main.cpp}: Jobs is disabled
2023-06-01T07:45:36.806Z [INFO]  {Main.cpp}: Secure Tunneling is enabled
2023-06-01T07:45:36.806Z [INFO]  {Main.cpp}: Device Defender is disabled
2023-06-01T07:45:36.806Z [INFO]  {Main.cpp}: Sample shadow is disabled
2023-06-01T07:45:36.806Z [INFO]  {Main.cpp}: Pub Sub is disabled
2023-06-01T07:45:36.806Z [INFO]  {Main.cpp}: Sensor Publish is disabled
2023-06-01T07:45:36.806Z [INFO]  {SharedCrtResourceManager.cpp}: Starting Device Client features.
2023-06-01T07:45:36.806Z [DEBUG] {FeatureRegistry.cpp}: Attempting to start Secure Tunneling
2023-06-01T07:45:36.806Z [INFO]  {SecureTunnelingFeature.cpp}: Running Secure Tunneling!
2023-06-01T07:45:36.806Z [INFO]  {Main.cpp}: Client base has been notified that Secure Tunneling has started
2023-06-01T07:45:36.928Z [DEBUG] {SecureTunnelingFeature.cpp}: Subscribed to tunnel notification topic
2023-06-01T07:45:37.720Z [DEBUG] {SecureTunnelingFeature.cpp}: Received MQTT Tunnel Notification
terminate called after throwing an instance of 'std::logic_error'
  what():  basic_string::_M_construct null not valid
Aborted

@thomas-roos
Copy link
Contributor Author

This explains why the ptest did work but your error was already existing back in March...

@RogerZhongAWS
Copy link
Contributor

I was able to look into this issue and as far as I'm aware, the errors started appear after this commit: https://github.com/awslabs/aws-crt-cpp/pull/460/files#diff-ec35b151e2548713e7314d9a56c645646f3d625be574601453d153d5a95d9d1f, which moves much of the JSON utility away from C++ Class member functions towards the aws-c-common implementation.

In the device client, we use the CRT upon reading config json, by checking if a value was provided from each key in the config json file. https://github.com/awslabs/aws-crt-cpp/blob/0e747152208198b56033339a9d6af4d541c74c0a/source/JsonObject.cpp#L521
When attempting to get the value from the key, this function: https://github.com/awslabs/aws-c-common/blob/2e01315574255ceb6cdcf6734d4d164b9f586877/source/json.c#L107 attempts to construct an aws_string... presumably to append a null terminator to the end of whatever byte cursor is passed as an arg.
When constructing the aws_string, the assertion allocator && aws_byte_cursor_is_valid(cursor) fails https://github.com/awslabs/aws-c-common/blob/2e01315574255ceb6cdcf6734d4d164b9f586877/source/string.c#L216

From the backtrace pasted by @thomas-roos and my own testing with gdb, it appears that the allocator pointer is set to NULL while the byte cursor is non-null and valid

    (gdb) s
    Aws::Crt::JsonView::ValueExists (this=0x7fffffffd638,
        key=0x5555561775d0 <Aws::Iot::DeviceClient::PlainConfig::JSON_KEY_ENDPOINT> "endpoint")
        at /home/ubuntu/iot/aws-iot-device-client/build/aws-iot-device-sdk-cpp-v2-src/crt/aws-crt-cpp/source/JsonObject.cpp:525
    525	                aws_json_value *item = aws_json_value_get_from_object(m_value, aws_byte_cursor_from_c_str(key));
    (gdb) info args
    this = 0x7fffffffd638
    key = 0x5555561775d0 <Aws::Iot::DeviceClient::PlainConfig::JSON_KEY_ENDPOINT> "endpoint"
    (gdb) info local
    item = 0x555556a926f0
    (gdb) s
    aws_json_value_get_from_object (
        object=0x5555561775d0 <Aws::Iot::DeviceClient::PlainConfig::JSON_KEY_ENDPOINT>, key=...)
        at /home/ubuntu/iot/aws-iot-device-client/build/aws-iot-device-sdk-cpp-v2-src/crt/aws-crt-cpp/crt/aws-c-common/source/json.c:107
    107	struct aws_json_value *aws_json_value_get_from_object(const struct aws_json_value *object, struct aws_byte_cursor key) {
    (gdb) s
    109	    void *return_value = NULL;
    (gdb) info args
    object = 0x555556a8d140
    key = {len = 8,
      ptr = 0x5555561775d0 <Aws::Iot::DeviceClient::PlainConfig::JSON_KEY_ENDPOINT> "endpoint"}
    (gdb) s
    110	    struct aws_string *tmp = aws_string_new_from_cursor(s_aws_json_module_allocator, &key);
    (gdb) s
    aws_string_new_from_cursor (allocator=0x555556a9598f, cursor=0x555556a9598f)
        at /home/ubuntu/iot/aws-iot-device-client/build/aws-iot-device-sdk-cpp-v2-src/crt/aws-crt-cpp/crt/aws-c-common/source/string.c:216
    216	struct aws_string *aws_string_new_from_cursor(struct aws_allocator *allocator, const struct aws_byte_cursor *cursor) {
    (gdb) s
    217	    AWS_PRECONDITION(allocator && aws_byte_cursor_is_valid(cursor));
    (gdb) info args
    allocator = 0x0
    cursor = 0x7fffffffd410
    (gdb) s
    aws_fatal_assert (cond_str=0x555556a926f0 "", file=0x7fffffffd4f0 "", line=21845)
        at /home/ubuntu/iot/aws-iot-device-client/build/aws-iot-device-sdk-cpp-v2-src/crt/aws-crt-cpp/crt/aws-c-common/source/assert.c:13

By design, the only way for the allocator to be set is to transitively call the init function in the crt module by properly initializing the CRT with an allocator. We did not do that in the device client... We initialize the CRT before starting our features, but after we initialize our config parser, which means any code in the config parser that references the SDK JSON api has a chance to fail with a null pointer exception.

TLDR; Some time ago we introduced a latent circular dependency that only uncovered itself after a recent update to the CRT... Because our CRT resource manager requires args to be passed from the config, which themselves are parsed using the CRT JSON model! We need to push a fix to the client and the unit tests to solve issues stemming from both.

@RogerZhongAWS
Copy link
Contributor

However, Device Client dumping core when using secure tunneling is most likely a separate issue, will continue investigating that one next week.

@HarshGandhi-AWS
Copy link
Contributor

I was able to reproduce this issue last week and I am actively working on resolving it. I came across multiple other issues in device client few of which are resolved and others will be addressed soon. Right now I am working on resolving the integration test failures in DC I saw after updating SDK version.

@HarshGandhi-AWS
Copy link
Contributor

After updating SDK version, Device Defender code started failing. With DD feature enabled, device client is failing with an segmentation error at this line of code.
It is happening because allocator passed over here is set to NULL.

@HarshGandhi-AWS
Copy link
Contributor

HarshGandhi-AWS commented Jul 13, 2023

I was able to resolve all of the integration and unit test failures in device client we saw after updating the SDK version. Device Defender integration tests are failing once in every 3 times. Manually tested changes and I can confirm that DD feature is working fine. I was able to see the device side metrics being uploaded from the device.

I suspect the test failure is related to the Secure Tunneling issue which @RogerZhongAWS is working on. Secure Tunneling is failing to establish SSH connection with source (web based local proxy) after SDK version was updated. Once the Secure Tunneling issue on both DC and SDK side is resolved, I will be able to merge my changes.

@HarshGandhi-AWS
Copy link
Contributor

Hello @thomas-roos Since it will take some time to resolve this issue, to unblock you I would suggest you to use the older version of aws-meta repo where yocto recipe is using the this SDK commit.

@HarshGandhi-AWS
Copy link
Contributor

Hello @thomas-roos , we have resolved this issue and made a new DC release with tag v1.9. Can you please update the tag on meta-aws repo and test this again?

https://github.com/awslabs/aws-iot-device-client/releases/tag/v1.9

thomas-roos added a commit to thomas-roos/meta-aws that referenced this issue Dec 22, 2023
thomas-roos added a commit to aws4embeddedlinux/meta-aws that referenced this issue Dec 22, 2023
github-actions bot pushed a commit to aws4embeddedlinux/meta-aws that referenced this issue Dec 22, 2023
github-actions bot pushed a commit to aws4embeddedlinux/meta-aws that referenced this issue Dec 22, 2023
github-actions bot pushed a commit to aws4embeddedlinux/meta-aws that referenced this issue Dec 22, 2023
@thomas-roos
Copy link
Contributor Author

this issue is fixed with version 1.9

dwalkes added a commit to Trellis-Logic/meta-aws that referenced this issue Mar 13, 2024
Based on comments in [1] this was resolved in 1.19, however it wasn't added back in with
[2].

1: awslabs/aws-iot-device-client#411 (comment)
2: aws4embeddedlinux@80c74bf
dwalkes added a commit to Trellis-Logic/meta-aws that referenced this issue Mar 31, 2024
Based on comments in [1] this was resolved in 1.19, however it wasn't added back in with
[2].

1: awslabs/aws-iot-device-client#411 (comment)
2: aws4embeddedlinux@80c74bf

Signed-off-by: Dan Walkes <danwalkes@trellis-logic.com>
dwalkes added a commit to Trellis-Logic/meta-aws that referenced this issue Mar 31, 2024
Based on comments in [1] this was resolved in 1.19, however it wasn't added back in with
[2].

1: awslabs/aws-iot-device-client#411 (comment)
2: aws4embeddedlinux@80c74bf

Signed-off-by: Dan Walkes <danwalkes@trellis-logic.com>
dwalkes added a commit to Trellis-Logic/meta-aws that referenced this issue Mar 31, 2024
Based on comments in [1] this was resolved in 1.19, however it wasn't added back in with
[2].

1: awslabs/aws-iot-device-client#411 (comment)
2: aws4embeddedlinux@80c74bf

Signed-off-by: Dan Walkes <danwalkes@trellis-logic.com>
thomas-roos pushed a commit to aws4embeddedlinux/meta-aws that referenced this issue Apr 2, 2024
Based on comments in [1] this was resolved in 1.19, however it wasn't added back in with
[2].

1: awslabs/aws-iot-device-client#411 (comment)
2: 80c74bf

Signed-off-by: Dan Walkes <danwalkes@trellis-logic.com>
github-actions bot pushed a commit to aws4embeddedlinux/meta-aws that referenced this issue Apr 2, 2024
Based on comments in [1] this was resolved in 1.19, however it wasn't added back in with
[2].

1: awslabs/aws-iot-device-client#411 (comment)
2: 80c74bf

Signed-off-by: Dan Walkes <danwalkes@trellis-logic.com>
github-actions bot pushed a commit to aws4embeddedlinux/meta-aws that referenced this issue Apr 2, 2024
Based on comments in [1] this was resolved in 1.19, however it wasn't added back in with
[2].

1: awslabs/aws-iot-device-client#411 (comment)
2: 80c74bf

Signed-off-by: Dan Walkes <danwalkes@trellis-logic.com>
github-actions bot pushed a commit to aws4embeddedlinux/meta-aws that referenced this issue Apr 2, 2024
Based on comments in [1] this was resolved in 1.19, however it wasn't added back in with
[2].

1: awslabs/aws-iot-device-client#411 (comment)
2: 80c74bf

Signed-off-by: Dan Walkes <danwalkes@trellis-logic.com>
aws-iot-embedded-linux-ci2 pushed a commit to aws4embeddedlinux/meta-aws that referenced this issue Apr 2, 2024
Based on comments in [1] this was resolved in 1.19, however it wasn't added back in with
[2].

1: awslabs/aws-iot-device-client#411 (comment)
2: 80c74bf

Signed-off-by: Dan Walkes <danwalkes@trellis-logic.com>
aws-iot-embedded-linux-ci2 pushed a commit to aws4embeddedlinux/meta-aws that referenced this issue Apr 2, 2024
Based on comments in [1] this was resolved in 1.19, however it wasn't added back in with
[2].

1: awslabs/aws-iot-device-client#411 (comment)
2: 80c74bf

Signed-off-by: Dan Walkes <danwalkes@trellis-logic.com>
aws-iot-embedded-linux-ci2 pushed a commit to aws4embeddedlinux/meta-aws that referenced this issue Apr 2, 2024
Based on comments in [1] this was resolved in 1.19, however it wasn't added back in with
[2].

1: awslabs/aws-iot-device-client#411 (comment)
2: 80c74bf

Signed-off-by: Dan Walkes <danwalkes@trellis-logic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants