-
Notifications
You must be signed in to change notification settings - Fork 68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fatal error condition occurred: aws-c-common string.c:217: allocator && aws_byte_cursor_is_valid(cursor) #411
Comments
of course this was not seen as this is not working: #404 |
This is the backtrace: |
BTW: as easy as adding
to your Yocto local.conf |
did create an easy reproducible example: https://github.com/thomas-roos/yocto_example/tree/awslabs_aws-iot-device-client_issues_411 |
I was able to build a static binary using an Ubuntu 20.04 VM that at least doesn't have the issue with startup and also does not crash when connecting to the endpoint when a secure tunnel is open. To recap, when using the Siemens kas container, the issue is outlined in this thread and also here. At a 10,000 feet level, the gcc environment in kas is:
and in the Ubuntu 20.04 VM:
The client still dumps core, however, when using child processes to parse /etc/host.conf or /etc/hosts. The contents do not matter - if the file size of either /etc/hosts or /etc/host.conf is anything > 0, the issue occurs. An indicative
It appears that SIGSEGV happens when attempting to read file contents into a buffer, so maybe related to the issue at hand. By moving /etc/hosts and /etc/host.conf out of the way I was able to stop it from dumping core and successfully tested the secure tunnel connection. |
Maybe a thread-safety issue? |
Just to be sure, I rebuilt the entire Yocto image from scratch with the most recent Kirkstone release, and meta-aws from aws4embeddedlinux/meta-aws@912bb30 as suggested by @thomas-roos. Hashes and versions as follows:
The startup problem when aws-iot-device-client would dump core while trying to read the conf file is not there, but it still dumps core when trying to receive the secure tunnel notification:
|
This explains why the ptest did work but your error was already existing back in March... |
I was able to look into this issue and as far as I'm aware, the errors started appear after this commit: https://github.com/awslabs/aws-crt-cpp/pull/460/files#diff-ec35b151e2548713e7314d9a56c645646f3d625be574601453d153d5a95d9d1f, which moves much of the JSON utility away from C++ Class member functions towards the aws-c-common implementation. In the device client, we use the CRT upon reading config json, by checking if a value was provided from each key in the config json file. https://github.com/awslabs/aws-crt-cpp/blob/0e747152208198b56033339a9d6af4d541c74c0a/source/JsonObject.cpp#L521 From the backtrace pasted by @thomas-roos and my own testing with gdb, it appears that the allocator pointer is set to NULL while the byte cursor is non-null and valid
By design, the only way for the allocator to be set is to transitively call the init function in the crt module by properly initializing the CRT with an allocator. We did not do that in the device client... We initialize the CRT before starting our features, but after we initialize our config parser, which means any code in the config parser that references the SDK JSON api has a chance to fail with a null pointer exception. TLDR; Some time ago we introduced a latent circular dependency that only uncovered itself after a recent update to the CRT... Because our CRT resource manager requires args to be passed from the config, which themselves are parsed using the CRT JSON model! We need to push a fix to the client and the unit tests to solve issues stemming from both. |
However, Device Client dumping core when using secure tunneling is most likely a separate issue, will continue investigating that one next week. |
I was able to reproduce this issue last week and I am actively working on resolving it. I came across multiple other issues in device client few of which are resolved and others will be addressed soon. Right now I am working on resolving the integration test failures in DC I saw after updating SDK version. |
After updating SDK version, Device Defender code started failing. With DD feature enabled, device client is failing with an segmentation error at this line of code. |
I was able to resolve all of the integration and unit test failures in device client we saw after updating the SDK version. Device Defender integration tests are failing once in every 3 times. Manually tested changes and I can confirm that DD feature is working fine. I was able to see the device side metrics being uploaded from the device. I suspect the test failure is related to the Secure Tunneling issue which @RogerZhongAWS is working on. Secure Tunneling is failing to establish SSH connection with source (web based local proxy) after SDK version was updated. Once the Secure Tunneling issue on both DC and SDK side is resolved, I will be able to merge my changes. |
Hello @thomas-roos Since it will take some time to resolve this issue, to unblock you I would suggest you to use the older version of aws-meta repo where yocto recipe is using the this SDK commit. |
Hello @thomas-roos , we have resolved this issue and made a new DC release with tag v1.9. Can you please update the tag on meta-aws repo and test this again? https://github.com/awslabs/aws-iot-device-client/releases/tag/v1.9 |
this issue is fixed with version 1.9 |
Based on comments in [1] this was resolved in 1.19, however it wasn't added back in with [2]. 1: awslabs/aws-iot-device-client#411 (comment) 2: aws4embeddedlinux@80c74bf
Based on comments in [1] this was resolved in 1.19, however it wasn't added back in with [2]. 1: awslabs/aws-iot-device-client#411 (comment) 2: aws4embeddedlinux@80c74bf Signed-off-by: Dan Walkes <danwalkes@trellis-logic.com>
Based on comments in [1] this was resolved in 1.19, however it wasn't added back in with [2]. 1: awslabs/aws-iot-device-client#411 (comment) 2: aws4embeddedlinux@80c74bf Signed-off-by: Dan Walkes <danwalkes@trellis-logic.com>
Based on comments in [1] this was resolved in 1.19, however it wasn't added back in with [2]. 1: awslabs/aws-iot-device-client#411 (comment) 2: aws4embeddedlinux@80c74bf Signed-off-by: Dan Walkes <danwalkes@trellis-logic.com>
Based on comments in [1] this was resolved in 1.19, however it wasn't added back in with [2]. 1: awslabs/aws-iot-device-client#411 (comment) 2: 80c74bf Signed-off-by: Dan Walkes <danwalkes@trellis-logic.com>
Based on comments in [1] this was resolved in 1.19, however it wasn't added back in with [2]. 1: awslabs/aws-iot-device-client#411 (comment) 2: 80c74bf Signed-off-by: Dan Walkes <danwalkes@trellis-logic.com>
Based on comments in [1] this was resolved in 1.19, however it wasn't added back in with [2]. 1: awslabs/aws-iot-device-client#411 (comment) 2: 80c74bf Signed-off-by: Dan Walkes <danwalkes@trellis-logic.com>
Based on comments in [1] this was resolved in 1.19, however it wasn't added back in with [2]. 1: awslabs/aws-iot-device-client#411 (comment) 2: 80c74bf Signed-off-by: Dan Walkes <danwalkes@trellis-logic.com>
Based on comments in [1] this was resolved in 1.19, however it wasn't added back in with [2]. 1: awslabs/aws-iot-device-client#411 (comment) 2: 80c74bf Signed-off-by: Dan Walkes <danwalkes@trellis-logic.com>
Based on comments in [1] this was resolved in 1.19, however it wasn't added back in with [2]. 1: awslabs/aws-iot-device-client#411 (comment) 2: 80c74bf Signed-off-by: Dan Walkes <danwalkes@trellis-logic.com>
Based on comments in [1] this was resolved in 1.19, however it wasn't added back in with [2]. 1: awslabs/aws-iot-device-client#411 (comment) 2: 80c74bf Signed-off-by: Dan Walkes <danwalkes@trellis-logic.com>
Describe the bug
root@qemux86-64:/usr/lib/aws-iot-device-client/ptest# ./test-aws-iot-device-client
[==========] Running 276 tests from 32 test suites.
[----------] Global test environment set-up.
[----------] 48 tests from ConfigTestFixture
[ RUN ] ConfigTestFixture.AllFeaturesEnabled
Fatal error condition occurred in /usr/src/debug/aws-c-common/0.8.19-r0/git/source/string.c:217: allocator && aws_byte_cursor_is_valid(cursor)
Exiting Application
################################################################################
Stack trace:
################################################################################
/usr/lib/libaws-c-common.so.1(aws_backtrace_print+0x62) [0x7f974e575282]
/usr/lib/libaws-c-common.so.1(aws_fatal_assert+0x48) [0x7f974e55a428]
/usr/lib/libaws-c-common.so.1(+0x37599) [0x7f974e57d599]
/usr/lib/libaws-c-common.so.1(aws_json_value_get_from_object+0x29) [0x7f974e56cbf9]
/usr/lib/libaws-crt-cpp.so(_ZNK3Aws3Crt8JsonView11ValueExistsEPKc+0x1d) [0x7f974e6ef47d]
./test-aws-iot-device-client(+0x9b1b9) [0x555e6911d1b9]
./test-aws-iot-device-client(+0x11ce97) [0x555e6919ee97]
./test-aws-iot-device-client(+0x26223d) [0x555e692e423d]
./test-aws-iot-device-client(+0x2562e6) [0x555e692d82e6]
./test-aws-iot-device-client(+0x256572) [0x555e692d8572]
./test-aws-iot-device-client(+0x25671c) [0x555e692d871c]
./test-aws-iot-device-client(+0x256ed3) [0x555e692d8ed3]
./test-aws-iot-device-client(+0x26275d) [0x555e692e475d]
./test-aws-iot-device-client(+0x257051) [0x555e692d9051]
./test-aws-iot-device-client(+0x73d6f) [0x555e690f5d6f]
/lib/libc.so.6(__libc_start_main+0xeb) [0x7f974e0e3cfb]
./test-aws-iot-device-client(+0x8475a) [0x555e6910675a]
Aborted
Environment (please complete the following information):
The text was updated successfully, but these errors were encountered: