Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[vcpkg-tool] Tool sometimes segfaults in manifest mode #26651

Closed
omartijn opened this issue Sep 2, 2022 · 13 comments · Fixed by microsoft/vcpkg-tool#796
Closed

[vcpkg-tool] Tool sometimes segfaults in manifest mode #26651

omartijn opened this issue Sep 2, 2022 · 13 comments · Fixed by microsoft/vcpkg-tool#796
Assignees
Labels
category:vcpkg-bug The issue is with the vcpkg system (including helper scripts in `scripts/cmake/`) requires:tool-release An issue that has been fixed in the microsoft/vcpkg-tool repo and is waiting for a release thereof

Comments

@omartijn
Copy link
Contributor

omartijn commented Sep 2, 2022

Describe the bug

On rare occasions vcpkg-tool will crash during a CMake configure. It crashes right after calculatng the installation plan in popen syscall, from what I can tell this is where it tries to restore packages from cache.

Environment

  • OS: macOS 12.0.1
  • Compiler: AppleClang 13.0

To Reproduce
Steps to reproduce the behavior:

  1. Create a vcpkg.json
  2. Create a CMake project that uses manifest mode
  3. Start configure
  4. See error

Expected behavior
vcpkg-tool should not crash

I've attached a crash log, it crashes in Thread 46 which does not really seem to start, it doesn't get to the posix_spawn_file_actions_addclose like the other threads. My gut feeling tells me that it's because one of the input arguments is bogus and it's trying to read invalid memory because of it.

vcpkg-crash.txt

@Thomas1664
Copy link
Contributor

I think the reason for this is that cmd_execute_and_stream_data uses std::function<void(StringView)> data_cb, instead of std::function<void(const std::string&)> data_cb,. I recently had a similar issue when using string views in a multithreaded context.

How reliably can you reproduce the bug and what inputs do you use, i.e. what are the contents of your vcpkg.json?

@omartijn
Copy link
Contributor Author

omartijn commented Sep 2, 2022

It's quite hard to reproduce. It happens maybe once or twice, but only on our CI, I'm not having any success reproducing it, otherwise I'd be able to attach a debugger.

I've even tried to build vcpkg-tool with ASAN and TSAN enabled. Even like that, I couldn't trigger a failure. I added some logging to get the command that is executed, for me it's something like this:

/Users/martijn/git/software2/vcpkg/vcpkg install --triplet x64-osx --vcpkg-root /Users/martijn/git/software2/vcpkg --x-wait-for-lock --x-manifest-root=/Users/martijn/git/software2 --x-install-root=/Users/martijn/git/software2/build/vcpkg_installed --overlay-triplets=/Users/martijn/git/software2/vcpkg-triplet

@dg0yt
Copy link
Contributor

dg0yt commented Sep 4, 2022

I guess this is related to the second sub-item in #21905.

@Thomas1664
Copy link
Contributor

Thomas1664 commented Sep 4, 2022

@dg0yt @omartijn I think I found the reason for the crash: microsoft/vcpkg-tool#695 Because this is currently the only place where multithreading is in use, there should be no crashes anymore in CI.

@omartijn Thanks again for providing debug information!

@LilyWangLL LilyWangLL added the category:vcpkg-bug The issue is with the vcpkg system (including helper scripts in `scripts/cmake/`) label Sep 5, 2022
@Thomas1664
Copy link
Contributor

Thomas1664 commented Sep 5, 2022

Thread 46 Crashed:
0   libsystem_c.dylib             	    0x7ff8197edd22 popen + 478
1   vcpkg                         	       0x10a10ca3d vcpkg::cmd_execute_and_stream_data(vcpkg::Command const&, std::__1::function<void (vcpkg::StringView)>, vcpkg::WorkingDirectory const&, vcpkg::Environment const&, vcpkg::Encoding) + 1037
2   vcpkg                         	       0x10a10bc86 vcpkg::cmd_execute_and_capture_output(vcpkg::Command const&, vcpkg::WorkingDirectory const&, vcpkg::Environment const&, vcpkg::Encoding, vcpkg::EchoInDebug) + 118
3   vcpkg                         	       0x10a10da60 std::__1::__async_func<vcpkg::cmd_execute_and_capture_output_parallel(vcpkg::Span<vcpkg::Command const>, vcpkg::WorkingDirectory const&, vcpkg::Environment const&)::$_2>::operator()() + 128
4   vcpkg                         	       0x10a10d97a std::__1::__async_assoc_state<void, std::__1::__async_func<vcpkg::cmd_execute_and_capture_output_parallel(vcpkg::Span<vcpkg::Command const>, vcpkg::WorkingDirectory const&, vcpkg::Environment const&)::$_2> >::__execute() + 26
5   vcpkg                         	       0x10a10daee void* std::__1::__thread_proxy<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct> >, void (std::__1::__async_assoc_state<void, std::__1::__async_func<vcpkg::cmd_execute_and_capture_output_parallel(vcpkg::Span<vcpkg::Command const>, vcpkg::WorkingDirectory const&, vcpkg::Environment const&)::$_2> >::*)(), std::__1::__async_assoc_state<void, std::__1::__async_func<vcpkg::cmd_execute_and_capture_output_parallel(vcpkg::Span<vcpkg::Command const>, vcpkg::WorkingDirectory const&, vcpkg::Environment const&)::$_2> >*> >(void*) + 62
6   libsystem_pthread.dylib       	    0x7ff8198cf514 _pthread_start + 125
7   libsystem_pthread.dylib       	    0x7ff8198cb02f thread_start + 15
...
"exception" : {"codes":"0x0000000000000001, 0x0000116d9afd8010","rawCodes":[1,19162449412112],"type":"EXC_BAD_ACCESS","signal":"SIGSEGV","subtype":"KERN_INVALID_ADDRESS at 0x0000116d9afd8010"},
"termination" : {"flags":0,"code":11,"namespace":"SIGNAL","indicator":"Segmentation fault: 11","byProc":"exc handler","byPid":87602},

The crash occurs in popen. This is most likely an issue with Xcode. Compiling vcpkg with a new version of Xcode should fix the issue.

@Thomas1664
Copy link
Contributor

Thomas1664 commented Sep 6, 2022

CI logs: https://dev.azure.com/vcpkg/public/_build/results?buildId=77580&view=logs&jobId=cf796865-97b7-5cd1-be8e-6e00ce4fd8cf&j=cf796865-97b7-5cd1-be8e-6e00ce4fd8cf&t=6d56bf67-9f63-54b3-7bd2-1eeca5f88481

I did some further debugging and it seems like there is some null termination missing. The difficulty is to debug this issue as I don't have a Mac and the issue seems to depend on code generation because adding more print statements makes it disappear. Since the issue reproduces 100% in CI runs with the current setup @omartijn can you please try to debug this on your Mac using the linked PR branch (microsoft/vcpkg-tool#695)?

That being said my new theory is that there is some wrong code generation/ optimization going on and that it is not caused by vcpkg.

@BillyONeal What do you think about this? Is this issue worth adding fuzzing to it?

@omartijn
Copy link
Contributor Author

omartijn commented Sep 7, 2022

Alright, with all the work you'd done to be able to explicitly trigger this bug, I managed to find a source of undefined behaviour.

vcpkg-tool uses a vcpkg::StringView class, which, as the name suggests, gives you a view over some string-data. It's of course expected that the source remains available as long as the vcpkg::StringView is used. This is violated, however, in places where the output of Strings::concat is directly used as an argument to a function that accepts a vcpkg::StringView.

What happens in that case is that a temporary std::string is created, then an implicit string view is created, the std::string gets discarded and the string view is now pointing to invalid memory. To fix this, we should add a deleted constructor to the vcpkg::StringView class:

StringView(std::string&&) = delete;

This way, it'll still work if there is an existing string, but not when a temporary is passed. This leads to many compile errors - which are all bugs triggering UB.

@JackBoosY
Copy link
Contributor

cc @BillyONeal

@omartijn
Copy link
Contributor Author

omartijn commented Sep 7, 2022

BTW: If I compile this example program:

#include <string_view>
#include <string>

int main()
{
    std::string_view bla = std::string{"Hello, world!"};
}

and compile it with clang, even without specifying warning flags, I get the following warning:

string_view.cpp:6:28: warning: object backing the pointer will be destroyed at the end of the full-expression [-Wdangling-gsl]
    std::string_view bla = std::string{"Hello, world!"};
                           ^~~~~~~~~~~~~~~~~~~~~~~~~~~~

which is really helpful in diagnosing this particular class of bug. This warning doesn't come when using the home-grown StringView class (and I don't think you can enable this for your own classes yet, unfortunately).

I guess this class was made to avoid requiring c++17? Is that really important nowadays that the tool is provided pre-built to vcpkg? If so, could we maybe check whether c++17 is available and then simply alias vcpkg::StringView to std::string_view? That way we get the advantages of useful lifetime checks.

@Thomas1664
Copy link
Contributor

Alright, with all the work you'd done to be able to explicitly trigger this bug, I managed to find a source of undefined behaviour.

vcpkg-tool uses a vcpkg::StringView class, which, as the name suggests, gives you a view over some string-data. It's of course expected that the source remains available as long as the vcpkg::StringView is used. This is violated, however, in places where the output of Strings::concat is directly used as an argument to a function that accepts a vcpkg::StringView.

What happens in that case is that a temporary std::string is created, then an implicit string view is created, the std::string gets discarded and the string view is now pointing to invalid memory. To fix this, we should add a deleted constructor to the vcpkg::StringView class:

StringView(std::string&&) = delete;

This way, it'll still work if there is an existing string, but not when a temporary is passed. This leads to many compile errors - which are all bugs triggering UB.

Thanks a lot for reproducing and explaining the bug. It finally turns out that my initial guess that StringView caused the bug was actually right.

If so, could we maybe check whether c++17 is available and then simply alias vcpkg::StringView to std::string_view?

C++ 17 is already in use. I really don't know why StringView wasn't already replaced by std::string_view. This is something else that @BillyONeal has to answer.

[To avoid any confusion: I'm not working for Microsoft]

@BillyONeal
Copy link
Member

vcpkg-tool uses a vcpkg::StringView class, which, as the name suggests, gives you a view over some string-data. It's of course expected that the source remains available as long as the vcpkg::StringView is used. This is violated, however, in places where the output of Strings::concat is directly used as an argument to a function that accepts a vcpkg::StringView.

That's not true: Strings::concat returns a prvalue, which goes through a temporary materialization conversion in order to call StringView's ctor, materializing a temporary which dies at the end of the full-expression. You would have to show a StringView handed such a temporary outliving it; plain

f(StringView);
int main() {
    f(Strings::concat("a", "b"));
}

does not do that. (Both the materialized temporary std::string and the StringView die at the ; in reverse order)

StringView(std::string&&) = delete;

This way, it'll still work if there is an existing string, but not when a temporary is passed. This leads to many compile errors - which are all bugs triggering UB.

No, StringView is explicitly intended to replace const std::string& and adding such a ctor would break that.

@BillyONeal
Copy link
Member

Standard references for the above:

@omartijn
Copy link
Contributor Author

omartijn commented Sep 7, 2022

Ah, you're right about that. So the issue is (luckily) confined to places where the StringView outlives the temporary std::string, like in vcpkg::parse_download_configuration, where an AssetSourcesParser is constructed taking a temporary std::string that's then used to create a StringView.

The parser is then used a line later - in anothre full-expression - when the temporary is destructed. In this case, clang would've caught the bug had we used std::string_view instead.

@BillyONeal BillyONeal reopened this Nov 7, 2022
@BillyONeal BillyONeal added the requires:tool-release An issue that has been fixed in the microsoft/vcpkg-tool repo and is waiting for a release thereof label Nov 7, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category:vcpkg-bug The issue is with the vcpkg system (including helper scripts in `scripts/cmake/`) requires:tool-release An issue that has been fixed in the microsoft/vcpkg-tool repo and is waiting for a release thereof
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants