-
-
Notifications
You must be signed in to change notification settings - Fork 386
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Crash during loading with multiple WorkerThreads enabled on Intel(R) Core(TM) i3-2100 CPU @ 3.10GHz #5387
Comments
If you're going to recompile, I'd recommend using our latest master, since it contains bug fixes, and new features (and more relevant bugs to discover :) ) Hmm, looking at the PKBUILD file, I don't see how it's getting the 2022-02 version, to me it looks like it's getting the latest master? (I very likely could be wrong). If you still have the git repo that pacman downloaded, you can always do a "git log" in it and see what the latest commit is. Have pioneer worked on your machine before? |
Yes, despite first version string, it updates to the latest master version at every package rebuild. I dont know how either :-)
Yes, I believe it was version from Jun 8, 2022 or slightly earlier. |
Indeed, that makes sense. Would be interesting if you could find out which commit your release is on.
I believe L19 in the PKBUILD file, then updates L4 with the value of the |
As I stated, it's probably the latest commit, as I updated my package just two days ago and latest commit from master has 12 days. And here is a proof, I just extracted git log from package source directory: commit feb4169 (HEAD -> master, origin/master, origin/HEAD)
|
I will need a backtrace with symbols present to debug this particular error - I'd recommend uploading your output.txt as well, as some log messages are intentionally not logged to the console due to being too verbose (though I don't think it'll make a significant difference). |
Ok, finally I got this:
Also: During compilation I got a strange "compiler internal error", and it was similar to those that encountered few weeks ago with similar conditions, so I was afraid that it's my RAM or other hardware that fails. But after final and successful recompilation main issue persisted the same. So, seems to be "stable". |
@pirogronian thanks for bt & bug report! I've taken the liberty to edit it as I did with the first, by adding code block, such that the numbers don't reference issue 1-13. Click "edit" on your post and you can see what needed to be added to code-format it. (Adding: |
I run memtest and found several RAM errors at addressess above 6113M. So I disabled it by mem=6113M and run compilation again. Let's check it again. I hope my PC wil not overheat... 😄
Thanks a lot! I cannot figure out, how to do this... Code button inserts only single ` |
Hmm... as far as I can tell, I want to consider this as an issue with memory corruption - the program fails immediately after filling the array with "real stars" when it attempts to write to the remaining space pre-reserved in the array. I've confirmed on my end that the same configuration with respect to exact number of stars being generated and number of worker threads has no issues at all, so I think this may be a problem with your memory setup. The code writes to the exact same array across multiple worker threads, so it makes very little sense why it would work properly during the parallel phase, but fail immediately upon returning to the single-threaded phase and indexing through the stack-allocated array handle. |
I'm also worried that this is about my RAM, but I cant figure aout why the issue is so predictable and persistent between runs and even builds. RAM corruption should lead to more random issues, as memory management is dynamic. Now I have newer, corrupted chip replaced with previous one, whitch seems to be ok, and I'm recompiling pioneer again (along with new kernel with memtest support). For now, the issue persisted, so it's nearly for sure not a dynamic runtime problem. Edit: I just imagined another option: remembering not very old problems with gcc's "internal compiler error" messages, I checked that gcc package was updated more than month ago. Mayby something is broken with its intallation or in other devel packages. I'm going to reinstall it now and recompile pioneer once again... Update: Reinstalled group base-devel, gcc-libs and glibc. System rebooted. Issue persisted. Recompiling pioneer once again... I decided to use another PC with Archlinux for reference. But it shall takes some time... Update: I reinstalled forcefully all my system. Issue persisted. Then recompiled pioneer from fresh sources. Issue persisted. Update: Finally... Yet yesterday I started to suspect it could be something with CPU, precisely with multithreading. And edited config.ini, changing Just for sure, I tested also |
Hi! I'm the maintainer of the PKGBUILD file. I'm glad to see it is used by someone, I'm doing it as a hobby and to learn Arch Linux packaging system, but trying it have the best quality I can. I'm having the same issue. I have an AMD Ryzen 5 2600 (12) @ 3.400GHz CPU and 8 GiB RAM. My memory is not damaged. Looking at the error i can see the line: /usr/include/c++/12.1.0/bits/stl_vector.h ...... The version of GCC being used is 12. I've tried to compile using version 11, but I still have the same issue. I'm haven't experienced this issue before. It's the first time. As @pirogronian indicates, using WorkerThreads=1 it works. It seems an issue related to multithreading. Perhaps it is needed a compile option to indicate CPU type? |
I've compiled commit 291a495 just before commit "Web-eWorks/multithread-improvements" and I can load the game without problem, but I can't save the game. When I save the gamme appears this error: /usr/include/c++/12.1.0/bits/stl_vector.h:1142: std::vector<_Tp, _Alloc>::const_reference std::vector<_Tp, _Alloc>::operator const [with _Tp = unsigned int; _Alloc = std::allocator; const_reference = const unsigned int&; size_type = long unsigned int]: Assertion '__n < this->size()' failed. And when I'm compiling appears this warnings: It seems ArchLinux uses too many new versions of programs and libraries and perhaps pioneer is using an iterator considered deprecated. |
Good to know! I wasn't expecting it to be actually related to corrupted memory, but unfortunately system errors of that sort mean there's very little actionable information that we can use from the bug report as nothing about the system can be considered in a consistent, stable state.
Interestingly, this doesn't actually disable multithreading at all, it just reduces the number of threads involved to a 'core' thread and a 'worker' thread which both still work to fill the vector in question. It's not related to 'cpu type', as a Core-i3 2300 is a very different CPU from a Ryzen 5 2600. ...and looking at the code I just realized what the actual problem is. Try adding: stars.pos.resize(NUM_BG_STARS);
stars.color.resize(NUM_BG_STARS);
stars.brightness.resize(NUM_BG_STARS); to
This is a completely unrelated warning (the code it's warning about is not even in scope when the error is triggered) and is not relevant to the problem at hand. Inheriting from std::iterator will remain deprecated-but-allowed for a very long time. |
I've compiled using Debian 11 (Bullseye) and it works without problem, in the same computer. This version of Debian uses GCC 10. Perhaps the issue is with GCC 11 and GCC 12 |
The code in question (before or after the modification above) compiles and runs fine on GCC 12 for me (Solus Linux). It's much more likely to be a build-flags issue than a compiler-specific issue. |
I've compiled manually in Arch Linux, with GCC 12 and it works without problem. So the game only fails when it is packaged, perhaps there is a build flag causing the error. These are the build flags I have configured for packaging:
@Web-eWorks I haven't tested your patch yet, I will try tomorrow. Perhaps it will fix the problem. Thank you! |
Finally, I have tested your fix (don't let for tomorrow things you can do today) It works perfect!!! Now I can have the game packaged again. Thank you |
Sorry, I've celebrated too soon. The game loads, this issue is solved but there are still some randomly crashes related to the same file: /usr/include/c++/12.1.0/bits/stl_vector.h What should I do? Open a new issue or you reopen this? Manually compiled game works without problem, but packaged game with Arch Linux packager crashes in this situations:
It seems there is an strict memory protection build flag when you package the game in Arch Linux. The flags used are mentioned in this comment Which flag can be causing the crash? -fstack-clash-protection? -D_GLIBCXX_ASSERTIONS? These are default flags, it means they are build flags that Arch Linux developers consider adequate. Well, for the moment we can compile the game manually following COMPILING.txt guide for Linux. |
If I have some time, I'll look into those other crashes you've mentioned; for now I'd recommend manually compiling Pioneer (without said flag) or temporarily/permanently removing the flag given that it does have a runtime performance cost. |
Flatpak and Fedora packages are built with this flag, and it was causing a crash on save. Also encountered by manolollr in pioneerspacesim#5387.
Flatpak and Fedora packages are built with this flag, and it was causing a crash on save. Fixes pioneerspacesim#5570. Also encountered by manolollr in pioneerspacesim#5387.
Flatpak and Fedora packages are built with this flag, and it was causing a crash on save. Fixes pioneerspacesim#5570. Also encountered by manolollr in pioneerspacesim#5387.
Observed behaviour
Crash during loading, always in the same place and manner.
Console log with gdb session (last lines):
Gdb backtrace:
I know, it's not very helpful, depsite I added "-D CMAKE_BUILD_TYPE=Debug" and "options=(debug !strip)" to the PKBUILD file. Maybe I should recompile everything again, but I'm afraid I would take long time again. Well, I'll try it, but meanwhile can anybody already look at it?
Expected behaviour
Normal loading and work, as before about two or three weeks ago.
Steps to reproduce
My pioneer version (and OS):
pioneer-git 20220203.r74.gfeb4169a0-1 (from there: https://aur.archlinux.org/packages/pioneer-git)
OS: Arch Linux, up to date.
EDIT: Impaktor added code formatiting to not reference issues 1-16
The text was updated successfully, but these errors were encountered: