-
-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Monerod Syncing problem #9141
Comments
Please see #9139
How do you know the segfault is related to this longer chain? Can you share a backtrace? |
No segfault issues during sync until it switched to the longer chain. Had to run while loop to keep Monerod from stopping. Tell me what to do to get the backtrace |
It didn't switch to this longer chain, it just logged that a node has sent a new top block condidate. So far I haven't seen another person report that this crashed their node so I'm not sure if it's related to your issue.
Which OS are you using? |
This is on Alpine linux, i'm new on this distro |
What kind of hardware do you use? |
3900x / 16gb ram/ 2.5inch ssd/asrock b550 |
For a backtrace you need gdb installed, and then execute gdb with the monerod binary
then wait for it to load and enter
then monerod should start to sync, wait for it to segfault and enter
and share the output. |
i blocked the IP with longer chain, problem went away. Unable to reproduce the problem now. thanks for the replies *edit: nevermind, seeing segment fault again, will try to reproduce the problem again. |
alpine:~/monero-x86_64-linux-gnu-v0.18.3.1$ gdb monerod For help, type "help". Thread 39 "ld-musl-x86_64." received signal SIGSEGV, Segmentation fault. (gdb) Thread 52 (LWP 26610 "ld-musl-x86_64."): Thread 51 (LWP 26609 "ld-musl-x86_64."): Thread 50 (LWP 26608 "ld-musl-x86_64."): Thread 49 (LWP 26607 "ld-musl-x86_64."): Thread 48 (LWP 26606 "ld-musl-x86_64."): Thread 47 (LWP 26605 "ld-musl-x86_64."): Thread 46 (LWP 26604 "ld-musl-x86_64."): Thread 45 (LWP 26603 "ld-musl-x86_64."): Thread 44 (LWP 26602 "ld-musl-x86_64."): Thread 43 (LWP 26601 "ld-musl-x86_64."): Thread 42 (LWP 26600 "ld-musl-x86_64."): Thread 41 (LWP 26599 "ld-musl-x86_64."): Thread 40 (LWP 26598 "ld-musl-x86_64."): Thread 39 (LWP 26597 "ld-musl-x86_64."): Thread 38 (LWP 26596 "ld-musl-x86_64."): Thread 37 (LWP 26595 "ld-musl-x86_64."): Thread 36 (LWP 26594 "ld-musl-x86_64."): Thread 35 (LWP 26593 "ld-musl-x86_64."): Thread 34 (LWP 26592 "ld-musl-x86_64."): Thread 33 (LWP 26591 "ld-musl-x86_64."): Thread 32 (LWP 26590 "ld-musl-x86_64."): Thread 31 (LWP 26589 "ld-musl-x86_64."): Thread 30 (LWP 26588 "ld-musl-x86_64."): Thread 29 (LWP 26587 "ld-musl-x86_64."): Thread 28 (LWP 26586 "ZMQbg/IO/0"): Thread 27 (LWP 26585 "ZMQbg/Reaper"): Thread 26 (LWP 26584 "ld-musl-x86_64."): Thread 25 (LWP 26583 "ld-musl-x86_64."): Thread 24 (LWP 26582 "ld-musl-x86_64."): Thread 23 (LWP 26581 "ld-musl-x86_64."): Thread 22 (LWP 26580 "ld-musl-x86_64."): Thread 21 (LWP 26579 "ld-musl-x86_64."): Thread 20 (LWP 26578 "ld-musl-x86_64."): Thread 19 (LWP 26577 "ld-musl-x86_64."): Thread 18 (LWP 26576 "ld-musl-x86_64."): Thread 17 (LWP 26575 "ld-musl-x86_64."): Thread 16 (LWP 26574 "ld-musl-x86_64."): Thread 15 (LWP 26573 "ld-musl-x86_64."): Thread 14 (LWP 26572 "ld-musl-x86_64."): Thread 13 (LWP 26571 "ld-musl-x86_64."): Thread 12 (LWP 26570 "ld-musl-x86_64."): Thread 11 (LWP 26569 "ld-musl-x86_64."): Thread 10 (LWP 26568 "ld-musl-x86_64."): Thread 9 (LWP 26567 "ld-musl-x86_64."): Thread 8 (LWP 26566 "ld-musl-x86_64."): Thread 7 (LWP 26565 "ld-musl-x86_64."): Thread 6 (LWP 26564 "ld-musl-x86_64."): Thread 5 (LWP 26563 "ld-musl-x86_64."): Thread 4 (LWP 26562 "ld-musl-x86_64."): Thread 2 (LWP 26560 "ld-musl-x86_64."): Thread 1 (LWP 26557 "ld-musl-x86_64."): |
Can you use paste.debian.net to share the backtrace? Also is this the full log? I'm specifically looking for thread 39, it seems to be missing from your comment. |
sry about that, some parts of the log got cut off. |
Can you make sure everything is updated in your alpine? I see a similar error due to ABI compatibility. And what version of Alpine are you using? |
$ cat /etc/alpine-release |
Great. I downloaded the exact version and sync'ed monerod with my local node. But was not able to reproduce. How familiar are you with package compilation in Alpine? Can you build a debug monero package? I am not familiar with alpine at all, I did a quick search and saw this [1]. |
Hi, thanks for the response. I'm not familiar at all with package compilation in Alpine. |
If we change this line [1]:
to
that would at least generate better debugging information when you are debugging it. It seems the official link for how to build
In the meantime, I left my AlpineOS vm running, but so far no luck. If you are using any specific flags or config to run |
Strange, I see the same behavior. I [165.232.190.164:50514 INC] Sync data returned a new top block candidate: 3076846 -> 3493679 [Your node is 416833 blocks (1.6 years) behind] ? 3493679 |
@Haraade you can ignore it, it's just a node sending false data. It's harmless. |
switched to debian, problem solved itself. |
I am closing this issue as it looks like it is alpine issue. |
My guess is the pthread stack size is too small as musl (not Alpine) has a much lower default than glibc. The (presumed) fix is for Monero to explicitly increase its stack size if the system default is presumably too low. This is monerod failing to run on a widely used environment. While we can declare the environment at fault (an entire libc which has a lot of reasons to use it), I'm affected and would like monerod + musl to work as expected. |
What are the steps that reproduce this bug? |
Run monerod on Alpine. If you have a rootless Docker and Rust toolchain, the following will do that: git clone https://github.com/serai-dex/serai
cd serai
git checkout f0694172ef2cdf7dfde0d286e693243e4bdcacca
cargo run -p serai-orchestrator -- key_gen testnet
cargo run -p serai-orchestrator -- setup testnet
cargo run -p serai-orchestrator -- start testnet monero-daemon This will create a key in a file under The container should SIGSEGV, presumably due to the pthread stack size, within a few minutes (<30, I'd expect, yet I think likely as soon as 5-10). Effectively all of users complained of this, and @j-berman can confirm trivial replication. While we've moved to Debian, that has an increased surface, increased memory requirements, and slower bootup times. This isn't specific to Serai either as Alpine is largely preferred for Docker containers. Alpine is also a Linux distro not exclusive to Docker, so this does potentially have impact to personal machines. If it is the theorized issue (pthread stack size defaults), this actually effects all musl systems. |
Thanks, I ran and synced the entire mainnet blockchain on Alpine and didn't have this issue [1]. If you have specific steps that reproduce this issue I am happy to take a look at it. |
Given that effectively every participant I've had has reported the SIGSEGV, that's my current recommendation. I'll also note that configuration doesn't sync the mainnet blockchain and does have a variety of CLI flags. |
Change made due to a segfault incurred when locally testing. monero-project/monero#9141 for the upstream.
* Remove unsafe creation of dalek_ff_group::EdwardsPoint in BP+ * Rename Bulletproofs to Bulletproof, since they are a single Bulletproof Also bifurcates prove with prove_plus, and adds a few documentation items. * Make CLSAG signing private Also adds a bit more documentation and does a bit more tidying. * Remove the distribution cache It's a notable bandwidth/performance improvement, yet it's not ready. We need a dedicated Distribution struct which is managed by the wallet and passed in. While we can do that now, it's not currently worth the effort. * Tidy Borromean/MLSAG a tad * Remove experimental feature from monero-serai * Move amount_decryption into EncryptedAmount::decrypt * Various RingCT doc comments * Begin crate smashing * Further documentation, start shoring up API boundaries of existing crates * Document and clean clsag * Add a dedicated send/recv CLSAG mask struct Abstracts the types used internally. Also moves the tests from monero-serai to monero-clsag. * Smash out monero-bulletproofs Removes usage of dalek-ff-group/multiexp for curve25519-dalek. Makes compiling in the generators an optional feature. Adds a structured batch verifier which should be notably more performant. Documentation and clean up still necessary. * Correct no-std builds for monero-clsag and monero-bulletproofs * Tidy and document monero-bulletproofs I still don't like the impl of the original Bulletproofs... * Error if missing documentation * Smash out MLSAG * Smash out Borromean * Tidy up monero-serai as a meta crate * Smash out RPC, wallet * Document the RPC * Improve docs a bit * Move Protocol to monero-wallet * Incomplete work on using Option to remove panic cases * Finish documenting monero-serai * Remove TODO on reading pseudo_outs for AggregateMlsagBorromean * Only read transactions with one Input::Gen or all Input::ToKey Also adds a helper to fetch a transaction's prefix. * Smash out polyseed * Smash out seed * Get the repo to compile again * Smash out Monero addresses * Document cargo features Credit to @hinto-janai for adding such sections to their work on documenting monero-serai in #568. * Fix deserializing v2 miner transactions * Rewrite monero-wallet's send code I have yet to redo the multisig code and the builder. This should be much cleaner, albeit slower due to redoing work. This compiles with clippy --all-features. I have to finish the multisig/builder for --all-targets to work (and start updating the rest of Serai). * Add SignableTransaction Read/Write * Restore Monero multisig TX code * Correct invalid RPC type def in monero-rpc * Update monero-wallet tests to compile Some are _consistently_ failing due to the inputs we attempt to spend being too young. I'm unsure what's up with that. Most seem to pass _consistently_, implying it's not a random issue yet some configuration/env aspect. * Clean and document monero-address * Sync rest of repo with monero-serai changes * Represent height/block number as a u32 * Diversify ViewPair/Scanner into ViewPair/GuaranteedViewPair and Scanner/GuaranteedScanner Also cleans the Scanner impl. * Remove non-small-order view key bound Guaranteed addresses are in fact guaranteed even with this due to prefixing key images causing zeroing the ECDH to not zero the shared key. * Finish documenting monero-serai * Correct imports for no-std * Remove possible panic in monero-serai on systems < 32 bits This was done by requiring the system's usize can represent a certain number. * Restore the reserialize chain binary * fmt, machete, GH CI * Correct misc TODOs in monero-serai * Have Monero test runner evaluate an Eventuality for all signed TXs * Fix a pair of bugs in the decoy tests Unfortunately, this test is still failing. * Fix remaining bugs in monero-wallet tests * Reject torsioned spend keys to ensure we can spend the outputs we scan * Tidy inlined epee code in the RPC * Correct the accidental swap of stagenet/testnet address bytes * Remove unused dep from processor * Handle Monero fee logic properly in the processor * Document v2 TX/RCT output relation assumed when scanning * Adjust how we mine the initial blocks due to some CI test failures * Fix weight estimation for RctType::ClsagBulletproof TXs * Again increase the amount of blocks we mine prior to running tests * Correct the if check about when to mine blocks on start Finally fixes the lack of decoy candidates failures in CI. * Run Monero on Debian, even for internal testnets Change made due to a segfault incurred when locally testing. monero-project/monero#9141 for the upstream. * Don't attempt running tests on the verify-chain binary Adds a minimum XMR fee to the processor and runs fmt. * Increase minimum Monero fee in processor I'm truly unsure why this is required right now. * Distinguish fee from necessary_fee in monero-wallet If there's no change, the fee is difference of the inputs to the outputs. The prior code wouldn't check that amount is greater than or equal to the necessary fee, and returning the would-be change amount as the fee isn't necessarily helpful. Now the fee is validated in such cases and the necessary fee is returned, enabling operating off of that. * Restore minimum Monero fee from develop
Hi,
trying to sync a pruned node, getting some weird log output.
2024-01-29 20:01:36.247 I [207.244.240.82:18080 OUT] Sync data returned a new top block candidate: 3067647 -> 3072795 [Your node is 5148 blocks (7.2 days) behind]
2024-01-29 20:01:36.248 I SYNCHRONIZATION started
2024-01-29 20:01:37.555 I [110.40.229.103:18080 OUT] Sync data returned a new top block candidate: 3067647 -> 3344842 [Your node is 277195 blocks (1.1 years) behind]
There seems to be another chain, and its ahead for 1.1 years? (3344842 blocks)
I know this cannot be right, because the current chain is only 30676XX blocks
Feels like an attacker trying to disrupt network stability.
Also getting Segment Fault after auto switching to this longer chain.
Is the DNS-blacklist not working properly?
The text was updated successfully, but these errors were encountered: