-
Notifications
You must be signed in to change notification settings - Fork 229
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
only getting compute errors #156
Comments
i'm getting same problem with vega 56. @xmrig can you help? |
Don't use latest drivers |
I use the blockchain drivers and get the problem and they even dont work @xmrig |
Wich one we should use then? Can you write number of latest version that still works well? |
18.3.4, 18.5.1, 18.5.2, 18.6.1. |
I have been testing the the code for the monero algo against the moneroworld pool. It has been working well. However I did have some occasional compute errors, is the new algo affected by this AMD driver bug at all. |
So are we basically waiting on AMD to resolve this issue then, or is there anything the mining software can do? I using the AMD pro drivers on Ubuntu 18.04, by following this guide: https://github.com/xmrminer01102018/VegaToolsNConfigs/blob/master/VegaUbuntuGuide I'll admit it's not the easiest guide to follow, but the TL;DR is basically:
This has been working fine for a few months, but as of yesterday, my mining pool started giving me a "warning" that I need to update to 2.8.0 (for upcoming PoW change). Even though this is supposed to be a warning, I can no longer mine at all -- I just get "n/a" for all of my vega56 cards. I tried rolling back to 18.10 drivers, pulling the latest xmrig-amd from git, and then recompiling. Then I reinstall 18.30 drivers and now I just get thread compute errors. I'm a little bit lost on exactly what the problem is here, and whether or not there is any end in sight. Any further updates would be appreciated. |
@xmrig Are you saying that the latest code will now build properly on the 18.30 drivers if we simply use |
No |
Issue with invalid shares may fixed, reference fireice-uk/xmr-stak#1866 |
I'm not using xmr-stak at all, only xmrig-amd. I do see your commit on the dev branch, so I'll pull from that instead of master and give it a shot. However, I don't know if that fix is relevant or not as it specifically references the rocm open cl implementation on xmr-stak#1866 and I am using pal (AMD Pro drivers). I will certainly pull the latest commit from the dev branch on this repo (xmrig-amd) tonight, build it on 18.10 with I should hopefully have an update in about 7-8 hours or so. |
@bdmayes Your issue is actually with the 18.30 drivers. For whatever reason the opencl cache files are incorrectly generated on 18.30. Presumably you tested your miner on 18.10 and pre-generated your cache files before moving to 18.30 so the miner did not have to regenerate them until recently. Any worksize change will require a new cache file to be created. I also run a Vega rig on Linux and only by chance did I figure this out. Using Ubuntu 18.04 LTS which is incompatible with the 18.10 driver, I copied my cache files from an Ethos installation I was testing xmrig on. This development is very interesting though and I also may have to pull and compile it to test. |
The definitive thread on the bug is here The gstoner posts mostly are the good parts. It's a mismatch of whether the shader C code (mostly same code between all CN OpenCL miners) works properly when the middle representation is HSAIL/PAL/SPIR or whatever the compiler within the driver uses, which got changed in both AMDGPU-Pro and ROCm around the same time, and now ROCm also has at least two variants PAL and whatever the other one is. |
@unsivilaudio I think you nailed it! I am BACK! I don't login as root, but I have to launch my miner with sudo privileges, and I remember being unable to remove the entire directory (before a fresh
Now you should be able to do whatever setup you want to do on your cards (I set fan speed, overclock, and set power play tables to lower wattage). Run the miner and it should work again, getting around 1.9 KH/s per card. 😁
|
Unfortunately, not all is well. I just added a third card on today so I cannot be certain if it's related to the third card, or related to the new code, but I am getting some compute errors on some of the threads:
|
Just FYI, I tried removing the third card entirely and I'm still getting compute errors. I can try lowering the intensity, but these settings have been stable for me without a single compute error on my rig for about a month. Now that I have removed the 3rd card, the only change is that I updated the code to the latest dev branch. :( |
You're going to need to restore your cache file. I hope you backed it up or your going to have to downgrade your driver to recreate it again. |
@unsivilaudio Maybe I don't quite follow then. I already downgraded to 18.10 last night and tested the miner, which generated the cache file. After getting it all back up and running on 18.30 I simply powered down the machine, installed the third card, and then started the miner back up. Are you saying that each time I add a new card, I have to downgrade to 18.10, regenerate the cache files, then go back up to 18.30? |
Ok that might have been the problem. I did the downgrade, cache recreate, and then upgrade. It has been almost an hour and so far, no compute errors. I'm going to let it run overnight and see what it shows in the morning. It seems that my setup is considerably more finicky than I realized it would be. I'll report back if there are further errors in the morning, but it seems stable thus far. This cache file things seems incredibly important. I'm surprised your comment yesterday was the first I've ever read about it. Thanks again! |
It gets corrupted on occasion, not sure what causes it but occasionally I
have to delete the file and restore my backup.
…On Sat, Oct 6, 2018, 10:56 PM Brandon Mayes ***@***.***> wrote:
Ok that might have been the problem. I did the downgrade, cache recreate,
and then upgrade. It has been almost an hour and so far, no compute errors.
I'm going to let it run overnight and see what it shows in the morning. It
seems that my setup is considerably more finicky than I realized it would
be.
I'll report back if there are further errors in the morning, but it seems
stable thus far. This cache file things seems incredibly important. I'm
surprised your comment yesterday was the first I've ever read about it.
Thanks again!
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#156 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AZAqFq8oRLJsvBSp6AvXkxxnwdAo76Xvks5uiYlsgaJpZM4V8HLl>
.
|
Looks like I have been stable throughout the night. It has been running for almost 8 hours without a single error. I guess I'll setup a cron job to periodically copy the cache elsewhere for safe keeping. |
Regarding stability, as far as I'm concerned as long as I don't have to use windows again I'm winning. My setup is fine as long as I don't touch it,it will run for days. :) |
Hi there, I am using version 2.8.1 and I just getting compute errors. My rig: Gigabyte AX370 Gaming 5 I tried to compile it with the DSTRICT_CACHE=OFF flag but it makes no difference. Do I have to downgrade the AMD drivers? Thanks a lot ! |
|
is there an update to usable drivers? |
@bdmayes I compiled 2.8.2-dev with -DSTRICT_CACHE=OFF with 18.10 drivers and tested and created cache file all OK. Then removed 18.10 and installed 18.30 drivers reboot. but when xmrig-amd starts it compiles cache again and get 3 Thread Compute Errors and then hangs until reboot. Any idea.? Have you tried 2.8.2 |
@micotito See the thread I referenced above. Unfortunately, even when I pull down the latest code and recompile, I can't get anything to work. It seems like things are completely unusable since yesterday's fork. 😢 |
v2.8.4 in dev branch now correctly works with 18.10 <-> 18.30 driver switching, short manual https://github.com/xmrig/xmrig-amd/blob/master/doc/DRIVERS.md I checked it and got a little better hashrate compared to Windows. In addition autoconfig for Vega now should work well. @xmrminer01102018 Good job, but all |
@xmrig I'm still unable to get any hashrate reported. I do like the new output with the cache file being printed out though. Very useful!
|
nanopool seems to report a hash rate, but every share is rejected due to "low difficulty"
|
Yes you need update a lot of things, so better use default config.json and fill required fields to add your pool, miner will create threads with proper settings, it easy way, hard way below:
|
@xmrig I just made those changes but I'm getting the following now. I'm not sure what it means:
Perhaps I need to make a symbolic link the CL headers instead of using the defaults? |
Yeah 1920 is for 2 threads, you literally are trying to use more memory than you have available. Try 896 if you are sticking with 4 threads. |
I see. Sorry -- I just copied this config from a friend that had things working several months ago. I have no idea if it's better to have 2 threads per GPU or 4. I just tried changing my config to 2 threads with the values above and it seemed to just hang my entire system. I had to just hold the power button to turn it off. So I just went back for 4 threads with 896 intensity, and it started working with 18.10 drivers. 🎉 I'm going to reinstall 18.30 and see how it goes. Will report back shortly. |
I personally like 4 threads at 896 (these intensities are only applicable on 2mb scratchpad aka regular cryptonight v0/1/2; cn-lite has 1mb scratchpad, and cn-heavy is 4mb). |
My bad, 1920 is for double threads. |
Well it seems to be working with the 4 thread solution again on 2.8.4. You are both amazingly helpful. Thank you so much. The only downside is that my rig is pulling more watts now, and my hashrate is down a bit. The new fork is cutting into my profitability. 😢 I'm back up and running. I believe the reason was that I needed to update variant for each pool and strided_index for each thread in the config. Thank you both again!
|
tbh I haven't messed with cn/2 on my vega rig. However I think you can gain some by keeping worksize at 8 and lowering unroll to 4 or 2, but my understanding of these settings is still rudimentary; please experiment. Its weird how Polaris only observes about a 2% drop in hashrate. |
I'm not choosing cn/2 -- my mining pool is as it switches algorithms. I believe there is a way to configure the miner to only accept certain algorithms, and ignore others? I just haven't looked into how to set that up yet, especially because I was getting 5.7KH/s at around 520W prior to the fork. Right now I'm getting 5 KH/s and pulling 595W off the wall. |
this is great development, i made some startup scripts in case anyone needs them for fully automated set ups followed your notes xmrminer01102018 ! |
no matter which method i do for some reason xmrig and xmr stak both keep recompiling i saved the cache files generated from 18.10 drivers, tarred those up , untarred them after installing 18.30 still recompiling on 18.30 any suggestions? must be something im doing wrong im using cast for the meantime !! |
I use blockchain drivers and mining be OK. Now, i install 18.6.1 driver and get THREAD #0 COMPUTE ERROR:
|
@knittycatkitty -As of right now once you have compiled the code with older driver, you cannot change the config.json settings. If you do, it will recompile. When it does, you will get compute errors. So the best route is 1. Tweak config.json in 18.30 and ignore compute errors. 2. Once you get the best hash rate, uninstall 18.30 and install 18.10. 3. Compile xmrig-amd with "cmake -DSTRICT_CACHE=OFF ..". 4. Uninstall 18.10 and reinstall 18.30. 5. Reboot, update and run the miner. |
@xmrminer01102018 Any thoughts on my issues described in #180 ? Specifically, the last post there is the most relevant. The PPT definitely seems to be the problem. I just ran the following steps, and still one card has entirely n/a threads:
Now it's working at least but the power consumption is very high and I want to tune it to help lower some of the power draw. So then I tried:
Any idea why PPT applied to one card can affect a totally different card? |
AMD says PowerPlay is (still) broken in newer drivers That also explains as best I've seen anywhere, how the AMD triple-layer-stack ends up so broken all the time. Three groups do different stuff without apparently communicating with each other very often. |
@Spudz76 Thanks for that link. Finally an explanation! For what it's worth, I stumbled upon the ability to set power levels in |
@bdmayes - I have seen this on what I categorized as lower grade memory state V56. It can be caused by mostly inferior memory or bad riser. I use the different PPT file for that situation. If you want to try that they are in PPTDIR folder at my GitHub site. I named it LGMV56PPT. |
if i start the miner i only get job received and get compute errors dont get any shares only errors
i am using a rx 480 with the latest drivers
someone has a solution?
The text was updated successfully, but these errors were encountered: