-
-
Notifications
You must be signed in to change notification settings - Fork 271
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
worker_id must be less than task_worker_num #851
Comments
Hey. I was investigating this, and I have no idea why this happens (in theory it shouldn't). Swoole, the tool used to serve shlink, allows you to spin up a number of processes used to serve requests and to process background tasks. Each one of them is a worker, and has a worker ID. When you try to run a background task, swoole enqueues it and then it gets assigned to the first free worker. Apparently, it is trying to assign it to a non existing worker (like, for example, there is 16 workers and it tried to assign it to the 17 one), which is producing this error. The problem is that this logic is all handled by swoole internally, so I don't know how to prevent this 🤔. Maybe there's a bug either in swoole or the library I use to integrate with it. I'll try to reproduce it with that version of shlink (I have a couple of raspberries somewhere), and then with a newer version using updated dependencies, to see if it solves the problem. |
Hey, |
Hey @leolivier. Can you open a new ticket for this? |
Hi @acelaya
|
Yep, I saw about the base path, but I didn't think it had nothing to do. Also, I'm not really familiar with traefik. So, to summarize, the 404 issue is probably caused by not passing the prefix to shlink, while you have actually defined it's using a prefix through the BASE_PATH env var. You have to either make sure the URL keeps the prefix until it reaches shlink and the BASE_PATH is defined, or use something in front of shlink to strip out the prefix and then don't define the BASE_PATH at all. That probably closes #855 That said, now we still have the issue with the workers, which is most probably related with the architecture in which it is being run. I'll have to investigate further this one. |
I closed #855, can you reopen this one, please? |
Reopened. I saw it from the phone and didn't remember I closed it 😅 |
I think the issue could be related with this swoole/swoole-src#3176 |
Hi @acelaya acelaya, I saw the issue in swoole is closed, any news on your side? |
Hey @leolivier I'm afraid I haven't had time to continue investigting this one. I'm currently focused on pushing v2.5 out. I'll try to make some time after that to investigate a bit more. The last update is that I wasn't able to reproduce it. |
Hey @SidneySun. I received a notification for a comment you posted here, but you seem to have deleted it. Did you manage to find something out? Could either you or @leolivier try with the docker image using The problem is most probably with swoole, and I have to spend some time setting up a raspberry with docker and such, so any help trying to find out how to report this is more than welcome. |
Hi, I have the same problem with Raspberry Pi.
I found that the maximum id of the worker is 32. It is greater than task_worker_num[16]. I have tried with the docker image using the latest tag. In fact, I just met this project and pulled the image tonight. |
Thanks for clarifying. The numbers should be correct. Swoole allows setting up some processes that will handle web requests, and some that will handle tasks. By default, shlink sets up 16 of each. Maybe that's to high for the raspberry. It can be changed by using the |
Yes, I have tried smaller numbers yesterday. Because of this, I discovered the quantitative relationship between them. All in all, the total number of workers must be greater than task_worker_num. |
Ok. Thanks for the help. I'll probably try reporting the issue to the swoole project with this information. |
Hi, didn't have time to test latest version, will keep you posted when I'll do |
Thanks @leolivier |
I have finally set-up my old raspberry 2B with some raspbian, installed docker and executed shlink over it. With stable versions, I can reproduce the issue described here. However, with latest version, which uses swoole 4.6.2, I get a different error, something related with clock_gettime. I'll try to report both in swoole project and see how it goes. |
Hi, I'm sorry, I used the stable image yesterday.
Then, you will find the original error about worker id. |
Cool, thanks for checking. |
FYI, the issue I opened in swoole project has been closed by a PR, so I imagine it has been fixed. I subscribed to the project, and I'll test v 4.6.3 once it's out, to check if the error is gone. |
Thanks Alejandro, keep us posted please |
Once this job finishes https://github.com/shlinkio/shlink/actions/runs/552753589, the I'll check it tomorrow afternoon after work (it's 11:15pm in my timezone), but feel free to give it a try. |
I have just tried, and I can confirm the issue has been fixed 🙂 |
Great! Thanks @acelaya, I'll test also this weekend |
I have bad news. Swoole 4.6.3 seems to have regression that makes requests to never close. I'm trying to find a solution to this, but I might have to roll back to v4.6.2 and reopen this. I'll keep you posted. |
I found a quite simple workaround to this issue ☝🏼 This build https://github.com/shlinkio/shlink/runs/1879379513?check_suite_focus=true includes the fix and still uses swoole 4.6.3. No need to rollback anything. Once that finishes, |
Hi @acelaya
|
Can you share how you run the docker command? Also, did you run |
I have checked again, just in case I missed something or I have introduced some unintended regression since I tested for the first time, and I can confirm it works. Swoole tasks are now properly executed when a short URL is visited. I took some pictures. The best thing is that I have just released v2.6.0, which includes this fix. As soon as this build finishes, the corresponding docker image will be available. https://github.com/shlinkio/shlink/actions/runs/564130076 |
I found the solution by looking at your Dockerfile:
And now it's the 3rd one with shlink and yet another symptom... |
How Shlink is set-up
Summary
When accessing to a generated short link, I get a 500 error.
Current behavior
When accessing to a generated link, I get a 500 error
and the log shows the following exception
Expected behavior
No exception
How to reproduce
Here is the docker-compose file I use and start with docker-compose up -d. It's using traefik for the routing but I don't think it's linked.
Then I create a link with e.g.:
And when I access the short link, I get the error (whatever the target)
The text was updated successfully, but these errors were encountered: