Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

System cron instead of webcron? #55

Closed
agates opened this issue Dec 28, 2020 · 27 comments · Fixed by #292
Closed

System cron instead of webcron? #55

agates opened this issue Dec 28, 2020 · 27 comments · Fixed by #292

Comments

@agates
Copy link

agates commented Dec 28, 2020

Apps like News currently require system cron, which by my understanding isn't available with this helm chart by default.

Is it possible to run cron.sh as in some of the docker compose setups, instead of webcron?

@webwurst
Copy link
Contributor

Solution for the Nextcloud News app would probably be to run Nextcloud News Updater as a Kubernetes CronJob.

@sseneca
Copy link

sseneca commented Feb 18, 2021

It wouldn't be that hard to implement system cron into the chart though, right? According to the docker compse file, it's just a matter of mounting the directory and executing the PHP script at /var/www/html/cron.php. It even seems less troublesome than the webcron method, unless I'm misunderstanding something. I thought permissions could complicate things (the script must be executed as user www-data), but the docker compose file doesn't specify anything to do with that, so it doesn't look like an issue.

Alternatively, maybe the commands listed here could be run each time the container is created, though this wouldn't create a dedicated Kubernetes CronJob object.

@sseneca
Copy link

sseneca commented Feb 18, 2021

I did some more investigation, and found that crontab -u www-data -l in the nextcloud container already spits out:

*/5 * * * * php -f /var/www/html/cron.php

I simply ran crond and now system cron (along with Nextcloud News) seems to work fine. This could be incorporated into the Helm chart (e.g. current cronjob.enabled could become cronjob.webcron.enabled, and cronjob.system.enabled could be added that just runs crond each time the nextcloud container is created/recreated/whatever).

@agates
Copy link
Author

agates commented Feb 18, 2021

The only caveat would be if one had HPA and 2 or more pods running, right?

@webwurst
Copy link
Contributor

webwurst commented Feb 20, 2021

Running multiple processes in one container is discouraged since this makes handling more complicated. Resource usage might be harder to predict, maybe a misbehaving cronjob takes down the main Nextcloud process. This will be harder to debug and if some less experienced user shows up with problems like this in a forum it might be not be easy to help and lead to frustration..

@sseneca Your idea of calling /var/www/html/cron.php in a Kubernetes CronJob makes the most sense to me. But if the executed jobs are directly accessing the filesystem then this solution depends on a shared read-write-many filesystem the same as for other multi-pod installations, like @agates is suggesting.

I am lacking knowledge about the difference of webcron vs. crond. I don't see a technical reason why those just don't trigger the same jobs? But obviously I am missing something here.

@agates
Copy link
Author

agates commented Feb 20, 2021

I believe the primary concern with webcron is the http timeout for some long running tasks. My understanding is fixing that would solve the issue, as long as you can convince the News app devs it's a workable solution.

Currently, News just refuses to update feeds with webcron -- they don't let the user make a decision if they know what they are doing -- even though it appears to be functionally the same otherwise.

@sseneca
Copy link

sseneca commented Feb 20, 2021

Apart from Nextcloud News, my understanding is that system cron is preferred anyway because it is independent of the web sever and any issues it may have.

Another thought I had was rather than implementing the CronJob as a separate Pod, implementing it as a sidecar container within the same Pod, which I think would avoid requiring the ReadWriteMany access mode on the PV. But without additional logic in the containers this isn't possible because k8s doesn't have good native support for sidecars yet (and won't for some time, see: kubernetes/enhancements#753).

So that approach isn't possible yet, and relying on ReadWriteMany isn't ideal. I'm going to stick to using crond in the same container, I think it's the best solution for me right now. There's also not much point adding that option to the chart, since the same can (probably) be achieved by using lifecycle.postStartCommand.

@ishioni
Copy link
Contributor

ishioni commented Mar 4, 2021

Apart from Nextcloud News, my understanding is that system cron is preferred anyway because it is independent of the web sever and any issues it may have.

Another thought I had was rather than implementing the CronJob as a separate Pod, implementing it as a sidecar container within the same Pod, which I think would avoid requiring the ReadWriteMany access mode on the PV. But without additional logic in the containers this isn't possible because k8s doesn't have good native support for sidecars yet (and won't for some time, see: kubernetes/enhancements#753).

So that approach isn't possible yet, and relying on ReadWriteMany isn't ideal. I'm going to stick to using crond in the same container, I think it's the best solution for me right now. There's also not much point adding that option to the chart, since the same can (probably) be achieved by using lifecycle.postStartCommand.

Even with a sidecar this would still not be fixed because even with a sidecar container you're breaking replicas as ALL of them would have the sidecar and all of them would trigger at the same time.

Unfortunately this looks like something that nextcloud-news would have to adapt on their end

@Routhinator
Copy link

I'm not sure why the CronJob isn't just reconfigured to run the cron.php with the PVC mounted. There is the caveat the PVC needs to be readwritemany - but it would work just fine and it would only run once.

@ishioni
Copy link
Contributor

ishioni commented Mar 14, 2021

I'm not sure why the CronJob isn't just reconfigured to run the cron.php with the PVC mounted. There is the caveat the PVC needs to be readwritemany - but it would work just fine and it would only run once.

Because RWX storage providers are not that common. You'd be excluding anyone who uses iscsi for example

@Routhinator
Copy link

I'm not sure why the CronJob isn't just reconfigured to run the cron.php with the PVC mounted. There is the caveat the PVC needs to be readwritemany - but it would work just fine and it would only run once.

Because RWX storage providers are not that common. You'd be excluding anyone who uses iscsi for example

That would imply they only have one pod then since they would not be able to deploy in HA if this was true, so the sidecar would work.

Implement both methods and put them behind a feature flag to ensure only one method is deployed at once.

@JensErat
Copy link

From reading the code, I don't think parallel cronjobs should really be an issue:

		// We only ask for jobs for 14 minutes, because after 5 minutes the next
		// system cron task should spawn and we want to have at most three
		// cron jobs running in parallel.

It looks like there's some locking in place.

So for supporting HA, not spawning unrelated processes in a container and having stable and robust system-cron, we can use a simple sidecar:

      - command:
        - /cron.sh
        image: nextcloud:21.0.0-apache
        imagePullPolicy: IfNotPresent
        name: cron
        volumeMounts:
        - mountPath: /var/www/
          name: nextcloud-data
          subPath: root
        - mountPath: /var/www/html
          name: nextcloud-data
          subPath: html
        - mountPath: /var/www/html/data
          name: nextcloud-data
          subPath: data
        - mountPath: /var/www/html/config
          name: nextcloud-data
          subPath: config
        - mountPath: /var/www/html/custom_apps
          name: nextcloud-data
          subPath: custom_apps
        - mountPath: /var/www/tmp
          name: nextcloud-data
          subPath: tmp
        - mountPath: /var/www/html/themes
          name: nextcloud-data
          subPath: themes

This makes use of:

No need for any of the database/... env variables, looks like the config written to disk is used. I guess the first run of the cronjob might fail on first start until the main container finally wrote the configuration, but this does not seem like an issue to me.

I currently monkeypatched with kubectl edit deployment. It works fine for ~1h now, including nextcloud news updates. I am willing to put together a PR if we get some agreement if this is a good way to solve the issue and on how configure (alternative to webcron in values.yaml?).

@pfuhrmann
Copy link
Contributor

I fail to see what is the issue with the current webcron implementation.

Using a sidecar container is not a good idea. You will end up running potentially N parallel jobs where N is the same as a number of replicas. That can have potentially bad side effects as hinted by the comments from the source code.

we want to have at most three cron jobs running in parallel.

If the concern is the uptime of the web server, the reality is, if the server is down you have probably bigger problems to worry about than the execution of cron tasks anyways.

@agates
Copy link
Author

agates commented Apr 2, 2021

I fail to see what is the issue with the current webcron implementation.

It's literally in the first sentence of this issue. Nextcloud devs apparently don't talk to each other :/

@pfuhrmann
Copy link
Contributor

It is not the responsibility of this chart to be working around the issues of the particular apps.

Webcron is an officially supported way of running scheduled tasks and makes the most sense for the K8s environment. For this reason, this should be fixed on the Nextcloud News side as previously suggested by @webwurst and @ishioni.

@agates
Copy link
Author

agates commented Apr 2, 2021

I opened an issue there and they closed it. It's like two vendors pointing fingers at each other.

@JensErat
Copy link

JensErat commented Apr 2, 2021

If you're running nextcloud with multiple replicas and don't want too many instances, you've already got the infrastructure to support extra volume mounts (RWX, ...), ie. you could also add the "native" cronjob as extra pod. Or you could even go for a Kubernetes Cronjob. If you have RWX available, you have possibilities for a whole bunch of possible solutions.

A sidecar solution would at least be an easy-to-use solution for those running nextcloud with few instances, or only a single one. I totally agree this is maybe not the solution for everybody, but even a simple extraContainers configuration option would already be fine. Most mature helm charts have such options, to flexibly handle special requirements.

Also from (very brief) analysis of the code, I don't see an obvious issue with parallel cronjobs, apart from some wasted CPU cycles. The instances started by the cron jobs just seem to be polling tasks from a queue, they don't seem to interfere with each other at all.

I fail to see what is the issue with the current webcron implementation.

Webcron suffers from issues with long-running background tasks, like the one from Nextcloud news. If you can't get the task finished within the HTTP call's deadline, you won't get it finished with webcron. At least that's my understanding from the underlying issue. I don't think this is something that can be solved by Nextcloud News, at least not withing instructing the admin to heavily increase timeouts (that also affects general Nextcloud traffic).

@Tha-Fox
Copy link

Tha-Fox commented Apr 21, 2021

I did some more investigation, and found that crontab -u www-data -l in the nextcloud container already spits out:

*/5 * * * * php -f /var/www/html/cron.php

I simply ran crond and now system cron (along with Nextcloud News) seems to work fine. This could be incorporated into the Helm chart (e.g. current cronjob.enabled could become cronjob.webcron.enabled, and cronjob.system.enabled could be added that just runs crond each time the nextcloud container is created/recreated/whatever).

I borrowed your solution and now my news app running on Rpi k3s cluster updates the feeds. Thanks!

@provokateurin
Copy link
Member

Hello is there any update? It doesn't seem that it's fixed on the Nextcloud News side and the webcron doesn't work for me for updating the feeds. I'm only running a single instance of Nextcloud on k8s and I'd love to have this working officially and not in some hacky way.

@wilrnh
Copy link

wilrnh commented Jan 17, 2022

this would also be useful for the google/onedrive integrations which can take super long to import files

👍 for the extraContainers idea

@quentinus95
Copy link

quentinus95 commented Mar 27, 2022

Hello, I do not use News, but I still get some tasks that are ignored when running with WebCron.

Nextcloud itself points that system CRON should be used.

image

Can this issue be fixed? Would a pull request providing an alternative using cron.sh fix (as pointed above) be accepted?

@ntrp
Copy link

ntrp commented May 3, 2022

Found an easy workaround to make this work, add this to your values:

lifecycle:
  postStartCommand:
  - "sh"
  - "-c"
  - "apk add openrc && start-stop-daemon --background /cron.sh"

I am using the fpm-alpine image, in case of another image you need to come up with another way to start a daemon.

With this approach you do not need a sidecar properly configured to mount all disks:

/var/www/html # ps aux
PID   USER     TIME  COMMAND
    1 root      0:00 php-fpm: master process (/usr/local/etc/php-fpm.conf)
   38 www-data  0:08 php-fpm: pool www
   39 www-data  0:08 php-fpm: pool www
   40 root      0:00 sh
   54 www-data  0:02 php-fpm: pool www
  162 root      0:00 busybox crond -f -l 0 -L /dev/stdout
  163 root      0:00 ps aux

@vitobotta
Copy link

Found an easy workaround to make this work, add this to your values:

lifecycle:
  postStartCommand:
  - "sh"
  - "-c"
  - "apk add openrc && start-stop-daemon --background /cron.sh"

I am using the fpm-alpine image, in case of another image you need to come up with another way to start a daemon.

With this approach you do not need a sidecar properly configured to mount all disks:

/var/www/html # ps aux
PID   USER     TIME  COMMAND
    1 root      0:00 php-fpm: master process (/usr/local/etc/php-fpm.conf)
   38 www-data  0:08 php-fpm: pool www
   39 www-data  0:08 php-fpm: pool www
   40 root      0:00 sh
   54 www-data  0:02 php-fpm: pool www
  162 root      0:00 busybox crond -f -l 0 -L /dev/stdout
  163 root      0:00 ps aux

Hi. cron.sh contains this:

#!/bin/sh
set -eu

exec busybox crond -f -l 0 -L /dev/stdout

It doesn't seem to execute cron.php. How did you get it working?

@vitobotta
Copy link

In case anyone uses the Ubuntu/Apache version:

        lifecycle:
          postStart:
            exec:
              command:
              - "sh"
              - "-c"
              - "apt-get update && apt-get install -y openrc && start-stop-daemon --start --background --pidfile /cron.pid --exec /cron.sh"

This works just fine. Thanks @ntrp for the input :)

@quentinus95
Copy link

quentinus95 commented May 26, 2022

On my side I changed the command of the container using the following:

command: ["/entrypoint.sh"]
args: ["bash", "-c", "/cron.sh & apache2-foreground"]

Do you know which approach is the best practice? Main advantage is that I do not install anything here.

Edit: it looks like it breaks the upgrade feature.

@SNThrailkill
Copy link

Any update on this? Would love to stop having to change these settings every day.

@StephenLasseter
Copy link

My horrible solution until this issue is resolved was to create the following cron job on one of my master nodes at the host OS level. It works and continues to work across chart updates and deployment cycles.

*/5 * * * * /usr/local/bin/k3s kubectl exec -n nextcloud deploy/nextcloud -- su -s /bin/bash www-data -c 'php -f /var/www/html/cron.php'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.