-
Notifications
You must be signed in to change notification settings - Fork 714
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
APScheduler 4.0 progress tracking #465
Comments
The master branch is now in a state where both the async and sync schedulers work, albeit with a largely incomplete feature set. Next I will focus on getting the first implementation of shareable data stores, based on asyncpg. I've made some progress on that a while back but got sidetracked by other projects, particularly AnyIO. |
Regarding Twisted scheduler on the chopping block for APScheduler v4. My main OSS project is a multi-process app, that spins up many Twisted reactors in those processes, where several of the sub-processes use APScheduler inside the reactor (https://github.com/opencontentplatform/ocp). What would be a safe replacement scheduler if the twisted version is being removed? |
So you run multiple schedulers? Are you sharing job stores among them? The main reason I'm thinking of dropping (explicit) Twisted support is because it carries a heavy burden of legacy with it. I will play around with it and see if I can make it work at least with the asyncio reactor. If it can be made to work with a small amount of glue, I will take it off the chopping block. |
Yes, it runs multiple instances of the schedulers - with their own independent job stores. I understand the need for software redesigns, and I'm certainly not pushing back or trying to make more work for you. Just trying to understand what the recommendation would be. Maybe I could fall back to using APS' BackgroundScheduler since I don't spin it up until after the reactors are running? Either way, I saw the note and want to ensure I follow whatever happens on that one. Either way, thank you for the solid project. |
Are the jobs you run typically asynchronous (returning Deferreds) or synchronous (run in threads)? |
The initial setup with creating job definitions is synchronous. Any updates to previous job definitions or newly created jobs (stored/managed in a DB) occur regularly in an asynchronous manner (LoopingCall that returns a Deferred). And all the work with job runtime (execution/management/reporting/cleanup) occurs in non-reactor threads. |
Ok, so it sounds like the actual job target functions are synchronous, correct? Then you would be able to make do with the synchronous scheduler, yes? |
If you're saying so, then yes. I defer to your knowledge there. I selected with TwistedScheduler since the user guide choosing-the-right-scheduler section said to do so when building a Twisted application. I apologize for compounding the response with a question, but it's related. How is the thread pool and thread count handled if I use something other than the TwistedScheduler? Will the job run inside Twisted's thread pool, or inside BackgroundScheduler's thread pool? Do I need to extend both? Does constructing the BackgroundScheduler with an explicit max_workers count (example below), do anything when it's running inside the Twisted's reactor? self.scheduler = BackgroundScheduler({ |
The sync scheduler (including 3.x's I want to provide first class async support in APScheduler 4.x. If I can do that with Twisted without having to create an entire ecosystem of Twisted specific components, then I'm open to doing that. |
I just added a few items to description:
|
I am open to it, but only as soon as their API stabilizes. As it stands, every beta release breaks backward compatibility. I have more important issues to work on. I don't think v4.0 will have OpenTelemetry support but I will consider adding it to a minor update release once they are in GA. |
A lot of progress has been made on the core improvements of v4.0. Vast code refactorings have taken place. The data store system is really taking shape now. I've added "Failure resilience for persistent data stores" to the task list. It's one of the most frequent deployment issues with APScheduler, so I'm making sure that it's adequately addressed in v4.0. I'm not sure what to do with the event system. I may rip it out entirely until I can figure out exactly how it should work. I know users will want to know when a job completes or a misfire occurs etc., so it will be implemented in some form at least before the first release. I will post another comment when I've pushed these changes to the repository. |
I hit a snag with the synchronous version of the scheduler. I tried to use the AnyIO blocking portal system to run background tasks but I had to conclude that it won't work that way. I have an idea for that though. |
@agronholm do you have any estimate when 4.0 would be released? |
I had hoped at least for an alpha at this point, but the design problems in the sync version killed the momentum I had. I have not done any significant F/OSS development since. I am still committed to getting 4.0 done, but due to pressure at work I don't think I can work on it before Christmas holidays. |
@agronholm How will you make the jobstore can be shared among multiple schedulers? |
By coordination and notifications shared between schedulers. Notifications are optional but recommended, and without notifications the schedulers will periodically check for due schedules. How all this works is specific to each store implementation. |
Hello @agronholm Impressive task list and thanks for apscheduler. By big christmas whish is "locking" (probably the idea of persistent storage) I use apscheduler on several web nodes each node had some workers. Today, I inherit scheduler, store etc to add locking. Instead of using For me it's mandatory that a Task never belong to a worker, the job must be in queue then another worker or himself could process that task. To achieve it I added in redis (like jobs and running keys) "ready", "locked", "dead", "failed", "done"
I'm a big fan of Sidekiq (and also Faktory) And I will be very happy with something like In the "main"
Then in code
Why not Celery ? I don't wan't to setup full celery/flower stuffs, my tasks are simple and I'm a bit lazy to repackage an entire app or split into small libs some line of codes just to allow celery running my code (and also split config, creds etc) Don't know if I'm clear (not native english) |
@ahmet2mir APScheduler 4.0 already has the proper synchronization mechanisms in place. What's still missing is the synchronous API. I've come to a realization that I cannot simply copy the async API and remove the |
While 4.0 is being worked on, I've gone back to the 3.x branch for a bit and fixed a number of bugs and other annoyances. |
Tests on async/sync workers (formely: executors) are passing now, but the sync worker tests are strangely slow and I want to get to the bottom of that before moving forward. |
Slowness in worker tests resolved: it was a race condition in which the notification about the newly added job was sent before the listener was in place, causing the data store to wait for the 1 second timeout to expire before checking for new jobs again. I'll move on to completing the synchronous scheduler code now. I'm also very close to releasing AnyIO v2.1.0 which is a critical dependency for APScheduler 4. |
Tests for both sync and async schedulers pass, but the tests run into delays caused by the new schedule/job notifications not working as intended, plus the sync scheduler tests are causing lots of "Task exception was never retrieved" errors outside of the actual tests which I will have to investigate. I'm considering making an alpha release once these issues have been ironed out. |
After hours of debugging, I finally figured out that I was needlessly creating a new task group in the worker's |
I've just pushed a big batch of changes that implement data store sharing on PostgreSQL (via asyncpg) and MongoDB (via motor). There are a lot of rough edges but at least the whole test suite passes now (at least locally – CI seems to have some troubles). In the coming days I'll try to polish the code base to the point where I can at least make an alpha release. Feel free to try it out, but you'll have to look at the test suite for some idea on how to use it since I haven't updated the docs yet. Also, the database schema will change before the final release (tasks accounting is not currently done) so expect to have to throw out your schedules and jobs. |
Stateful triggers contain state which is saved after the trigger is used to calculate new fire times for a schedule. All triggers are stateful in APScheduler 4. This was necessary in order to correctly implement combination triggers ( |
Got it. And it isn't possible to share state between jobs on the same worker currently, right? E.g. I want to reuse a database connection for a schedule (and then close it once the schedule is "done"). Maybe it can be done via events now that I think of it. |
For schedules, "stateful" means that its jobs retain some internal state which is then saved after the execution of the job. Sharing database connections is out of scope anyway since you can't serialize them. |
I've released another alpha, with tons of fixes/workarounds for less capable RDBMS (sqlite, mysql). Explicit task configuration is also in there. As usual, this update requires wiping your data store and starting over. |
Thank you so much Alex, I just started using 3.x, do you see any specific date around a production ready 4.x release? @agronholm |
I'm sure I can get a beta out before the end of the year (I am furloughed most of December so I have plenty of time to work on APScheduler), but production? That depends on what issues come up in testing. Q2/2024? Not impossible at least. |
Thank you so much for being transparent, really appreciate your community effort <3 |
Some good news again. I'm making significant progress on the cleanup feature which periodically purges expired job results, and now also finished schedules which are no longer purged right after the last job is submitted to the store. With luck, I can push these changes to GitHub this weekend. I've also opened two discussions I would like your input on: |
The automated cleanup is now in. That's 3 out of 4 blockers completed for the beta release. My idea of a "beta" release is that it's feature complete but may still contain bugs. I would like to get the data store schema settled so that there won't be any need for nuking the data stores after an upgrade to a newer beta. To that end, the first bullet point of my previous comment needs to be addressed ASAP. In the absence of any feedback on that issue, my plan is to introduce dynamic fields and to correspondingly reduce the number of columns to only those that need to be indexed and queried against. The last blocker is now the implementation of maximum running jobs limits. Ideally there would be two levels of such limits: task and schedule level. The total number of jobs with the same task ID would never be allowed to rise above the task-level limit, and the number of jobs with the same schedule ID would never exceed the schedule-level limit. The promised import/export feature will likely not land in the first beta, but probably in the second one. |
Alright, so we're in 2024 now. I know what I said about the beta, but I got sidetracked by two other projects of mine that needed urgent work on them. There will be another alpha as soon as:
|
Hey @agronholm , wanted to ask whether it would be helpful to submit issues for bugs in the 4.0.0 releases at this point, or if it would just be best to wait? I'd like to move my system to 4.0.0 because it has some settings that 3.10.9 does not, but I'm running into some problems here and there. Thanks again. |
It might be helpful, but remember that it's still in alpha state for a reason. |
Hey, I am really loving the new version. it is a lot easier to use when compared to the other options available. I am using AsyncScheduler. Diving into the code, I can see why it is a lot of work to get this version released to the world! I will take a look at the issue below and see if I can make some changes. I also noted that the latest commits may address a few of these issues. I read through much of the discussion, but thought it prudent to add my own thoughts on v4.0.0a4:
Workaround, only for scheduling; if you are manually running or adding jobs, this will fail to help you. of course, i think there is another fix, referenced here.
|
Hi, it's been a while! I just released a new alpha with a metric ton of fixes, and a handful of new features too! Importantly, data stores now finally have a clean-up procedure which will remove expired job results and finished schedules. Schedules can now also be paused and unpaused (contributed by @WillDaSilva). This restores a 3.x series feature in an even more powerful form. Kudos to the 3 people who contributed fixes too! As usual, the data store schemas have changed in a backwards incompatible manner, so you need to start from scratch when updating. This should stop happening once the beta is out. |
There are tests making sure |
In current state (no yet beta) it would be great to add restart worker after X processed jobs (behaviour same as Gunicorn) to battle memory leaks in third party libs and code. Reasons why I would like it:
Desired flow:
Please give feedback on idea and if people like it I offer my help to make a PR. |
GUnicorn is an entry point (or a launcher, if you like), so it's fine for it to do restarts at will. APScheduler, however, is merely a library, and it is not acceptable for it to be restarting its host process. If you want such a mechanism, it will have to be implemented elsewhere. |
I agree it's out of the scope, many thanks for fast response I really appreciate you work. |
hi, I just noticed this project. I would like to know if the documentation for v4.0 can be queried now? |
What do you mean by "queried"? |
sorry, is "view the document" |
The documentation is here. You just had to select |
thanks |
@agronholm, any idea when we move to beta or stable release which can be used for production? TIA |
I was already going to release the beta at the end of July, but then some critical issues were reported by users against |
Thanks @agronholm. We hit a roadblock while we trying to use 3.10 with clustering. We got undesired behaviors while we ran 3 instances with a shared data store. We are eagerly waiting for a stable release. We have done PoCs with master and for our simple usecase, the master is working fine. |
@agronholm . I'm using Apscheduler 4.0.0a5. I have a problem with it as I using Redis as event broker and mongo has datastore my flask application is running in multiple instances which inturn creates multiple schedule instances. This causes duplication of tasks. Any idea on this??. Thank you |
Could you please not ask questions in this thread, and create a new discussion for that? Any messages posted here will notify a lot of people. My suggestion while waiting for the beta is to use the master directly, as it has tons of fixes already in it. |
It's been a while again. This weekend I decided to give the old 3.x branch an overhaul because it was causing difficulties for users. I got a lot of work done over there, including some that will help with the eventual 4.x migration:
Check out the full details in the version history. I realize that a year ago I estimated the beta to be released in December (2023), and that hasn't happened yet. I was going to do it in July but then a flurry of bug reports came in which I deemed necessary to be fixed before the beta, and the mental exhaustion, plus work from a gazillion other projects hit me hard. But I'm still working on it when I can. I have a hard deadline on finishing another major release before the end of the year, but I'll try to dedicate so time towards fixing the APScheduler 4.x issues too. |
I'm opening this issue as an easy way to interested parties to track development progress of the next major APScheduler release (v4.0).
Terminology changes in v4.0
The old term of "Job", as it was, is gone, replaced by the following concepts which are closer to the terminology used by Celery:
Also, the term "executor" is now being changed to "worker".
Notice that the terminology may still change before the final release!
Planned major changes
v4.0 is a ground-up redesign that aims to fix all the long-standing flaws found in APScheduler over the years.
Checked boxes are changes that have already been implemented.
threshold
value forAndTrigger
(resolves issues with containedIntervalTrigger
instances)Potential extra features I would like to have:
OrTrigger
(Having the threshold also on OrTrigger? #453)You will notice that I have dropped a number of features from master. Some I may never add back to v4.0, even if requested, but do voice your wishes in this issue (and this issue only – I will summarily close such requests in new tickets). Others have been removed only temporarily to give me space for the redesign.
Features on the chopping block
Qt scheduler (difficult to test/maintain)Being on the chopping block does not mean the feature will be gone forever! It may return in subsequent minor release or even before the 4.0 final release if I deem it feasible to implement on top of the new architecture.
The text was updated successfully, but these errors were encountered: