-
-
Notifications
You must be signed in to change notification settings - Fork 315
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
3.0.0 Performance collapse #3329
Comments
After solving my issue in #3330 I have to say I am having much the same issue. Basic stuff like It's realistically a barrier to me actually using 3.0. So I will probably have to downgrade for now. I'm happy to help in any way I can in terms of testing and stuff, but my know-how is severely limited wrt. anything needed for development. That all said, I am glad to see work is ongoing on the project, and as noted in another update-related thread, I am sorry that you usually only hear from the userbase when we are unhappy! |
Are you using any hooks on the database? I am mainly asking as in #3312 as well as in #3314, both undesired behaviour is when there are hooks enabled, and I also experience some strange reporting on the modification count when adding a new task. (Adding one new task results in four "local" changes; I would expect to have one change) |
I'm having the same issue.
The old task list isn't available for comparison, but it always took less than a second.
I have a lot of uda's, due to bugwarrior synchronizing with gitlab.
|
I am not using any hooks, but hooks are enabled.
Memory usage can also be a bit bonkers. Running the following:
on around 3,300 tasks resulted in all of my RAM being eaten through and my computer crashing - though, granted, it took a very long time to get there, over an hour. The database is around 22MB in size. I have more RAM than taskwarrior should ever need.
Decided to make the hook directory and give it rwx permissions. No improvement. Here is the same set of commands on task 2.6.2:
|
There may be a few issues here. In general, performance is weak right now. Part of that should be relatively easy to fix -- we do a lot of individual queries, especially when modifying data, that could just as well be handled in a single transaction. There may be some unanticipated scaling issues, both in terms of the number of tasks (BTW, is that 4,500 pending tasks, @ashprice, or does that include completed?) an number of operations ("There are 95472 local changes"). I don't think we have anything that scales with the number of operations, but maybe I've missed something there. I have a vague recollection that some DBs accomplish count(*) by scanning all rows, which would be expensive for 95,472 rows! So, let's see if we scan split those out into individual issues. Running a As I've said elsewhere, developers for Taskwarrior are thin on the ground right now, so I appreciate the kind words and support. |
That is 4,850 total tasks, 1,769 are pending. I thought it was pretty damn big but I see from some of the other threads here, that other people have even double this. The backlog is quite high, yeah. I basically never used 2.6.2 and before's built-in sync - I've always just made sure I'm not doing any tw related i/o and then ran something like syncthing or rsync on the directory. Careful use meant that never caused any issue, even with recurrence. Occasionally I'd delete the backlog file and even strip out purged tasks from undo.data with tools like ripgrep, sed, and uniq. (Pretty sure I never broke anything in ways that is relevant here.) I set up the 3.0.0 sync to a local directory and synced without issue. Unfortunately, performance is much the same. (I'll edit in the output when it arrives!) |
OK, thanks for checking! |
In case it helps, too, here are my current
|
I'm stressed just thinking of having 1,769 pending tasks! I added #3334 to track this particular perf issue. |
Is there a way to disable sync entirely so that it just skips over any sync-related counts or logic? I looked through the man pages and didn't see anything. In my use case, I don't need to sync the data to any other devices, so it would be superfluous to set up a local sync that has no real use. |
Yes, that's #3297. |
i had a look into the sqlite file because vit performance is horrible rn, and
the task database is just two columns, where the second column is JSON for all the real task data? so we have to reindex the whole db every time the user wants to run a filter, or check what tags exist, etc? what even is the point of using sqlite then? with all due respect and tons of love for this project ... wtf. |
Okay, forget my complaint about not using the db as a db for now. ElectricR's comment about bombarding the db with queries is way more important. If I'm reading correctly (I probably am not, since I'm unfamiliar with the codebase and with rust), I see that
fn get_task(&mut self, uuid: Uuid) -> Result<Option<TaskMap>> {
let t = self.get_txn()?;
let result: Option<StoredTaskMap> = t
.query_row(
"SELECT data FROM tasks WHERE uuid = ? LIMIT 1",
[&StoredUuid(uuid)],
|r| r.get("data"),
)
.optional()?;
// Get task from "stored" wrapper
Ok(result.map(|t| t.0))
} So, tldr I think there are a lot of cases where we are doing something like (pseudocode)
which ends up killing performance. Sorry if I'm being annoying with this. Just trying to make sense of it. |
Yep, your analysis is correct. There's some work in the Taskchampion repo that will help write performance, If you'd like to work on read performance, please do! |
It's been a minute since I've programmed anything in C and I'm completely unfamiliar with FFI, so let me know if there's anything I've misunderstood. From what I can tell, the read performance could be improved by adding a fn get_tasks(&mut self, uuid: Vec<Uuid>) -> Result<Vec<Option<TaskMap>>> This loop could then simply push the uuid to a Would this be a workable solution? |
Yep! The pseudocode in @imyxh's comment is, essentially, BTW, like most DBs, SQLite is page-based, and pretty quickly pages its DB into memory. With a decent OS, those stay in memory, so at some level we are reading from memory. However, there's a lot of overhead to lock and open the DB (see also #3418), and a lot of overhead to parse each query and return the result, so this We already have I suspect the implementation of this would be relatively straightforward, and based on existing examples at both the |
UPDATE: I created GothenburgBitFactory/taskchampion#452 for that work. Feel free to comment there and I can assign it to you! |
I think I'll close this bug out, as it's more of a theme than an issue with a resolution. I think anything actionable here has either been addressed or is covered in
topic:performance
|
Hello.
So I've just upgraded to 3.0 and immediately noticed huge performance drop in Taskwarrior on my system.
I primarily use TW on my Raspberry PI 5 home server, which is not something you would expect great performance from. Anyway, new TW clearly shows that something is wrong in its new implementation.
I've run simple
time task list
commands with 2.6.2 and 3.0.0 with 744 pending tasks. This is what I get with the old version:task list > /dev/null 0.08s user 0.00s system 99% cpu 0.085 total
And this is from 3.0:
./task list > /dev/null 2.01s user 0.13s system 98% cpu 2.176 total
As we can see, new Taskwarrior performs about 25 times slower than the previous version.
I've run TWs under strace and noticed that there are over 160k syscalls in the new version, most of which relate to some IO of SQLite DB. Each of syscalls roughly takes about 10-14us, which leads to almost 2 seconds of context switches and random page cache IO.
I've also tested direct SQL query with
sqlite3 taskchampion.sqlite3 "select * from tasks where json_extract(data, '$.status') = 'pending';" > /dev/null
, which ran without performance problems at all:sqlite3 taskchampion.sqlite3 > /dev/null 0.01s user 0.00s system 94% cpu 0.012 total
So it seems that new Taskwarrior just bombards DB with queries for each task it gets. If that's the case, this is not something you would want from synchronous SQL operations.
task diag
command.Old:
New:
I think you had already heard some complaints about this release, but this is a perfect example of how you would not want to make one. I use this tool for about two years and genuinely love it. This tool does everything it should and nothing else, and I really appreciate all the work that has been done here by devs and community. But this is really frustrating to see how this update just dropped on heads of it's users without proper testing, gradual rollout and automation tools. Please, don't make the same mistake twice.
The text was updated successfully, but these errors were encountered: