Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

3.0.0 Performance collapse #3329

Closed
ElectricR opened this issue Apr 2, 2024 · 18 comments
Closed

3.0.0 Performance collapse #3329

ElectricR opened this issue Apr 2, 2024 · 18 comments
Labels
topic:performance Issues related to performance

Comments

@ElectricR
Copy link

Hello.

So I've just upgraded to 3.0 and immediately noticed huge performance drop in Taskwarrior on my system.

I primarily use TW on my Raspberry PI 5 home server, which is not something you would expect great performance from. Anyway, new TW clearly shows that something is wrong in its new implementation.

  • What command(s) did you run?

I've run simple time task list commands with 2.6.2 and 3.0.0 with 744 pending tasks. This is what I get with the old version:

task list > /dev/null 0.08s user 0.00s system 99% cpu 0.085 total

And this is from 3.0:

./task list > /dev/null 2.01s user 0.13s system 98% cpu 2.176 total

As we can see, new Taskwarrior performs about 25 times slower than the previous version.

  • What actually happened?

I've run TWs under strace and noticed that there are over 160k syscalls in the new version, most of which relate to some IO of SQLite DB. Each of syscalls roughly takes about 10-14us, which leads to almost 2 seconds of context switches and random page cache IO.

I've also tested direct SQL query with sqlite3 taskchampion.sqlite3 "select * from tasks where json_extract(data, '$.status') = 'pending';" > /dev/null, which ran without performance problems at all:

sqlite3 taskchampion.sqlite3 > /dev/null 0.01s user 0.00s system 94% cpu 0.012 total

  • What can I say?

So it seems that new Taskwarrior just bombards DB with queries for each task it gets. If that's the case, this is not something you would want from synchronous SQL operations.

  • Paste the output of the task diag command.
Old:
task 2.6.2
   Platform: Linux

Compiler
    Version: 12.1.0
       Caps: +stdc +stdc_hosted +LP64 +c8 +i32 +l64 +vp64 +time_t64
 Compliance: C++17

Build Features
      CMake: 3.26.3
    libuuid: libuuid + uuid_unparse_lower
  libgnutls: 3.8.0
 Build type: 

Configuration
       File: /home/er/.config/task/taskrc (found), 2090 bytes, mode 100644
       Data: /home/er/.local/share/task (found), dir, mode 40755
    Locking: Enabled
         GC: Enabled
    $EDITOR: helix
     Server: 
         CA: -
Certificate: -
        Key: -
      Trust: strict
    Ciphers: NORMAL
      Creds: 

Hooks
     System: Enabled
   Location: /home/er/.config/task/hooks
     Active: on-modify.timewarrior (executable)
   Inactive: 

Tests
   Terminal: 261x127
       Dups: Scanned 3597 tasks for duplicate UUIDs:
             Found duplicate 90cc89aa-1b0f-47d1-888f-4e9ca3616dff
 Broken ref: Scanned 3597 tasks for broken references:
             No broken references found
New:
task 3.0.0
   Platform: Linux

Compiler
    Version: 12.1.0
       Caps: +stdc +stdc_hosted +LP64 +c8 +i32 +l64 +vp64 +time_t64
 Compliance: C++17

Build Features
     Commit: 3e41fb604
      CMake: 3.29.0
    libuuid: libuuid + uuid_unparse_lower
 Build type: 

Configuration
       File: /home/er/.config/task/taskrc (found), 2090 bytes, mode 100644
       Data: /home/er/.local/share/task (found), dir, mode 40755
    Locking: Enabled
         GC: Enabled
    $EDITOR: helix
Hooks
     System: Enabled
   Location: /home/er/.config/task/hooks
     Active: on-modify.timewarrior (executable)
   Inactive: 

Tests
   Terminal: 261x127
       Dups: Scanned 3607 tasks for duplicate UUIDs:
             No duplicates found
 Broken ref: Scanned 3607 tasks for broken references:
             No broken references found

  • P.S.

I think you had already heard some complaints about this release, but this is a perfect example of how you would not want to make one. I use this tool for about two years and genuinely love it. This tool does everything it should and nothing else, and I really appreciate all the work that has been done here by devs and community. But this is really frustrating to see how this update just dropped on heads of it's users without proper testing, gradual rollout and automation tools. Please, don't make the same mistake twice.

@ashprice
Copy link

ashprice commented Apr 3, 2024

After solving my issue in #3330 I have to say I am having much the same issue. Basic stuff like task ready with my list of ~4,500 tasks is virtually unusable - I'm talking running the command and going and getting a cup of coffee and a meal and managing to drink and eat it kind of slow (I'll get you some time and task debug output in an hour or however long it takes...)

It's realistically a barrier to me actually using 3.0. So I will probably have to downgrade for now. I'm happy to help in any way I can in terms of testing and stuff, but my know-how is severely limited wrt. anything needed for development.

That all said, I am glad to see work is ongoing on the project, and as noted in another update-related thread, I am sorry that you usually only hear from the userbase when we are unhappy!

@felixschurk
Copy link
Collaborator

Are you using any hooks on the database?

I am mainly asking as in #3312 as well as in #3314, both undesired behaviour is when there are hooks enabled, and I also experience some strange reporting on the modification count when adding a new task. (Adding one new task results in four "local" changes; I would expect to have one change)

@Andonome
Copy link
Contributor

Andonome commented Apr 3, 2024

I'm having the same issue.

Command task 3.0.0
task ready 3.629s
task list 14.127s
task next 1.889s
task calendar 1.970s

The old task list isn't available for comparison, but it always took less than a second.

strace task next outputs 31205 lines.

I have a lot of uda's, due to bugwarrior synchronizing with gitlab.
However, the times don't change when reducing the taskrc to a minimum.

task diagnostics

task 3.0.0
   Platform: Linux

Compiler
    Version: 12.1.0
       Caps: +stdc +stdc_hosted +LP64 +c8 +i32 +l64 +vp64 +time_t64
 Compliance: C++17

Build Features
     Commit: 3e41fb604
      CMake: 3.29.0
    libuuid: libuuid + uuid_unparse_lower
 Build type: 

Configuration
       File: /home/ghost/.config/task/taskrc (found), 1933 bytes, mode 100755
       Data: /home/ghost/.local/state/ (found), dir, mode 40700
    Locking: Enabled
         GC: Enabled
    $VISUAL: vim
Hooks
     System: Enabled
   Location: /home/ghost/.config/task/hooks
     Active: on-modify             (executable) (symlink)
             on-modify.timewarrior (executable) (symlink)
   Inactive: 

Tests
   Terminal: 241x45
       Dups: Scanned 6114 tasks for duplicate UUIDs:
             No duplicates found
 Broken ref: Scanned 6114 tasks for broken references:
             No broken references found

@ashprice
Copy link

ashprice commented Apr 3, 2024

Are you using any hooks on the database?

I am not using any hooks, but hooks are enabled.

$ task diag
task 3.0.0
   Platform: Linux

Compiler
    Version: 13.2.1 20230801
       Caps: +stdc +stdc_hosted +LP64 +c8 +i32 +l64 +vp64 +time_t64
 Compliance: C++17

Build Features
     Commit: 8a0a98d3e
      CMake: 3.29.0
    libuuid: libuuid + uuid_unparse_lower
 Build type: 

Configuration
       File: /home/hearth/.taskrc (found), 4718 bytes, mode 100644
       Data: /home/hearth/.task (found), dir, mode 40755
    Locking: Enabled
         GC: Enabled
Hooks
     System: Enabled
   Location: /home/hearth/.task/hooks
             (-none-)

Tests
   Terminal: 175x60
       Dups: Scanned 4850 tasks for duplicate UUIDs:
             No duplicates found
 Broken ref: Scanned 4850 tasks for broken references:
             No broken references found

Memory usage can also be a bit bonkers. Running the following:

task pri:5.000000 mod pri:5

on around 3,300 tasks resulted in all of my RAM being eaten through and my computer crashing - though, granted, it took a very long time to get there, over an hour. The database is around 22MB in size. I have more RAM than taskwarrior should ever need.

$ time task ready >/dev/null
There are 95472 local changes.  Sync required.

________________________________________________________
Executed in   20.13 mins    fish           external
   usr time   19.67 mins  332.00 micros   19.67 mins
   sys time    0.42 mins  145.00 micros    0.42 mins
$ time task rc.hooks=0 >/dev/null
Configuration override rc.hooks=0
There are 95472 local changes.  Sync required.

________________________________________________________
Executed in   20.04 mins    fish           external
   usr time   19.60 mins    0.00 micros   19.60 mins
   sys time    0.41 mins  390.00 micros    0.41 mins
$ task rc.debug=1 ready
Timer Config::load (/home/hearth/.taskrc) 0.000174 sec
No context set
Hook directory not readable: /home/hearth/.task/hooks
Parse Tree (before command-specifіc processing)
  _original_args
    task rc.debug=1 ready
  _args
    word basename='task' raw='task' BINARY 
    pair modifier='debug' name='rc' raw='rc.debug=1' separator='=' value='1' CONFIG ORIGINAL 
    identifier canonical='ready' raw='ready' ORIGINAL CMD READONLY SHOWSID RUNSGC USESCONTEXT ALLOWSFILTER 

No context set
Filtered 4850 tasks --> 1618 tasks [all tasks]
ViewTask::render min=75 ideal=303 overage=119 width=194
Perf task 3.0.0 8a0a98d3e 20240403T215306Z init:847 load:0 gc:48105 filter:10014 commit:0 sort:1203478756 render:19919182 hooks:0 other:1668783 total:1225125687

Decided to make the hook directory and give it rwx permissions. No improvement.

Here is the same set of commands on task 2.6.2:

$ task diag
task 2.6.2
   Platform: Linux

Compiler
    Version: 13.1.1 20230429
       Caps: +stdc +stdc_hosted +LP64 +c8 +i32 +l64 +vp64 +time_t64
 Compliance: C++17

Build Features
      CMake: 3.26.3
    libuuid: libuuid + uuid_unparse_lower
  libgnutls: 3.8.0
 Build type: 

Configuration
       File: /home/hearth/.taskrc (found), 4718 bytes, mode 100644
       Data: /home/hearth/.task (found), dir, mode 40755
    Locking: Enabled
         GC: Enabled
     Server: 
         CA: -
Certificate: -
        Key: -
      Trust: strict
    Ciphers: NORMAL
      Creds: 

Hooks
     System: Enabled
   Location: /home/hearth/.task/hooks
             (-none-)

Tests
   Terminal: 302x51
       Dups: Scanned 4850 tasks for duplicate UUIDs:
             No duplicates found
 Broken ref: Scanned 4850 tasks for broken references:
             No broken references found
 
$ time task ready >/dev/null
________________________________________________________
Executed in    2.61 secs    fish           external
   usr time    2.52 secs  251.00 micros    2.52 secs
   sys time    0.08 secs  150.00 micros    0.08 secs
$ time task rc.hooks=0 ready >/dev/null
Configuration override rc.hooks=0

________________________________________________________
Executed in    2.59 secs    fish           external
   usr time    2.53 secs  421.00 micros    2.53 secs
   sys time    0.06 secs   85.00 micros    0.06 secs 
$ task rc.debug=1 ready
Timer Config::load (/home/hearth/.taskrc) 0.000311 sec
No context set
Parse Tree (before command-specifіc processing)
  _original_args
    task rc.debug=1 ready
  _args
    word basename='task' raw='task' BINARY 
    pair modifier='debug' name='rc' raw='rc.debug=1' separator='=' value='1' CONFIG ORIGINAL 
    identifier canonical='ready' raw='ready' ORIGINAL CMD READONLY SHOWSID RUNSGC USESCONTEXT ALLOWSFILTER 

No context set
Filtered 4850 tasks --> 1618 tasks [all tasks]
ViewTask::render min=75 ideal=303 overage=119 width=194
  pending.data rw - T1946+000~000-000 L1946+000
completed.data rw - T2904+000~000-000 L2904+000
     undo.data rw - T0000+000~000-000 L0000+000
  backlog.data rw - T0000+000~000-000 L0000+000
 
Perf task 2.6.2 - 20240403T220157Z init:1046 load:33314 gc:370795 filter:6333 commit:42 sort:2043416 render:17276 hooks:0 other:61495 total:2533717

@djmitche
Copy link
Collaborator

djmitche commented Apr 3, 2024

There may be a few issues here.

In general, performance is weak right now. Part of that should be relatively easy to fix -- we do a lot of individual queries, especially when modifying data, that could just as well be handled in a single transaction. There may be some unanticipated scaling issues, both in terms of the number of tasks (BTW, is that 4,500 pending tasks, @ashprice, or does that include completed?) an number of operations ("There are 95472 local changes"). I don't think we have anything that scales with the number of operations, but maybe I've missed something there. I have a vague recollection that some DBs accomplish count(*) by scanning all rows, which would be expensive for 95,472 rows! So, let's see if we scan split those out into individual issues. Running a task sync with local sync configured should take care of those 95k rows, so if things speed up at that point then we know leaving a lot of un-sync'd operations leads to poor performance.

As I've said elsewhere, developers for Taskwarrior are thin on the ground right now, so I appreciate the kind words and support.

@ashprice
Copy link

ashprice commented Apr 3, 2024

That is 4,850 total tasks, 1,769 are pending. I thought it was pretty damn big but I see from some of the other threads here, that other people have even double this.

The backlog is quite high, yeah. I basically never used 2.6.2 and before's built-in sync - I've always just made sure I'm not doing any tw related i/o and then ran something like syncthing or rsync on the directory. Careful use meant that never caused any issue, even with recurrence. Occasionally I'd delete the backlog file and even strip out purged tasks from undo.data with tools like ripgrep, sed, and uniq. (Pretty sure I never broke anything in ways that is relevant here.)

I set up the 3.0.0 sync to a local directory and synced without issue. Unfortunately, performance is much the same. (I'll edit in the output when it arrives!)

@djmitche
Copy link
Collaborator

djmitche commented Apr 3, 2024

OK, thanks for checking!

@ashprice
Copy link

ashprice commented Apr 3, 2024

In case it helps, too, here are my current task stats:

Category                   Data         
Pending                    1769         
Waiting                    4
Recurring                  173          
Completed                  1984
Deleted                    920          
Total                      4850
Annotations                66           
Unique tags                36
Projects                   30           
Blocked tasks              151
Blocking tasks             244          
Undo transactions          0
Sync backlog transactions  0            
Tasks tagged               72.6%
Oldest task                2022-08-13   
Newest task                2024-04-03
Task used for              1.6y         
Task added every           2h
Task completed every       7h           
Task deleted every         15h
Average time pending       3mo          
Average desc length        30 characters

@djmitche
Copy link
Collaborator

djmitche commented Apr 4, 2024

I'm stressed just thinking of having 1,769 pending tasks! I added #3334 to track this particular perf issue.

@JoshMock
Copy link

Running a task sync with local sync configured should take care of those 95k rows

Is there a way to disable sync entirely so that it just skips over any sync-related counts or logic? I looked through the man pages and didn't see anything. In my use case, I don't need to sync the data to any other devices, so it would be superfluous to set up a local sync that has no real use.

@djmitche
Copy link
Collaborator

Yes, that's #3297.

@djmitche djmitche added the topic:performance Issues related to performance label May 27, 2024
@djmitche djmitche moved this to Backlog in Taskwarrior Development Jun 4, 2024
@imyxh
Copy link

imyxh commented Jul 22, 2024

i had a look into the sqlite file because vit performance is horrible rn, and

sqlite> PRAGMA table_info(tasks);
0|uuid|STRING|0||1
1|data|STRING|0||0
sqlite> select * from tasks limit 1;
03fcc33c-52ae-44a3-aaf4-3af177e90bf0|{"status":"deleted","end":"1711502131","description":"test","modified":"1711502132","entry":"1711502121"}

the task database is just two columns, where the second column is JSON for all the real task data? so we have to reindex the whole db every time the user wants to run a filter, or check what tags exist, etc? what even is the point of using sqlite then?

with all due respect and tons of love for this project ... wtf.

@imyxh
Copy link

imyxh commented Jul 22, 2024

Okay, forget my complaint about not using the db as a db for now. ElectricR's comment about bombarding the db with queries is way more important.

If I'm reading correctly (I probably am not, since I'm unfamiliar with the codebase and with rust), I see that

  1. TBD2::pending_tasks() calls replica.get_task for all tasks in the working set.

    • I think whoever wrote that code assumed that the replica would be an in-memory replica, so looping through the whole db would be fast.
    • But this seems to not be case.
      a. open_replica calls tc::Replica::Replica (the version with 2 args),
      b. which calls tc_replica_new_on_disk,
      c. which calls TCReplica::from(Replica::new(storage)) where storage is StorageConfig::OnDisk
      d. I gave up on following the rabbit hole here but I don't think this ends up copying the db to memory.
  2. replica.get_task then calls self.taskdb.get_task,

  3. which I am assuming ends up being this one because we're seeing so much IO on the sqlite db:

    fn get_task(&mut self, uuid: Uuid) -> Result<Option<TaskMap>> {
        let t = self.get_txn()?;
        let result: Option<StoredTaskMap> = t
            .query_row(
                "SELECT data FROM tasks WHERE uuid = ? LIMIT 1",
                [&StoredUuid(uuid)],
                |r| r.get("data"),
            )
            .optional()?;

        // Get task from "stored" wrapper
        Ok(result.map(|t| t.0))
    }

So, tldr I think there are a lot of cases where we are doing something like (pseudocode)

for u in pending_uuids
    SELECT data FROM tasks WHERE uuid = u LIMIT 1

which ends up killing performance.

Sorry if I'm being annoying with this. Just trying to make sense of it.

@djmitche
Copy link
Collaborator

Yep, your analysis is correct. There's some work in the Taskchampion repo that will help write performance, If you'd like to work on read performance, please do!

@Scandiravian
Copy link

Scandiravian commented Sep 6, 2024

Yep, your analysis is correct. There's some work in the Taskchampion repo that will help write performance, If you'd like to work on read performance, please do!

It's been a minute since I've programmed anything in C and I'm completely unfamiliar with FFI, so let me know if there's anything I've misunderstood.

From what I can tell, the read performance could be improved by adding a get_tasks function to the StorageTxn trait :

fn get_tasks(&mut self, uuid: Vec<Uuid>) -> Result<Vec<Option<TaskMap>>>

This loop could then simply push the uuid to a std::vector and pass that to get_tasks. The maybe_task.is_some() could potentially happen on the rust side of things as well, in which case get_tasks would simply return Result<Vec<TaskMap>>

Would this be a workable solution?

@djmitche
Copy link
Collaborator

djmitche commented Sep 7, 2024

Yep! The pseudocode in @imyxh's comment is, essentially, TDB2::pending_tasks. And, that's used a lot!

BTW, like most DBs, SQLite is page-based, and pretty quickly pages its DB into memory. With a decent OS, those stay in memory, so at some level we are reading from memory. However, there's a lot of overhead to lock and open the DB (see also #3418), and a lot of overhead to parse each query and return the result, so this pending_tasks is certainly not as fast as it could be. But, it may not be the worst offender in terms of performance -- some profiling would be required to determine that. Probably running a profiler is the action item from this issue.

We already have Replica::all_tasks and all_task_data, so it may make sense to add Replica::pending_tasks and pending_task_data which could then join on the working_set table (as opposed to taking an arbitrary list of UUIDs).

I suspect the implementation of this would be relatively straightforward, and based on existing examples at both the Replica and underlying TaskDB levels in TaskChampion. The connection to C++ is now much simpler than it used to be, using cxx, so that should also be relatively straightforward. And then calling the result from TDB2 is also pretty easy. The tricky bit is implementing it in TaskChampion and then developing against that in Taskwarrior, since they are different repos -- but I can help with that. This is definitely worth doing even if it's not the biggest contributor to poor performance. Do you want to give it a shot? If so, let's open a new issue to track the work.

@djmitche
Copy link
Collaborator

djmitche commented Sep 7, 2024

UPDATE: I created GothenburgBitFactory/taskchampion#452 for that work. Feel free to comment there and I can assign it to you!

@djmitche
Copy link
Collaborator

djmitche commented Sep 7, 2024

I think I'll close this bug out, as it's more of a theme than an issue with a resolution. I think anything actionable here has either been addressed or is covered in topic:performance Issues related to performance . If not, please add new issues!

@djmitche djmitche closed this as completed Sep 7, 2024
@github-project-automation github-project-automation bot moved this from Backlog to Done in Taskwarrior Development Sep 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic:performance Issues related to performance
Projects
Status: Done
Development

No branches or pull requests

8 participants