You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Since updating to 2.7 we've observed a massive spike on the database load. Most of it seems to come from updates to the agents table. More specifically, the last_work field of several agents is updated multiple times per second.
This is greatly impacting database performance (which seems to feed into #3999)
For context we were running Woodpecker 2.3 with a PostgreSQL database on Amazon RDS db.t4g.small before just fine (2 vCPU and 2 GiB RAM). After 2.7 we had to update to a db.t4g.xlarge (4 vCPU and 8 GiB RAM) and it's still struggling on the CPU.
We're running on Kubernetes with 10 agents and up to 10 workflows for agent.
Steps to reproduce
Install Woodpecker 2.7
Run multiple Workflows
Observe multiple updates on the agents table
Expected behavior
From #3844 this seems to be the intended behavior. However it's clearly coming at a cost.
Maybe we could only update the agents every X minutes instead of every log line/every second (not sure how it's implemented right now, need to look deeper). Possibly the same could be said for log_entries update, their frequency might be just a tad too high. Of course surely there are risks here (i.e. losing logs).
We will test an internal version with the last work update on every log disabled and see how that goes.
System Info
Woodpecker 2.7
Kubernetes installation
Additional context
Here's our Amazon Performance Insights from the database. On Friday 9th evening we updated Woodpecker from 2.3 to 2.7 but we barely had any pipelines ran then nor during the weekend. Then on Monday you can see how load is many times higher than the previous week.
Most of it is from INSERTs coming from log_entries and agents however a more detailed analysis shows that in query volume, log_entries updates have not increased much after the update. However agents have increased tremendously.
On Monday we increased the database from a t4g.small to t4g.xlarge which helped but did not solve the issue.
Component
server, other
Describe the bug
Since updating to 2.7 we've observed a massive spike on the database load. Most of it seems to come from updates to the
agents
table. More specifically, thelast_work
field of several agents is updated multiple times per second.This is greatly impacting database performance (which seems to feed into #3999)
I haven't done a full profiling but if I had to guess this seems to come from this change https://github.com/woodpecker-ci/woodpecker/pull/3844/files#diff-0f4ca4733649eb6707a0dd7e0ca0083cdc587b5cdced5b3ac051fc32cc9353cbR361-R368. If I understand correctly at every time a log line is persisted to the database, the respective agent is updated. The frequency of these updates seem quite high.
For context we were running Woodpecker 2.3 with a PostgreSQL database on Amazon RDS db.t4g.small before just fine (2 vCPU and 2 GiB RAM). After 2.7 we had to update to a db.t4g.xlarge (4 vCPU and 8 GiB RAM) and it's still struggling on the CPU.
We're running on Kubernetes with 10 agents and up to 10 workflows for agent.
Steps to reproduce
agents
tableExpected behavior
From #3844 this seems to be the intended behavior. However it's clearly coming at a cost.
Maybe we could only update the agents every X minutes instead of every log line/every second (not sure how it's implemented right now, need to look deeper). Possibly the same could be said for log_entries update, their frequency might be just a tad too high. Of course surely there are risks here (i.e. losing logs).
We will test an internal version with the last work update on every log disabled and see how that goes.
System Info
Additional context
Here's our Amazon Performance Insights from the database. On Friday 9th evening we updated Woodpecker from 2.3 to 2.7 but we barely had any pipelines ran then nor during the weekend. Then on Monday you can see how load is many times higher than the previous week.
Most of it is from INSERTs coming from
log_entries
andagents
however a more detailed analysis shows that in query volume,log_entries
updates have not increased much after the update. Howeveragents
have increased tremendously.On Monday we increased the database from a t4g.small to t4g.xlarge which helped but did not solve the issue.
Validations
next
version already [https://woodpecker-ci.org/faq#which-version-of-woodpecker-should-i-use]The text was updated successfully, but these errors were encountered: