Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Web UI Hanging #3276

Closed
2 tasks done
SlothCroissant opened this issue Jun 16, 2023 · 30 comments · May be fixed by #3595
Closed
2 tasks done

Web UI Hanging #3276

SlothCroissant opened this issue Jun 16, 2023 · 30 comments · May be fixed by #3595
Labels
area:dashboard The main dashboard page where monitors' status are shown bug Something isn't working question Further information is requested

Comments

@SlothCroissant
Copy link
Contributor

⚠️ Please verify that this bug has NOT been raised before.

  • I checked and didn't find similar issue

🛡️ Security Policy

Description

Often (more often than not), the web UI hangs on initial load for me.

As you can see in the screenshot below, the websocket is open and data is flying in for monitors, so the browser is definitely connected to the server and there doesn't seem to be any noticeable issues with server connectivity.

Issue reproduces on Edge for Mac (114.0.1823.43), Firefox for Mac (112.0.1) and Safari for Mac (Version 16.5 (18615.2.9.11.4)).

When it DOES work, it's often very slow to initially load (10+ seconds before you see anything happening as monitors load).

I can also confirm that some push monitors experience latency/timeouts when calling UptimeKuma.

System: Proxmox LXC

  • 8 Cores (2X E5-2680 v3)
  • RAM: 8GB
  • Disk: 3-way ZFS Mirror on Dell Enterprise SAS SSDs

Average CPU via htop shows fairly straightfoward, nothing too exciting:

Screenshot 2023-06-15 at 19 21 49

👟 Reproduction steps

N/A really, just open the web page and observe.

👀 Expected behavior

Page loads as normal

😓 Actual Behavior

Page hangs and shows:

Screenshot 2023-06-15 at 19 01 26

🐻 Uptime-Kuma Version

version : "1.21.3"

💻 Operating System and Arch

Debian 11 x64 via https://tteck.github.io/Proxmox/

🌐 Browser

Microsoft Edge Version 114.0.1823.43 (Official build) (arm64)

🐋 Docker Version

N/A - LXC

🟩 NodeJS Version

9.5.1

📝 Relevant log output

I don't see anything interesting in the logs output. Let me know what I can provide to assist.
@SlothCroissant SlothCroissant added the bug Something isn't working label Jun 16, 2023
@CommanderStorm
Copy link
Collaborator

How many monitors do you have? What is their type?
10s seems quite long

@joe-eklund
Copy link

I am experiencing the same behavior. On initial load it seems very slow, but on subsequent loads it seems much faster. Running 96 monitors (each checking every minute). My database size is 2.3 GB.

@UllenBullen
Copy link

Have the same problem with the hanging user interface.
We are using Docker. I am experiencing the issue in the v1.21.3 release as well as the 1.22.0-beta.0 version.

After a container restart, the first time I use UptimeKuma, the monitors are loaded and displayed.
Also the monitoring events work without problems on this first call.

But if I open a second window (No matter which browser or even other devices).
The information is no longer loaded.
The page remains empty.

The first page, however, continues to run without problems. If I refresh the complete page, everything is gone.

Unfortunately there are not really any error messages.

The only thing I see in the Docker log is the message:
[RATE-LIMIT] INFO: remaining requests: 19

Maybe as a side information:

  • We have about 700+ monitors running
  • The database is about 400 MB in size
  • We keep the log data for 7 days.

@ChrisLStark
Copy link

I'm having the same issue with 216 monitors. Sometimes they randomly display, but then when drilling down they disappear!
I cannot see any errors in the logs.

Running as a container on Docker/Unraid. Version: 1.22.1
DB size: 1.15GB

Has anyone managed to find a solution?

@kevin-dunas
Copy link

I'm having the same issue. When I log in with a new browser, I only see the WEB UI, no data.
When I check with developer tools, I am only receiving data on the socket.
When i restart via "pm2 restart uptime-kuma" command, i see the data.
However, when I connect with a different browser, I don't see the data again.

monitoring : 716 monitors running
Database size : 1.4gb+
OS : Windows 2012 R2
CPU : Xeon 2.9
Memory : 16GB
Keep Log data : 7 days
uptime kuma version : 1.22.1
node.js version : 18.16.1

@Saibamen
Copy link
Contributor

Please retest with latest version 1.23

@chakflying
Copy link
Collaborator

Note that this could be caused by #3515 which is not merged yet.

@CommanderStorm
Copy link
Collaborator

CommanderStorm commented Aug 18, 2023

could be caused

I doubt that it is, given the reported mounitor-count and db-size:

Running 96 monitors [...] My database size is 2.3 GB

@kevin-dunas and @ChrisLStark 's comments are likely not related to this issue and are likely fixed by the PR you linked.
@UllenBullen 's comment might also be.

This might be solved by the call to optimise introduced in #3380 and released in 1.23
=> @Saibamen 's comment if @SlothCroissant can retest if this is a problem in 1.23 is valid
This might also be solved by #3595 or #2750.

@CommanderStorm CommanderStorm added question Further information is requested area:dashboard The main dashboard page where monitors' status are shown labels Dec 6, 2023
@raqua
Copy link

raqua commented Apr 19, 2024

I have the same issue.
Running under Proxmox, LXC Debian 12, docker container inside.
Version 1.23.11
DB size: 51.5 MB, keeping only 1 day of history as lowering it seems to improve somehow this issue. At least that was my impression. Might be false.
83 monitors.

@CommanderStorm
Copy link
Collaborator

CommanderStorm commented Apr 19, 2024

Read: with the lowered retention are you still having problems or not?

@raqua
Copy link

raqua commented Apr 19, 2024

@CommanderStorm if I understood your question correctly (not sure because of the typos) you are asking if I still have issues with lowered retention. I still do, but seems to happen less often.
Normally this is working OK and after few days the response period is lengthening until becoming really long and I just reboot the machine then. This "few days" is usually about a week.

@nullxx
Copy link

nullxx commented May 6, 2024

Why isn't this project utilizing Node worker threads? It's common for the UI to freeze when the thread is blocked by monitor processing, in addition to sqlite3.

@CommanderStorm
Copy link
Collaborator

CommanderStorm commented May 6, 2024

Which monitors are you using?
Monitor processing should not be a significant amount of time as all monitors are async and await => allow switching to another part of the codebase.

The usual answer: We just are currently not, no good reason why. Just have not investigated this optimisation.
=> benchmark it to prove that there is a improvement and contribute a fix ^^

@nullxx
Copy link

nullxx commented May 6, 2024

Currently, we are monitoring 82 services, including ping, HTTP, DNS, MySQL, MariaDB, MongoDB, PostgreSQL, and others. It is taking more than 60s to load. I recently experimented with adding worker_threads, which significantly enhanced our performance. Could I contribute further to this area?

@CommanderStorm
Copy link
Collaborator

Yes, sounds obviously interesting. (Duuh..)

Problem is that migrating to the new, way more performant data shema of v2 is not currently done.
=> directly migrating the experiments data (I am assuming you have a high amont of retention configured) won't be possible.

I think if you configure a few hundreds/thousands of monitor @ 20s (see https://pypi.org/project/uptime-kuma-api/) ping and set up a local mock server to ping against you can create a good testcase/"benchmark" for this.
Be aware that some monitors have different scaling requirements.
For example ping is supprisingly CPU intensive => this is an example where I think worker threads might be especially interesting.
Also monitor the impact on RAM (I think that might be the downside of this approach) ^^

@chakflying
Copy link
Collaborator

Uptime Kuma is designed to be usable in a container with 256MB of memory. Since worker threads is basically a separate node.js instance with its own v8 heap, every thread will take up precious memory and interfere with the memory management of v8 (#3039). Of course there could be a dynamic scaling algorithm based on the available memory, but a lot more engineering effort is needed.

Also, in my limited testing v2 running with an external database has not experienced any performance issues, which suggests to me that the current bottleneck is still on SQlite.

@nullxx
Copy link

nullxx commented May 6, 2024

I have been benchmarking why it is taking too many time to load and I found the Monitor.toJSON function. It’s very slow because it fetches lots of data recursively

@TimK42
Copy link

TimK42 commented May 7, 2024

I met the issue, too.
And my monitoring items are 240.
After trying many ways.

I found a way to resolve that temporarily.

  1. Restart the service.
  2. Try to open the dashboard. If failed, go to step 1.
  3. Select ALL items, and pause them.
  4. Try to open the dashboard.
  5. Resume ALL the items.
  6. The dashboard resumes normal.

Hope this can help you.

@CommanderStorm
Copy link
Collaborator

CommanderStorm commented May 7, 2024

@TimK42

You are right, we probably should have added a > [!TIP] to the related issues.
I have updated #4500 with this tip:

Tip

If you are affected by the performance limits mentoned above in v1, you need to reduce the amount of data you store.
The core problem is that Uptime Kuma in v1 has to do a table scan of the entire heartbeat table for some operations.

The solution boils down to having a lower amount of data => making Uptime Kuma less worried about reading those gigabytes of data.:

  • reduce the retention and execute a manual cleanup under /settings/monitor-history
  • Increase the time between checks
  • pause or reduce the amount of monitors
  • delete the specific history of less essential monitors (to "lower" their retention below the configured maximum)

@olli991
Copy link

olli991 commented Jul 7, 2024

I#m experiencing the same on latest 1.x release in docker. I couldn't rly find a solution in this thread. The tip in the end here only seems to be for v 2.x because i don't have the settings folder in my docker drive.
Any new infos on this?

Only running 15 item tho

@CommanderStorm
Copy link
Collaborator

/settings/monitor-history is a url

@olli991
Copy link

olli991 commented Jul 8, 2024

Oh my bad 🙈 bad thing is I can't use anything anymore. Web UI IS completly frozen and useless. Even monitoring push messegaes are effected now. Getting some "is up again" but never got the "went down" :/

@UllenBullen
Copy link

Is there a possibility to deactivate or limit the monitoring history? Actually, I have 900 monitors, and when there are only a few events registered, there isn't a problem with loading the page. If there are a lot of events during loading, the information Uptime crashes.
I limited it with the Monitor History function to keep it for 1 day. But when we do our maintenance, the events go crazy.

@CommanderStorm
Copy link
Collaborator

@UllenBullen
No, you can't currently deactivate the history. The problem with maintenance you are describing is likely unrelated to the history.

@CommanderStorm
Copy link
Collaborator

@olli991
In that case, you will need to go into the sqlite console and manually delete items.

First ensure that the system is offline, then run

  • VACCUM
  • DELETE FROM heartbeats WHERE 1=1 (adjust the where clause to your needs)
  • VACCUM

A vacuum alone could solve your problem, you will have to check this.

@itmcdonalds
Copy link

We are experiencing the same problem.

  • Running Version: 1.23.15 in a docker container
  • Monitoring 1198 devices, mostly ping
  • Monitoring history is set to 7 days
  • The database is 905.6MB
  • Heartbeat Interval is set to 120 seconds
  • Packet Size is 56

When accessing the web interface, it reloads at least 3 times before being able to do anything.

@CommanderStorm
Copy link
Collaborator

Monitoring 1198 devices

That is far beyond our capabilities in v1. V2 does improve upon this (see #4500 for mitigations in the mean time).

Likely the issue needs #5025 too, but I have not have time for a proper re-review of said PR

@laborb-sb
Copy link

I'm probably in the same boat.
I have about 50 monitors running. Monitoring is reliable, notifications are coming in.
However, the web UI regularly crashes after a little longer uptime (haha!). Then the entries/monitors are not updated, or are missing, or the whole app is not loaded.
Browser Reload does not help, Server Restart does (mostly).
The log is unremarkable. Only http monitors, I keep history for 100 days, my database is around 800 MB.

@vRobM
Copy link

vRobM commented Dec 18, 2024

Could another table/index could be added which nullifies the need to have the code scan through the entire DB recursively?

@CommanderStorm
Copy link
Collaborator

We have released V2.0.0-beta1 with lots of performance fixes.
I am going to close this issue to keep possible V2 and V1 performance issues in different threads

You are right, we probably should have added a > [!TIP] to the related issues.
I have updated #4500 with this tip:

[!TIP]
If you are affected by the performance limits mentoned above in v1, you need to reduce the amount of data you store.
The core problem is that Uptime Kuma in v1 has to do a table scan of the entire heartbeat table for some operations.

The solution boils down to having a lower amount of data => making Uptime Kuma less worried about reading those gigabytes of data.:

  • reduce the retention and execute a manual cleanup under /settings/monitor-history
  • Increase the time between checks
  • pause or reduce the amount of monitors
  • delete the specific history of less essential monitors (to "lower" their retention below the configured maximum)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:dashboard The main dashboard page where monitors' status are shown bug Something isn't working question Further information is requested
Projects
None yet
Development

Successfully merging a pull request may close this issue.