Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Critical Performance Issues Post-Upgrade from Umbraco 11.5.0 to 13.4.1 #16803

Closed
piotrbach opened this issue Jul 19, 2024 · 10 comments · Fixed by #16837
Closed

Critical Performance Issues Post-Upgrade from Umbraco 11.5.0 to 13.4.1 #16803

piotrbach opened this issue Jul 19, 2024 · 10 comments · Fixed by #16837
Labels

Comments

@piotrbach
Copy link

Which Umbraco version are you using? (Please write the exact version, example: 10.1.0)

13.4.1

Bug summary

After upgrading Umbraco from version 11.5.0 to 13.4.1, we are experiencing critical issues with basic operations such as publishing and unpublishing content.
Additionally, we are often encountering database lock errors.
We have noticed significant changes in the database schema and indexes.

Please let us know if this upgrade/change has been performance-tested on a large volume of data similar to ours.
We would greatly appreciate any guidance or solutions to resolve this critical issue.

cc: @wtct

Specifics

We are managing a large dataset with ~325 000 documents.

We have attempted various solutions to resolve the issue, including:

  • Disabling nucache local db

  • Forcing rebuild of Database/Memory Cache

  • Disabling all custom handlers (ContentPublishingNotificationHandler, ContentUnpublishingNotificationHandler, ContentSavingNotificationHandler, etc.)

  • Rebuilding database indexes

  • Disabling DeliveryApi and Webhooks via configuration in appsettings.json

  • Disabling Content Cleanup Hosted Service to reduce background processes

Additionally, we observed that the upgrade process takes so long that the https://localhost:44392/install view does not redirect after completion.
We can only see from the logs that the process has finished.
When accessing https://localhost:44392/ directly and removing /install, the site still shows 'Website is Under Maintenance'.
Only after restarting Umbraco does the site function normally.

Steps to reproduce

  1. Start with an Umbraco 11.5.0 instance containing 325K documents.

  2. Perform an upgrade to Umbraco 13.4.1.

  3. Attempt to publish or unpublish a document.

  4. Observe the performance and any errors related to database locks.

Expected result / actual result

  • Expected: Publishing and unpublishing operations should be fast and error-free.

  • Actual: Operations are ultra-slow slow, and database lock errors occur. Working with CMS is impossible.

umbraco-lock-issue

Copy link

Hi there @piotrbach!

Firstly, a big thank you for raising this issue. Every piece of feedback we receive helps us to make Umbraco better.

We really appreciate your patience while we wait for our team to have a look at this but we wanted to let you know that we see this and share with you the plan for what comes next.

  • We'll assess whether this issue relates to something that has already been fixed in a later version of the release that it has been raised for.
  • If it's a bug, is it related to a release that we are actively supporting or is it related to a release that's in the end-of-life or security-only phase?
  • We'll replicate the issue to ensure that the problem is as described.
  • We'll decide whether the behavior is an issue or if the behavior is intended.

We wish we could work with everyone directly and assess your issue immediately but we're in the fortunate position of having lots of contributions to work with and only a few humans who are able to do it. We are making progress though and in the meantime, we will keep you in the loop and let you know when we have any questions.

Thanks, from your friendly Umbraco GitHub bot 🤖 🙂

@pvhees
Copy link

pvhees commented Jul 24, 2024

I encountered the same issue (upgrading from 11.5 to 13.4.0). The 'DeleteVersions' job is blocking everything. However once it's done, the issue is resolved and doesn't come back. To help with the database timeouts and locking issues:

Increase the connect timeout of the connectionstring to a big number
Add the following in In appsettings.json under Umbraco -> global:

    "DistributedLockingReadLockDefaultTimeout": "00:05:00",
    "DistributedLockingWriteLockDefaultTimeout": "00:00:20"

Then I let it run for a while.

@wtct
Copy link
Contributor

wtct commented Jul 29, 2024

Hello Guys,

Just letting you know, we have just found a critical performance bottleneck which is caused by refactoring in the #14806.

We are going to prepare a new pull request.

If you started investigation please hold on :)

cc: @nul800sebastiaan, @Zeegaan, @elit0451, @nikolajlauridsen, @leekelleher, @mattbrailsford

wtct pushed a commit to wtct/Umbraco-CMS that referenced this issue Jul 29, 2024
…vestigation of umbraco#16803 - well tested on db with 300k+ nodes
@piotrbach
Copy link
Author

Hi @pvhees and big thanks for your suggestion🤝.

Yes, we initially disabled all background services, including 'DeletedVersions,' plus played around with timeouts, but this did not resolve the issue.
We found the problem and PR is waiting for merge #16837.
It's worth noting that we are managing a dataset of 325,000 documents.
I suspect that with fewer than 100,000 documents, the problem might not be noticeable.

@wtct

@Zeegaan
Copy link
Member

Zeegaan commented Aug 2, 2024

I am not sure why this is a problem at all ? 🤔 The PR just changed the default behavior. But it might be a documentation issue 🤔
To revert to the old behavior, you have to change the UsePagedSqlQuery option to false in your appSettings like so:

 "Umbraco": {
    "CMS": {
      "NuCache": {
        "UsePagedSqlQuery" : false      },

Let me know if that helps 😁

@PerplexDaniel
Copy link
Contributor

@Zeegaan Looking at PR #16837 it seems your change from earlier (in #14806) removed a WHERE clause from the COUNT query in some cases. It seems you refactored 3 different ones into 1 without a WHERE, but 2 of the original 3 had a WHERE clause.

For example, in #14806 this was changed:

Original (using a .WhereIn):
image

Your change (without .WhereIn). This version is used for all 3 cases:
image

Likewise for this one, it also uses the version without any WHERE clause.
image

@wtct
Copy link
Contributor

wtct commented Aug 2, 2024

I am not sure why this is a problem at all ?

Hi @Zeegaan,

It seems @PerplexDaniel clarified this case even more, but you can still take a look at the critical line you forget during your refactoring:

https://github.com/umbraco/Umbraco-CMS/pull/14806/files#diff-1c5f9167e84fb37137bd239d3fddb0aebe0ac78b01e3b8c51871b227791d8ce0L349

 "Umbraco": {
    "CMS": {
      "NuCache": {
        "UsePagedSqlQuery" : false      },

Let me know if that helps 😁

This is probably only a workaround, but we don't want to go this way, because it could cause another issues.

Honestly, the issue is caused by an obvious bug introduced by refactoring performed in #14806

Therefore, we created a new PR #16837 which keeps logic of your PR #14806, but we restored WHERE clauses as needed.

I recommend debugging the QueryPaged extension method during publishing a node which contains some children:

https://github.com/umbraco/Umbraco-CMS/blob/contrib/src/Umbraco.Infrastructure/Persistence/NPocoDatabaseExtensions.cs#L35

BTW, I have to mention that I'm a little bit shocked that the performance of basic CRUD operations are not tested enough before releasing a new version, because everything will work as expected on database with a small amount of nodes even there is a critical performance issue like this.

@Zeegaan
Copy link
Member

Zeegaan commented Aug 2, 2024

@PerplexDaniel Thank you for the clarification, that made it really clear 😁
I will take a look at merging the PR ASAP 😄

@piotrbach
Copy link
Author

Hi,
@PerplexDaniel big thanks for your input and extra clarification🤝.

@Zeegaan I can suggest you to:

  1. Set first breakpoint in ContentService https://github.com/umbraco/Umbraco-CMS/blob/contrib/src/Umbraco.Core/Services/ContentService.cs#L1387
  2. Set second breakpoint in https://github.com/umbraco/Umbraco-CMS/blob/contrib/src/Umbraco.Infrastructure/Persistence/NPocoDatabaseExtensions.cs#L63 while loop
  3. Publish a node with debugging on. Node without children should be fine I guess.
  4. Debug through cache refreshing code.
  5. Inspect itemCount in public static IEnumerable QueryPaged method, you will notice total number of Umbraco documents (no filtering, WHERE).

image

As a result the while loop takes forever and it's killing the system.

Simply merging PR #16837 is a quick solution here.

Zeegaan pushed a commit that referenced this issue Aug 2, 2024
…tion of #16803 - well tested on db with 300k+ nodes (#16837)

Co-authored-by: Wojciech Tengler <wtengler@umbracare.net>
@Zeegaan
Copy link
Member

Zeegaan commented Aug 2, 2024

@piotrbach thanks for the debugging steps, it helped during testing 🚀

Fixed in #16837

@Zeegaan Zeegaan closed this as completed Aug 2, 2024
Zeegaan pushed a commit that referenced this issue Aug 2, 2024
…tion of #16803 - well tested on db with 300k+ nodes (#16837)

Co-authored-by: Wojciech Tengler <wtengler@umbracare.net>
(cherry picked from commit 688790e)
@nul800sebastiaan nul800sebastiaan linked a pull request Aug 6, 2024 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants