Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Restoring multiple big folders from trash bin fails #43759

Open
4 of 8 tasks
mgallien opened this issue Feb 22, 2024 · 10 comments
Open
4 of 8 tasks

[Bug]: Restoring multiple big folders from trash bin fails #43759

mgallien opened this issue Feb 22, 2024 · 10 comments
Assignees
Labels
0. Needs triage Pending check for reproducibility or if it fits our roadmap 29-feedback bug feature: trashbin

Comments

@mgallien
Copy link
Contributor

⚠️ This issue respects the following points: ⚠️

Bug description

Restoring multiple big folders at once from trash bin fails with an error message: "Failed to restore some elements"

Steps to reproduce

  1. create 4 folders named bigFolder1, bigFolder2, bigFolder3 and bigFolder4
  2. create many empty files inside each folder for i in {1..10000}; do touch test${i}.txt; done
  3. get the files scan to finish sudo -u www-data php ./occ files:scan <uid>
  4. delete the 4 folders from web UI
  5. try to restore them with trash bin web UI

Expected behavior

all folders are restored without error

Installation method

Community Manual installation with Archive

Nextcloud Server version

master

Operating system

Debian/Ubuntu

PHP engine version

PHP 8.2

Web server

Apache (supported)

Database engine version

MariaDB

Is this bug present after an update or on a fresh install?

None

Are you using the Nextcloud Server Encryption module?

Encryption is Disabled

What user-backends are you using?

  • Default user-backend (database)
  • LDAP/ Active Directory
  • SSO - SAML
  • Other

Configuration report

sudo -u www-data php occ config:list system 
{
    "system": {
        "debug": true,
        "profiler": true,
        "apps_paths": [
            {
                "path": "\/var\/www\/html\/apps",
                "url": "\/apps",
                "writable": false
            },
            {
                "path": "\/var\/www\/html\/apps-extra",
                "url": "\/apps-extra",
                "writable": false
            },
            {
                "path": "\/var\/www\/html\/apps-shared",
                "url": "\/apps-shared",
                "writable": false
            },
            {
                "path": "\/var\/www\/html\/apps-writable",
                "url": "\/apps-writable",
                "writable": true
            }
        ],
        "allow_local_remote_servers": true,
        "mail_from_address": "***REMOVED SENSITIVE VALUE***",
        "mail_smtpmode": "smtp",
        "mail_sendmailmode": "smtp",
        "mail_domain": "***REMOVED SENSITIVE VALUE***",
        "mail_smtphost": "***REMOVED SENSITIVE VALUE***",
        "mail_smtpport": "1025",
        "skeletondirectory": "\/skeleton",
        "setup_create_db_user": false,
        "loglevel": 0,
        "log_query": false,
        "query_log_file": "\/var\/www\/html\/data\/query.log",
        "query_log_file_requestid": "yes",
        "diagnostics.logging": true,
        "diagnostics.logging.threshold": 0,
        "log.condition": {
            "apps": [
                "diagnostics",
                "admin_audit"
            ]
        },
        "bulkupload.enabled": false,
        "passwordsalt": "***REMOVED SENSITIVE VALUE***",
        "secret": "***REMOVED SENSITIVE VALUE***",
        "trusted_domains": [
            "localhost",
            "nextcloud",
            "nextcloud.local",
            "192.168.21.6",
            "localhost"
        ],
        "datadirectory": "***REMOVED SENSITIVE VALUE***",
        "dbtype": "mysql",
        "version": "29.0.0.5",
        "overwrite.cli.url": "http:\/\/nextcloud.local",
        "dbname": "***REMOVED SENSITIVE VALUE***",
        "dbhost": "***REMOVED SENSITIVE VALUE***",
        "dbport": "",
        "dbtableprefix": "oc_",
        "mysql.utf8mb4": true,
        "dbuser": "***REMOVED SENSITIVE VALUE***",
        "dbpassword": "***REMOVED SENSITIVE VALUE***",
        "updater.release.channel": "git",
        "installed": true,
        "instanceid": "***REMOVED SENSITIVE VALUE***",
        "lookup_server": "",
        "redis": {
            "host": "***REMOVED SENSITIVE VALUE***",
            "port": 6379
        },
        "memcache.local": "\\OC\\Memcache\\Redis",
        "memcache.locking": "\\OC\\Memcache\\Redis",
        "trusted_proxies": "***REMOVED SENSITIVE VALUE***"
    }
}

List of activated Apps

sudo -u www-data php occ app:list
Enabled:
  - cloud_federation_api: 1.12.0
  - comments: 1.19.0
  - contactsinteraction: 1.10.0
  - dashboard: 7.9.0
  - dav: 1.30.0
  - federatedfilesharing: 1.19.0
  - federation: 1.19.0
  - files: 2.1.0
  - files_reminders: 1.2.0
  - files_sharing: 1.21.0
  - files_trashbin: 1.19.0
  - files_versions: 1.22.0
  - lookup_server_connector: 1.17.0
  - oauth2: 1.17.0
  - provisioning_api: 1.19.0
  - settings: 1.12.0
  - sharebymail: 1.19.0
  - systemtags: 1.19.0
  - theming: 2.4.0
  - twofactor_backupcodes: 1.18.0
  - updatenotification: 1.19.0
  - user_oidc: 1.3.6
  - user_status: 1.9.0
  - weather_status: 1.9.0
  - workflowengine: 2.11.0
Disabled:
  - admin_audit: 1.19.0
  - encryption: 2.17.0
  - files_external: 1.21.0
  - testing: 1.19.0
  - user_ldap: 1.20.0

Nextcloud Signing status

No response

Nextcloud Logs

No response

Additional info

No response

@mgallien mgallien added bug 0. Needs triage Pending check for reproducibility or if it fits our roadmap labels Feb 22, 2024
@mgallien
Copy link
Contributor Author

@sorbaugh

@sorbaugh
Copy link
Contributor

sorbaugh commented Feb 22, 2024

This reminds me a bit of a reported issue where an instance runs out of memory when doing operations on super large directories. This is very infrastructure/setup dependant, and It might be unrelated, but an idea then was to create some sort of admin setting to limit the number of allowed files in a directory, including the trashbin. While certainly not a satisfactory solution, we're bound to run into some kind of friction when dealing with absurdly large amounts of files/data.

Depending on what analysis of the root cause reveals, maybe this could be an alternative approach.

@sorbaugh
Copy link
Contributor

Looping in @come-nc & @Altahrim

@come-nc
Copy link
Contributor

come-nc commented Feb 27, 2024

["An exception occurred while executing a query: SQLSTATE[40001]: Serialization failure: 1213 Deadlock found when trying to get lock; try restarting transaction","/remote.php/dav/trashbin/bob/trash/bigFolder4.d1709033304"]
["An exception occurred while executing a query: SQLSTATE[40001]: Serialization failure: 1213 Deadlock found when trying to get lock; try restarting transaction","/remote.php/dav/trashbin/bob/trash/bigFolder3.d1709033304"]
["An exception occurred while executing a query: SQLSTATE[40001]: Serialization failure: 1213 Deadlock found when trying to get lock; try restarting transaction","/remote.php/dav/trashbin/bob/trash/bigFolder1.d1709033304"]
["Transaction took 36.729679107666s","/remote.php/dav/trashbin/bob/trash/bigFolder2.d1709033304"]

Seems to be a concurrency problem.

@come-nc
Copy link
Contributor

come-nc commented Feb 27, 2024

Info on the deadlock:

------------------------
LATEST DETECTED DEADLOCK
------------------------
2024-02-27 13:10:47 0x7f031a5a6700
*** (1) TRANSACTION:
TRANSACTION 61551, ACTIVE 0 sec updating or deleting
mysql tables in use 1, locked 1
LOCK WAIT 80 lock struct(s), heap size 24696, 2950 row lock(s), undo log entries 2917
MariaDB thread id 2630, OS thread handle 139651303761664, query id 223713 192.168.21.6 root Updating
UPDATE `oc_filecache` SET `storage` = 3, `path_hash` = MD5(CONCAT('files/bigFolder2', SUBSTR(`path`, 44))), `path` = CONCAT('files/bigFolder2', SUBSTR(`path`, 44)) WHERE (`storage` = 3) AND (`path` LIKE 'files\\_trashbin/files/bigFolder2.d1709039279/%')
*** WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 10 page no 4456 n bits 304 index fs_storage_path_hash of table `nextcloud`.`oc_filecache` trx id 61551 lock mode S waiting
Record lock, heap no 221 PHYSICAL RECORD: n_fields 3; compact format; info bits 0
 0: len 8; hex 8000000000000003; asc         ;;
 1: len 30; hex 343962373630373532383339653161373739323330383531303737393462; asc 49b760752839e1a77923085107794b; (total 32 bytes);
 2: len 8; hex 800000000000b768; asc        h;;

*** CONFLICTING WITH:
RECORD LOCKS space id 10 page no 4456 n bits 304 index fs_storage_path_hash of table `nextcloud`.`oc_filecache` trx id 61550 lock_mode X locks rec but not gap
Record lock, heap no 221 PHYSICAL RECORD: n_fields 3; compact format; info bits 0
 0: len 8; hex 8000000000000003; asc         ;;
 1: len 30; hex 343962373630373532383339653161373739323330383531303737393462; asc 49b760752839e1a77923085107794b; (total 32 bytes);
 2: len 8; hex 800000000000b768; asc        h;;


*** (2) TRANSACTION:
TRANSACTION 61550, ACTIVE 0 sec updating or deleting
mysql tables in use 1, locked 1
LOCK WAIT 350 lock struct(s), heap size 57464, 2877 row lock(s), undo log entries 2825
MariaDB thread id 2629, OS thread handle 139651533489920, query id 223703 192.168.21.6 root Updating
UPDATE `oc_filecache` SET `storage` = 3, `path_hash` = MD5(CONCAT('files/bigFolder4', SUBSTR(`path`, 44))), `path` = CONCAT('files/bigFolder4', SUBSTR(`path`, 44)) WHERE (`storage` = 3) AND (`path` LIKE 'files\\_trashbin/files/bigFolder4.d1709039279/%')
*** WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 10 page no 321 n bits 304 index fs_storage_path_hash of table `nextcloud`.`oc_filecache` trx id 61550 lock mode S waiting
Record lock, heap no 3 PHYSICAL RECORD: n_fields 3; compact format; info bits 32
 0: len 8; hex 8000000000000003; asc         ;;
 1: len 30; hex 333238343130336335663666633766326566306530636333303438333334; asc 3284103c5f6fc7f2ef0e0cc3048334; (total 32 bytes);
 2: len 8; hex 8000000000002a4a; asc       *J;;

*** CONFLICTING WITH:
RECORD LOCKS space id 10 page no 321 n bits 304 index fs_storage_path_hash of table `nextcloud`.`oc_filecache` trx id 61551 lock_mode X locks rec but not gap
Record lock, heap no 3 PHYSICAL RECORD: n_fields 3; compact format; info bits 32
 0: len 8; hex 8000000000000003; asc         ;;
 1: len 30; hex 333238343130336335663666633766326566306530636333303438333334; asc 3284103c5f6fc7f2ef0e0cc3048334; (total 32 bytes);
 2: len 8; hex 8000000000002a4a; asc       *J;;

*** WE ROLL BACK TRANSACTION (1)

@come-nc
Copy link
Contributor

come-nc commented Feb 27, 2024

It’s not clear to me why the DELETE does not deadlock the same. It does thing in reverse order and update cache first but otherwise it’s the same kind of logic.

@come-nc
Copy link
Contributor

come-nc commented Feb 27, 2024

Looking at the query log for both delete and restore, it looks quite similar, there is only one more UPDATE query on restore side, related to storage_mtime.

@come-nc
Copy link
Contributor

come-nc commented Feb 27, 2024

When deleting the 4 folders, it seems the concurrency does not slow down deletion at all, each request takes less than 2 sec on my setup.
For restoring, it take close to 5s, and when restoring the 4, 3 requests fail with deadlock in about 1s, and the fourth one finishes close to 5s as when alone.

@sorbaugh sorbaugh self-assigned this Feb 29, 2024
@come-nc
Copy link
Contributor

come-nc commented Mar 7, 2024

I tried #43531 but it does not avoid the deadlock.

@come-nc
Copy link
Contributor

come-nc commented Mar 7, 2024

Current @Altahrim theory:

  • The update locks more rows than it’s actually updating
  • Because of that the deadlock appears

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
0. Needs triage Pending check for reproducibility or if it fits our roadmap 29-feedback bug feature: trashbin
Projects
None yet
Development

No branches or pull requests

5 participants