Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automatically Set MySQL/MariaDB Session Transaction Isolation to "Read Committed" for Upon Reconnection #17947

Closed
GuyPaddock opened this issue Nov 14, 2019 · 10 comments · Fixed by #18227
Labels
4. to release Ready to be released and/or waiting for tests to finish enhancement

Comments

@GuyPaddock
Copy link

GuyPaddock commented Nov 14, 2019

Is your feature request related to a problem? Please describe.
As someone responsible for deploying and maintaining our Nextcloud deployment, I'd like it if Nextcloud automatically set the transaction isolation level to "READ COMMITTED" when reconnecting to the database during a long-running request so that I don't have to set it at the server level, affecting all other applications using the same application server.

Describe the solution you'd like
It appears that Nextcloud automatically sets the transaction isolation level to "Read Committed" whenever Nextcloud opens a database connection to MySQL or MariaDB, but it doesn't restore this if the connection dies and Nextcloud has to reconnect.

This transaction isolation level is necessary to ensure that the transaction isolation level is as it needs to be for Nextcloud not to deadlock the DB on file operations.

Describe alternatives you've considered
Change the global default value for transaction_isolation on the MySQL server per the installation docs, and then either:

  1. Modify all other applications that use the same database server to set their transaction isolation levels as needed for their own purposes; OR
  2. Use a separate database server for other applications.

Neither option is great.

@GuyPaddock GuyPaddock added 0. Needs triage Pending check for reproducibility or if it fits our roadmap enhancement labels Nov 14, 2019
@GuyPaddock GuyPaddock changed the title Automatically Set MySQL/MariaDB Session Transaction Isolation to "Read Committed" Automatically Set MySQL/MariaDB Session Transaction Isolation to "Read Committed" for Each Connection Nov 14, 2019
@kesselb
Copy link
Contributor

kesselb commented Nov 14, 2019

parent::setTransactionIsolation(parent::TRANSACTION_READ_COMMITTED);

Whenever Nextcloud opens a database connection to MySQL or MariaDB, it should execute:

I think the above code should do that. Does not work for you? 🤔

@GuyPaddock
Copy link
Author

GuyPaddock commented Nov 14, 2019

Hmm, that's interesting...

This appears to be a separate root cause that was not actually affected by this bug; we thought it was related to seeing "Failed to send QUERY packet" in the logs, but it's actually separate. It was not clear from the logs that Nextcloud is actually able to recover from "Failed to send QUERY packet" and it is not fatal.

We're finding that large, chunked uploads fail at the end unless transaction_isolation is set to READ-COMMITTED per the install docs. Basically, the chunked upload gets to 100% and then fails during the "Processing files..." step with a "Error when assembling chunks, status code 403".

The response to the browser on the .file MOVE request is:

<?xml version="1.0" encoding="utf-8"?>
<d:error xmlns:d="DAV:" xmlns:s="http://sabredav.org/ns">
  <s:exception>Sabre\DAV\Exception\Forbidden</s:exception>
  <s:message/>
</d:error>

The error in the logs when this happens is always like this:

{
    "reqId": "1UhRDgTUW4Mc4UDd6DLM", 
    "level": 3, 
    "time": "2019-11-14T05:21:56+00:00", 
    "remoteAddr": "123.45.67.123", 
    "user": "TestUser", 
    "app": "PHP", 
    "method": "MOVE", 
    "url": "/remote.php/dav/uploads/TestUser/web-file-upload-7799eafe3dcfe3ec9bb691f16742b576-1573707497231/.file", 
    "message": "Error while sending QUERY packet. PID=83 at /var/www/html/3rdparty/doctrine/dbal/lib/Doctrine/DBAL/Driver/PDOConnection.php#107", 
    "userAgent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.97 Safari/537.36", 
    "version": "16.0.6.1"
}

At first, we tried increasing wait_timeout from the default of 120 to 300 but that did not fix it on its own. We had to also set the database transaction_isolation to READ-COMMITTED instead of REPEATABLE-READ in conjunction with increasing the wait_timeout.

So, based on this, my assumption is either that:

  1. The linked code is not having the desired effect at all; OR
  2. The linked code does not affect all code paths that are involved in chunked uploads. So, it has an effect on some parts of Nextcloud but the code that's needed for chunked uploads errors out unless the global default is modified on the MySQL server because those code paths are inheriting the DB default; OR
  3. Chunked uploads are actually unaffected by this setting but changing it affects request timing just enough that we're getting lucky.~~

@kesselb
Copy link
Contributor

kesselb commented Nov 15, 2019

Thanks for sharing the log message. I have a guess ;) Probably the database connection is lost and we retry a reconnect. The transactionIsolation is set for new connections (if we create a new instance of the connection object) but not for a reconnected connection (we call connect again but do not create a new instance).

@GuyPaddock
Copy link
Author

GuyPaddock commented Nov 17, 2019

@kesselb I appreciate the PR... but it looks like there's actually one other log message I missed that may make my original log message a red herring.

I'm in the process of debugging it further... but it looks like the 403 might actually be coming from this error:

{
    "reqId": "yRypcPE7j3zummFkYoSb", 
    "level": 3, 
    "time": "2019-11-17T06:50:38+00:00", 
    "remoteAddr": "123.45.67.123", 
    "user": "TestUser", 
    "app": "PHP", 
    "method": "MOVE", 
    "url": "/remote.php/dav/uploads/TestUser/web-file-upload-5514056e9ca8e7927f2255f80a3085d1-1573971043550/.file", 
    "message": "rmdir(/var/www/html/data/TestUser/uploads/web-file-upload-5514056e9ca8e7927f2255f80a3085d1-1573971043550): Directory not empty at /var/www/html/lib/private/Files/Storage/Local.php#119", 
    "userAgent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.97 Safari/537.36", 
    "version": "16.0.6.1"
}

So far, it looks like this comes from \OCA\DAV\Connector\Sabre\Directory::delete on line 314, but I still need to dig further. At first glance it looks like the chunked upload is trying to remove the source folder containing the chunks by using rmdir which only works for empty folders...

If this is the case, then the isolation level may or may not be the root cause that's impacting our chunked uploads.

@GuyPaddock GuyPaddock changed the title Automatically Set MySQL/MariaDB Session Transaction Isolation to "Read Committed" for Each Connection Automatically Set MySQL/MariaDB Session Transaction Isolation to "Read Committed" for Upon Reconnection Nov 18, 2019
@GuyPaddock
Copy link
Author

GuyPaddock commented Nov 18, 2019

@kesselb I rewrote this ticket to cover the bug you believe you've found. I'm writing up a separate ticket for what appears to be to blame for the issues we're now seeing with chunked uploads and 403 errors.

For verifying that the bug in this ticket exists, and verifying that your fix works, I would think the following sequence would work:

  1. Set the wait_timeout on the database server very short (10 secs).
  2. Have Nextcloud connect to the database.
  3. Execute a query on the connection to query the isolation level and confirm that it's set to READ COMMITTED; for example, SELECT @@SESSION.tx_isolation
  4. Sleep for 15 seconds.
  5. Execute the same query as in step 3 on the connection (forcing a reconnect), to confirm that it's still set to READ COMMITTED.

@GuyPaddock
Copy link
Author

Ok, so, the root cause for our most-recently chunked upload failures is coming from #17980. We thought it was directly related to the transaction isolation because we made the mistake of assuming that the issue would appear for all large uploads when in fact it only appears for uploads exactly greater than 620 MB. One of our test files is 617 MB and that happened to be the file we were testing with after adjusting the isolation level. The user who reported the issue was testing with a file that was 1.5 GB.

@ChristophWurst ChristophWurst added 4. to release Ready to be released and/or waiting for tests to finish and removed 0. Needs triage Pending check for reproducibility or if it fits our roadmap labels Feb 21, 2020
@ChristophWurst
Copy link
Member

Fix is at #18227

@simonwiles
Copy link

I found this issue trying to research whether/why I need to set transaction_isolation = READ-COMMITTED at the server level (which is inconvenient for other applications I'd like to share the same DB installation). The code at

parent::setTransactionIsolation(parent::TRANSACTION_READ_COMMITTED);
(linked above) seems to take care of it, but that line isn't present in the current master or the latest releases. Does anyone know why that is?

I assume there must be a reason why it can't be done this way? Is there any way I can use READ-COMMITTED just for NextCloud and leave REPEATABLE-READ as the global default? Grateful for any insight!

@kesselb
Copy link
Contributor

kesselb commented Oct 4, 2023

(linked above) seems to take care of it, but that line isn't present in the current master or the latest releases. Does anyone know why that is?

#18227

@simonwiles
Copy link

@kesselb Thank you -- I had read that, but the fact that the docs still state under "Database Requirements" that READ-COMMITTED is "currently required" and that "you need to configure the transaction isolation level" (and that a committer commented that "we" still set it that way!) gave me considerable pause.

I'll leave my global transaction isolation level set to REPEATABLE-READ, then, and let the NextCloud codebase take care of configuring its own db connections appropriately for its needs.

Thank you very much!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
4. to release Ready to be released and/or waiting for tests to finish enhancement
Projects
None yet
4 participants