kill queries on timeout #791

Sovianum · 2018-05-11T16:55:45Z

Description

Added mechanism to kill orphan queries after context timeout.
Used approach is similar to one used in postgres driver: if context is cancelled, new connection is opened and than KILL query is executed.
Query is killed using id of corresponding connection which is saved on initial handshake.

Related issues: #731, #496

Checklist

Code compiles correctly
Created tests which fail without the change (if possible)
All tests passing
Extended the README / documentation, if necessary
Added myself / the copyright holder to the AUTHORS file

methane · 2018-05-12T00:24:34Z

I think this should be opt-in option, because it creates unexpected connections and it won't work in some environment.

And I want to wait adding this feature until next version (v1.4) is tagged.

methane · 2018-05-12T00:28:54Z

connection.go

+}
+
+func (mc *mysqlConn) kill() error {
+	conn, err := mc.d.Open(mc.dsn)


Don't creating new connection pool! Create mysqlConn directly.
Addtionally, opening new connection should be have short timeout.

I agree about timeout but didn't catch your idea about connection pool. Here I'm using MySQLDriver.Open method, which returns driver.Conn and error (underlying type of driver.Conn is *mysqlConn)

I'm sorry, I was wrong about the pool.

#705 should be used instead of Driver.Open.

Ok, will use after it merged to master

methane · 2018-05-12T00:41:58Z

Killing query is difficult than you think. Apart from load balanced MySQL server, there are many race scenarios.

For example:

The query is slow... Timeout happened
Start opening connection for sending KILL query
The query is finished and MySQL server sent normal response
New connection for KILL query is opened / or timed out.

In this scenario, what should we do?
Query is finished successful, it's just bit slow. We can get normal result of the query.

Sovianum · 2018-05-15T11:42:13Z

Do you mean that timeout shout be used on the query execution stage, but not while response is being sent back to client?

connection_go18.go

methane · 2018-05-15T12:35:20Z

Do you mean that timeout shout be used on the query execution stage, but not while response is being sent back to client?

I just meant there are many race conditions we should care about before merging this.

packets.go

methane · 2018-05-15T13:47:17Z

Even though MySQL is not clustered, connection can be not unique.
For example, server closed the connection and client failed to receive FIN packet.
When timeout happened, the connection id (=thread id) may be used for another connection.

Additionally, when MySQL is under very heavy state, many concurrent queries can be timed out at same time.
In this situation, DB.SetMaxConns(N) may use 2*N connections. (1/2 of them are for kill queries)

So I think (1) mc.kill() should be have very short timeout, and (2) this should be opt-in feature.

Sovianum · 2018-05-15T14:09:25Z

(1) 50 ms?
(2) Ok, I will add extra option to config.

Sovianum · 2018-05-15T17:19:05Z

Made this feature optional and added tests to check that query is not killed if flag is not activated.
You said, you don't want to add this feature until version 1.4, but do you have any forecast when it will be tagged? I'm asking this since we extensively use this driver in production and periodically find our database loaded with long orphaned queries. So this feature is really important for us.

methane · 2018-06-13T09:04:10Z

@Sovianum @ceshihao
This feature is very difficult to review (many possible race conditions which can be very difficult to debug. I need to check whole code, not only diff). So I can't estimate when we can ship this.

I recommend using timeout by optimizer-hint. It's much safer than this.

pjebs · 2019-03-14T00:53:08Z

See my solution: https://medium.com/@rocketlaunchr.cloud/canceling-mysql-in-go-827ed8f83b30

methane · 2019-03-14T01:27:17Z

@pjebs Thanks for writing it. My comments:

It can't be used for load balanced environment. It may even kill different query running on another DB instance, because thread id is not UUID.
When you need only timeout for heavy select query, MAX_EXECUTION_TIME optimizer hint is better than KILL QUERY on timeout.

pjebs · 2019-03-16T13:02:09Z

I've improved my design for reverse proxy scenarios for mysql using SELECT @@global.server_uuid to identify the correct server. When it's time to send the kill signal, I first check if the server is correct.

Unfortunately I wasn't able to find a reliable way to get a unique server identifier for mariadb.

@methane Thanks for the feedback.

RafeKettler · 2019-07-01T15:21:44Z

@Sovianum @ceshihao
This feature is very difficult to review (many possible race conditions which can be very difficult to debug. I need to check whole code, not only diff). So I can't estimate when we can ship this.

I recommend using timeout by optimizer-hint. It's much safer than this.

@methane is there a path to getting this PR merged from here? While I'm not sure if @Sovianum still needs this feature or has found a workaround, I could really use this feature on an opt-in basis. The MAX_EXECUTION_TIME optimizer hint you refer to doesn't work for my use case -- I'd like to be able to cancel queries other than a SELECT with a timeout. I am currently using KILL QUERY as a workaround, but it is pretty painful to have to add this logic everywhere I want to ensure a query is canceled after it times out.

pjebs · 2019-07-01T22:58:59Z

@RafeKettler https://github.com/rocketlaunchr/mysql-go

methane · 2019-07-01T23:44:06Z

@methane is there a path to getting this PR merged from here?

At least, new Connector interface should be used. At most one connection should be used for cancelling many queries.

pjebs · 2019-07-02T01:35:32Z

It definitely should be optional because you need to waste time querying for the connection id.
Most production queries would be milliseconds in duration, so cancellation itself is "overkill" or the query time to fetch the connection id is "overkill"

jpcope · 2019-09-12T09:10:08Z

I would love to see this functionality added to mysql.

I would not mind taking a stab at it. Circuit breaking is an important part of reliable, timely delivery in a microservice world so data stores remain highly available.

Is now the right time to add this functionality or are there other high priority refactors in the works that could lead to PRs being closed?

Sovianum force-pushed the kill-queries-on-timeout branch from 2c780b2 to 959332b Compare May 11, 2018 17:03

kill queries on timeout

9ed2daa

Sovianum force-pushed the kill-queries-on-timeout branch from 959332b to 9ed2daa Compare May 11, 2018 17:10

methane reviewed May 12, 2018

View reviewed changes

methane reviewed May 15, 2018

View reviewed changes

connection_go18.go Outdated Show resolved Hide resolved

methane reviewed May 15, 2018

View reviewed changes

packets.go Outdated Show resolved Hide resolved

Sovianum force-pushed the kill-queries-on-timeout branch 2 times, most recently from aea431d to 1251784 Compare May 15, 2018 17:01

review 1 fix

eabce5d

Sovianum force-pushed the kill-queries-on-timeout branch from 1251784 to eabce5d Compare May 15, 2018 17:05

julienschmidt added this to the v1.5.0 milestone May 20, 2018

This comment has been minimized.

Sign in to view

julienschmidt mentioned this pull request Oct 8, 2018

Weird behaviour with query cancellation using context #863

Closed

hanchuanchuan mentioned this pull request Apr 4, 2019

hope API for ConnectionID() #946

Closed

julienschmidt modified the milestones: v1.5.0, v1.6.0 Apr 24, 2019

Sovianum closed this Jul 22, 2019

jpcope mentioned this pull request Sep 12, 2019

Does context cancellation cancel the underlying query? #731

Closed

pjebs mentioned this pull request Oct 12, 2019

Feature Request: Add mysql context cancellation ent/ent#87

Closed

cia-rana mentioned this pull request Jul 2, 2022

RDBのクエリのキャンセルについて oreilly-japan/practical-go-programming#38

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

kill queries on timeout #791

kill queries on timeout #791

Sovianum commented May 11, 2018

methane commented May 12, 2018

methane May 12, 2018

Sovianum May 15, 2018

methane May 15, 2018

methane May 15, 2018

Sovianum May 15, 2018

methane commented May 12, 2018

Sovianum commented May 15, 2018

methane commented May 15, 2018

methane commented May 15, 2018

Sovianum commented May 15, 2018 •

edited

Loading

Sovianum commented May 15, 2018

This comment has been minimized.

methane commented Jun 13, 2018

pjebs commented Mar 14, 2019

methane commented Mar 14, 2019

pjebs commented Mar 16, 2019 •

edited

Loading

RafeKettler commented Jul 1, 2019

pjebs commented Jul 1, 2019

methane commented Jul 1, 2019

pjebs commented Jul 2, 2019

jpcope commented Sep 12, 2019

kill queries on timeout #791

kill queries on timeout #791

Conversation

Sovianum commented May 11, 2018

Description

Checklist

methane commented May 12, 2018

methane May 12, 2018

Choose a reason for hiding this comment

Sovianum May 15, 2018

Choose a reason for hiding this comment

methane May 15, 2018

Choose a reason for hiding this comment

methane May 15, 2018

Choose a reason for hiding this comment

Sovianum May 15, 2018

Choose a reason for hiding this comment

methane commented May 12, 2018

Sovianum commented May 15, 2018

methane commented May 15, 2018

methane commented May 15, 2018

Sovianum commented May 15, 2018 • edited Loading

Sovianum commented May 15, 2018

This comment has been minimized.

methane commented Jun 13, 2018

pjebs commented Mar 14, 2019

methane commented Mar 14, 2019

pjebs commented Mar 16, 2019 • edited Loading

RafeKettler commented Jul 1, 2019

pjebs commented Jul 1, 2019

methane commented Jul 1, 2019

pjebs commented Jul 2, 2019

jpcope commented Sep 12, 2019

Sovianum commented May 15, 2018 •

edited

Loading

pjebs commented Mar 16, 2019 •

edited

Loading