-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SELECT after every EXEC / non-standard command #1988
Comments
The motivation here is about correctness, and our confidence/knowledge of what a particular command has done, and in particular: could they have changed database? I thought the transaction logic only applied in the case of failure, but I'm happy to be corrected. When using Lua, the caller can certainly change the connection database, so we need to treat that defensively. I "presume* the same is true of an ad-hoc command, whether inbuilt redis or a custom module. Things we could consider here:
Thoughts? Note: the impact of this command is usually negligible, since we don't await it - it just becomes part of the same packet as the command, and the server processes it trivially. If you are seeing mearusable impact, I'm not against looking more, though. |
Hi @mgravell thanks for getting back so quick
I can confirm it's definitely happening after a successful transaction: var db = muxer.GetDatabase();
for (var i = 0; i < 3; i++)
{
var transaction = db.CreateTransaction();
var t1 = transaction.StringSetAsync("foo", "bar");
var t2 = transaction.StringGetAsync("foo");
await Task.WhenAll(transaction.ExecuteAsync(), t1, t2);
Console.WriteLine(t2.Result);
} The above trivial example corresponds to the following in Redis:
You'll notice the SELECTs called after each EXEC (I think it's probably more accurate to say it's called BEFORE each successive MULTI). Again, I'm not certain that there's a valid reason to set the database to unknown after an EXEC, since as far as I can tell there's no API to call SELECT from within a transaction? Please correct me if I'm missing something?
This was definitely true at one point, but as you can see in that switch statement, it appears that there was a change in Redis that prompted you (or one of the library's other contributors - I can't readily tell from the blame) to circle around this for scripts, as apparently SELECT is no longer applied to the calling client within a script here's the pertinent section of the release notes. To your point, I can definitely see the need to be defensive with arbitrary commands. I'm not sure about trying to divine the presence of other databases from the info, since this can change on the fly. I suppose if something were added to the configuration of the multiplexer that told it there was only a single database that could work. But then I guess my next question is: Does it toss an exception if you try to select the non-default database? Having a command flag "hey, pinky-swear I'm not changing the database" seems the most correct way to handle this with minimal disruption. If that's acceptable to you, I'll open a PR to add it. Circling back to the EXEC, given what I mentioned above, can I correct that behavior in the same shot? thanks again! |
Ah yes, you've reminded me about that change to Lua, thanks; I'll try to
think about EXEC over the weekend - I *think* it should be OK in any
success case, but I believe the problem is that due to the multiplexed
nature of the connection, we don't usually wait for the result before
issuing the next command - which means *if* the user issued a select inside
the transaction, there's both possibilities: the select runs, and the
select doesn't run. Because of that, we need to be cautious. However! I
suspect we should be able to do something like tracking whether we've
issued a DB-impacting command during the transaction (which could be any of
SELECT, EVAL/EVALSHA on down-level servers, or unknown commands) - and then
we could only mark the DB as unreliable if any of those apply. That
probably means that we'd avoid the SELECT in almost all cases, without the
caller having to do anything special. Sound reasonable?
…On Fri, 11 Feb 2022, 14:59 Steve Lorello, ***@***.***> wrote:
Hi @mgravell <https://github.com/mgravell> thanks for getting back so
quick
I thought the transaction logic only applied in the case of failure
I can confirm it's definitely happening after a successful transaction:
var db = muxer.GetDatabase();for (var i = 0; i < 3; i++)
{
var transaction = db.CreateTransaction();
var t1 = transaction.StringSetAsync("foo", "bar");
var t2 = transaction.StringGetAsync("foo");
await Task.WhenAll(transaction.ExecuteAsync(), t1, t2);
Console.WriteLine(t2.Result);
}
The above trivial example corresponds to the following in Redis:
1644588492.645289 [0 172.17.0.1:36414] "MULTI"
1644588492.645645 [0 172.17.0.1:36414] "SET" "foo" "bar"
1644588492.645673 [0 172.17.0.1:36414] "GET" "foo"
1644588492.645674 [0 172.17.0.1:36414] "EXEC"
1644588492.650322 [0 172.17.0.1:36414] "SELECT" "0"
1644588492.650345 [0 172.17.0.1:36414] "MULTI"
1644588492.650351 [0 172.17.0.1:36414] "SET" "foo" "bar"
1644588492.650353 [0 172.17.0.1:36414] "GET" "foo"
1644588492.650354 [0 172.17.0.1:36414] "EXEC"
1644588492.651078 [0 172.17.0.1:36414] "SELECT" "0"
1644588492.651082 [0 172.17.0.1:36414] "MULTI"
1644588492.651086 [0 172.17.0.1:36414] "SET" "foo" "bar"
1644588492.651087 [0 172.17.0.1:36414] "GET" "foo"
1644588492.651092 [0 172.17.0.1:36414] "EXEC"
You'll notice the SELECTs called after each EXEC (I think it's probably
more accurate to say it's called BEFORE each successive MULTI). Again, I'm
not certain that there's a valid reason to set the database to unknown
after an EXEC, since as far as I can tell there's no API to call SELECT
from within a transaction? Please correct me if I'm missing something?
When using Lua, the caller can certainly change the connection database,
so we need to treat that defensively
This was definitely true at one point, but as you can see in that switch
statement
<https://github.com/StackExchange/StackExchange.Redis/blob/main/src/StackExchange.Redis/PhysicalBridge.cs#L1412>,
it appears that there was a change in Redis that prompted you (or one of
the library's other contributors - I can't readily tell from the blame) to
circle around this for scripts, as apparently SELECT is no longer applied
to the calling client within a script here's the pertinent section of the release
notes <https://github.com/redis/redis/blob/2.8/00-RELEASENOTES#L322>.
To your point, I can definitely see the need to be defensive with
arbitrary commands.
I'm not sure about trying to divine the presence of other databases from
the info, since this can change on the fly. I suppose if something were
added to the configuration of the multiplexer that told it there was only a
single database that could work. But then I guess my next question is: Does
it toss an exception if you try to select the non-default database?
Having a command flag "hey, pinky-swear I'm not changing the database"
seems the most correct way to handle this with minimal disruption.
If that's acceptable to you, I'll open a PR to add it.
Circling back to the EXEC, given what I mentioned above, can I correct
that behavior in the same shot?
thanks again!
—
Reply to this email directly, view it on GitHub
<#1988 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAAEHMFJFFNHRQKSWGSMUK3U2UP4PANCNFSM5OCFUT2Q>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
(and yes, I think you're right in that the API won't give rise to direct
SELECT usage mid-transaction, so that just leaves Lua on down-level
servers, and unknown ad-hoc commands)
…On Fri, 11 Feb 2022, 16:23 Marc Gravell, ***@***.***> wrote:
Ah yes, you've reminded me about that change to Lua, thanks; I'll try to
think about EXEC over the weekend - I *think* it should be OK in any
success case, but I believe the problem is that due to the multiplexed
nature of the connection, we don't usually wait for the result before
issuing the next command - which means *if* the user issued a select inside
the transaction, there's both possibilities: the select runs, and the
select doesn't run. Because of that, we need to be cautious. However! I
suspect we should be able to do something like tracking whether we've
issued a DB-impacting command during the transaction (which could be any of
SELECT, EVAL/EVALSHA on down-level servers, or unknown commands) - and then
we could only mark the DB as unreliable if any of those apply. That
probably means that we'd avoid the SELECT in almost all cases, without the
caller having to do anything special. Sound reasonable?
On Fri, 11 Feb 2022, 14:59 Steve Lorello, ***@***.***>
wrote:
> Hi @mgravell <https://github.com/mgravell> thanks for getting back so
> quick
>
> I thought the transaction logic only applied in the case of failure
>
> I can confirm it's definitely happening after a successful transaction:
>
> var db = muxer.GetDatabase();for (var i = 0; i < 3; i++)
> {
> var transaction = db.CreateTransaction();
> var t1 = transaction.StringSetAsync("foo", "bar");
> var t2 = transaction.StringGetAsync("foo");
> await Task.WhenAll(transaction.ExecuteAsync(), t1, t2);
> Console.WriteLine(t2.Result);
> }
>
> The above trivial example corresponds to the following in Redis:
>
> 1644588492.645289 [0 172.17.0.1:36414] "MULTI"
> 1644588492.645645 [0 172.17.0.1:36414] "SET" "foo" "bar"
> 1644588492.645673 [0 172.17.0.1:36414] "GET" "foo"
> 1644588492.645674 [0 172.17.0.1:36414] "EXEC"
> 1644588492.650322 [0 172.17.0.1:36414] "SELECT" "0"
> 1644588492.650345 [0 172.17.0.1:36414] "MULTI"
> 1644588492.650351 [0 172.17.0.1:36414] "SET" "foo" "bar"
> 1644588492.650353 [0 172.17.0.1:36414] "GET" "foo"
> 1644588492.650354 [0 172.17.0.1:36414] "EXEC"
> 1644588492.651078 [0 172.17.0.1:36414] "SELECT" "0"
> 1644588492.651082 [0 172.17.0.1:36414] "MULTI"
> 1644588492.651086 [0 172.17.0.1:36414] "SET" "foo" "bar"
> 1644588492.651087 [0 172.17.0.1:36414] "GET" "foo"
> 1644588492.651092 [0 172.17.0.1:36414] "EXEC"
>
> You'll notice the SELECTs called after each EXEC (I think it's probably
> more accurate to say it's called BEFORE each successive MULTI). Again, I'm
> not certain that there's a valid reason to set the database to unknown
> after an EXEC, since as far as I can tell there's no API to call SELECT
> from within a transaction? Please correct me if I'm missing something?
>
> When using Lua, the caller can certainly change the connection database,
> so we need to treat that defensively
>
> This was definitely true at one point, but as you can see in that switch
> statement
> <https://github.com/StackExchange/StackExchange.Redis/blob/main/src/StackExchange.Redis/PhysicalBridge.cs#L1412>,
> it appears that there was a change in Redis that prompted you (or one of
> the library's other contributors - I can't readily tell from the blame) to
> circle around this for scripts, as apparently SELECT is no longer applied
> to the calling client within a script here's the pertinent section of the release
> notes <https://github.com/redis/redis/blob/2.8/00-RELEASENOTES#L322>.
>
> To your point, I can definitely see the need to be defensive with
> arbitrary commands.
>
> I'm not sure about trying to divine the presence of other databases from
> the info, since this can change on the fly. I suppose if something were
> added to the configuration of the multiplexer that told it there was only a
> single database that could work. But then I guess my next question is: Does
> it toss an exception if you try to select the non-default database?
>
> Having a command flag "hey, pinky-swear I'm not changing the database"
> seems the most correct way to handle this with minimal disruption.
>
> If that's acceptable to you, I'll open a PR to add it.
>
> Circling back to the EXEC, given what I mentioned above, can I correct
> that behavior in the same shot?
>
> thanks again!
>
> —
> Reply to this email directly, view it on GitHub
> <#1988 (comment)>,
> or unsubscribe
> <https://github.com/notifications/unsubscribe-auth/AAAEHMFJFFNHRQKSWGSMUK3U2UP4PANCNFSM5OCFUT2Q>
> .
> Triage notifications on the go with GitHub Mobile for iOS
> <https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
> or Android
> <https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
>
> You are receiving this because you were mentioned.Message ID:
> ***@***.***>
>
|
Cool, so given all this how does the following sound:
LMK what you think, if this all sounds good I'll open a PR for it. |
The discard/exec is a bit more complex than that; we need to know whether any command in the transaction might have changed db - essentially just an accumulator of the same logic, on a command-by-command basis, but: yes |
Question: what's stopping someone from doing an Overall question I still have: is the defensiveness an issue in terms of performance or anything else today? If we're getting more complicated or clever, it would be helpful to know what we're improving in the exchange. |
That would be the "we need to know whether any command in the transaction might have changed db" part |
@mgravell k, same page there - I'm wondering if that's worth doing and adding complexity for, back to...what problem are we solving? @slorello89 Could you elaborate on if this is causing an issue today or just looks odd? |
True. From a performance standpoint I'd be surprised if there was any significant impact here (reasons summarised below), but performance analysis based on data is often surprising. Do we have any such data? Reasons:
|
@mgravell @NickCraver , so, this is actually surprisingly a bit of an issue on Redis Enterprise. The proxy routes all the incoming SELECTS to slot-0 by default (I'm not entirely sure why this is the case, I'm guessing this is because SELECT isn't supported), so the impact on the first shard can be significant if you are sending lots of transactions/ad-hoc commands (as it's getting hit by everything in the cluster). We can work around this by disabling SELECTs at the proxy level, but since these commands aren't doing anything to alter the selected database the SELECT probably shouldn't be sent in the first place. |
Is Redis Enterprise cluster in this case? (Sorry I'm not familiar with all the hosting models going on) If the issue is sending the command at all in a cluster, that's a much simpler change and totally agree we should tweak that (which also has no consequences to reason about). I even have a new helper method for this as a result of the Envoy work :) |
Looked at where this happens - ultimately if |
if we can detect this scenario: no problem automatically disabling it; as a temporary workaround, I suspect if the user adds (Edit: or the client will explode in a shower of sparks and smoke; it'll be interesting to find out!) |
+1 - wondering if we can detect it and help out, with more details I can poke there (been in the area) - it looks like we think it's Standalone atm ( |
Aye, Clusters in Redis Enterprise have a proxy that accepts the incoming commands, so it looks sort of like a stand alone to the client. That's probably why the multiplexer thinks it can have multiple databases. |
I don't have redis enterprise to test against; any chance |
@mgravell @NickCraver Sorry for the delay, I needed to do a bit of digging and needed to speak to some people on our end. I've not been able to find a way to differentiate a Redis Enterprise instance with a single proxy (basically the transparent cluster I was describing earlier) from a stand-alone redis instance. The Also, with the single-proxy the Slight tangent for some extra background (this isn't really pertinent to the issue), that's not actually the end of the story. There are two modes for Cluster in Redis Enterprise. In addition to a single proxy cluster (the similar to stand-alone that I've been referring to up to this point) there's also an OSS Cluster API flavor. In which rather than having a single proxy, all Nodes (machines, not Redis instances) have a proxy, this is much less common, but in this case, the cluster API is supported, and therefore Back on track. To Marc's point about adding
And any further attempts to call any command will result in a timeout (something is probably deadlocked in the command queue?) Interestingly enough it looks like the third command in the catch fragment is making it through, just its result isn't being read back cleanly.
Code for that example (this is run in a console app so perhaps a context issue), you can replicate with the rejson image var muxer = await ConnectionMultiplexer.ConnectAsync("localhost:6379,$SELECT=", Console.Out);
var db = muxer.GetDatabase();
try
{
await db.ExecuteAsync("JSON.SET", "foo", ".", "{\"bar\":\"baz\"}"); // Goes through, no problem
await db.StringSetAsync("bar", "baz"); // dies on the Select
}
catch (Exception ex)
{
Console.WriteLine(ex);
await db.StringSetAsync("bar", "baz"); // Makes it through to Redis, but fails with a timeout
} In the PhysicalConnection's GetSelectDatabaseCommand there's no logic to interrogate the CommandMap in the multiplexer for the presence of thanks! |
I was looking at this but in a long session this week and didn't writeup: indeed there's no check here, but if we add one it needs to be as efficient as possible since it's inside the write lock. Most of the throughput performance that we're looking at is "how long is the lock taken?" and any extension of that is a net drain on throughput. I'll look at making |
@NickCraver, if a config-level change ends up being the answer. Perhaps making public bool HasDatabases => serverType == ServerType.Standalone; With a simple property: public bool HasDatabases { get; } Then when the ServerEndpoint is initialized, interrogate the config for either a disabled SELECT command or perhaps a config setting HasDatabases = config.CommandMap.IsAvailable(RedisCommand.SELECT) && config.HasDatabases; Then in the ServerType property's setter, (this is what gets called after a successful call to HasDatabase = HasDatabases && value == ServerType.Standalone Unless there are some other externalities I'm missing? |
@slorello89 This week is crazier than most so very slow here, likely won't dig in until next week and recharged but yes we can probably memoize it in some way, though I'm not sure if a) that's the correct location (it's not per-server that it's disabled, for example), and b) how that'll interact with config changes at runtime. Something in play here is we intend to add a method soon to allow config changes at runtime for example to change authentication passwords for next reconnect in rolling environments and such. How that plans with memoized things based on config downstream is a factor. Is there any chance you could post a full |
Hi @NickCraver, figured I'd wait a few days to get back to you as you seemed a bit swamped. What follows is the info from a RedisEnterprise instance with a similar configuration to the one we are talking about. Unfortunately, there's not much to go off of from the info.
|
@slorello89 Thanks for that - agree we can't do much here. I have some bigger fish in progress about 2.5 but haven't forgotten about this - we'll make the |
This resolves #1988 by respecting when the `SELECT` command has been disabled. It's memoized which seems bit extreme here but that's because we're inside the lock and every op matters. Measured locally to ensure no regressions.
Fix works, thanks @NickCraver & @mgravell :) |
@slorello89 awesome, thanks for the update! |
👋 I was debugging an issue today, a customer saw an abnormal number of SELECTS being sent to their Redis instance, which was pretty odd given they weren't passing any arguments into
GetDatabase
. After a bit of digging, I found this bit of code from the libraries earliest days, which pointed me to the fact that whenever they execute a transaction, aSELECT
is called immediately afterward.I then noticed that above that bit, the
UNKNOWN
command also falls through to the same condition. This means that if you execute any Redis Command via theExecute
/ExecuteAsync
API (which is the pattern in NRediSearch, Redis.OM, NReJSON, etc. . .) that the multiplexer will immediately callSELECT
afterward.I suppose my question is in 2 parts
1: What was the origination of this behavior for
EXEC
? My guess would be that it's because if you call aSELECT
in the middle of a transaction that it will change the client's selected database. But I don't think there's a way in which you can call an arbitrary command on a transaction is there? My other guess would be that since you are loaning a copy of the database to the transaction you need to be double sure that you set the selected database is correct afterward, but for some reason that doesn't ring correctly to me.2: Would it be possible to provide either a way to extend the
RedisCommand
enum to allow expanded command sets, or alternatively perhaps provide a configuration item to preventUNKNOWN
commands from encountering this?I'd of course be happy to open any PRs required myself.
Thanks!
The text was updated successfully, but these errors were encountered: