-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[C++][FlightRPC] FlightClient.authenticate is not thread safe #41919
Comments
Likely related: #38565 |
This will avoid the crash but it seems that it may still use wrong authenticate handler. |
By "this" I assume you mean the suggestion I made in the linked issue to change SetToken to take a shared_ptr argument by value instead of a raw pointer.
In what my team is trying to do there is no "wrong" authentication handler. One thread is trying to use the currently registered handler, and another thread is trying to change that handler to a new one, which may or may not use the same authentication token. If the old handler is used, that will use a valid token. If the new handler is used, that will use a valid token (which may or may not be a different authentication token; our server rotates authentication tokens after a period of time, and keeps the old one valid for a while after it has been rotated, because there is no way to prevent several RPCs in flight from different threads from racing with each other, unless the client stopped sending RPCs altogether before trying to get a new token, which it doesn't in our case). Independently of what my team is trying to do and more generally, if some threads are trying to use the flight client to do operations like DoPut and DoGet, and another thread is changing the authentication handler, there is no way to guarantee an ordering unless the user themselves do some kind of order protection between those threads. What I am trying to elaborate here is that the idea that there is a "right" authentication handler only makes sense if a particular DoGet or DoPut or any other flight operation was done in a context with an expectation on what handler would be used; for that expectation to hold, a user would have to do their own synchronization to ensure that desired ordering, eg, stop doing DoGets or DoPuts or any other operations, change the handler, and then resume. As far as I can tell, there is nothing the flight client itself can, or should do. What I think the flight client can do is to ensure a pointer that was already deleted is not used; since things running from multiple threads can happen in any ordering, if one thread is changing the authentication handler and another is calling DoPut, the thread calling DoPut can't have an expectation on whether the old handler or the new handler will be used. What it can have is an expectation that either of those will be used correctly (as opposed to crashing). |
Right.
If a server does it and a client is used only for one user, it doesn't use wrong authentication as you said. But, in general, I don't object this approach because it's better than crashing. Could you open a PR? But I think that we still need to update our documentation even with this change: #38565 (comment) |
Thanks. Done: #41927 |
Describe the bug, including details regarding any error messages, version, and platform.
This is the cython code (pyx) implementation of flight_client.authenticate:
arrow/python/pyarrow/_flight.pyx
Line 1440 in 6a28035
Note the object being stored inside the
handler
object is returned byto_handler
defined later on the same file:arrow/python/pyarrow/_flight.pyx
Line 2501 in 6a28035
The call to Authenticate on
_flight.pyx
line 1461 is togrpc_client.cc
line 860:arrow/cpp/src/arrow/flight/transport/grpc/grpc_client.cc
Line 860 in 6a28035
Many calls in the same file use the
auth_handler_
member of the structarrow/cpp/src/arrow/flight/transport/grpc/grpc_client.cc
Line 1078 in 6a28035
that is being assigned on line 862 above, eg,
DoPut
:arrow/cpp/src/arrow/flight/transport/grpc/grpc_client.cc
Line 1002 in 6a28035
SetToken on the same file:
arrow/cpp/src/arrow/flight/transport/grpc/grpc_client.cc
Line 87 in 6a28035
There is a race between calling
auth_handler_.get()
to get out a raw pointer out of theauth_handler_
shared_ptr
and using it insideSetToken
, and another thread changing the value ofauth_handler_
via the its assignment operator call inAuthenticate
, which can trigger the deletion of the previously held pointer value. That deletion can happen in another thread after the call toauth_handler_.get()
to get the raw pointer value and beforeSetToken
using that raw pointer value.My team builds a server and client libraries based on flight. One of our customers ran into an issue while using hundreds of concurrent sessions to our service. Our client library was using calls to
FlightClient.authenticate
triggered by a timer once every 5 minutes; that concurrently with hundreds of calls to DoPut triggered the problem. We were able to reproduce the problem by artificially increasing the frequency of calls toFlightClient.authenticate
to 3 seconds; we saw the same problem with either concurrentDoPut
orDoGet
. We saw the problem on pyarrow 16.0.0 on Ubuntu Linux 22.04.More details about the symptoms we saw here: deephaven/deephaven-core#5489
Component(s)
FlightRPC
The text was updated successfully, but these errors were encountered: