-
Notifications
You must be signed in to change notification settings - Fork 605
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Extreme Latency when Using Google Cloud Functions and Cloud Datastore #2374
Comments
Can you see if subsequent calls also have a lag? The first request has to sort out the authentication details, so it's known to be slower. |
If I call the datastore API within the same function call, I do see a reduction in subsequent call time -
I've also discovered that using a key file (i.e. creating a service account and specifying keyFilename) decreases the first call time from ~1.4 seconds to 0.85 s. So, it looks like your suggestion that it's related to the first request auth is correct, but, since when using cloud functions, the penalty is incurred on each http invocation the overhead is still very high (i.e. 0.8s per HTTP call). |
Yes, unfortunately. I'll close the issue, as we can't do anything within the client library. If any ideas come to mind for how this can be addressed, feel free to discuss. |
@stephenplusplus - We are running into the same issue. Datastore and Functions seem like the perfect combination, paying only a per-request cost. However the latency is much too high to be usable in real-world. As was mentioned, it is related to the Auth. This problem must be solvable, as other similar systems such as Lambda + DynamoDB don't under into this issue. It seems unlikely to be implemented in node, but probably some setup on the function/lambda that is 'pre-authenticated' (AWS IAM roles?). Who would be the correct team/repo to work on this issue? |
@ro-savage we were/are using app engine with datastore. we also did run in to big issues with the latency of the datastore. would love to hear some updates @google-admin about this issue. |
A few weeks into development, we're finding Datastore to not be a viable solution for use in Functions due to the consistently poor performance on even very simple queries. 4000ms on first (authenticating) request, and then 800-1200ms afterwards. |
We are also experiencing 60ms-200ms latencies on a lookup by key (runQuery is even worse) from GAE Flexible in node. |
Also reported in #2727 @lukesneeringer this seems to be a common problem. The most recent report shows 10 seconds spent on a query that should take in the tens of milliseconds. |
Thank you @stephenplusplus. I can confirm I'm having this issue with the following environment:
Additionally, I'm noticing the issue on Cloud Functions with the same node and NPM versions. |
If this is something that is fixed by switching to the gRPC transport we use elsewhere, then this should be a reasonably (~week or so) quick fix. |
We also have the same high latency issues mentioned by @ahume with CloudFunctions and Datastore node.js client v1.1.0 |
I guess if you convert to gRPC then when used in a CloudFunction the Datastore comms will start having this problem like Spanner and PubSub #2438. If that happens then at least the problem will be more visible. |
I get a significant reduction in latency in calls from CloudFunctions to Datastore by embedding a service account. For warm functions latency was previously several seconds (mostly around 2-3 seconds but sometimes much higher) and are now mostly 1 second with the occasional 2 seconds. These calls are a mixture of CRUD operations. FYI I am using AppEngine cron to trigger a job every minute to hit all functions several times in order to keep them warm. We only have indexes on a small subset of attributes in the Datastore entities. We have around 60 functions in all. Every function has a health mode where it immediately returns instead of calling Datastore and the latency of the functions when operating in this mode are around 200-400ms with the occasional 500-600ms. Old code (without the 'at' prefix): New code (without the 'at' prefix): The service account is encrypted with GCP KMS so we load it and cache it prior to the function running so basically wrapping the entry point in a promise. |
Does anyone know what the expected latency of Datastore is? We are considering moving to Go and AppEngine (which we used before with AWS and DynamoDB) or will that have the same issues? Another person (@ignaciovazquez) in this thread mentioned that AppEngine and Datastore also have issues but perhaps that is just the Node.js lib? Would it be better for us to look at using CloudSQL if we want low latency? |
Today I deployed our code to app engine flex using the same node.js version used by gcp functions. In both cases (same code) auth is an embedded service account decrypted (and cached) using GCP KMS. Pings average around 250ms compared to 350ms in functions. However the latency is more erratic from functions with a % of responses falling into the 500-1000ms window. So when not calling out to other integrations, functions are only slightly slower but more erratic. When the code also reads from Datastore there is a more pronounced difference in mean latency between app engine 589ms and functions 924ms. Function latency is also much more erratic e.g. a % of requests >> 1000ms. Data is attached. |
Just at side note to maintainers / google. We raised this issue in a few places including in person at Google Sydney and pretty much were told there is no solution and no eta. We ended up leaving google cloud and moving to AWS. Not only was it must faster but it costs significantly less because all our functions run in under 100ms on AWS with Dynamo compared to multiple hundreds (sometimes 1-2s) with Google Cloud. This issue is one of the things that kills the ability for people to use Cloud Functions and Datastore together on a production app. |
Thanks @ro-savage I was coming to the same conclusion and it is not just functions, latency when using AppEngine Flex and Node.js (but not Go) is not that great either. To be fair Datastore and DynamoDB are not drop-in replacements because Datastore has transactions which are occasionally very very useful and help keep the code clean because you do not need to cope with in-consistent data where that can happen. DynamoDB is very very fast since it's basically key-value and has proper support for updating documents unlike Datastore where an update replaces the entire entity from the point of view of the developer. On the other hand Datastore supports many more query indexes. I have used both extensively and would prefer Datastore if it were faster. Lambda is of course far ahead of Functions which is unsurprising given the latter is beta. AWS support is incredible, I get AWS engineers phoning me to ask how they can help out. Firebase hosting and auth etc are great though, much better than CloudFront and S3 and Cognito. So both Clouds have strengths with AWS being flexible, more work and fast and GCP being easy to use, less ops overhead and fast for most things too. |
Just did the same tests but using App Engine Flex with Go 1.9.2 instead of Node.js (App Engine Flex and Functions) and latency for pings average around 240ms and reads 390ms. Both the pings and the read latency are much less erratic compared to Node.js (App Engine Flex and Functions). We are the other side of the world from the infra (but still on fiber) so it's the delta between the ping and read that is important not the absolute number. So we will stick with Datastore but just use Go and App Engine Flex. FYI when using Node.js it is easy to support both Functions and Node.js App Engine Flex as deployment targets, if it wasn't I wouldn't have done this testing in the first place. GCP NoOps is very addictive... Data is attached. |
I'm marking this as
We are almost switched over to the GAPIC API. We are working on the final steps of splitting the Datastore code out from this repo and into its own. After that, it would be great if anyone with patience remaining could test it out for us. If it doesn't have any effect, we will need to make sure this feedback gets noticed by the Cloud Functions team and we start a proper search for the problem. |
Happy to test when you're ready |
Thanks. A few things I've found that might be relevant findings to help diagnose -
|
We use --memory=1024MB in order to get a faster machine (as recommended in the docs). We haven't tired any higher memory settings. We see similar behaviour to @richardowright and its important to note that the mean latency is not the only problem; some requests can still take multiple seconds even when auth is hardcoded to give a lower mean latency. Similar behaviour is seen in App Engine Flex with Node.js i.e. some requests take > 1 sec even though the mean for reads is lower ~ 600ms. |
I just did the same test against DynamoDB using a 1024MB Lambda behind API Gateway using Go and eawsy's aws-lambda-go-shim and the latency was ~490ms and pretty stable. This is all very rough and ready but seems there is no big difference between Go/AppEngine/Datastore compared to Go/API Gateway/Lambda/DynamoDB and so once Cloud Functions gets Go support I will be happy :) |
This issue was moved to googleapis/nodejs-datastore#9. |
Facing the very same issue with Google Cloud Functions and Firestore... First call takes several seconds, generally from 3 to 6 seconds. Subsequent calls half the latency. I have separate Functions talking to a MongoDB VM in Compute Engine, and the latency is < 1s, except on the first call where it's around 3-5 seconds. I've moved some functions to AWS Lambda with DynamoDB where the latency is at most 1.5s, which is definitely doable in an end-user facing situation.... Sad. |
@lazharichir you should probably add your comment here googleapis/nodejs-datastore#9 since this issue is closed |
@lazharichir you can significantly improve latency in AWS Lambda if you use a CloudWatch scheduled Lambda to keep your function warm |
So far I have ~500ms latency for a simple API request. Do you have a suggestion how to trace it and find the cause of the issue? Or, is it totally normal for Cloud Functions to have such latency? https://firebase.reactstarter.com online demo + source code |
Can someone confirm whether or not using gRPC makes a notable difference? |
@karlbateman I'm seeing significant improvements with the latest version. For the function above which was taking ~800ms per call, I know see call times around 150 ms. |
@koistya I'd post or look through https://stackoverflow.com/questions/tagged/google-cloud-functions. I took a quick look through the code, and it seems unrelated to cloud datastore. |
Environment details
Steps to reproduce
I experience high latency (~1 to 2 seconds) with pretty much every action.
Simple example (runs through bable prior to deploy) -
The text was updated successfully, but these errors were encountered: