-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DataStore - DeltaSync Very Slow on SQLite #8699
Comments
Hi @iartemiev - I've created the new GitHub issue as requested. I checked the AmplifyDataStore-ENV table and there are currently ZERO records in this table. So I thought the DeltaSync should be almost instantaneous - it still took 20 minutes to process everything. This is not making sense to me why that is the case. When there were a large amount of records in the AmplifyDataStore table there were zero that were relevant to my login and zero were downloaded to the device. I have not been able to test anything in a web app as we discontinued development of that and don't know of an easy way to test that. I don't believe we are changing any sync related settings. Another observation - I am testing this with a large data set. If we log in with a smaller data set (development environment) then we see base sync at 5 minutes and delta sync at 1 minute - so it is noticeably different on smaller data set - but at 1 minute to download zero changes that is still too long. |
Thank you for opening the issue, @sacrampton. It sounds to me like DataStore is performing another base sync instead of a delta. Let me try to reproduce this behavior and I'll know for sure. Should be an easy fix if that's the case. |
I did some testing today and did not find any issues with the SQLite adapter's behavior here, i.e., if I reload the app inside of the full sync interval (by default this is 1 day), DataStore will perform a delta sync. If it's outside of the full sync interval, it performs a base sync. Just to give you some context on how delta sync works. When DataStore starts, it retrieves the last sync timestamp for each model, checks if that timestamp is inside the full sync interval, and if it is, passes that timestamp in the GraphQL query network request ( I'll do some more targeting testing on my end, but in the meantime, I have a few more follow-up questions:
|
Hi @iartemiev - firstly, everything we are doing is within the full sync interval. We've run a bunch more tests in debug and can report as follows. So when we first log in with an empty database we are downloading these quantities of records from these 20 models
Total time taken in sync : approx 19 minutes. Majority of time is spent in these 3 models.
I think there is an issue with the base sync truncating if I do a count of records in the database (we use ElasticSearch for that) I have the following quantities. The fact that the photo/assetVisit models are nice round numbers is a worry:
After the database is populated (ie. we don't call Datastore.clear() ) we restart the app ( Datastore.start() ). We do this within minutes of the initial sync so its still well within the full sync window. What we observe is that all 19 Models except Asset get a delta sync completed in a few seconds, while Asset model gets a full Our use case has the customer at a site sharing devices with staff as shifts change - so we want to swap users (ie. don't clear the database) rather than log out. So if we swap user and do a delta sync we get the following:
Then if we swap back to the original user we see the following
If we keep swapping users back and forth the delta sync eventually gets to a consistent 30 seconds. Have also been noticing that in many cases records are being updated during the delta sync where there has not actually been an update. And the number of records updates is equal to the number of records originally synced. |
Hi @iartemiev - separately to this I still think the issue of multi-tenanted delta sync is an issue - if there are 1000 customers that have made 1000 changes in the last hour that is a million changes to sort through. If I have another customer that has made zero changes then their users have to sort through the million changes to understand that there are zero changes that impact them. |
Hi @iartemiev - changed max records from 50K to 100K records and this has resolved. Still have concerns for multi tennant at scale, but immediate issue resolved |
Hi @iartemiev - looks like I spoke too soon - the testing we did was on a virtual device on a development environment. When we put it on a real device with production database the problem did not resolve by increasing the number of max records. |
Hi @iartemiev - seems like there is some issue with maxRecordsToSync. When we had the limit set to 50K it was downloading 61,907 (see above) - but there are actually 88,899 records in the data set. Changing limit to 100K is not seeming to make any change. Seems delta sync issues are related to entire data set not being initially downloaded |
@sacrampton, I believe you were able to resolve this issue by increasing the DeltaTTL for certain tables. If so, are you fine with us closing this issue? Or is there more to this that has yet to be addressed? |
Thanks @iartemiev - yes. For tables we were not syncing to data store I put the DeltaTTL to 1 minute and for tables we are syncing to data store I put to 720 minutes (12 hours). This has made an immediate and dramatic impact for our users. With the 30 minute DeltaTTL it was essentially syncing all the time and after 5-6 hours of constant usage the memory would become so bogged down that you had to kill the app and restart it. That has gone away now. |
This sounds like a hacky workaround and seems to me, will likely come back and bite u |
Hi @nubpro - always open to any suggestions that help us move forward. So if you have any ideas on better ways to proceed please share. Welcome any and all suggestions. Thanks. |
@nubpro, can you elaborate on why increasing the DeltaSyncTableTTL is “hacky” or a “workaround”? This is a well-documented property of sync-enabled AppSync APIs. We default it to 30 minutes because we think this is a sensible default for most customers, but a longer TTL is certainly valid for certain use cases, such as @sacrampton’s. |
My bad I have read incorrectly. I'd thought if u increase the deltaTTL to a large number, you are simply delaying the memory usage issue to a later time. Instead: |
This issue has been automatically locked since there hasn't been any recent activity after it was closed. Please open a new issue for related bugs. Looking for a help forum? We recommend joining the Amplify Community Discord server |
Before opening, please confirm:
JavaScript Framework
React Native
Amplify APIs
DataStore
Amplify Categories
api
Environment information
Describe the bug
I am creating this issue as a separate GitHub issue - separate from #8405
#8405 (comment)
Hi @iartemiev - want to push further into the slowness we are seeing for DeltaSync
DataStore creates a separate table in DynamoDB to manage the DeltaSync called "AmplifyDataStore-ENV".
There are no indexes in this table - just the partition key and sort key - where the partition key is table/date and sort key is time/id/version.
Our database is multi-tenanted - and we deal with assets in industrial plants. So I could have hundreds of other users in other plants making massive amounts of changes. But I might not have any users working my plant. The DeltaSync as I see it is going to have to sort through everyone else's changes just to work out there are zero changes that are going to be applicable to me.
When we initially hydrate the cache we do a base query which uses GSI's to get a quick response.
At the moment I'm seeing DeltaSync take about the same amount of time as the full sync (20 minutes). Today I know I was doing a lot of bulk updating of data in a few different plants through our web back end. Not an unusually large workload. But I am concerned from the slowness I'm seeing in our database and what I see in the DeltaSync table for DynamoDB has me worried that this is not scalable for a multi-tenanted environment.
You've been really good at coming up with solutions to get us moving - hopefully someone else has already come up with a solution to make the DeltaSync run in seconds rather than 20+ minutes.
#8405 (comment)
@sacrampton, I think this behavior likely warrants a separate GitHub issue, unless this is somehow related to the on-device database on React Native specifically (AsyncStorage or SQLite).
To better understand what's going on, I have some follow up questions:
How many total records are in the delta sync table in DynamoDB at the time that you're seeing the 20 min delta sync time?
How many of those records are being synced down to the app?
Are you using DataStore.configure to change any of the sync-related settings (e.g., syncPageSize, fullSyncInterval, etc.)? If so, which settings are you using?
Are you seeing roughly the same delta sync performance if you test this in a web app?
Expected behavior
DeltaSync processes very quickly - seconds, not minutes
Reproduction steps
DataStore.start
Code Snippet
// Put your code below this line.
Log output
aws-exports.js
No response
Manual configuration
No response
Additional configuration
No response
Mobile Device
No response
Mobile Operating System
No response
Mobile Browser
No response
Mobile Browser Version
No response
Additional information and screenshots
No response
The text was updated successfully, but these errors were encountered: