-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Database dumps do not work on large databases #59
Comments
/bounty $250 |
💎 $250 bounty • Outerbase (YC W23)Steps to solve:
Thank you for contributing to outerbase/starbasedb! Add a bounty • Share on socials
|
Hi, @Brayden. instead of using DO alarm API we can use queues if the database is size greater than certain size (say 200 or 500)? My suggestion is to attach the cloudflare queues to the new private endpoints where the consumer and producer are a single worker which will export the data into R2 for certain time say (20 to 30sec) depends on usage limits, this way we can have the queues with the capability of the last record of the export queue and there's blockage of any request/response. https://developers.cloudflare.com/durable-objects/api/sql-storage/#databasesize we can depend on this property of DO to trigger the queue or something like. |
Hey @b4s36t4! Instead of using the DO alarm API, I think queues is also an acceptable approach. Just to talk through it a bit more so I fully understand how you're thinking of approaching it. A couple of questions:
Thanks for looking into this! |
/attempt #59
|
💡 @onyedikachi-david submitted a pull request that claims the bounty. You can visit your bounty board to reward. |
/attempt #59 may I ask you for cloudflare paid account because I don't have one (it may resolved without cloudflare account following your guidances) |
Describe the bug
If you try to use any of the database dump endpoints such as SQL, CSV or JSON the data is loaded into memory and then created as a dump file. To support any size database we should investigate enhancements to allow any sized database to be exported. Currently the size limitations are 1GB for Durable Objects with 10GB in the future. Operate under the assumption that we might be attempting to dump a 10GB database into a
.sql
file.Another consideration to make is because Durable Objects execute synchronous operations we may need to allow for "breathing intervals". An example might be we allow our export operation to run for 5 seconds, and take 5 seconds off if other requests are in a queue, then it can pick up again. The goal here would be to prevent locking the database for long periods of time.
But then poses the questions:
To Reproduce
Steps to reproduce the behavior:
/export/dump
endpoint on a large databaseRun the following command in Terminal (replace the URL with yours) and if your operation exceeds 30 seconds you should see a failed network response instead of a dump file.
If you can't create a large enough test database feel free to add code in to
sleep
for 29 seconds before proceeding with the/export/dump
functional code and should also see the failure.Expected behavior
As a user I would expect any and all of the specified data to be dumped out without an error and without partial results. Where it ends up for the user to access if the operation takes more than 30 seconds is up for discussion. Ideally if shorter than 30 seconds it could be returned as our cURL above works today (downloads the file from the response of the origin request), but perhaps after the timeout it continues on uploads it to a destination source to access afterwards?
Proposed Solution:
.sql
file that gets created in R2 with the filename likedump_20240101-170000.sql
where it represents2024-01-01 17:00:00
The text was updated successfully, but these errors were encountered: