Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix pgdump-based backups to work again (even after result too long to hold in single nodejs string) #325

Closed
Venryx opened this issue Jun 3, 2024 · 0 comments

Comments

@Venryx
Copy link
Collaborator

Venryx commented Jun 3, 2024

Problem

The production cluster's database contents are large enough that when pg_dump is run for backing up the database, the output string is too large for the NodeJS script on the caller's computer to receive it into a single string (NodeJS has an upper limit on the size of a string).

This causes this backup route to fail, with the error: Got error during execution: Error: Cannot create a string longer than 0x1fffffe8 characters

Solution

The GraphQL API will need to be changed to allow for the pgdump's contents to be transferred from the server to the nodejs script in smaller chunks, which nodejs will then just append to the backup file it wants to create.

The rust part of the backup process is here: https://github.com/debate-map/app/blob/15163d468b922c7e43eb13c3f3347ea19656cf44/Packages/app-server/src/db/general/backups.rs
The NodeJS script part is here: https://github.com/debate-map/app/blob/19a86cbe6e0de0e5bba1269f05bedb66aabdbba2/Scripts/DBBackups/GQLBackupHelper.js

Probably the easiest way to introduce "chunking" is by switching the get_db_dump from being a graphql query to instead being a graphql subscription. You can see an example of using a graphql subscription (to send data in multiple parts) here:

async fn testingLogEntries<'a>(&self, ctx: &'a async_graphql::Context<'_>, admin_key: String) -> impl Stream<Item = Result<TestingLogEntry, SubError>> + 'a {

On the NodeJS side, this will complicate the logic of course, since now instead of being a simple fetch call, there will need to be a websocket connection made, and then iterative processing of that data-stream as it appends to an output file. It's unclear to me atm whether this is able to be done with reasonable ease using native NodeJS apis, or if a library like @apollo/client will be necessary to import into the GQLBackupHelper.js file. (I'll leave that up to the you to evaluate/decide)


To start things along, I have created a new branch for working on this feature named "alvinosh/add-chunking-to-dbdump": https://github.com/debate-map/app/tree/alvinosh/add-chunking-to-dbdump

I have temporarily modified the try_get_db_dump function here to generate enough fake data so that you can replicate the NodeJS string-length limit issue.

And then for running the test backup command, run this (the nodejs script has some instructions on how to get the jwt-contents):

node ./Scripts/DBBackups/GQLBackupHelper.js backup --dev --jwt "PUT_YOUR_JWT_CONTENTS_HERE"
@Venryx Venryx closed this as completed Jul 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

No branches or pull requests

1 participant