Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generically parameterizing queries #940

Closed
jtlapp opened this issue Sep 11, 2024 · 8 comments
Closed

Generically parameterizing queries #940

jtlapp opened this issue Sep 11, 2024 · 8 comments

Comments

@jtlapp
Copy link

jtlapp commented Sep 11, 2024

Hello. I'm trying to figure out how to complete this function:

async executeQuery(
    sql: ReturnType<typeof postgres>,
    query: string,
    args: { [key: string]: string | number | boolean }
  ): Promise<TBD> {
    // TBD
  }

query contains named placeholders for parameters, and args provides the values of those placeholders by key, which provides the placeholder name. I'm generically representing queries and dynamically providing their arguments.

I'm looking for the function's implementation and return type.

Is this possible to do in a safe way, with proper literal escaping?

(In case you're questioning the need to do this, it's for a series of benchmarking tests, each potentially written for a different platform, in a different framework, in a different language. Rather than copying the queries from implementation to implementation and maintaining them across implementations, I'm centralizing them. I don't even know what queries I'll end up using in the end, so I want to be able to centrally change the queries for all frameworks all at once as I experiment. The above is for the Deno implementation. Moreover, I won't be using the exact function above; it's just to teach me how to do this.)

@jtlapp
Copy link
Author

jtlapp commented Sep 13, 2024

If we had a method like file that accepted strings instead of files, that would do the job:

const result = await sql.file('query.sql', ['Murray', 68])

@tilemanrenovations
Copy link

typeof true);
Try it out.

@jtlapp
Copy link
Author

jtlapp commented Sep 13, 2024

typeof true);
Try it out.

Was there a copy-paste error? I'm grateful for any help!

@jtlapp
Copy link
Author

jtlapp commented Sep 13, 2024

It turns out that sql.unsafe(query, args) does exactly what I want. This isn't apparent from the documentation, which says nothing about how args are substituted into the query in this case. I was assuming that with a name like "unsafe" that it was literal substitution, not SQL escaping. From my experimentation, I found that this is not the case.

For others looking to solve this problem, here's what you need to know:

  • query accepts positional parameters of the form $1, $2, etc. (It does not also accept ${param} parameters.)
  • args is an array of values that are to be substituted for the positional parameters.
  • sql.unsafe will escape the values for safe embedding in SQL (though it may defer the work to postgres itself).

For completeness, can we assume that options are any of these?

@porsager
Copy link
Owner

porsager commented Sep 13, 2024

How about a PR to the readme then @jtlapp 😉 You'd be in the best position to make it since you know what's missing 👍 I'll help ensure it's correct if you start one.

The reasoning for the naming is that there is no way to ensure you haven't combined the query string yourself, so it's potentially unsafe even though you're using parameters.

query accepts positional parameters of the form $1, $2, etc. (It does not also accept ${param} parameters.)

There's nothing preventing a user from sticking a template literal for the query string in unsafe, where they can then use ${...}. The problem with that is that then they're again in the territory of just concatting strings. Hence unsafe.

args is an array of values that are to be substituted for the positional parameters.

This happens on the database level, Postgres.js does nothing to them. The string is sent as is, and the parameters are sent on the side as is (with the exception of serialization). I think conceptually looking at them like variables that the PostgreSQL planner uses in the query is a better mental picture.

sql.unsafe will escape the values for safe embedding in SQL (though it may defer the work to postgres itself).

The parameters are not escaped, they are transferred as is to PostgreSQL that then uses them in the $1, $2, etc positions.

For completeness, can we assume that options are any of these?

No, these are options for the connection / instance.

sql.unsafe currently responds to two options that are mutually exclusive:

  • prepare: will create a prepared statement with the query string as key for an automatically generated id. This is false by default because we can't know if the user is generating millions of different queries causing the DB to bloat in memory. Hence it's something the user explicitly needs to choose.

  • simple: will change the query to use the simple protocol which doesn't allow parameteres or prepared statements. But it does allow multiple statements. I'm thinking of deprecating this in favor of the .simple() chained method which will then work for both tagged queries and unsafe queries.

@porsager
Copy link
Owner

I'm looking for the function's implementation and return type.

Hehe.. Sorry, I actually disregarded this question as a Typescript issue because of this code sample.

Is this possible to do in a safe way, with proper literal escaping?

No it's not. Any time you can pass a regular string how should the library know what the user has cooked up?

(In case you're questioning the need to do this, it's for a series of benchmarking tests, each potentially written for a different platform, in a different framework, in a different language. Rather than copying the queries from implementation to implementation and maintaining them across implementations, I'm centralizing them. I don't even know what queries I'll end up using in the end, so I want to be able to centrally change the queries for all frameworks all at once as I experiment. The above is for the Deno implementation. Moreover, I won't be using the exact function above; it's just to teach me how to do this.)

I think that wouldn't give you the true picture of the various libraries - I think benchmarks ought to be written in the idiomatic style of each library. Even so, for Postgres.js you should add the { prepare: true }option to get closer to the way queries would be performed when using it the idiomatic way.

Be curious to see what you find. Here's some prior art if you're interested:

https://porsager.github.io/imdbench/sql.html
https://github.com/porsager/postgres-benchmarks

@jtlapp
Copy link
Author

jtlapp commented Sep 14, 2024

@porsager thank you for the detailed explanation! I posted a PR.

I'm primarily benchmarking concurrency, so the computational expense of the query shouldn't matter much, though memory usage may still be an issue.

I have found it challenging to find much that's helpful about what sorts of query combinations I need to create to get a good sense of the relative ability frameworks and platforms have for supporting concurrency. So I plan to play around. I want it easy and quick to do so, particularly because I'm building images for deployment to a Kubernetes cluster, where I'll be running the experiments.

But I do appreciate the reminder that the most reliable measures will come from using real-world implementations of the queries. Once I understand what sorts of queries I need, I'll likely then hard-code them to get trustworthy test results. And then as a bonus I'll also have meaningful latency benchmarks.

Thank you for your benchmark links, and thanks again for your assistance! I already have this working.

@jtlapp jtlapp closed this as completed Sep 14, 2024
@porsager
Copy link
Owner

@jtlapp sounds great :)

In the case of Postgres.js there is much more than computational. For instance using prepared statements implicitly is a huge gain, as well as ensuring pipelining the queries will work correctly. This all works the best when using Postgres.js properly with tagged template literals, but as mentioned, should be fine with { prepare: true } for .unsafe.

Would love a heads up here if the benchmarks are something you'll share.

This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants