-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal: Improvements to sqlx::query()
for Query Builders, and Explicit Injection Safety
#1594
Comments
A thought I had was to not have a The main idea with that would be to make it more obvious that the parameter is still supposed to be "some sort of string" but with an extra marker trait attached. |
I'm stumbling across this right now, so let me add another use case. It's pretty common to want to sort a query based on user preferences or request parameters. The easiest way to do this is to dynamically build the I had to construct a truly grotesque SQL abomination to try to work around this, where the user's preferred sort order is expressed as a three-element array of an enum type: ORDER BY
CASE $2[0]
WHEN 'original' THEN COALESCE( a.sortable_name, a.name )
WHEN 'transcripted' THEN COALESCE( a.transcripted_sortable_name, a.transcripted_name )
WHEN 'translated' THEN COALESCE( a.translated_sortable_name, a.translated_name )
END ASC,
CASE $2[1]
WHEN 'original' THEN COALESCE( a.sortable_name, a.name )
WHEN 'transcripted' THEN COALESCE( a.transcripted_sortable_name, a.transcripted_name )
WHEN 'translated' THEN COALESCE( a.translated_sortable_name, a.translated_name )
END ASC,
CASE $2[2]
WHEN 'original' THEN COALESCE( a.sortable_name, a.name )
WHEN 'transcripted' THEN COALESCE( a.transcripted_sortable_name, a.transcripted_name )
WHEN 'translated' THEN COALESCE( a.translated_sortable_name, a.translated_name )
END ASC I'm not even sure if this works, because now I have other lifetime issues. Notably, the parameter passed to I'm happy to open another issue for that if that would be helpful. |
Oops, ignore the bit about |
Thanks. This looks like an interesting approach. FWIW, I initially tried using the compile-time query macros, but I gave up because there's no non-hacky way to support custom types (like |
I'm gonna add a bit of info as question that sparked this issue was asked by me. Bit that led me to investigation of
I can guarantee that Per my use case, and I failed to clarify that initially, it would be beneficial if I could
which currently causes postgres connections to hang indefinitely
which can't compile This is more of a nitpick about how code looks and how much extras are there rather than a real issue. |
We actually just released support for the The reason the version using |
If that's true I'm in awe 😄 |
Take a look at the Scala library Relate and its It's basically a string-like type for a known-safe query fragment that also carries a list of values to bind to the parameters. In Rust, maybe something like struct InterpolatedQuery {
query: Cow<'static, str>,
params: Vec<Box<dyn Encode>>,
} You could have a safe constructor from let where: InterpolatedQuery = format_query!("id = ?", id); // or maybe `format_query!("id = {id}")` and insert the placeholder
let sort = InterpolatedQuery::sql("foo");
sqlx::query(format_query!("select foo from bar where {where} order by {sort}")) |
I found a workaround for the issue I had with wanting to generate a query dynamically and then return a stream from calling use futures_util::stream::BoxStream;
use sqlx::PgPool;
#[derive(Clone, Debug)]
pub struct DB {
pool: PgPool,
}
impl DB {
pub fn get_item<'a>(
&'a self,
some_id: &str,
select: &'a mut String,
) -> BoxStream<'a, Result<Item, sqlx::Error>> {
*select = format!("...", ...);
sqlx::query_as::<_, Item>(select)
.bind(some_id)
.fetch(&self.pool)
}
} Then the caller can easily arrange to keep the string used for the |
I'm not sure if this proposal would cover it, but I'd love to be able to specify that the table identifiers were injection-safe and to provide the code myself for generating those identifiers. PostgreSQL's
And have the schema for PostGIS determined at runtime. |
I had the exact same issue, thanks for that idea. I do wish I had a way to make my own query with |
As per @luke-biel's earlier comment:
I am using something similar where I input |
SQLx desperately needs some facility like this, however it's spelled. I don't think the current API actually has much security benefit - in the most common case, you're relying on someone who is unaware of SQL injection issues being repelled by the code smell of having to take a reference to a constructed string. I think programmers that combine such a lack of awareness with such delicate sensibilities are rare. On the other hand, this pattern makes many more sophisticated uses an enormous pain. Progressive construction of query strings becomes impossible in many scenarios because of the lifetime on the |
We have worked around this interally using the async_stream crate, which can be used to trivially create anonymous
If the return type needs to be a boxed stream then this unfortunately adds a level of indirection, but under our workload we haven't noticed any significant performance impact from this. |
I'd love to see robust sanitisation of table and schema names for querybuilder. I can make whole sanitisation by myself, but... this would mean I don't need query builder. An example for tables for SQlite is below. Basically, it's possible to insert all sorts of characters. to name a few:
PS: nevermind, I used |
In case it's helpful for other people running into this issue who need a workaround for the time being: async fn query_with(
&self,
query: FancyQuery,
bind: impl for<'q> FnOnce(Query<'q, Sqlite, SqliteArguments<'q>>) -> Query<'q, Sqlite, SqliteArguments<'q>>,
) -> Result<Vec<Row>> {
let query_text = query.generate();
bind(sqlx::query(&query_text)).fetch_all(&self.db).await
} and then use it like: let rows = db.query_with(fancy_query, |q| q.bind(1).bind("hi")); The closure lets you provide the binds in the appropriate lifetime, and avoids dealing with traits for big tuples etc. that might be necessary with passing binds as parameters. |
Status Quo
At the core of the
sqlx::query()
family of functions lies a contentious design choice that goes all the way back to the initial prototypes of SQLx:The query string must be a string slice, usually
'static
but can also be borrowed. This was a deliberate choice, because allowingString
to be passed directly makes it extremely tempting to simplyformat!()
user input into the query string instead of using bind parameters, which would easily introduce SQL injection vulnerabilities into the application. For example, to co-opt XKCD #327:Then suddenly, you've lost all student data for the year because the query that got executed looked like this:
(Postgres would reject the attempt to prepare a query string with more than one statement, but MySQL and SQLite wouldn't.)
The idea is that by requiring dereffing to a string slice, it will at least make the code start to smell a bit and hopefully lead to the user re-evaluating their implementation:
The "easier" path is intended to be using bind parameters, which by design do not introduce injection vulnerabilities because the database knows not to interpret bound values as part of the SQL. By comparison, this implementation is completely injection-safe, and is the approach that the API design was intended to push users towards:
However, it does have its downsides.
The intention is not at all clear in the design.
Time and again, users have opened issues or questions on the discussion board or asked on Discord about why the API is designed this way, and it gets very tedious to explain the reasoning (I admittedly have copy-pasted large chunks of this first part from a Q&A so I didn't have to write it again). This isn't a complaint about people asking questions, of course, it's more an expression of annoyance that the question needs to be asked in the first place.
Of course, we could just explain it better in the docs, which we should be doing anyways. But it's still an issue that it's even necessary.
The design doesn't give off very obvious signals when the user is doing something wrong.
All you have to do for the function to accept a
String
is to add a&
and let auto-deref do the rest. Yeah, it's a code smell, but it reeks more of poor API design than injection vulnerability. I personally would complain about an API so inflexible that all I could pass is&str
; what if I want to passString
orArc<str>
?And if all a user has to do is add a
&
, is there really any opportunity for them to gain insight into why they shouldn't be doing that?There's legitimate use-cases for passing a constructed
String
if the user knows what they're doing.This design decision is notorious for making it very difficult to implement query builders on top of SQLx, because of the lifetime issues that it introduces: #1428
Proposal
I don't want to just allow
String
to be passed directly, because I think it's really important for the design of an API to avoid straight-up handing the user a footgun. We can warn about injection vulnerabilities in the docs all we want, but as long as it's the path of least resistance, some people will still end up shooting themselves. And allowing that to happen just isn't the Rust way.I think we could copy a page from
std
's notebook here and introduce something similar toUnwindSafe
: https://doc.rust-lang.org/stable/std/panic/trait.UnwindSafe.htmlExisting usage with
&'static str
would remain unchanged:However, usage of dynamic query strings would now be forced to assert that they do not contain SQL injection vulnerabilities, and in exchange would be freed of lifetime constraints:
Of course, it is still possible to bypass using
AssertInjectionSafe
, but it's a more significant code smell (and involves deliberately introducing a memory leak):And ideally, on the way the compiler would have informed them that their query string needs to implement
InjectionSafe
which would have prompted them to check the docs and learn about SQL injection and why what they're trying to do is deliberately annoying.The text was updated successfully, but these errors were encountered: