Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TypeScript plugin or runtime tracing to allow query composition? #233

Closed
rattrayalex opened this issue Apr 18, 2021 · 19 comments
Closed

TypeScript plugin or runtime tracing to allow query composition? #233

rattrayalex opened this issue Apr 18, 2021 · 19 comments

Comments

@rattrayalex
Copy link

https://github.com/xialvjun/ts-sql-plugin is an interesting project that uses a TS plugin to typecheck sql query strings.
https://github.com/mmkal/slonik-tools/tree/master/packages/typegen#readme is also an interesting project that uses runtime tracing to detect query types on first run.

Both should allow you to do something roughly like this:

const notDeleted = (asOf = null) => sql`
  deleted_at is null or deleted_at > ${asOf || sql`now()`}
`

const myQuery = (bar) => sql`
  select * from mytable
  where ${notDeleted()}
  and foo = ${bar}
`

// result is fully typed
const result = await db.query(myQuery(bar))

Is pgTyped interested in experimenting with such techniques to enable query composition, for convenient sharing of common conditions etc?

@adelsz
Copy link
Owner

adelsz commented Apr 18, 2021

Yes, query composition would be great. I am open to proposals here.
I have tried brainstorming some approaches to make it work but all of them had significant problems.

As I see it, query composition is dynamic by the nature of it, so its resultant type cant be inferred at compile time.
It doesnt look like slonik can reliably achieve it either, extending your example:

const notDeleted = (asOf = null, panic = false) => {
  if (panic) return sql``;
  return sql`
    deleted_at is null or deleted_at > ${asOf || sql`now()`}
  `;
}

const myQuery = (bar, panic) => sql`
  select * from mytable
  where ${notDeleted(null, panic)}
  and foo = ${bar}
`

// result is fully typed
const result = await db.query(myQuery(bar, randomBool()))

Such a query would be invalid sometimes and its validity or resultant types can't be checked at compile time.

@rattrayalex
Copy link
Author

rattrayalex commented Apr 19, 2021

Cool, thanks for taking a look!

That particular example would be fine in terms of type safety, but an equivalent that conditionally injects something to the select clause, of course, would not be. The important thing is to prevent that.

I haven't peeked into the implementation of @slonik/typegen, but an approach like theirs could be used to perform runtime type validation of the response, ensuring the types do not lie.

For example (loosely):

const sql = specialMagic('./generated-queries');

const bookQuery = (id, getBar) => sql.FindBookByID`
  select foo, ${getBar ? sql`bar::text` : null} from books where id = ${id}
`
// on first run, introspects type, generates to ./generated-queries.js
const result = await bookQuery(1, true).run(conn)
result.bar // typed as unknown

// (reload project)

// on second run, types are verified – and pass!
const result = await bookQuery(1, true).run(conn)
result.bar // now typed as string, hooray!

// typescript will think this is fine, but bar is _not_ returned at runtime.
// but, the runtime validator will catch it and throw an error!
const bad = await bookQuery(1, false).run(conn)
result.bar // incorrectly typed as string 😱 but never reached 😌

If you configure whether and where to store these generated queries, this sql function imported can even just be the normal pgTyped import (you can require() and print to that file yourself).

The runtime type validation would come at a performance cost that might be unacceptable, however. And I'm not sure if it would have other advantages or disadvantages, for example when the db schema changes from what the running code expects.

@rattrayalex
Copy link
Author

I guess an alternative approach that wouldn't require runtime validation would be to allow template strings, but statically disallow them in select or returning clauses. I'm not sure how doable that would be.

const allowed = await sql.GetPerson`
  select id, name
  from persons
  where foo = bar
    ${checkAge ? sql`and age > 18` : null}
`.run(conn)

const notAllowed = await sql.GetPersonMaybeWithAge`
  select id, name ${getAge ? `, age` : null}
  from persons
  where foo = bar
`.run(conn)

I think this would be a pretty understandable restriction, and not too much of a problem – I imagine most places people want composition is in where clauses anyway.

Of course, thanks to the beautify of composition, if you really want the above, you could do this:

const personQuery = (checkAge) => sql`
  from persons
  where foo = bar
    ${checkAge ? sql`and age > 18` : null}
`
const idAndName = sql.GetPerson`select id, name ${personQuery(checkAge)}`.run();
const idNameAndAge = sql.GetPersonWithAge`select id, name, age ${personQuery(checkAge)}`.run();

It'd be really nice to be able to support static interpolations in these clauses if possible for something like sql`select ${commonFields}, another_field` but again I'm not sure how reasonable it'd be to implement (might require annotating as as const and making deep use of the TS language server or something).

@golergka
Copy link
Contributor

Could template literal types help with that? I didn't use this Typescript feature yet, but it looks like you could use it to detect all possible combinations of different strings at compile time:

https://www.typescriptlang.org/docs/handbook/release-notes/typescript-4-1.html#template-literal-types

@adelsz
Copy link
Owner

adelsz commented Apr 27, 2021

@golergka Good point, I actually did some heavy experimentation with them, even built a PoC type-level LR lexer+parser that built typed AST trees from string literals at compile-time.
In the end, it didn't work out well because Typescript has some very strict limits on typechecker recursion depths and the compile times were impacted as well. TS authors explicitly recommend against doing anything too smart with typed template literal types.

@maxpain
Copy link

maxpain commented Jun 16, 2021

@adelsz what do you think about this concept?

We could generate one pgtyped.d.ts for the entire project with the generic function query and mappings, where all our queries are typescript literals and mapped to corresponding params and results.

Example:

declare module '@pgtyped/query' {
  export type Mappings = {
    '\n			SELECT cluster_id\n			FROM cluster_subnets\n			WHERE $clientIP <<= subnet\n		': {
      params: {
        clientIP: string | null | void
      }
      result: {
        cluster_id: number
      }
    }
  }

  export function query<T extends keyof Mappings>(
    ...args: Mappings[T]['params'] extends void
      ? [T]
      : [T, Mappings[T]['params']]
  ): Promise<Mappings[T]['result'][]>
}

And use like this:

import query from '@pgtyped/query' 

// both result and params are strictly typed
const [subnet] = await query(`
  SELECT cluster_id
  FROM cluster_subnets
  WHERE $clientIP <<= subnet
`, { clientIP })

query function implementation:

import pg from 'postgres' // for example
import { parseTSQuery } from './loader/typescript'
import { processTSQueryAST } from './preprocessor-ts'

export default function query(...args) {
    const [query, params] = args

    // Actually, we should cache queryAst.
    const { query: queryAst } = parseTSQuery(query)

    const { query: processedQuery, bindings } = processTSQueryAST(
        queryAst,
        params
    )

    return pg.unsafe(processedQuery, bindings)
}

We also would make a type-safe generic es6 template tag (even with strictly typed params), but there is a problem in typescript itself

@maxpain
Copy link

maxpain commented Jun 22, 2021

Any thoughts, @adelsz, @rattrayalex ?

@adelsz
Copy link
Owner

adelsz commented Jun 22, 2021

This is an interesting idea @maxpain. I like the ergonomics of it. One thing I am worried about is its fragility.
Wouldn't adding a single whitespace to the query break its type (until pgtyped gets rerun again)?

@maxpain
Copy link

maxpain commented Jun 22, 2021

@adelsz yes, but what if a developer accidentally breaks SQL query and doesn't run pgtyped?

@maxpain
Copy link

maxpain commented Jun 22, 2021

In an ideal world, we would use TypeScript Type Providers, but this issue is unresolved since 2015 and we need something useful right now.

At the moment using pgtyped on big projects is pretty painful.

For example, at my projects, I have tons of imports at the beginning of the files. image
Moreover, we have to place all our queries at the beginning of the files to speed-up performance image

Yes, we tried to use SQL files and use imports like this:

import * as queries from './src/books/queries';

await queries.findBookById.run(
    {
      bookId: 42,
    },
    client,
  );

But it is very uncomfortable to scroll a file (or open a separate .sql file) every time you create a new query.
Moreover, you have to think about unique names of new SQL queries and when using separate SQL files you have to write all the annotations stuff:

/*
  @name selectSomeUsers
  @param ages -> (...)
  @param names -> (...)
*/
SELECT FROM users WHERE age in :ages or name in :names;

Someone may get benefits of reusing the same SQL queries, but in practice in different places you use different columns, joins, WHERE conditions, etc.

Also, we like to save context and write SQL queries in the same place we call/use them.

@adelsz
Copy link
Owner

adelsz commented Jun 22, 2021

Thanks for the detailed write-up @maxpain.

I can see how inconvenient some of the points you mentioned can be, but have to say that it also highly depends on developer preferences, IDE setup, etc.

The minimalism of the approach you are suggesting does look neat and might work for many people. As I mentioned, my core concern with it that it is very implicit about what is happening behind the scenes and it having some unexpected UX edgecases (like the whitespace one). This might make it unsuitable for developers who prefer less magical and more explicit code styles.

With that said, I will be glad to review & merge an implementation of the approach you are suggesting as long as it is a separate pgTyped flag/mode and doesn't break the existing TS mode. We might even make it a default at some point to simplify the onboarding of new pgTyped users. What is important for me is to leave our users a choice here.

rattrayalex added a commit to rattrayalex/ts-sql that referenced this issue Jun 22, 2021
I have not tested this (just made the intuitive change in github) but curious if it works / you are interested. This is intended as a proof-of-concept for adelsz/pgtyped#233
@rattrayalex
Copy link
Author

I also think the proposal from @maxpain looks interesting (and I really appreciate the motivating examples) though I don't think I see how it'd enable query composition? Perhaps I'm being dense…

I wonder if template literal types could be used to help with the whitespace issues? eg; https://github.com/codemix/ts-sql seems to handle multiple spaces, though it doesn't currently handle newlines (I put up a PR to see if this works).

I think you could ultimately then end up with something where the generated file looks a little more like this (using the second code block here as an example, and using the ts-sql AST for reference):

type Query1 = SelectStatement<
  Identifier<'id'> | Identifier<'name'>,
  From<Identifier<'persons'>>,
  | BinaryExpression<Identifier<'foo'>, '=', Identifier<'bar'>>
  | LogicalExpression<
      BinaryExpression<Identifier<'foo'>, '=', Identifier<'bar'>>,
      'AND',
      BinaryExpression<Identifier<'age'>, '>', NumericLiteral<18>>
    >
>;

type Query2 = SelectStatement<
  Identifier<'id'> | Identifier<'name'> | Identifier<'age'>,
  From<Identifier<'persons'>>,
  | BinaryExpression<Identifier<'foo'>, '=', Identifier<'bar'>>
  | LogicalExpression<
      BinaryExpression<Identifier<'foo'>, '=', Identifier<'bar'>>,
      'AND',
      BinaryExpression<Identifier<'age'>, '>', NumericLiteral<18>>
    >
>;

but I haven't thought about it a ton.

On an ergonomic note related to @maxpain's proposal and this tweak of it, I think it still may be helpful to require or allow arbitrary names for queries, to make it easier to command-click to the typedef for a given query.

@maxpain
Copy link

maxpain commented Jun 22, 2021

@rattrayalex

to make it easier to command-click to the typedef for a given query.

pgtyped

@rattrayalex
Copy link
Author

rattrayalex commented Jun 22, 2021

Ah, fair, thanks for that screencast. Looks pretty compelling to me! Excited to see the PR.

@adelsz
Copy link
Owner

adelsz commented Jun 22, 2021

I also think the proposal from @maxpain looks interesting (and I really appreciate the motivating examples) though I don't think I see how it'd enable query composition? Perhaps I'm being dense…

Sorry @rattrayalex, we did hijack this issue for a little, diverging from the original topic. I don't think @maxpain proposal is related to query composition. Please correct me if I am wrong here @maxpain.

I wonder if template literal types could be used to help with the whitespace issues?

I did some experiments with that, it is possible to remove whitespace using recursively inferred TS types like ts-sql is doing. The problem there is that such recursion does impact typechecking/compilation time and sooner or later you hit the typechecker recursion limit (especially with long SQL queries).

I guess an alternative approach that wouldn't require runtime validation would be to allow template strings, but statically disallow them in select or returning clauses. I'm not sure how doable that would be.

Unfortunately, statically disallowing certain interpolations would require statically parsing the query which isn't feasible due to the reasons above.

An alternative would be to use the TS language server for static analysis of such string interpolations. I am actually experimenting with this approach as part of a different project related to GraphQL query parsing, and the results so far aren't very reassuring. The TS typechecker API isn't well documented and has some serious limitations around type aliases and generics.

@maxpain
Copy link

maxpain commented Jun 22, 2021

@adelsz shall I create a new issue?

@rattrayalex
Copy link
Author

Ah, bummer. Thanks for all your deep research @adelsz. I think you're probably right that type checking performance might become too bad with an approach like the one I suggested there.

Though I suppose if it were to scale only based on depth of a given query (and not number of queries), a limit to how deep/complex you're allowed to interpolate might be a blessing in disguise anyway?

@adelsz
Copy link
Owner

adelsz commented Jun 22, 2021

@maxpain That would be great, thanks!

@adelsz
Copy link
Owner

adelsz commented Jun 22, 2021

Though I suppose if it were to scale only based on depth of a given query (and not number of queries), a limit to how deep/complex you're allowed to interpolate might be a blessing in disguise anyway?

To confidently limit the depth/complexity of the interpolation we would still need to parse the whole query and it is easy to hit the limits by just removing the whitespace. On top of that, SQL also has a very complex grammar requiring robust parsing.

@adelsz adelsz closed this as completed Jul 6, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants