Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Partially fix #3160 by properly handling resultsFormat query #5917

Merged
merged 10 commits into from
May 27, 2024

Conversation

kossnocorp
Copy link
Contributor

@kossnocorp kossnocorp commented May 25, 2024

TL;DR: Cloudflare D1 has a critical bug in Workers SDK reproducible on the Cloudflare D1 Console and when running D1 through Wrangler. The PR fixes the bug and also improves compatibility with workerd.

The problem

The essence of the problem is selecting two columns with similar names:

SELECT
    projects.name,
    organizations.name
FROM
    projects
    JOIN organizations ON projects.organization_id = organizations.id;

...must return in the row containing both columns (tested on PostgreSQL v16.3):

test=# SELECT projects.name, organizations.name
        FROM projects
        JOIN organizations ON projects.organization_id = organizations.id;
    name    |    name    
------------+------------
 Cloudflare | Cloudflare
(1 row)

SQLite (on a database created by Wrangler) gives a similar result:

sqlite> SELECT
    projects.name,
    organizations.name
FROM
    projects
    JOIN organizations ON projects.organization_id = organizations.id;
Daisy Chain|Daisy Chain
Fake Daisy Chain|Fake Org
Mind Control|Daisy Chain
sqlite> 

I also added a test for this behavior to workerd that is passing:

await itShould(
  // ...
  () =>
    DB.prepare(
      `SELECT projects.name, organizations.name
      FROM projects
      JOIN organizations ON projects.organization_id = organizations.id;`
    ).raw(),
  [['Cloudflare', 'Cloudflare']]
)

However, when sending the same request through Workers SDK, I get:

| name       |
| Cloudflare |

As both columns have the same name, name, they got squashed into a single column.

A GitHub user raised the issue a year ago and since then it got fixed in workerd (that my test confirms), but it is still present in Workers SDK.

The problem is also reproducible in Cloudflare D1 Console, where I get the same data:

| name       |
| Cloudflare |

It breaks people's code in unexpected ways and creates an impression of D1 as an unreliable product that might behave differently in production.

In my opinion, it's a critical bug that must be prioritized.

The source

I chased the bug through wranglerworkerdminiflare to these line in database.worker.ts:

const cursor = this.db.prepare(query.sql)(...params);
const results = convertResults(all(cursor));

When running the test:

await db
  .prepare(
	`SELECT ${tableColours}.name, ${tablePalettes}.name FROM ${tableColours} JOIN ${tablePalettes} ON ${tableColours}.id = ${tablePalettes}.colour_id`
  )
  .raw();

...the cursor object passed through all results in the given structure:

[{ name: "Night" }]

You can see that at this point, the information is already lost, and the name columns have been reduced to one property.

After further researching I found that workerd sends the resultsFormat query that determines the format of the results.

It worked before only because workerd handles the case when the API always returns an array of objects either by sheer luck or as a workaround for Workers SDK behavior.

The solution

The PR fixes the problem and adds proper handling of the resultsFormat query.

It makes raw preserve columns with the same name, addressing the original problem.

It also adds handling of NONE which workerd sends with run and exec (hence the change in existing test cases).

Extras

Here's the SQL I used to test the behavior:

CREATE TABLE organizations (id INTEGER PRIMARY KEY, name TEXT);

CREATE TABLE projects (
    id INTEGER PRIMARY KEY,
    name TEXT,
    organization_id INTEGER,
    FOREIGN KEY (organization_id) REFERENCES organizations(id)
);

INSERT INTO
    organizations (id, name)
VALUES
    (1, 'Cloudflare');
    
INSERT INTO
    projects (id, name, organization_id)
VALUES
    (1, 'Cloudflare', 1);
    
SELECT
    projects.name,
    organizations.name
FROM
    projects
    JOIN organizations ON projects.organization_id = organizations.id;

What this PR solves / how to test

The PR fixes #3160.

Author has addressed the following

  • Tests
    • TODO (before merge)
    • Included
    • Not necessary because:
  • E2E Tests CI Job required? (Use "e2e" label or ask maintainer to run separately)
    • I don't know
    • Required / Maybe required
    • Not required because:
  • Changeset (Changeset guidelines)
    • TODO (before merge)
    • Included
    • Not necessary because:
  • Public documentation

@kossnocorp kossnocorp requested a review from a team as a code owner May 25, 2024 07:39
Copy link

changeset-bot bot commented May 25, 2024

🦋 Changeset detected

Latest commit: 41624d9

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 4 packages
Name Type
miniflare Minor
@cloudflare/pages-shared Patch
@cloudflare/vitest-pool-workers Patch
wrangler Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

Copy link
Contributor

github-actions bot commented May 25, 2024

A wrangler prerelease is available for testing. You can install this latest build in your project with:

npm install --save-dev https://prerelease-registry.devprod.cloudflare.dev/workers-sdk/runs/9256924792/npm-package-wrangler-5917

You can reference the automatically updated head of this PR with:

npm install --save-dev https://prerelease-registry.devprod.cloudflare.dev/workers-sdk/prs/5917/npm-package-wrangler-5917

Or you can use npx with this latest build directly:

npx https://prerelease-registry.devprod.cloudflare.dev/workers-sdk/runs/9256924792/npm-package-wrangler-5917 dev path/to/script.js
Additional artifacts:
npx https://prerelease-registry.devprod.cloudflare.dev/workers-sdk/runs/9256924792/npm-package-create-cloudflare-5917 --no-auto-update
npm install https://prerelease-registry.devprod.cloudflare.dev/workers-sdk/runs/9256924792/npm-package-cloudflare-kv-asset-handler-5917
npm install https://prerelease-registry.devprod.cloudflare.dev/workers-sdk/runs/9256924792/npm-package-miniflare-5917
npm install https://prerelease-registry.devprod.cloudflare.dev/workers-sdk/runs/9256924792/npm-package-cloudflare-pages-shared-5917
npm install https://prerelease-registry.devprod.cloudflare.dev/workers-sdk/runs/9256924792/npm-package-cloudflare-vitest-pool-workers-5917

Note that these links will no longer work once the GitHub Actions artifact expires.


wrangler@3.57.1 includes the following runtime dependencies:

Package Constraint Resolved
miniflare workspace:* 3.20240512.0
workerd 1.20240512.0 1.20240512.0
workerd --version 1.20240512.0 2024-05-12

Please ensure constraints are pinned, and miniflare/workerd minor versions match.

This PR fixes cloudflare#3160 caused by ignoring the `resultsFormat` query that `workerd` sends.

The commits add proper handling of the query. It makes `raw` preserve columns with the same name by handling `resultsFormat=ROWS_AND_COLUMNS` and removes `results` from `run` as it sends `resultsFormat=NONE` (see https://github.com/cloudflare/workerd/blob/1d89f3b8e9cdcd898ea486656d72d9551e79f4a3/src/cloudflare/internal/d1-api.ts#L295)
@kossnocorp kossnocorp changed the title Add failing test case for #3160 Fix #3160 by properly handling resultsFormat query May 25, 2024
@kossnocorp
Copy link
Contributor Author

@geelen @petebacondarwin please take a look again. I found the root cause of the problem and fixed the issue.

I bealive it's a critical bug as it breaks people code in unexpected way and creates an impression of D1 as unreliable product that might behave differently in production.

I'm confident in the fix quality, so I think it won't create much work for you and fix an important problem for customers.

This PR fixes cloudflare#3160 caused by ignoring the `resultsFormat` query that `workerd` sends.

The commits add proper handling of the query. It makes `raw` preserve columns with the same name by handling `resultsFormat=ROWS_AND_COLUMNS` and removes `results` from `run` as it sends `resultsFormat=NONE` (see https://github.com/cloudflare/workerd/blob/1d89f3b8e9cdcd898ea486656d72d9551e79f4a3/src/cloudflare/internal/d1-api.ts#L295)
@rozenmd
Copy link
Contributor

rozenmd commented May 27, 2024

Sorry, looks like the build is failing - can you run prettier against packages/miniflare/test/plugins/d1/test.ts?

@rozenmd
Copy link
Contributor

rozenmd commented May 27, 2024

nvm found the issue, missing ;

@kossnocorp
Copy link
Contributor Author

@rozenmd sorry, I had to turn off automatic formatting as it was changing unrelated files, particularly in workerd. Do you want me try to fix formatting and force push the branch?

@rozenmd
Copy link
Contributor

rozenmd commented May 27, 2024

That's alright @kossnocorp - I can fix it up

@kossnocorp
Copy link
Contributor Author

@rozenmd thank you, I appreciate it!

@rozenmd
Copy link
Contributor

rozenmd commented May 27, 2024

Looked at this a bit more - can you remove this part?

It also adds handling of NONE which workerd sends with run and exec (hence the change in existing test cases).

We initially released this change in workerd and had to rapidly roll-forward a fix in our D1 Worker to prevent breaking compatibility with user Workers that relied on run/exec to return results. We intend to fix this properly via a compatibility_date later.

@kossnocorp
Copy link
Contributor Author

@rozenmd sure, I'm on it.

kossnocorp added a commit to kossnocorp/workers-sdk that referenced this pull request May 27, 2024
@kossnocorp
Copy link
Contributor Author

@rozenmd done!


// Check whether workerd raw test case passes here too
// Note that this test did not pass with the old binding
if (!t.context.bindings["__D1_BETA__DB"]) {
Copy link
Contributor

@rozenmd rozenmd May 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test suite is dynamic (yay) and gets run by both versions of the D1 binding.

I had to add this check to ensure that it's fixed only for new bindings (since old versions of wrangler have the old binding, and we have no control over that code)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fascinating!

@rozenmd rozenmd requested a review from CarmenPopoviciu May 27, 2024 14:45
@rozenmd rozenmd changed the title Fix #3160 by properly handling resultsFormat query Partially fix #3160 by properly handling resultsFormat query May 27, 2024
@rozenmd rozenmd merged commit 64ccdd6 into cloudflare:main May 27, 2024
19 checks passed
@kossnocorp kossnocorp deleted the join-squash branch May 27, 2024 20:08
@kossnocorp
Copy link
Contributor Author

@rozenmd thank you for helping me to land this PR into the main branch 🙏

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
d1 Relating to D1
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

🐛 BUG: when trying to left join 2 tables that have same name columns - second column is not returned
5 participants