[BUG] Chrome downloads fail despite `acceptDownloads` set #1777

rwoll · 2020-04-14T17:48:18Z

Context:

Playwright Version: 0.13.0
Operating System: macOS
Extra: node v13.12.0

Code Snippet

const playwright = require('playwright');

(async () => {
  for (const browserType of ['chromium']) {
    const browser = await playwright[browserType].launch();
    // [RAW] NB: Ensure acceptDownloads is set per https://github.com/microsoft/playwright/blob/bb0b6cd90a9f95c2337ddfcafb516c48318e4c43/docs/api.md#class-download
    const context = await browser.newContext({ acceptDownloads: true });
    const page = await context.newPage();

    page.on("download", (dl) => {
      console.log("Captured download:", dl.url());
      dl.path().then(console.log).catch(console.error);
    });

    await page.goto("https://upbeat-nightingale-6e32da.netlify.com/download");
    await page.click("text=Download");
    await page.waitFor(4_000);
    await browser.close();
  }
})();

Output:

$ node index.js
Captured download: https://upbeat-nightingale-6e32da.netlify.com/Archive.zip
Error: Pass { acceptDownloads: true } when you are creating your browser context.
    at Download.path (/Users/workspace/playwright-downloads/node_modules/playwright-core/lib/download.js:41:19)
    at Download.<anonymous> (/Users/workspace/playwright-downloads/node_modules/playwright-core/lib/helper.js:87:31)
    at Page.<anonymous> (/Users/workspace/playwright-downloads/index.js:12:10)
    at Page.emit (events.js:315:20)
    at new Download (/Users/workspace/playwright-downloads/node_modules/playwright-core/lib/download.js:32:20)
    at CRBrowser._downloadCreated (/Users/workspace/playwright-downloads/node_modules/playwright-core/lib/browser.js:37:26)
    at FrameSession._onDownloadWillBegin (/Users/workspace/playwright-downloads/node_modules/playwright-core/lib/chromium/crPage.js:535:47)
    at CRSession.<anonymous> (/Users/workspace/playwright-downloads/node_modules/playwright-core/lib/chromium/crPage.js:275:100)
    at CRSession.emit (events.js:315:20)
    at /Users/workspace/playwright-downloads/node_modules/playwright-core/lib/chromium/crConnection.js:137:47
  -- ASYNC --
    at Download.<anonymous> (/Users/workspace/playwright-downloads/node_modules/playwright-core/lib/helper.js:66:23)
    at Page.<anonymous> (/Users/workspace/playwright-downloads/index.js:12:10)
    at Page.emit (events.js:315:20)
    at new Download (/Users/workspace/playwright-downloads/node_modules/playwright-core/lib/download.js:32:20)
    at CRBrowser._downloadCreated (/Users/workspace/playwright-downloads/node_modules/playwright-core/lib/browser.js:37:26)
    at FrameSession._onDownloadWillBegin (/Users/workspace/playwright-downloads/node_modules/playwright-core/lib/chromium/crPage.js:535:47)
    at CRSession.<anonymous> (/Users/workspace/playwright-downloads/node_modules/playwright-core/lib/chromium/crPage.js:275:100)
    at CRSession.emit (events.js:315:20)
    at /Users/workspace/playwright-downloads/node_modules/playwright-core/lib/chromium/crConnection.js:137:47
    at runNextTicks (internal/process/task_queues.js:62:5)

Describe the bug

Despite passing { acceptDownloads: true } to the browser context (per the docs and error message), downloading fails complaining that acceptDownloads is not set.

Expected Behavior: Archive.zip is downloaded.

Misc.

Thanks for this amazing project! It's awesome to see the rapid feature development and progress! 🎉

The text was updated successfully, but these errors were encountered:

rwoll · 2020-04-14T18:35:26Z

Looking at this further, what I though might be a bug might actually just be a question since it actually looks like I'm using dl captured in page.on('downloads, …)` incorrectly (although the error message from the original bug description is confusing).

Could you clarify (either here) or add an example in the documentation of getting the download path from within the page.on('download', …) hanlder?

If I run the following example which more closely follows the docs:

const playwright = require('playwright');

(async () => {
  for (const browserType of ['chromium']) {
    const browser = await playwright[browserType].launch();
    // [RAW] NB: Ensure acceptDownloads is set per https://github.com/microsoft/playwright/blob/bb0b6cd90a9f95c2337ddfcafb516c48318e4c43/docs/api.md#class-download
    const context = await browser.newContext({acceptDownloads: true });
    const page = await context.newPage();


    await page.goto("https://upbeat-nightingale-6e32da.netlify.com/download");
    const [ download ] = await Promise.all([
      page.waitForEvent('download'), // wait for download to start
      page.click("text=Download"),
    ]);

    const path = await download.path();
    console.log("Download Path:", path);

    await page.waitFor(4_000);
    await browser.close();
  }
})();

The download path is printed and no error is trigger.

What's the difference between the download object in page.on('download', …) and page.waitForEvent('download)? Intuitively, they seem like they are tied to the same event (download) but the availability of the dl.path() differs.

If I change the original code to:

    page.on("download", (dl) => {
      page.waitFor(2_000).then(() => dl.path().then(console.log).catch(console.error));
    });

The path does get printed; however, this seems like a bad practice since it waits for arbitrary instead of an event (but if I wait for download in their, I'll get a Target Detached error).

Thanks!

pavelfeldman · 2020-04-14T21:32:28Z

In your original snippet, page.on("download", (dl) => {}) did not wait for download to actually finish. Log was chained to the path promise, but that promise did not have a chance to fulfill. The main program went running, was waiting 4 seconds and exited, regardless of whether the archive was downloaded.

You should make sure control flow waits for the download to happen. The Promise.all version takes care of it since you explicitly wait for download to happen in const path = await download.path(); line, before you close the browser. You don't need the page.waitFor(4000) there - you properly await for the download.

Please feel free to reopen if something was missing in my response!

rwoll · 2020-04-14T22:08:51Z

Thanks for your quick response! Could you clarify how you can wait for download to actually finish within the page.on handler?

While the Promise.all(…) variant works under some circumstances, it implies you know there is going to be a download generated from the click where as the page.on variant allows you to passively observe the downloads without needing to know about the click.

In your original snippet, page.on("download", (dl) => {}) did not wait for download to actually finish.

Is it possible to wait for the download to actually finish from with the page.on("download", (dl) => {})?

In my original snippet:

    page.on("download", (dl) => {
      console.log("Captured download:", dl.url());
      dl.path().then(console.log).catch(console.error);
    });

I assumed the .then(console.log) would implicitly wait for dl.path() to resolve; however, it immediately throws an error stating acceptDownloads is not set (even though it is).

I'm assuming the following are equivalent:

// NON async context
dl.path().then(console.log)

// async context
const path = await dl.path();
console.log(path);

SeekDaSky · 2020-04-22T09:35:58Z

I am facing the same issue, I can't wrap my head around why playwright asks for acceptDownloads : true while it's already set

my snippet is:

return new Promise((accept,reject) => {
    this.page.once('download', download => {
        download.path().then((file) => {
            let rawJSON = readFileSync(file,{encoding: 'utf8'})
            accept(JSON.parse(rawJSON))
        }).catch(e => reject(e))
     })

    //click on a button that triggers the download
})

I tried both non async and async context. This behavior is not limited to chromium, firefox is also not able to download the file, but it does not fire the "download" event at all, it instead displays the message " could not be saved, because the source file could not be read.".

It might be helping to know that I'm trying to download a Blob with a URL generated by URL.createObjectURL. The download is working if I manually use my browser and I passed acceptDownloads: true when creating the context.

yury-s · 2020-04-22T16:51:14Z

@SeekDaSky

my snippet is:

Can you paste complete snippet including the code where you create the context and the page?

It might be helping to know that I'm trying to download a Blob with a URL generated by URL.createObjectURL.

A code snippet would be helpful here as well. Can it be that the blob gets garbage collected by the browser before you start downloading it?

SeekDaSky · 2020-04-22T16:56:16Z

I am going to fiddle with this possible bug tomorrow and open a new issue if I find out it is a bug related to Blobs. I'll make sure to include a reproducer.

rwoll · 2020-04-22T20:07:11Z

@pavelfeldman / @yury-s: It doesn't look like I have permission to re-open this issue, but I think I figured out what is happening (at least for the originally reported issue). It looks to be that an Events.Page.Download is emitted (with reference to the Download) that hasn't quite fully initialized.

I've included a patch at the end. If you don't see any red flags, let me know and I will contribute an MR (including some additional tests).

Here's the current [Download constructor] (https://github.com/microsoft/playwright/blob/f5942295d413af10c552ce3583f5ec7ad70fbe7f/src/download.ts) from playwright:

  constructor(page: Page, downloadsPath: string, uuid: string, url: string) {
    this._page = page;
    this._downloadsPath = downloadsPath;
    this._uuid = uuid;
    this._url = url;
    this._finishedCallback = () => {};
    this._finishedPromise = new Promise(f => this._finishedCallback = f);
    for (const barrier of this._page._frameManager._signalBarriers)
      barrier.addDownload();
    this._page.emit(Events.Page.Download, this);
    page._browserContext._downloads.add(this);
    this._acceptDownloads = !!this._page._browserContext._options.acceptDownloads;
  }

Notice how the event is emitted

this._page.emit(Events.Page.Download, this);

before this._acceptDownloads is set:

this._acceptDownloads = !!this._page._browserContext._options.acceptDownloads;

.

If I move the event emission to the end of the constructor, dl.path() behaves correctly in path
page.on('download', …) handlers and page.waitForEvent('download') strategy.

RAW Debugging Notes

Start with playwright@0.13.0.

The following doesn't work and you'll get a misleading error message ((node:2027) UnhandledPromiseRejectionWarning: Error: Pass { acceptDownloads: true } when you are creating your browser context.) despite setting acceptDownloads to true:

(async () => {
  for (const browserType of ['chromium']) {
    const browser = await playwright[browserType].launch();
    // [RAW] NB: Ensure acceptDownloads is set per https://github.com/microsoft/playwright/blob/bb0b6cd90a9f95c2337ddfcafb516c48318e4c43/docs/api.md#class-download
    const context = await browser.newContext({acceptDownloads: true });
    const page = await context.newPage();


    page.on("download", async (dl) => {
      const path = await dl.path();
      console.log(path);
    });

    await page.goto("https://upbeat-nightingale-6e32da.netlify.com/download");
    await page.click("text=Download");
    // Do some other processing…
    await page.waitFor(10_000);
    await browser.close();
  }
})();

But if you add an "internal" guard:

diff --git a/index.js b/index2.js
index 75b3e8a..f48d450 100644
--- a/index.js
+++ b/index2.js
@@ -9,6 +9,7 @@ const playwright = require('playwright');
 
 
     page.on("download", async (dl) => {
+      await new Promise(res => setTimeout(res, 200));
       const path = await dl.path();
       console.log(path);
     });

everything works and you'll get the path printed.

Now, that's not very robust, let's wait on the internal download promise:

diff --git a/index2.js b/index3.js
index f48d450..dbb1bba 100644
--- a/index2.js
+++ b/index3.js
@@ -9,7 +9,7 @@ const playwright = require('playwright');
 
 
     page.on("download", async (dl) => {
-      await new Promise(res => setTimeout(res, 200));
+      await this._downloadFinished;
       const path = await dl.path();
       console.log(path);
     });

This also works.

This then lead into me diving into the Download implementation and discovering the "race" between the event emission vs. acceptDownloads being set.

If we patch playwright:

diff --git a/src/download.ts b/src/download.ts
index 200bd75..26047a4 100644
--- a/src/download.ts
+++ b/src/download.ts
@@ -41,9 +41,9 @@ export class Download {
     this._finishedPromise = new Promise(f => this._finishedCallback = f);
     for (const barrier of this._page._frameManager._signalBarriers)
       barrier.addDownload();
-    this._page.emit(Events.Page.Download, this);
     page._browserContext._downloads.add(this);
     this._acceptDownloads = !!this._page._browserContext._options.acceptDownloads;
+    this._page.emit(Events.Page.Download, this);
   }
 
   url(): string {

We don't need the random sleep or the waiting on this._downloadFinished. dl.path().then(…) just works. 🎉

rwoll · 2020-04-22T20:44:49Z

@SeekDaSky I just confirmed that what I said in #1777 (comment) also applies to the following Blob example as well:

<!DOCTYPE html>
<html>
  <head>
    <title>SeekDaSky Blob Example</title>
  </head>
  <body>
    <script>
      const download = (data, filename) => {
      const a = document.createElement("a");
      a.style = "display: none";
      document.body.appendChild(a);
      a.style = "display: none";

      const json = JSON.stringify(data);
      const blob = new Blob([json], { type: "octet/stream" });
      const url = window.URL.createObjectURL(blob);
      a.href = url;
      a.download = filename;
      a.click();
      window.URL.revokeObjectURL(url);
      document.body.removeChild(a);
    };

    const downloadIt = () => {
      download({ foo: 1, bar: 2}, "example.json");
    }
    </script>
    <a onclick="javascipt:downloadIt();">Download</a>
  </body>
</html>

In each of the examples, you can sub the target url for https://c7ecc89a4366bf00.netlify.app/seekdasky-example (which hosts the above file) and the same conclusions apply.

…et (microsoft#1777)

This patch fixes `download.path()` throwing an error that `acceptDownloads` is not set when using a download within a `page.on` handler even when `acceptDownloads` had been set to `true`; Fixes microsoft#1777

SeekDaSky · 2020-04-23T08:19:10Z

Wow thank you very much @rwoll for the in-depth debugging and quick fix, this fixed my issues with chromium but firefox is still unable to download Blob (but it is able to download files). I opened a separate issue for this (#1936 )

rwoll changed the title ~~[BUG]~~ [BUG] Chrome downloads fail despite acceptDownloads set Apr 14, 2020

pavelfeldman closed this as completed Apr 14, 2020

rwoll added a commit to rwoll/playwright that referenced this issue Apr 22, 2020

fix(downloads): ensure download.path available when acceptDownloads s…

ef009fe

…et (microsoft#1777)

rwoll mentioned this issue Apr 22, 2020

fix(downloads): fix acceptDownloads complaint (#1777) #1923

Merged

pavelfeldman pushed a commit that referenced this issue Apr 23, 2020

fix(downloads): fix acceptDownloads complaint (#1777) (#1923)

53c78a8

SeekDaSky mentioned this issue Apr 23, 2020

[BUG] Firefox cannot download Blobs #1936

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] Chrome downloads fail despite `acceptDownloads` set #1777

[BUG] Chrome downloads fail despite `acceptDownloads` set #1777

rwoll commented Apr 14, 2020 •

edited

Loading

rwoll commented Apr 14, 2020 •

edited

Loading

pavelfeldman commented Apr 14, 2020 •

edited

Loading

rwoll commented Apr 14, 2020 •

edited

Loading

SeekDaSky commented Apr 22, 2020 •

edited

Loading

yury-s commented Apr 22, 2020 •

edited

Loading

SeekDaSky commented Apr 22, 2020

rwoll commented Apr 22, 2020 •

edited

Loading

rwoll commented Apr 22, 2020

SeekDaSky commented Apr 23, 2020

[BUG] Chrome downloads fail despite acceptDownloads set #1777

[BUG] Chrome downloads fail despite acceptDownloads set #1777

Comments

rwoll commented Apr 14, 2020 • edited Loading

rwoll commented Apr 14, 2020 • edited Loading

pavelfeldman commented Apr 14, 2020 • edited Loading

rwoll commented Apr 14, 2020 • edited Loading

SeekDaSky commented Apr 22, 2020 • edited Loading

yury-s commented Apr 22, 2020 • edited Loading

SeekDaSky commented Apr 22, 2020

rwoll commented Apr 22, 2020 • edited Loading

rwoll commented Apr 22, 2020

SeekDaSky commented Apr 23, 2020

[BUG] Chrome downloads fail despite `acceptDownloads` set #1777

[BUG] Chrome downloads fail despite `acceptDownloads` set #1777

rwoll commented Apr 14, 2020 •

edited

Loading

rwoll commented Apr 14, 2020 •

edited

Loading

pavelfeldman commented Apr 14, 2020 •

edited

Loading

rwoll commented Apr 14, 2020 •

edited

Loading

SeekDaSky commented Apr 22, 2020 •

edited

Loading

yury-s commented Apr 22, 2020 •

edited

Loading

rwoll commented Apr 22, 2020 •

edited

Loading