Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Chrome downloads fail despite acceptDownloads set #1777

Closed
rwoll opened this issue Apr 14, 2020 · 9 comments · Fixed by #1923
Closed

[BUG] Chrome downloads fail despite acceptDownloads set #1777

rwoll opened this issue Apr 14, 2020 · 9 comments · Fixed by #1923

Comments

@rwoll
Copy link
Member

rwoll commented Apr 14, 2020

Context:

  • Playwright Version: 0.13.0
  • Operating System: macOS
  • Extra: node v13.12.0

Code Snippet

const playwright = require('playwright');

(async () => {
  for (const browserType of ['chromium']) {
    const browser = await playwright[browserType].launch();
    // [RAW] NB: Ensure acceptDownloads is set per https://github.com/microsoft/playwright/blob/bb0b6cd90a9f95c2337ddfcafb516c48318e4c43/docs/api.md#class-download
    const context = await browser.newContext({ acceptDownloads: true });
    const page = await context.newPage();

    page.on("download", (dl) => {
      console.log("Captured download:", dl.url());
      dl.path().then(console.log).catch(console.error);
    });

    await page.goto("https://upbeat-nightingale-6e32da.netlify.com/download");
    await page.click("text=Download");
    await page.waitFor(4_000);
    await browser.close();
  }
})();

Output:

$ node index.js
Captured download: https://upbeat-nightingale-6e32da.netlify.com/Archive.zip
Error: Pass { acceptDownloads: true } when you are creating your browser context.
    at Download.path (/Users/workspace/playwright-downloads/node_modules/playwright-core/lib/download.js:41:19)
    at Download.<anonymous> (/Users/workspace/playwright-downloads/node_modules/playwright-core/lib/helper.js:87:31)
    at Page.<anonymous> (/Users/workspace/playwright-downloads/index.js:12:10)
    at Page.emit (events.js:315:20)
    at new Download (/Users/workspace/playwright-downloads/node_modules/playwright-core/lib/download.js:32:20)
    at CRBrowser._downloadCreated (/Users/workspace/playwright-downloads/node_modules/playwright-core/lib/browser.js:37:26)
    at FrameSession._onDownloadWillBegin (/Users/workspace/playwright-downloads/node_modules/playwright-core/lib/chromium/crPage.js:535:47)
    at CRSession.<anonymous> (/Users/workspace/playwright-downloads/node_modules/playwright-core/lib/chromium/crPage.js:275:100)
    at CRSession.emit (events.js:315:20)
    at /Users/workspace/playwright-downloads/node_modules/playwright-core/lib/chromium/crConnection.js:137:47
  -- ASYNC --
    at Download.<anonymous> (/Users/workspace/playwright-downloads/node_modules/playwright-core/lib/helper.js:66:23)
    at Page.<anonymous> (/Users/workspace/playwright-downloads/index.js:12:10)
    at Page.emit (events.js:315:20)
    at new Download (/Users/workspace/playwright-downloads/node_modules/playwright-core/lib/download.js:32:20)
    at CRBrowser._downloadCreated (/Users/workspace/playwright-downloads/node_modules/playwright-core/lib/browser.js:37:26)
    at FrameSession._onDownloadWillBegin (/Users/workspace/playwright-downloads/node_modules/playwright-core/lib/chromium/crPage.js:535:47)
    at CRSession.<anonymous> (/Users/workspace/playwright-downloads/node_modules/playwright-core/lib/chromium/crPage.js:275:100)
    at CRSession.emit (events.js:315:20)
    at /Users/workspace/playwright-downloads/node_modules/playwright-core/lib/chromium/crConnection.js:137:47
    at runNextTicks (internal/process/task_queues.js:62:5)

Describe the bug

Despite passing { acceptDownloads: true } to the browser context (per the docs and error message), downloading fails complaining that acceptDownloads is not set.

Expected Behavior: Archive.zip is downloaded.

Misc.

Thanks for this amazing project! It's awesome to see the rapid feature development and progress! 🎉

@rwoll rwoll changed the title [BUG] [BUG] Chrome downloads fail despite acceptDownloads set Apr 14, 2020
@rwoll
Copy link
Member Author

rwoll commented Apr 14, 2020

Looking at this further, what I though might be a bug might actually just be a question since it actually looks like I'm using dl captured in page.on('downloads, …)` incorrectly (although the error message from the original bug description is confusing).

Could you clarify (either here) or add an example in the documentation of getting the download path from within the page.on('download', …) hanlder?


If I run the following example which more closely follows the docs:

const playwright = require('playwright');

(async () => {
  for (const browserType of ['chromium']) {
    const browser = await playwright[browserType].launch();
    // [RAW] NB: Ensure acceptDownloads is set per https://github.com/microsoft/playwright/blob/bb0b6cd90a9f95c2337ddfcafb516c48318e4c43/docs/api.md#class-download
    const context = await browser.newContext({acceptDownloads: true });
    const page = await context.newPage();


    await page.goto("https://upbeat-nightingale-6e32da.netlify.com/download");
    const [ download ] = await Promise.all([
      page.waitForEvent('download'), // wait for download to start
      page.click("text=Download"),
    ]);

    const path = await download.path();
    console.log("Download Path:", path);

    await page.waitFor(4_000);
    await browser.close();
  }
})();

The download path is printed and no error is trigger.

What's the difference between the download object in page.on('download', …) and page.waitForEvent('download)? Intuitively, they seem like they are tied to the same event (download) but the availability of the dl.path() differs.

If I change the original code to:

    page.on("download", (dl) => {
      page.waitFor(2_000).then(() => dl.path().then(console.log).catch(console.error));
    });

The path does get printed; however, this seems like a bad practice since it waits for arbitrary instead of an event (but if I wait for download in their, I'll get a Target Detached error).

Thanks!

@pavelfeldman
Copy link
Member

pavelfeldman commented Apr 14, 2020

In your original snippet, page.on("download", (dl) => {}) did not wait for download to actually finish. Log was chained to the path promise, but that promise did not have a chance to fulfill. The main program went running, was waiting 4 seconds and exited, regardless of whether the archive was downloaded.

You should make sure control flow waits for the download to happen. The Promise.all version takes care of it since you explicitly wait for download to happen in const path = await download.path(); line, before you close the browser. You don't need the page.waitFor(4000) there - you properly await for the download.

Please feel free to reopen if something was missing in my response!

@rwoll
Copy link
Member Author

rwoll commented Apr 14, 2020

Thanks for your quick response! Could you clarify how you can wait for download to actually finish within the page.on handler?

While the Promise.all(…) variant works under some circumstances, it implies you know there is going to be a download generated from the click where as the page.on variant allows you to passively observe the downloads without needing to know about the click.


In your original snippet, page.on("download", (dl) => {}) did not wait for download to actually finish.

Is it possible to wait for the download to actually finish from with the page.on("download", (dl) => {})?

In my original snippet:

    page.on("download", (dl) => {
      console.log("Captured download:", dl.url());
      dl.path().then(console.log).catch(console.error);
    });

I assumed the .then(console.log) would implicitly wait for dl.path() to resolve; however, it immediately throws an error stating acceptDownloads is not set (even though it is).

I'm assuming the following are equivalent:

// NON async context
dl.path().then(console.log)
// async context
const path = await dl.path();
console.log(path);

@SeekDaSky
Copy link

SeekDaSky commented Apr 22, 2020

I am facing the same issue, I can't wrap my head around why playwright asks for acceptDownloads : true while it's already set

my snippet is:

return new Promise((accept,reject) => {
    this.page.once('download', download => {
        download.path().then((file) => {
            let rawJSON = readFileSync(file,{encoding: 'utf8'})
            accept(JSON.parse(rawJSON))
        }).catch(e => reject(e))
     })

    //click on a button that triggers the download
})

I tried both non async and async context. This behavior is not limited to chromium, firefox is also not able to download the file, but it does not fire the "download" event at all, it instead displays the message " could not be saved, because the source file could not be read.".

It might be helping to know that I'm trying to download a Blob with a URL generated by URL.createObjectURL. The download is working if I manually use my browser and I passed acceptDownloads: true when creating the context.

@yury-s
Copy link
Member

yury-s commented Apr 22, 2020

@SeekDaSky

my snippet is:

Can you paste complete snippet including the code where you create the context and the page?

It might be helping to know that I'm trying to download a Blob with a URL generated by URL.createObjectURL.

A code snippet would be helpful here as well. Can it be that the blob gets garbage collected by the browser before you start downloading it?

@SeekDaSky
Copy link

I am going to fiddle with this possible bug tomorrow and open a new issue if I find out it is a bug related to Blobs. I'll make sure to include a reproducer.

@rwoll
Copy link
Member Author

rwoll commented Apr 22, 2020

@pavelfeldman / @yury-s: It doesn't look like I have permission to re-open this issue, but I think I figured out what is happening (at least for the originally reported issue). It looks to be that an Events.Page.Download is emitted (with reference to the Download) that hasn't quite fully initialized.

I've included a patch at the end. If you don't see any red flags, let me know and I will contribute an MR (including some additional tests).

Here's the current [Download constructor] (https://github.com/microsoft/playwright/blob/f5942295d413af10c552ce3583f5ec7ad70fbe7f/src/download.ts) from playwright:

  constructor(page: Page, downloadsPath: string, uuid: string, url: string) {
    this._page = page;
    this._downloadsPath = downloadsPath;
    this._uuid = uuid;
    this._url = url;
    this._finishedCallback = () => {};
    this._finishedPromise = new Promise(f => this._finishedCallback = f);
    for (const barrier of this._page._frameManager._signalBarriers)
      barrier.addDownload();
    this._page.emit(Events.Page.Download, this);
    page._browserContext._downloads.add(this);
    this._acceptDownloads = !!this._page._browserContext._options.acceptDownloads;
  }

Notice how the event is emitted

this._page.emit(Events.Page.Download, this);

before this._acceptDownloads is set:

this._acceptDownloads = !!this._page._browserContext._options.acceptDownloads;

.

If I move the event emission to the end of the constructor, dl.path() behaves correctly in path
page.on('download', …) handlers and page.waitForEvent('download') strategy.


RAW Debugging Notes

Start with playwright@0.13.0.

The following doesn't work and you'll get a misleading error message ((node:2027) UnhandledPromiseRejectionWarning: Error: Pass { acceptDownloads: true } when you are creating your browser context.) despite setting acceptDownloads to true:

(async () => {
  for (const browserType of ['chromium']) {
    const browser = await playwright[browserType].launch();
    // [RAW] NB: Ensure acceptDownloads is set per https://github.com/microsoft/playwright/blob/bb0b6cd90a9f95c2337ddfcafb516c48318e4c43/docs/api.md#class-download
    const context = await browser.newContext({acceptDownloads: true });
    const page = await context.newPage();


    page.on("download", async (dl) => {
      const path = await dl.path();
      console.log(path);
    });

    await page.goto("https://upbeat-nightingale-6e32da.netlify.com/download");
    await page.click("text=Download");
    // Do some other processing…
    await page.waitFor(10_000);
    await browser.close();
  }
})();

But if you add an "internal" guard:

diff --git a/index.js b/index2.js
index 75b3e8a..f48d450 100644
--- a/index.js
+++ b/index2.js
@@ -9,6 +9,7 @@ const playwright = require('playwright');
 
 
     page.on("download", async (dl) => {
+      await new Promise(res => setTimeout(res, 200));
       const path = await dl.path();
       console.log(path);
     });

everything works and you'll get the path printed.

Now, that's not very robust, let's wait on the internal download promise:

diff --git a/index2.js b/index3.js
index f48d450..dbb1bba 100644
--- a/index2.js
+++ b/index3.js
@@ -9,7 +9,7 @@ const playwright = require('playwright');
 
 
     page.on("download", async (dl) => {
-      await new Promise(res => setTimeout(res, 200));
+      await this._downloadFinished;
       const path = await dl.path();
       console.log(path);
     });

This also works.

This then lead into me diving into the Download implementation and discovering the "race" between the event emission vs. acceptDownloads being set.

If we patch playwright:

diff --git a/src/download.ts b/src/download.ts
index 200bd75..26047a4 100644
--- a/src/download.ts
+++ b/src/download.ts
@@ -41,9 +41,9 @@ export class Download {
     this._finishedPromise = new Promise(f => this._finishedCallback = f);
     for (const barrier of this._page._frameManager._signalBarriers)
       barrier.addDownload();
-    this._page.emit(Events.Page.Download, this);
     page._browserContext._downloads.add(this);
     this._acceptDownloads = !!this._page._browserContext._options.acceptDownloads;
+    this._page.emit(Events.Page.Download, this);
   }
 
   url(): string {

We don't need the random sleep or the waiting on this._downloadFinished. dl.path().then(…) just works. 🎉

@rwoll
Copy link
Member Author

rwoll commented Apr 22, 2020

@SeekDaSky I just confirmed that what I said in #1777 (comment) also applies to the following Blob example as well:

<!DOCTYPE html>
<html>
  <head>
    <title>SeekDaSky Blob Example</title>
  </head>
  <body>
    <script>
      const download = (data, filename) => {
      const a = document.createElement("a");
      a.style = "display: none";
      document.body.appendChild(a);
      a.style = "display: none";

      const json = JSON.stringify(data);
      const blob = new Blob([json], { type: "octet/stream" });
      const url = window.URL.createObjectURL(blob);
      a.href = url;
      a.download = filename;
      a.click();
      window.URL.revokeObjectURL(url);
      document.body.removeChild(a);
    };

    const downloadIt = () => {
      download({ foo: 1, bar: 2}, "example.json");
    }
    </script>
    <a onclick="javascipt:downloadIt();">Download</a>
  </body>
</html>

In each of the examples, you can sub the target url for https://c7ecc89a4366bf00.netlify.app/seekdasky-example (which hosts the above file) and the same conclusions apply.

rwoll added a commit to rwoll/playwright that referenced this issue Apr 22, 2020
rwoll added a commit to rwoll/playwright that referenced this issue Apr 22, 2020
This patch fixes `download.path()` throwing an error that
`acceptDownloads` is not set when using a download within a `page.on`
handler even when `acceptDownloads` had been set to `true`;

Fixes microsoft#1777
rwoll added a commit to rwoll/playwright that referenced this issue Apr 22, 2020
This patch fixes `download.path()` throwing an error that
`acceptDownloads` is not set when using a download within a `page.on`
handler even when `acceptDownloads` had been set to `true`;

Fixes microsoft#1777
@SeekDaSky
Copy link

Wow thank you very much @rwoll for the in-depth debugging and quick fix, this fixed my issues with chromium but firefox is still unable to download Blob (but it is able to download files). I opened a separate issue for this (#1936 )

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants