Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] feat: Rewrite to automation-extra, Support both Playwright and Puppeteer #303

Closed
wants to merge 159 commits into from

Conversation

berstend
Copy link
Owner

@berstend berstend commented Aug 31, 2020

Update: Beta versions are available for testing now


Gist

  • massive rewrite of the whole project 😅
  • adds a new automation-extra package, which is the underlying shared foundation for both playwright-extra and the new puppeteer-extra package
  • adds a new automation-extra-plugin (cleaner API, adds Playwright events, better type safety)
  • many more things, e.g. Plugins don't listen to events themselves anymore but get called from the mother ship (we had a set of issues with e.g. too many listeners that should be fixed by that)
  • much improved type safety all around

Motivation for adding playwright support: puppeteer/puppeteer#3667

Status

  • automation-extra - Finished: We have plugin support for both Playwright & Puppeteer
  • playwright-extra - Finished: It's just a small entry point to automation-extra to make things more clear & type safe
  • automation-extra-plugin - Finished for now: I want to add shims to make plugins supporting both pptr & playwright less verbose (e.g. pptr has page.evaluateOnNewDocument, playwright has context.addInitScript)

I haven't switched puppeteer-extra over to use the new automation-extra code base yet, but specific tests against stealth, adblocker, recaptcha are all green 🚀

Next steps

I realized pretty late in the process of adding Playwright support that I need to make the code backwards compatible with existing plugins, so new automation-extra-plugin based plugins work with both playwright-extra and puppeteer-extra (the latter still supporting puppeteer-extra-plugin's). 😅

  • Replace puppeteer-extra with a small package using the new code in automation-extra
  • Make sure all tests in the whole monorepo are green after that
  • Port the first plugin over to support both Playwright & Puppeteer
  • Add/update/improve/fix documentation where needed
  • Final public beta version testing before release

Once that's done we should be able to push/publish this stuff without breaking anything. Afterwards I'll start porting other existing plugins to support both Playwright and Puppeteer (their APIs differ, so existing puppeteer plugins need to be modified to support both - this PR is about the underlying plugin framework itself).

In the process of porting the plugins I will extend automation-extra-plugin wherever it makes sense (add new unified events (e.g. addScript) or add some sort of shim).

Known issues

  • I removed that crusty "data from plugins" stuff, so this breaks e.g. the puppeteer-extra-plugin-user-data-dir plugin (I'll come up with a cleaner replacement API)
  • Playwright's .launchPersistentContext() is not yet augmented with plugin functionality

@berstend berstend self-assigned this Aug 31, 2020
@XBeg9
Copy link

XBeg9 commented Aug 31, 2020

Wow! That's huge PR @berstend, thanks for playwright support!

@dilame
Copy link

dilame commented Sep 29, 2020

I'm waiting for this PR as mom waits for her son from the army!!!

@berstend
Copy link
Owner Author

I'm waiting for this PR as mom waits for her son from the army!!!

Haha 😄 Unfortunately progress abruptly stopped due to an injury but I'm getting back to it now. :)

@opahopa
Copy link

opahopa commented Feb 19, 2021

Hi!

i can see that the docs were updated to include playwright-extra usage, however the package from npm contains throw new Error("Coming soon: https://github.com/berstend/puppeteer-extra/pull/303") is it ready to be used for test purposes or that was just a docs update?

@berstend
Copy link
Owner Author

berstend commented Feb 20, 2021

@opahopa you need to install @next versions currently as mentioned here:
#303 (comment)

Docs have only been updated for the automation-extra branch, not master.

@j3lev
Copy link

j3lev commented Feb 23, 2021

Thank you so much, this is amazing work! Are there any plans for porting stealth over to Playwright in the near future?

@wesley-campbell-q2
Copy link

@berstend Excellent job on porting this to Playwright. I'm running into an issue when using the recaptcha plugin if you use all the browsers { chromium, firefox, webkit } = require('playwright-extra') it fails to find the executable for any of the browsers.

@j3lev
Copy link

j3lev commented Feb 24, 2021

@wesley-campbell-q2 did you install playwright alongside playwright-extra?

@wesley-campbell-q2
Copy link

@j3lev I ran npm install playwright playwright-extra @extra/recaptcha.

@j3lev
Copy link

j3lev commented Feb 24, 2021

@wesley-campbell-q2 Playwright just released 1.9 today and might not work properly with this port yet, try npm install playwright@1.8

@wesley-campbell-q2
Copy link

wesley-campbell-q2 commented Feb 24, 2021

@j3lev Based off what I'm seeing you need to install as npm i playwright@1.8 playwright-extra@next but that still fails me

@lg
Copy link

lg commented Feb 26, 2021

same issue -- it looks like its downloading some versions of the browsers, axing them, and then downloading new ones.

@berstend
Copy link
Owner Author

berstend commented Feb 26, 2021

Everyone willing to test the new versions:

That means:

  • Installing versions with a @next tag
    • the package READMEs are written for release, when that won't be necessary anymore
  • Please use yarn
    • the lerna monorepo canary release flow is a bit wonky and there might be issues with npm

TL;DR (😢):

yarn add playwright playwright-extra@next @extra/recaptcha@next

@berstend
Copy link
Owner Author

Thank you so much, this is amazing work! Are there any plans for porting stealth over to Playwright in the near future?

I started work on @extra/stealth but it's quite a big undertaking with a fair bit of busy work (just look at e.g. all the evasion tests that need to be modified to test both pptr & playwright, etc) :-)

@j3lev
Copy link

j3lev commented Mar 12, 2021

How will playwright updates work? I'm assuming every release will be tied to a specific playwright version?

@berstend
Copy link
Owner Author

berstend commented Mar 13, 2021

How will playwright updates work? I'm assuming every release will be tied to a specific playwright version?

The same as with the existing puppeteer-extra: You install your own favorite puppeteer/playwright package version and the -extra package will use that (it's just a wrapper around those packages to add plugin lifecycle events). :-)

Regarding @extra/stealth (and speaking as a plugin framework developer):
puppeteer has shown to be a lot easier to hack on and work with than playwright - the latter switched to use their own intermediate wire protocol instead of using/exposing CDP (chrome devtools protocol) directly, which makes it harder to find solutions for the more advanced stuff that we fixed through raw CDP so far (take for example the sourceurl evasion)

If someone found a way to hook into the underlying CDP communication (the existing page session, not a new one) in playwright let me know :-)

@DrewRidley
Copy link

Has the API changed for the humanize plugin? It does not appear to work and the syntax "firefox.use" throws a type error when using playwright. Maybe I am just oblivious but any pointers would be much appreciated.

@berstend
Copy link
Owner Author

berstend commented Mar 13, 2021

Has the API changed for the humanize plugin? It does not appear to work and the syntax "firefox.use" throws a type error when using playwright. Maybe I am just oblivious but any pointers would be much appreciated.

Are you following the instructions mentioned here?

The readme's in the automation-extra branch are written from the perspective of being released, so they don't mention the current need to use a @next tag when installing the packages (for humanize: yarn add @extra/humanize@next).

If you or someone else have issues: Please mention how you installed packages (the exact command line)

edit, after a discussion with the user on discord it turned out npm was the issue here - I made a more prominent note in the info post to use yarn for the temporary @next releases.

@berstend
Copy link
Owner Author

I created a canonical issue with condensed info on how to install the new playwright-extra & puppeteer-extra beta versions and new plugins: #454

@berstend
Copy link
Owner Author

berstend commented Apr 15, 2021

I locked this PR so subscribers only get pinged when there's official "updates" of sorts. :-)

For feedback and bug reports on the new automation-extra branch please use this ticket: #454

If you're interested in chatting about this type of stuff make sure to join our friendly community over on discord.

We're really close to finalizing the new code and the release should happen soon.

chinedufn referenced this pull request in acucciniello/tiktok-analytics May 13, 2021
- I was overcomplicating it.
- Now I am getting the frame, then from within the frame, access the text
- TODO: Log in in the new window now!
@berstend
Copy link
Owner Author

berstend commented Jul 3, 2022

playwright-extra has landed in #664 😄
readme: https://github.com/berstend/puppeteer-extra/tree/master/packages/playwright-extra

I'm closing this PR/rewrite for now, it's a good reference for future additions though.

@berstend berstend closed this Jul 3, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
package: core Affecting a core package plugin: automation-extra AutomationExtra Plugin related plugin: puppeteer-extra PuppeteerExtra Plugin related plugin: recaptcha 🏴 reCAPTCHA plugin related plugin: stealth ㊙️ Detection evasion related
Projects
None yet
Development

Successfully merging this pull request may close these issues.