-
-
Notifications
You must be signed in to change notification settings - Fork 82
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optionally disable content security policies #114
Labels
enhancement
New feature or request
Comments
sesh
added a commit
to sesh/shot-scraper
that referenced
this issue
Jul 30, 2023
I just ran into this today while testing Simon's TIL about running axe-core with shot-scraper. I've taken @jamesking's suggestion above and implemented it in a PR. The |
This is a really smart feature request, and #116 looks like a good implementation. |
simonw
added a commit
that referenced
this issue
Nov 1, 2023
simonw
added a commit
that referenced
this issue
Nov 1, 2023
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Problem
I have been following this TIL to run the
Readability.js
on a page with Shot Scraper.https://til.simonwillison.net/shot-scraper/readability
This worked fine for pages with liberal content security policies, however when tried to scrape a page with a stronger CSP I ran across this error:
When a page has a strong CSP like this it limits the ability for Shot Scraper to run Javascript on a page before processing it.
Suggestion
The Playwright Python tools have an optional
bypass_csp
argument that can be passed to thenew_context
method.As a test I monkey-patched
shot_scraper/cli.py
with the following:And now the
Readability.js
script executes without a problem. :)It would be really useful to give Shot Scraper a CLI argument like
--bypass-csp
that would then optionally add this argument in Playwright and allow more flexibility to run javascript on pages like this.Thank you for a great tool!
The text was updated successfully, but these errors were encountered: