Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Built-in privacy plugin does not download external assets of embedded .html file #7600

Closed
4 tasks done
mrsmrynk opened this issue Oct 7, 2024 · 4 comments
Closed
4 tasks done
Labels
resolved by config change Issue can be mitigated by the reporter

Comments

@mrsmrynk
Copy link
Contributor

mrsmrynk commented Oct 7, 2024

Context

As discussed in #7596

I use an inline frame (<iframe>) to embed .html files in my documentation.
The .html files are generated by folium and download .js files from various CDNs.
The built-in privacy plugin is used for self-hosting of these external assets.

Bug description

The built-in privacy plugin does not download the external assets of the embedded .html file.

I would expect it to:

  • download the external assets to the site directory
  • replace all references with links to the downloaded copies

Related links

Reproduction

9.5.39-privacy-with-embedded-html.zip

Steps to reproduce

  1. run mkdocs serve
  2. check the .cache/plugin/privacy directory - the external assets of the embedded .html file are not downloaded
  3. check the docs/maps/bounding_box.html file - the references are not replaced with links to the downloaded copies

Browser

No response

Before submitting

@squidfunk
Copy link
Owner

squidfunk commented Oct 8, 2024

Thanks for reporting. While this might not be obvious, for MkDocs to consider your HTML file as something to process, it must be listed under extra_templates. If you add the following to your mkdocs.yml:

extra_templates:
  - maps/bounding_box.html

The privacy plugin will process it and download all assets:

INFO    -  Downloading external file: https://cdn.jsdelivr.net/npm/leaflet@1.9.3/dist/leaflet.js
INFO    -  Downloading external file: https://code.jquery.com/jquery-3.7.1.min.js
INFO    -  Downloading external file: https://cdn.jsdelivr.net/npm/bootstrap@5.2.2/dist/js/bootstrap.bundle.min.js
INFO    -  Downloading external file: https://cdnjs.cloudflare.com/ajax/libs/Leaflet.awesome-markers/2.0.2/leaflet.awesome-markers.js
INFO    -  Downloading external file: https://cdn.jsdelivr.net/npm/leaflet@1.9.3/dist/leaflet.css
INFO    -  Downloading external file: https://cdn.jsdelivr.net/npm/bootstrap@5.2.2/dist/css/bootstrap.min.css
INFO    -  Downloading external file: https://netdna.bootstrapcdn.com/bootstrap/3.0.0/css/bootstrap-glyphicons.css
INFO    -  Downloading external file: https://cdn.jsdelivr.net/npm/@fortawesome/fontawesome-free@6.2.0/css/all.min.css
INFO    -  Downloading external file: https://cdnjs.cloudflare.com/ajax/libs/Leaflet.awesome-markers/2.0.2/leaflet.awesome-markers.css
INFO    -  Downloading external file: https://cdn.jsdelivr.net/gh/python-visualization/folium/folium/templates/leaflet.awesome.rotate.min.css

I think this can be considered a design flaw in MkDocs, as all HTML files that are located in the docs_dir should probably automatically considered for moving them through the plugin pipeline. However, I'm not the one to decide. Maybe you can run this by the maintainers of MkDocs, to automatically include all files. We could also add an exception in the privacy plugin and try to detect when there are *.html files in the docs_dir that are not explicitly listed under extra_templates, but honestly, we're fighting MkDocs on so many fronts – I don't want to create another battlefield.

Note that dynamically generated asset URLs in JavaScript (= map tiles) are not downloaded - it's not possible to know what to download and replace without executing the JavaScript. Marking as resolved via configuration.

@squidfunk squidfunk added the resolved by config change Issue can be mitigated by the reporter label Oct 8, 2024
@kamilkrzyskow
Copy link
Collaborator

My suggestion about the bug came from reading the source code:

# Find all external style sheet and script files that are provided as
# part of the build (= already known to MkDocs on startup)
for initiator in files.media_files():

The privacy plugin uses the files.media_files() to find all media files detected by MkDocs.
The on_page... events process each documentation Markdown file.
So it made sense to also use file.static_pages() as this is the "canonical" way to find HTML files.

def on_files(files, *, config):
        
    print("Markdown Files:")
    
    for file in files.documentation_pages():
        print(" ", file.src_uri)
        
    print("Static Pages:")
    
    for file in files.static_pages():
        print(" ", file.src_uri)
$ mkdocs serve
INFO    -  Building documentation...
INFO    -  Cleaning site directory
Markdown Files:
  index.md
Static Pages:
  maps/bounding_box.html

extra_templates seems to me as a way to make use of the Jinja2 features to access the context and variables etc.
This use cases seems to be different, as the HTML file is an embed with external content, so it doesn't have to be a template 🤔
Perhaps it would be also necessary to avoid processing the static_pages that are set in extra_templates to not process them 2 times 🤔

@squidfunk
Copy link
Owner

squidfunk commented Oct 8, 2024

Yes, HTML is not considered media files but templates by MkDocs. Honestly I'd consider it such an edge case, that its not worth trying to fix what's not broken. Two lines of config and it works.

We might add something to the documentation, though. PR appreciated ☺️

@mrsmrynk
Copy link
Contributor Author

mrsmrynk commented Oct 8, 2024

Thanks to both of you for your help ☺️
I've opened a PR with some additions to the docs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
resolved by config change Issue can be mitigated by the reporter
Projects
None yet
Development

No branches or pull requests

3 participants