Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow clerk to function behind a proxy / bundle dependencies in a release #147

Closed
zackteo opened this issue May 12, 2022 · 10 comments
Closed

Comments

@zackteo
Copy link

zackteo commented May 12, 2022

#30

This is linked to the above issue in my work environment. However, in my work environment while we can download maven dependencies through a certain system, we will not be able to access links like "https://storage.googleapis.com/nextjournal-cas-eu"

Which seems to be a necessity as soon as (require '[nextjournal.clerk :as clerk]) is ran

@skovuri41
Copy link

same issue , blocker to user clerk at work behind proxy

@ikappaki
Copy link
Contributor

Hi @mk,

just to give a summary, it appears that since v0.7.416 Clerk tries to download some data (assets?) from https://storage.googleapis.com/nextjournal-cas-eu while loading, courtesy of the global var nextjournal.clerk.config/!resource->url at

(defonce !resource->url
.

Other than it seems inappropriate, in general, for libraries to download objects in the background without users being aware of it, it impacts apps behind firewalls or offline for whatever reason (no wifi, running clerk in isolation in a container etc).

It is common for corporate firewalls to allow http access only to browsers, and make it either difficult or impossible for any other process to access the internet.

This means that it could be either impossible to use Clerk outside of the browser or have to introduce specific proxy setup for just bootstrapping Clerk. The proxy settings might be difficult to figure out and could result in saving passwords in plaintext.

Is there any reason why these "assets" can't be packaged as part of the jar to avoid connecting to the internet while
bootstrapping?

This also used to be working in the earlier revision without the need of downloading from the internet.

As a workaround that might be effective in some cases (there is actually a proxy for non browser to connect o), here is how would one setup the proxy in deps.edn to make slurp and friends work:

  :aliases {:dev {:jvm-opts ["-Dhttp.proxyHost=w.x.y.z" "-Dhttp.proxyPort=NNNN"
			     "-Dhttps.proxyHost=w.x.y.z" "-Dhttps.proxyPort=NNNN"]}}

https://docs.oracle.com/javase/8/docs/technotes/guides/net/proxies.html

Thanks

@mk
Copy link
Member

mk commented Jun 17, 2022

Hi @ikappaki,

Since its initial release, Clerk didn't bundle all assets in the jar but load them in the browser on demand. What changed with v0.7.416 is that we also load a resource manifest on the JVM to look up the url of those assets.

Other than it seems inappropriate, in general, for libraries to download objects in the background without users being aware of it, it impacts apps behind firewalls or offline for whatever reason (no wifi, running clerk in isolation in a container etc).

Work is in progress on making Clerk work when offline in #113.

It is common for corporate firewalls to allow http access only to browsers, and make it either difficult or impossible for any other process to access the internet.

I wasn't aware of this distinction. So you can access https in the browser just fine, but not via the JVM?

We were going for an cache on first use rather than bundling things up in the jar in #113 because folks can also dynamically require js libraries like in this example and we could support those for offline use as well.

We could consider to bundle the set of commonly used libs inside the jar at the cost of introducing a pretty different code path for when Clerk is consumed via a jar vs a git lib. Another smaller change to consider is to bundle the resource manifest inside Clerk still use the browser to serve the files. It sounds like this would work in your case but then Clerk would still not work when offline.

Curious to hear how widespread the issue is and what other workarounds exist.

@mk
Copy link
Member

mk commented Jun 17, 2022

Another smaller change to consider is to bundle the resource manifest inside Clerk still use the browser to serve the files. It sounds like this would work in your case but then Clerk would still not work when offline.

Another option that I thought of now: not do any requests from the JVM but let the browser cache assets on first use using using a ServiceWorker, possibly via Workbox (which seems to power the offline functionality of many js frameworks).

@ikappaki
Copy link
Contributor

Since its initial release, Clerk didn't bundle all assets in the jar but load them in the browser on demand. What changed with v0.7.416 is that we also load a resource manifest on the JVM to look up the url of those assets.

this explains it then :)

Other than it seems inappropriate, in general, for libraries to download objects in the background without users being aware of it, it impacts apps behind firewalls or offline for whatever reason (no wifi, running clerk in isolation in a container etc).

Work is in progress on making Clerk work when offline in #113.

Sounds great, though it appears as if an internet connection is required at least once at first invocation to cache the assets (this wouldn't work for example if Clerk was a dep in a test container with no internet connectivity).

It is common for corporate firewalls to allow http access only to browsers, and make it either difficult or impossible for any other process to access the internet.

I wasn't aware of this distinction. So you can access https in the browser just fine, but not via the JVM?

Correct, I believe this is normal practice, organisations would still like for people behind a firewall to access the web for their work (given the browser is a very well managed application), but will restrict everything else.

We were going for an cache on first use rather than bundling things up in the jar in #113 because folks can also dynamically require js libraries like in this example and we could support those for offline use as well.

We could consider to bundle the set of commonly used libs inside the jar at the cost of introducing a pretty different code path for when Clerk is consumed via a jar vs a git lib. Another smaller change to consider is to bundle the resource manifest inside Clerk still use the browser to serve the files. It sounds like this would work in your case but then Clerk would still not work when offline.

It appears to me that the ideal solution would be to bundle the standard assets with the jar so that Clerk works out of the box without any need for connectivity (as with any other library whose design does not require an internet connection to load I guess), and let any dymamic loading assets require an internet connectivity to function (this seems to be a design requirement).

Yes, I believe bundling the manifest with the jar (as opposed to requesting at load time) will solve the majority of proxy issues with users, since they are most likely to have access to a browser with Internet connectivity.

Curious to hear how widespread the issue is and what other workarounds exist.

Is there a possibility perhaps to package the assets in a separate package so that people who wish to including them as a dependency on demand?

Why is it that the assets not bundle with Clerk to begin with? Was it perhaps the assets size consideration?

Thanks for the quick reply!

@ikappaki
Copy link
Contributor

Another option that I thought of now: not do any requests from the JVM but let the browser cache assets on first use using using a ServiceWorker, possibly via Workbox (which seems to power the offline functionality of many js frameworks).

Yes, this sounds like a good idea too getting around proxying restrictions

@ikappaki
Copy link
Contributor

Hi @zackteo, @shyamkovuri3

could you please advise whether you can access the following url from your web browser at work? I made the assumption in my discussion above that people generally can access the Internet behind the firewall via their browser, and it is only when trying to download it by other means (i.e. when clerk trying to make a direction connection to the internet from the jvm) that the problem manifests itself. Or is that you can't connect to google storage even from within the browser where you casually browse the web from?

https://storage.googleapis.com/nextjournal-cas-eu

Thanks

@borkdude
Copy link
Collaborator

borkdude commented Jun 20, 2022

This issue should now be solved with clerk {:mvn/version "0.8.470"}. Note that we still load from the internet when clerk is used as a git dep, but when used as a mvn dep, the problem should no longer happen. Please let us know.

@ikappaki
Copy link
Contributor

Tnanks @borkdude. I can confirm it works now with 0.8.470.

@skovuri41
Copy link

Tnanks @borkdude and clerk contributors.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants