Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Relocatable local website copy #15054

Closed

Conversation

dradetsky
Copy link

Similar to #15046, but by replacing the raw links with helpers, since all links are now wrapped, we can just set links & assets to relative.

@dradetsky
Copy link
Author

missed a few links before

@sethvargo
Copy link
Contributor

Hi @dradetsky

Thank you for the PR. I'd like to understand a bit more about the requirement here, and I have a few concerns:

  1. We now increase the barrier to entry to commit to the website. In addition to knowing HTML and markdown, contributors also need to understand ERB for the links.
  2. The build is significantly slower, which is a problem for Terraform's already slow build. Middleman's link_to helper is rather inefficient...

For running the docs locally or internally, I usually recommend either:

  1. Run middleman server or make website, which will run on localhost:4567, but could also be bound to 0.0.0.0:80 in a "production-like" environment.
  2. Run make build to generate the HTML, and then throw it as the webroot in apache or nginx. Both of those should be able to handle the relative links.

@dradetsky
Copy link
Author

requirements

The desire is to be able to build (and theoretically, distribute) a batch of docs that a user can read.

Of course I can run the result of make build off of nginx (I had to in order to confirm that my changes would probably not affect your production website), but then I have to have nginx running locally whenever I want to refer to the docs (systemd just kicked in, yo), and in a way which doesn't interfere with anything that also wants to run a local nginx server (my job's dev servers, for instance). Moreover, what about the 10 other projects that want to use local nginx servers for their own docs? Now I need some way of organizing them all, which I probably have to do myself. So I need to know how to write nginx config files, how nginx virtual servers (i.e. server_name) works, and other fun stuff like that. I mean, I sort of do know these things, but that doesn't mean I like the idea.

By contrast, tons of open-source projects have docs packages. In additional to installing the python package, I also install the python-doc package, and now I have all the python docs under /usr/share/doc. I don't have to mess with systemd or nginx because the linux kernel helpfully loads the file system code at boot time, and that's all we need. And although I suspect there isn't a whole lot of difference between running nginx for 1 project's docs and 50 projects' docs, I'm absolutely sure having 50 projects' docs on my hard drive is fine (df -h says: all good, bro).

concerns

I see the points about increasing requirements for devs. I did consider that, but I thought that most of the increase comes in the .erb templates, which I assumed were largely prepared by Hashicorp people (I only looked at logs for a few files though). And if they weren't, they were at least looked at by Hashicorp people before they went live. The typical contributor still basically just needs to know html and markdown. And even if she does have to deal with links in erb, she'll probably at least look at an example and see that it's actually pretty straightforward.

Regarding the build, I noticed this too. I assumed it wasn't a huge problem but that was on the theory that building terraform and building the website were two separate things happening independently. Is this not the case?

compromises

Supposing for the sake of argument that you cherry-picked only the first and last commit from this PR. Although this wouldn't actually add any new functionality, it wouldn't really be a problem for either of your 2 concerns. You're only adding a handful of uses of link_to, so it shouldn't affect the build much, and they're all wrapped around uses of inline_svg, so whoever deals with them already has to know a bit about ERB. Then I can make an arch AUR package which makes the rest of the changes with sed-in-place (which is how I actually edited the code in the first place) and builds the result.

Or for that matter, I could just add the sed-in-place script to the repo, and anyone who wants a local docs build (or to package one for his distro) can just use it. Possibly it would add a big ugly "warning! this will wipe out any uncommitted changes to source/! are you sure? [y/N]" with a flag to just do it without asking.

@dradetsky
Copy link
Author

@sethvargo I tried it out, taking the first and last commit and adding a simple find + sed-in-place script, which yields

% git diff relocatable-local-website-copy
diff --git a/website/source/index.html.erb b/website/source/index.html.erb
index aed4905..32cc78e 100644
--- a/website/source/index.html.erb
+++ b/website/source/index.html.erb
@@ -15,7 +15,7 @@ description: |-
         <h1>Write, Plan, and Create Infrastructure as Code</h1>
 
         <%= link_to "Get Started", "/intro/index.html", :class => "button primary" %>
-        <%= link_to "Download #{latest_version}", "/downloads.html", :class => "button" %>
+        <a class="button" href="/downloads.html">Download <%= latest_version %></a>
       </div>
     </div>
   </div>

So that means a total of 3 or 4 extra link_to helpers (depending on whether we want to fix the above; i think so) in order to make it possible to build static, relocatable docs readable off the local filesystem.

@sethvargo
Copy link
Contributor

Hi @dradetsky

Thank you for your thoughtful reply. It sounds like we might be conflating a few different things here, so I want to distill that a bit more.

My original assumption was that you wanted to mirror Terraform's documentation internally, perhaps due to an air-gapped environment. In that use case, I still assert "just use your web server" is an appropriate answer.

By contrast, tons of open-source projects have docs packages.

This is totally true. But those open source projects also have packages which ship default configuration files, man pages, etc. We are actively having conversations about how to leverage OS-level packages better, but there's no resolution at this time.

The reason I'm trying to better understand your use case is because the build is actually going to get more complicated soon. For reasons I can't discuss in the open, significant changes to the doc structure and build process are "coming soon" ™️, and those changes won't work with this approach for "vendoring" the documentation.

If you want to run the docs locally due to versioning - a fix for that is in the works.

If you want to run the docs locally because you're in an air-gapped environment, I still assert "just use your web server" is the best approach. I understand your concerns about knowledge requirements for nginx, etc, but the venn diagram of "people who want to run their own internal documentation store" and "people who can configure nginx" has significant overlapping regions.

I'm certainly not trying to discount the work you did here - this is great work, but this approach isn't going to work as the documentation becomes... more "graph like". I think this is something we hadn't actually considered internally at HashiCorp, and it's conversation I'll raise with folks internally to try and get a better grasp on how this might be possible with the future architecture.

@dradetsky
Copy link
Author

My original assumption was that you wanted to mirror Terraform's documentation internally, perhaps due to an air-gapped environment. In that use case, I still assert "just use your web server" is an appropriate answer.

It's not that. I'm just a regular guy with a laptop who likes to keep shitloads of documentation on it. Like many developers who work across very complicated stacks, I have a whole workflow built around the theory that there are local documentation packages for things. Of course, "packages" doesn't necessarily mean os packages; sometimes it means local documentation built from somebody's git repo. But the result is the same. The os docs package is just usually the simplest way to get the docs. The whole thing works very nicely with no issues precisely because it has no "runtime" dependencies other than a browser (or in some cases, an editor) and a filesystem. Which is why people package docs for distribution in the first place.

If you want to run the docs locally due to versioning - a fix for that is in the works.

That's not the main point, but it's definitely a side benefit.

the venn diagram of "people who want to run their own internal documentation store" and "people who can configure nginx" has significant overlapping regions.

The intersection of "people who can configure nginx", "people who want to run their own internal documentation store" and "people whose time is valuable" is also nontrivial. Hence my desire to ensure that more such people spend less of their time configuring nginx. The point of having all these docs is to enable reference-at-the-speed-of-thought. Which confers greater benefits the more valuable your time is, and is more necessary the more complicated your stack is. And both of these factors argue against more complicated solutions for providing references.

Or what if I'm looking up my nginx docs because my nginx server won't start because some of the other 50 config files which I wrote to serve my local doc store are breaking because of the too-clever-by-half stuff I've done to get them to not conflict with the config files for the company's development servers (which expect to run on machines with nothing else on them). Fortunately I was smart enough to keep a packaged copy of the nginx docs on the filesyst...oh wait. Well fortunately my nginx config is sufficiently modular to disable just the ones that are...no, I didn't do that either, I was too busy. And so on. Local html files just work.

If you're planning to change the build & structure the website, I'd say: would merging my pr impede these changes? If so, fair enough, you've got your priorities. If not, why not just merge it? After all, it always takes longer than you think to roll out the new thing anyway. And maybe some of your guys will get to like having a local docs build on their own machines, and I won't have to do it again myself.

@apparentlymart
Copy link
Contributor

Sorry to jump in to the middle of all this, but I just wanted to make it known that we are about to split the docs across many separate repositories as part of the 0.10 change to manage each provider in its own repo, and this reorg is already in progress so merging this would create conflicts with that work already in progress.

Unfortunately due to some compromises we are making to get 0.10 out in a reasonable time it will temporarily get even harder to build an offline copy of the site, though that is an interim step on the path to the other work Seth is describing that will make this better in the long run.

Sorry for the awkwardness here... we started this work not expecting broad changes to the docs for a while.

@dradetsky
Copy link
Author

@apparentlymart I assume when you say "we" you mean "Hashicorp," although you aren't a member of that org (maybe that's normal for hashicorp employees? I dunno). Is that right?

@dradetsky
Copy link
Author

dradetsky commented Jun 6, 2017

@apparentlymart to clarify, I'm trying to confirm that the whole repo split thing you're talking about is an official hashicorp effort, rather than just something you and your brother have been doing for a while or whatever.

EDIT: n/m, I see you're contributor no. 11, so either way...

@dradetsky
Copy link
Author

Anyway, I assume this repo split and @sethvargo's cryptic "more graph-like documentation" are the same thing. If so, fair enough. I wouldn't let you merge your pet change if it fucked up my overall engineering efforts either. I'm already able to cut my own local docs off this branch, and I imagine I'll be able to keep merging in master and repeating the process until 0.10 comes out.

If either @sethvargo or @apparentlymart are able to talk about this (you can also email me), to what extent do these documentation changes apply to Vagrant, Packer, et al? I ask because although my most immediate use-case was terraform docs, I plan to make more use of other Hashicorp tools in the future (probably packer to start with), and insofar as they all use the same website framework, I planned to adapt these to the other hashicorp tools, perhaps culminating in a hashicorp-doc AUR package. So basically, how bad will your plans screw up mine?

Also, fwiw, I'm not the only consumer of hashicorp docs for shitloads-of-dev-docs-in-one-env purposes. devdocs.io (which I hate, but apparently other people like) currently provides Vagrant docs through their framework, for example.

@sethvargo
Copy link
Contributor

sethvargo commented Jun 6, 2017

Hi @dradetsky

Thank you for your replies, and thanks @apparentlymart for jumping in. I'll try to distill them all at once:

The intersection of "people who can configure nginx", "people who want to run their own internal documentation store" and "people whose time is valuable" is also nontrivial.

I totally empathize with this, and I hope my comment didn't come off as "do it yourself", but rather a statement that the target audience for something like this appears relatively small. I could be wrong, but we get thousands of issues on Terraform, so when balancing "build a new feature to benefit 100 users" vs "build a new feature that complicates the build process and benefits 10 users", it's really hard for us to prioritize the latter.

If you want local docs without managing them as a server, I would highly recommend Dash (Mac) or Zeal (Linux). I know we have Terraform docs in Dash, and I would bet we have them in Zeal too (if not, I'd be willing to personally take on the task of getting them added). The benefits of these tools are: 1. you don't need to run a webserver and 2. they can be completely offline and indexed. Dash even uses machine learning to show you things you've searched in the past as higher priority, etc. They also have plugins for most programming languages as well.

If you're planning to change the build & structure the website, I'd say: would merging my pr impede these changes? If so, fair enough, you've got your priorities. If not, why not just merge it? After all, it always takes longer than you think to roll out the new thing anyway. And maybe some of your guys will get to like having a local docs build on their own machines, and I won't have to do it again myself.

In short, yes and no. It would be a little impedance, but we can work around that. The bigger problem is that your solution will stop working when we release 0.10. From my perspective, it's a very temporary solution to a much larger problem.

In the future, each provider and provisioner will be distributed as a separately versioned and maintained binary. You can see this work being completed out in the open over at https://github.com/terraform-providers. With each provider having its own release cycle, version, and documentation, there won't be "Terraform documentation" anymore. Yes, there will still be the core documentation on the syntax and internals, but after their initial introduction, most users spend time in the provider-specific documentation. Each provider will be distributing their own documentation. It will still "live" at terraform.io, but there won't be a single website build - just reverse proxies and rewrite rules everywhere.

As Martin said:

Unfortunately due to some compromises we are making to get 0.10 out in a reasonable time it will temporarily get even harder to build an offline copy of the site, though that is an interim step on the path to the other work Seth is describing that will make this better in the long run.

This is very true. The provider split is a step forward in our ability to independently release providers (we don't have to wait for a full Terraform release to fix a bug in GCP), but it's a step backward in the docs process. This was a known compromise that was actively discussed in the beginning. We have a number of potential plans to make offline docs easier, including automatically publish docsets for tools like Dash/Zeal during our release process. As I said above, given the relatively small number of people who want offline docs, coupled with the massive benefits of splitting providers, we opted to move forward with the original 0.10 release knowing it would make docs harder for a bit.

I assume when you say "we" you mean "HashiCorp," although you aren't a member of that org (maybe that's normal for hashicorp employees? I dunno). Is that right? I'm trying to confirm that the whole repo split thing you're talking about is an official hashicorp effort, rather than just something you and your brother have been doing for a while or whatever.

This is a HashiCorp effort - @apparentlymart is a HashiCorp employee now, although he has always been a very active contributor and key community member 😄 . You can see the split in real-time here, and we will be making an official announcement sometime this week on the HashiCorp blog.

...what extent do these documentation changes apply to Vagrant, Packer, et al?

At this time we have no plans to adopt this strategy for other tools. While both Vagrant and Packer do feature plugin models, the numbers are far fewer and the churn is a bit less frequent. That doesn't mean we will never move to this model, but we do not have plans to do so in the immediate future. However, again, I would recommend looking at tools like Dash/Keal for offline docs. They have significant market penetration (for free or very low cost pro features), and I would bet any future docs changes would integrate well with those tools.

...and insofar as they all use the same website framework, I planned to adapt these to the other hashicorp tools, perhaps culminating in a hashicorp-doc AUR package. So basically, how bad will your plans screw up mine?

All our sites will continue to be built using the hashicorp-middleman framework. It's also worth noting that all our sites can easily be "printed" as a PDF. If you wanted something super quick, hacky, but guaranteed to always work, you could write a script (wget --recursive) to "print" each page of the website as a PDF. That will continue to work, basically forever.

Finally, it's worth noting that man pages and proper OS packages are also on our radar. If these are features you are interested in working on, we are hiring release engineers who will get to define, build, and manage these types of processes and decisions.

@dradetsky
Copy link
Author

I totally empathize with this, and I hope my comment didn't come off as "do it yourself", but rather a statement that the target audience for something like this appears relatively small.

It's cool, and I understand your reasoning. Just don't discount the value of something just because the target audience is small. Lots of people put lots of effort into building things that appeal only to a very small group of very intelligent people precisely because those people get so much done. If you think about it, this is kind of the premise of devops in the first place.

As regards offline doc systems: I've rolled my own, for a variety of reasons. I might look into Zeal; the existence of zeal-at-point is encouraging. Even so, I really dislike the premise of offline doc systems being a solution to the lack of local docs builds, and I think it's one of the reasons I tend to dislike most of the offline doc systems. The difficulty of writing all those extensible scrapers means that the likelihood that the developer will also make a UI I like & that fits my workflow very low. If there wasn't any scraping involved, all the competition would be over UI, and invariably there'd be one I liked. Hence my desire to ensure that more important open-source projects can cut local docs builds, and that best-of-breed open source companies role-model the right way to feed into this ecosystem.

In the future, each provider and provisioner will be distributed as a separately versioned and maintained binary.

I can just imagine the conversation now "No see, terraform is...okay, you know how people call gcc a 'compiler'? but technically it's a 'compiler driver'? No? Okay, so the difference is..." and then eventually you give up.

Seriously, though this sounds like a good idea, and I totally support not moving forward with this at the expense of my own project. Which I imagine will be simple enough to to do for the whole ecosystem. For example, I originally proposed modifying a few of the anchor-on-svg links as special cases because of the difficulty of doing it with sed, but if there's 30 different repos, it's not like breaking out nokogiri is that hard, or that hitting 30 different github repo archive urls and walking over the results will be a problem.

Finally, it's worth noting that man pages and proper OS packages are also on our radar. If these are features you are interested in working on, we are hiring release engineers who will get to define, build, and manage these types of processes and decisions.

You know, a friend of mine told me that publishing a blog post on how I was moving to arch linux because I couldn't stand ubuntu anymore might have a negative effect on my career, but I said no, I can't imagine how that could come up. Christ...

@apparentlymart
Copy link
Contributor

(The terraform-providers repositories got temporarily marked as private subsequently to Seth's comment, because they were causing some confusion in the community about what was the "source of truth". They'll go back to being public again once the dust settles.)

@sethvargo
Copy link
Contributor

In light of the conversation, I'm going to close this out.

@ghost
Copy link

ghost commented Apr 11, 2020

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@ghost ghost locked and limited conversation to collaborators Apr 11, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants