Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update tech-docs-github-pages-publisher to 1.5 #150

Merged
merged 4 commits into from
May 25, 2022
Merged

Conversation

NickWalt01
Copy link

No description provided.

@NickWalt01 NickWalt01 requested a review from a team as a code owner April 27, 2022 16:27
@SteveMarshall SteveMarshall self-assigned this Apr 28, 2022
Copy link
Contributor

@SteveMarshall SteveMarshall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @NickWalt01, it’s not clear to me from the commits why you’ve had to change all the title lines? If the code that was previously generating them no longer works, is there a replacement for that? And if there isn’t, and this is the only way to display the titles, do we still need the metadata in the YAML front-matter on each document?

All of that should be described in the commit message and PR title/description, because as it is this change appears as if you’re just changing them for no reason I can see…

@NickWalt01
Copy link
Author

Hi @SteveMarshall I've recently updated the repo https://github.com/ministryofjustice/tech-docs-github-pages-publisher to v1.5 so that it is using the latest dependencies which involved bumping govuk_tech_docs from v2.1.0 to v3.2.0, in this version when it complies the html.md.erb files it has an issue with those html titles, I solved it by replacing them with normal markdown titles and the compilation was successful. I've updated the operations-engineering page in a similar fashion. I understand what you are saying that this merge doesnt equate to the upgrade to v1.5, that was done in the previous merge 49521c8 which then failed its CI run as the updated html proof checker inside the docker image has failed the license url link that Tony has spotted in the other PR. This PR and Tony's PR should be one bit of work that should have gone into the previous merge but I didnt spot the makefile was using v1.3 when I tested it.

@NickWalt01
Copy link
Author

I'm not to sure on the 'do we still need the metadata in the YAML front-matter on each document?' as I dont know how it is used, there is this page that explains that they are still usable fields: https://tdt-documentation.london.cloudapps.digital/configure_project/frontmatter/#configure-page-settings

@SteveMarshall
Copy link
Contributor

@NickWalt01 Right, but my point is that if the titles are failing, that seems like a bug to me.

Tech Docs Template uses Middleman under the covers, and Middleman's documentation explicitly uses <%= current_page.data.title %> as an example of how to use front matter variables.

That would suggest that there's something wrong here, and the solution would be to identify what that is (and possibly fix it upstream, if necessary) rather than just work around it.

@SteveMarshall
Copy link
Contributor

@NickWalt01 Not to mention, too, that the context in your first comment should probably be summarised in the commit messages if it turns out that we can't use the variables anymore…

@NickWalt01
Copy link
Author

NickWalt01 commented Apr 28, 2022

Ah it make sense now. The issue is that our files are missing the <h1>List</h1> two lines above in that link you have provided. In markdown # is equal to <h1>, so I have replaced <%= current_page.data.title %> with # title and it solved the compilation problem but at the same time removed <%= current_page.data.title %>. what does <%= current_page.data.title %> do?

@SteveMarshall
Copy link
Contributor

<%= current_page.data.title %> takes the title value from the YAML front-matter and puts it into the page's body, so that we don't have to have the title in both YAML and the HTML, if you see what I mean?

@NickWalt01
Copy link
Author

yeah, I didn't know what it did, as it doesn't look like ruby or markdown code. I'll run a local instance and get some images of the error when it is used with the latest versions of middleman and dependencies

@NickWalt01
Copy link
Author

An exception is raised error when <%= current_page.data.title %> is used within the source file:

image

It was solved by changing it to # Title

The error appears to be govuk_tech_docs rather than middleman.

I found this: alphagov/tech-docs-gem#284 which is similar to something I experienced: the html-proof tool was creating links that didnt exist ie because if the same title appeared in different files it didnt treat them as separate namespaces and thus failed the url check, there is a fix like this alphagov/tech-docs-gem#150 (comment) but to implement that is a bigger change across all the heading and sub headings.

The examples in the gem also uses markdown for the title: https://github.com/alphagov/tech-docs-gem/blob/master/example/source/index.html.md.erb within https://github.com/alphagov/tech-docs-gem/tree/master/example/source

I havent come across anything that explains how <%= current_page.data.title %> is converted to a html h1 tag or a markdown # tag.

@NickWalt01
Copy link
Author

Hi @SteveMarshall I've found out the problem, it is in the YAML fields at the top of the page specially last_reviewed_on. When loading a date yaml field an exception occurs and thus the title yaml field cannot be read correctly. Example
'YAML Exception parsing /app/source/documentation/guides/using-git.html.md.erb: Tried to load unspecified class: Date'
From what I can see Middleman wants this: 2021-04-12 from 2021/04/12 to as shown here: https://middlemanapp.com/basics/blogging/#customizing-paths-and-urls
But then the alphagov/tech-docs-gem ruby code fails as it wants it in 2021-04-12 format.
Will continue looking into it.

@SteveMarshall
Copy link
Contributor

@NickWalt01 Good sleuthing 👍🏼 I wonder if it's worth raising that as a bug against https://github.com/alphagov/tech-docs-gem, or fixing it and raising a PR?

@NickWalt01
Copy link
Author

Middleman
In the latest tag v4.4.2 on 3/11/2021: YAML.load(content) this is an alias for YAML.unsafe_load()
As of 30/Jan/2022: YAML.safe_load(content, permitted_classes: [Date, Time, DateTime, Symbol, Regexp], aliases: true)

Ruby v3.1
Ruby uses psych as the yaml module. psych uses YAML.safe_load as of version 3.1 onwards.

The issue is discussed here
ruby/psych#489
ruby/psych#487
The change was made in May 2021.

And explained in this article: https://www.ctrl.blog/entry/ruby-psych4.html

govuk_tech_docs
The Ruby code here expects the last_reviewed_on in the html header to be a Date, which works on Ruby < v3 as it is using YAML.load(content) aka YAML.unsafe_load(), but when Ruby v3 is used Middleman fails and raises an exception.

tech-docs-github-pages-publisher
Reverting the Docker image base from Ruby v3.1.1 back to v2.7.6 the issue is resolved.

Summary
It seem Ruby v3+ uses a stricter version of yaml and middleman has not released a version that has the YAML.safe_load with the Date option specified. Need to wait for Middleman next release that has he YAML.safe_load with the Date option specified and allows a Date object like this 2022-05-24, at the moment it will allow a string date 2022/05/24 but this causes a failure in govuk_tech_docs. Have asked govuk_tech_docs to allow a string input, convert the string into a Date object then use it in the original code.

The code that loads the date has changed here in Middleman:
image

@NickWalt01 NickWalt01 changed the title correct titles in html and up the makefile publisher version update tech-docs-github-pages-publisher to 1.5 May 24, 2022
@NickWalt01
Copy link
Author

Hi @SteveMarshall based on the last message above have decided to revert the ruby inside the image back to v2.7 which caused the issue, just need your approval to close off this PR

@SteveMarshall
Copy link
Contributor

@NickWalt01 To be clear, are you saying “close the PR without merging and upgrading to v1.5”? I'm happy with that or with merging it and not upgrading Ruby if that resolves the issue…

@NickWalt01
Copy link
Author

Hi @SteveMarshall I wish to merge the bump from 1.4 to 1.5, 1.5 is using ruby 2.7, am using 1.5 on the operations-engineering website and we have managed to get the review date expiry alerts up and running again, version 1.5 mainly contains dependency updates to fix dependabot issues: https://github.com/ministryofjustice/tech-docs-github-pages-publisher/commits/main

@NickWalt01 NickWalt01 merged commit d2d94c1 into main May 25, 2022
@NickWalt01 NickWalt01 deleted the update-html-pages branch May 25, 2022 12:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants