-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reorganize website source material to simplify updating #178
Comments
Thanks for writing this up Ray! I don't have super detailed thoughts at the moment. I think this largely makes sense and the "two pull requests" model didn't work as well as I'd hoped 😔 just two initial thoughts:
|
Thanks for the link to the previous related discussions. I think that thread is a good reminder of some of the ways that we've previously struggled with this.
My suggestion was to leave the "community" content associated with the website repo -- I agree that it is decoupled from Cantera's release cycle. I think the case of installation instructions fits pretty naturally as part of the main repo: updates related to changes in the development version, e.g. changes in build requirements or installation options, can be made only to the
The science docs have needed a significant update for several releases now, and I think the current organization is part of why we haven't at least managed to populate the "Science" section of the website with the relevant content that currently lives in the main repository, in the Doxygen docstrings for many of the C++ classes. I'm hoping that this restructuring will make it easier to update that content in smaller chunks so that it actually gets done. I think we should prioritize writing high-quality documentation for the current development version of Cantera. If we're also able to backport some of that content to the latest stable release, that's a nice bonus, but we shouldn't exert too much effort on that, since the development version will eventually be the stable release. If anything, I'm hoping this will reduce one of the points of friction for more frequent releases.
One other reason to keep the website as a separate repo is that it provides a clear way of managing and navigating the version-specific pages that are built from the main repo. This is something I remember being very tricky to manage back when the whole site was generated out of the main repo. By contrast, I think the current "documentation" landing page (https://cantera.org/documentation/index.html) handles this quite well. |
I definitely don't disagree with the direction of this suggestion, just trying to remember the context that led us to this point in the first place and make sure we're moving towards fixing the underlying problem 😊
As it happens, the pydata-sphinx-theme has a built-in version switcher widget, similar to the one on readthedocs. https://pydata-sphinx-theme.readthedocs.io/en/stable/user_guide/version-dropdown.html I haven't looked into how it's implemented and how that crosses over with our usage. It looks like we'd maybe want to "version" all the pages somehow. Cory also suggested something related over here: Cantera/cantera-website#229 |
@speth and @bryanwweber ... I likewise appreciate the writeup. From my side, my 2 cents are that it matters less where documents are located and more how they are written. The MyST markup envisioned in Cantera/cantera-website#211 would make documentation a lot easier, as it is more intuitive and has several extremely useful extensions (the Jupyter notebook integration/conversion is impressive). Regarding tutorials, things get a little iffy: I would certainly agree with moving the C++ examples over to the main repository (the tutorial examples are just simpler versions of samples anyways). For Python, I believe the most appropriate tutorial format are Jupyter notebooks; at the same time, this is where MyST shines (at least if I recall correctly), so I guess these could be moved to the main repo after we have a way to render content on the website (presumably the same is true for Jupyter notebooks). For MATLAB, the most appropriate tutorials would be live scripts, which are not git friendly. We definitely should not use As an aside I would also like to point out is that the state of the doxygen documentation is likewise a sore point. There is a lot of useful information in there that is almost impossible to find. While we do a good job of adding documentation for individual functions, the big picture is not great (although it's not too difficult to improve the situation at least somewhat; e.g. Cantera/cantera#1534). For some of the more technical documentation purposes, I'm wondering whether it may make sense to improve documentation within the source code (e.g. #169), rather than writing up separate sections? One example illustrating the current situation is the documentation of In summary, I believe that:
|
Thanks @ischoegl. On the topic of Doxygen, since we're switching to Sphinx, there are many extensions that support direct integration of Doxygen XML output into Sphinx sites. Among many options are Breathe and Exhale. I've not evaluated the options to see what might work best for us. As a note about Jupyter and Matlab Live notebooks which have inscrutable git histories, I wonder if git submodules in the main repo makes sense for those? Since we do submodule checkouts on CI anyways, the examples would be there for running without polluting the main git history. |
It would be interesting to see what works how. Getting a better feel for the Sphinx/doxygen integration would go a long way in terms of making decisions on where different parts of the documentation would 'live', especially if we want to house as much as possible in the main repo.
Based on my understanding, MyST will make Jupyter notebooks almost unnecessary, so the repo pollution issue is moot? At the same time, PS: exhale is - at least to me - on the lower end of what I'd like to see in terms of number of contributors (it's essentially a single developer). I feel a little safer about breathe. |
One issue that remains to be resolved is how to link to documents that are imported by |
Well, that's just a link within version-specific API documentation, where both the source and target are already generated in the main repo. I think the only realistic alternative for links from Doxygen to Sphinx would be to use |
@speth / @bryanwweber After working with the doxygen interface, I honestly believe that moving some of the reST documents (example: YAML Input File Reference) into doxygen markdown pages is much simpler than the opposite direction (trying to integrate everything using Here's an example (using Cantera/cantera#1546 plus some I only did part of this to get a feel for the outcome; see this branch https://github.com/ischoegl/cantera/tree/move-yaml-api-docs ... the conversion isn't too difficult, but this is certainly something that warrants some feedback before investing more work. |
Thanks for this work @ischoegl. I actually believe any work towards moving files around or changing formats is premature until the website stack is further settled. I don't think things are in a state where we can evaluate how integration should be done yet. 🤷♂️ |
Thanks for the comment, @bryanwweber.
This is a fair point. At the same time, I strongly believe that from a maintenance aspect, keeping two distinct websites, i.e. one user-facing for tutorials, examples, Python/MATLAB documentation, etc. (Sphinx) plus one developer-centric (doxygen) may be the best of both worlds. Doxygen is extremely good at what it is made for, and what you see above can be achieved with basic elements of recent doxygen versions (the css theme just makes it look more up-to-date). Essentially, all of this works "out of the box". Regarding content: this was merely an attempt to see how markdown files integrate into doxygen. Overall, it works beautifully (with minor caveats, as some reST formatting abilities are missing - mainly indents). So what my trials boil down to is the following proposition:
This proposal probably deviates a little from current thinking, but I honestly believe that it simplifies the work flow and is overall much easier to maintain: documentation is written together with the code in the C++ header files, which takes care of most concerns voiced at the top of this issue report. Everything can be checked using |
The division between what's handled by Sphinx and what's handled by Doxygen within the docs that are built from the contents of the main repo is not really what I was trying to address with this proposal, and again, I think that's already what #115 is about. What I'm suggesting here is to migrate the installation instructions, tutorials, and science documentation, all of which are version specific, into the main repo. Some of this may go into the Sphinx docs, and other parts into the Doxygen docs. I would agree that there is some value in keeping the detailed "science" docs as close to the implementation as possible, in the form of class docstrings on the various models. |
Thanks for the comments, @speth.
I believe this by itself addresses a lot of the issues you linked at the top. It is ultimately irrelevant whether this is rendered by doxygen or Sphinx (i.e. #115) - presumably, either of the two approaches should be able to generate detailed documentation. |
I don't think I agree with this. I think there's a really important case of arranging the content in more of a narrative format to tie concepts together for teaching. I don't think class docstrings facilitate that mode. I'm also loath to duplicate content. |
While I see where you're going at, I believe it's easy to get lost in the details if the narrative is too comprehensive (apart from my reservations about a convoluted work flow). My estimate is that 90% (or more) of the user base are only interested in a quick overview, and as long as we provide links to details (some of which are extremely lengthy, see this example), we should be good?
Agreed. At the moment, most of the Science details are still in doxygen docstrings. Changing this is a bear. PS: as an aside, putting the detailed docstring description on top of doxygen pages as proposed in Cantera/cantera#1546 makes things a lot more intuitive |
While this would address a lot of the (capital-I) Issues I linked to, it's only a small part of the version-specific updates in question. You can get a better feel for the scope of such changes in https://github.com/Cantera/cantera-website/pull/248/files.
Agreed.
I said some value, not that this was definitively the best option. To elaborate, one thing that is a bit of a struggle is to even know what all the things that have to be updated are when making an implementation change, when those changes are spread across many files (or worse, as now, repos). At least if the basic equations a model implements are right there with the corresponding class, it's obvious what someone should do when they've implemented a new class / method. We can try to remember all of these bits and pieces at PR time, but anything that makes this more obvious before that is useful. I agree that you can write better narrative documentation if you're not tied to the structure of the implementation, but it's also a lot harder to get anyone to actually do that. I'd say that the current state of #6 is proof of that. |
I couldn't agree more. Once a PR is approved, all the incentives for further work are gone; requiring decent docstrings is comparatively easy. Regarding version-specific stuff in Cantera/cantera-website#248, I appreciate the work! I'm definitely supportive of off-loading as much as we can to the main repo. |
Agree with both of you guys 😄 |
Fwiw, I took a deep dive into doxygen markdown, see Cantera/cantera#1548. The PR would consolidate the YAML Format Reference in the main repository (there are about 1k lines to be deleted from cantera-website, with quite a few redundancies removed and the browsing experience overall - I believe - considerably improved). Moving the YAML format reference to the Developer API (or "advanced"?) documentation is imho consistent, as most users don't need to know how to assemble YAML input from scratch (also, after the removal of CTI, input is no longer handled by Python). As MyST is just another flavor of markdown, I believe this effort to be portable within the context of #115 (obviously, with limitations). |
So, if we're agreed that some of the content that is currently in the website repo should be moved to the into the main repo, the question is then whether any given piece of it should be part of the Sphinx docs in the main repo, which is currently used for the Python, Matlab, and YAML API documentation, or in Doxygen, which is currently used for the C++ API documentation. While @ischoegl has already forged ahead on moving some content into Doxygen (Cantera/cantera#1548), my thinking had been to rely more heavily on Sphinx, due to some the features it provides (with the appropriate extensions). For starters:
I don't necessarily think we should go so far as to use Sphinx to generate the HTML for the C++ API documentation, but I think it is a better tool for most of the rest of our documentation needs. |
AFAIK this is just sphinxcontrib-bibtex under the hood. |
Thanks for the continued discussion. From my perspective, I want to clarify that moving everything into Doxygen was never a goal; both Sphinx and Doxygen are fantastic tools (the latter surprisingly so, as the default looks like the 90's want their GUI back). They both have strengths and weaknesses. I am not impressed by the output of
We are in agreement on the first part; on the second part I believe Doxygen deserves more credit, as we were not using it effectively. As mentioned in my comment above, my suggestion would be to:
I believe that points (1) and (3) are critical:
Given the limited manpower of the Cantera project, I am hoping to create simple workflows with reduced maintenance overhead (which happens to be an objective of this issue report). Regarding the other points raised: some Doxygen run projects do have switchers (example: OpenCV), and the |
Fwiw, creating |
With work on #179 being completed, I believe that doxygen now presents a viable option to receive some content from the website. |
@speth and @bryanwweber From my perspective, I think this is a probably one of the highest priority issues for 3.1. If we manage to move major parts of the website to the main repo and put things under CI, my personal hope is that the hurdle for releases will be lessened considerably (at least this was the premise of the description at the top). One question I have for @bryanwweber is about MyST - presumably, we can enable this on the main repo before the transition of the main website is done? |
Yes, completing this is my main goal for 3.1.0. Partly because I think any partial implementation would make that next release complicated, but also because I think it will provide a pathway to resolving many of the documentation deficiencies that have been noted. Enabling MyST usage is trivial -- I've already tried it very briefly. All you need is to add the I'm currently working on modifying |
Also, I've been thinking about how to organize the content of the website, and looking at how this is done in a number of other peer projects, such as Matplotlib, Pandas, SunPy, and others. My current draft outline is:
The top-level headings would be the ones appearing in the site header. Compared to the current site, this means combining "Science" and "Documentation" into "Reference", Replacing "Tutorials" with "User Guide", adding "Develop", and dropping "Blog". I'm very interested in any feedback on this layout, and getting to some consensus before we start implementing anything significant. |
@speth Thank you for elaborating. I like the structure you suggested, and am fully on board with making this as painless to maintain as possible. In terms of execution, my preferred approach is to be as pragmatic as possible (as little customization as possible). The two tools we have at our disposal are Sphinx/pyData and doxygen/awesome-css, where the styling is thankfully relatively similar. My deep dive into doxygen left me with the impression that it's very good to resolve some of the details, while Sphinx is definitely the better approach for 'big-picture' documentation. Here are some of my thoughts:
|
@speth Re
I had been working on https://github.com/bryanwweber/sphinx-gooey to support this need because I didn't find what I wanted in This has been the biggest blocker to me moving on with the website style changes and switching to MyST more broadly, so I'm super interested to get it resolved. |
My own 2 cents in this context are that the extra effort of creating/maintaining infrastructure code requires continuous input. My impression is that @bryanwweber genuinely likes this work and I’m confident that any solution will look amazing; but I am also aware that there are bandwidth issues. As mentioned above, my personal approach is informed by pragmatism. I don’t think it’s necessary to have a 100% visually consistent solution if we can piggyback on other projects. Of course it’d be great if we can afford the manpower for a custom solution. Whether having the bandwidth is a realistic assumption is the real question here. |
In this case, I should clarify that I meant that I didn't like the source file layout and output structure that was enforced by sphinx gallery, because I didn't think it'd match well with how the website is laid out (again, both source and output). However, if the structure we're envisioning for the website is changing anyways, it may make sense to reshape that to fit the expectation of sphinx gallery. |
@speth mentioned that his suggested layout is informed by other projects. From that perspective, I think that things can be made consistent. The main concern is how to deal with MATLAB examples. |
I think @ischoegl's idea about having some components of the documentation besides just the C++ API docs handled by Doxygen is an interesting one. It poses some integration challenges, though, in that mixing internal and external links in the navigation structure (e.g. in the menu bar across the top of the page) is a bit tricky in both Sphinx and Doxygen. One way to resolve that challenge by using doxysphinx to integrate the Doxygen-generated HTML content into the Sphinx navigational structure. I've started trying this out, with a work-in-progress branch here: https://github.com/speth/cantera/tree/doxysphinx-trial, and a copy of the resulting docs here: https://cantera.org/~speth/doxysphinx-trial/reference.html (follow the link into the C++ documentation). I'd say this worked pretty well without too many modifications. The one bit of a hack I did introduce was to add some My observations on this approach so far:
My next step is to spend a bit of time on the alternate approach, where we would just be linking between Doxygen- and Sphinx-generated content so we can compare them before committing to one approach or the other. |
And here's the second option, which keeps the Doxygen and Sphinx HTML generation more separate, but with some improvements to navigating between them: https://cantera.org/~speth/doc-linking-trial/reference.html. This is built off of the branch https://github.com/speth/cantera/tree/doc-linking-trial. The main change here is to replace the Top-of-page navigation links in Doxygen with ones that match Sphinx. Navigation within the Doxygen docs is handled by the navigation area on the left (unless you're browser window is too narrow; we may want to find a fix for this, but hopefully most people aren't trying to read C++ API docs on their phones...). On the Sphinx side, I added a set of cards to the root "Reference" page instead of using a (visible) TOC. It is unfortunately impossible to add relative links to external pages to the Sphinx TOC system (i.e., to get something to appear in the left navigation area; this is despite significant demand; see sphinx-doc/sphinx#701). Observations:
While I'm quite impressed by what Doxysphinx manages to do, my inclination is to opt for this simpler approach, since I think it still works well while avoiding introduction of extra moving parts in the documentation machinery that could end up requiring significant effort to maintain in the future. |
I've made a few further updates to this second version (https://cantera.org/~speth/doc-linking-trial/reference.html), to flatten the layout (along the lines suggested in Cantera/cantera-website#229), to provide stubs for the other main sections, and to try an example of adding some of the "science" documentation to the reference section using the MyST format. Based on this test, I recognized that there is a significant advantage to putting the science documentation in MyST or other markdown files rather than keeping it in C++ docstrings. Namely, that you get all the editor capabilities that are available for standalone markdown files, like syntax highlighting and (in VS Code at least) live preview. With the flatter file layout, it's also not too difficult to make a relative link from the C++ documentation to a specific page in Sphinx, for example Unless there are any strong opinions to the contrary, I'm going to start migrating some of the existing content into this new structure. |
@speth ... thank you for some impressive work on this. I am in general 👍 with the proposed direction, and the prototype website you linked to looks great. Apart from being interested in @bryanwweber's input, the one concern I have is how to review. I am a little hesitant to merge Cantera/cantera#1621 until the upstream Aside: I haven't looked into MyST recently, but my recollections from whenever I spent some time with it are extremely positive. It definitely is the way to go. Going back to how to proceed, I'd suggest the following:
PS: Regarding the example Ultimately, we just need to provide/cross-link information on what models are available, how they are implemented, and how things are defined in the YAML input format. I am more concerned about creating this linkage (see #182) and less about moving existing things around (beyond moving source material from the website as proposed in this PR). |
To facilitate review of this work, I opened speth/cantera#6, which shows the delta between Cantera/cantera#1621 and my You're correct that the current migration of the On the other hand, moving all the content from Beyond the "science" docs, there are also the installation and compilation instructions, plus the pages that are currently lumped under Tutorials that need to be migrated from |
Perfect - thanks! If @bryanwweber is ok with it, I'd be 👍 with merging Cantera/cantera#1621 so we don't have to do this the roundabout way of PR's against forks. It is not ideal to pull from a fork for CI, but I'm fairly optimistic about I'm aware of the tutorials and other instructions. Based on my trial with the YAML tutorial, I believe they may be easier to transfer than the Science section. |
I'm feeling very much out of the loop, so I want to take myself out of the critical path here. Whatever you guys think is best, let me know how I can help! |
Abstract
In conjunction with the website backend updates envisioned by Cantera/cantera-website#209, Cantera/cantera-website#210, and Cantera/cantera-website#211, I think it's time to reconsider the split of what content belongs as part of the
cantera-website
repository, and what belongs in the maincantera
repository alongside the source code.Motivation
Having spent some time on the website updates for Cantera 3.0 (see Cantera/cantera-website#248), I'm coming to the conclusion that a large portion of what is currently in the
website
repo actually belongs with the source. Given the norm of having the unversioned website content be applicable to the stable Cantera release, many of these updates are things where we don't really have a way to stage changes for the new release as it's being developed, like documenting changes to the build dependencies, deleting references to the now-removed CTI format, and documenting new models. And even when updates can be made in parallel, the additional work of putting together a second PR for the website is often delayed or never happens.This strong separation between code and its documentation is also I think one of the causes for a significant backlog of updates where changes and new capabilities are not documented at all, outside the somewhat obscure API docs. Just as a starting point, here are ones where issues have been created:
Description
Specific sections currently in the
cantera-website
repository that I think belong in the main code repository are:What this leaves for the
cantera-website
repository is pretty light:Using this organization, I think we should consider ourselves free to make documentation updates to the current stable branch and always use that branch to generate the active version of the website, even if we don't anticipate a future bugfix release.
I would say that my biggest question in all of this is how to go about implementing it, alongside @bryanwweber's suggested/planned changes to use a stack consisting of Sphinx / MyST / Pydata theme.
The text was updated successfully, but these errors were encountered: