Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add markdown formatter / exporter #1976

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

mayel
Copy link
Contributor

@mayel mayel commented Dec 8, 2024

@@ -103,7 +103,7 @@ defmodule ExDoc.CLI do
defp normalize_formatters(opts) do
formatters =
case Keyword.get_values(opts, :formatter) do
[] -> opts[:formatters] || ["html", "epub"]
[] -> opts[:formatters] || ["html", "epub", "markdown"]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we shuold enable it by default :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I misread and thought that was the allow-list of supported formatters

|> Enum.map(&elem(&1, 1))
end

defp generate_zip(output) do
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should skip the zip as well. The simplest we can genereate the best :)

@josevalim
Copy link
Member

Thank you @mayel, I think the general direction is good and I think we can continue exploring it. The testing structure will be very important too.

Just a heads up, we will be slow with reviews on our side, since we are focused on Elixir v1.18 and launching Livebook Teams.

@mjrusso
Copy link

mjrusso commented Dec 26, 2024

This is awesome @mayel! And thanks for letting me know about this PR :)

Having spent some time thinking about this, a few requirements suggestions for discussion/debate. My ideal Markdown export would generate:

  • an exact mirror of existing HTML page structure, but in Markdown format (with .md extension for each file, and working hyperlinks between all .md files)
  • at least one "single file" export with all documentation included in a single Markdown file
    • (there might be opportunities to generate a few different types of "single file" exports; here's what hex2txt is currently doing, but it would probably make sense for exdoc to generate a version of the single-file export with all "extras" included)
  • a top-level Markdown file that simply links to every generated Markdown file (like a sitemap, i.e. the intention behind the llms.txt proposal)

I would also recommend generating all of this by default, so tooling can start to rely on these files existing :)

mayel added a commit to mayel/ex_doc that referenced this pull request Dec 26, 2024
@mayel
Copy link
Contributor Author

mayel commented Dec 26, 2024

@mjrusso Thanks for the feedback!

The structure of the files and markdown contents should already match that of the html docs.

In terms of a single file, that was the intention of the ZIP archive containing all the md docs, so it can easily be downloaded from hexdocs and devs can choose which files/modules they want to add as context rather than always including everything, but now I'm thinking we could add a cli flag to generate either a single file or a ZIP with seperate files, and leave the discussion of what the default should be for later?...

I've pushed some WIP I hadn't staged which includes generating an index.md with structured links to all the .md docs.

@mjrusso
Copy link

mjrusso commented Dec 27, 2024

The structure of the files and markdown contents should already match that of the html docs.

Perfect :) I was mostly trying to enumerate my ideal requirements independent of what was already written, just for ease of debate.

add a cli flag to generate either a single file or a ZIP with seperate files, and leave the discussion of what the default should be for later?...

In general my preference would be to make Markdown generation (in whatever form we decide) the default, with no other configuration options (other than disabling it if you really don't want it) so tools can rely on a common approach.

On the topic of the zip, single file generation, etc.:

In terms of a single file, that was the intention of the ZIP archive containing all the md docs, so it can easily be downloaded from hexdocs and devs can choose which files/modules they want to add as context rather than always including everything, but now I'm thinking we could add a cli flag to generate either a single file or a ZIP with seperate files, and leave the discussion of what the default should be for later?...

Since we can already download a tarball of all docs from hex.pm (the md files, if generated, would be included by default there as well, correct?), I think we can forego the zip archive. Easy enough to get the markdown files from there. (And would these be fetched by default with mix hex.docs?)

Thinking through this a bit more, I think we could forego the single-file generation (at least for now). Realistically for AI tooling integration that works we are going to need a server in between that can manage pulling the right chunks of documentation for any given task. The individual md files being produced here provide the right building blocks.

(Also, instead of "Download Markdown version", perhaps "View Markdown version", which just links to the index.md file.)

@mayel
Copy link
Contributor Author

mayel commented Dec 27, 2024

Ah the downloadable docs from hex.pm had completely slipped me by! It may be useful to also include that link in the doc footers next to the ePub.

And yeah all makes sense to me, hoping I find some time to work on it a bit more soon :)

@mayel
Copy link
Contributor Author

mayel commented Dec 27, 2024

we can already download a tarball of all docs from hex.pm (the md files, if generated, would be included by default there as well, correct?)

The epub is included in that ZIP so I'm guessing yes

@mayel
Copy link
Contributor Author

mayel commented Dec 27, 2024

Alright I've experimented with updating the footer so it includes:

  • Hex Package (if a hex package is set)
  • View Code:
    • Source Repo (if a source url is known)
    • Hex Preview (if a hex package is set)
  • View Markdown version (if markdown formatter enabled)
  • Download docs archive (from hex, should include html, markdown and epub)
  • Search HexDocs

Note: source repo, hex preview, and view markdown version all link to the file matching the current module/page.

Screenshot 2024-12-27 at 15 11 32

@mayel mayel marked this pull request as ready for review December 27, 2024 16:55
@mayel
Copy link
Contributor Author

mayel commented Dec 27, 2024

OK I'm starting to feel pretty good with the generated output (tested with a bunch of projects), probably missing some things but could use some feedback on the implementation and test coverage :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

3 participants