-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Docs generated from javadoc get mangled when using html tags or taglets (like link, code) #21
Labels
enhancement
New feature or request
Milestone
Comments
other html tags to consider: |
simonbasle
added a commit
to simonbasle/micrometer-docs-generator
that referenced
this issue
Sep 23, 2022
This commit attempts to fix the generated asciidoc for javadoc blocks that are not trivial: A custom parsing of Roaster `getJavadoc()` model is used instead of `getText()`/`getFullText()` to avoid mangling the original javadoc. We attempt to convert a simple subset of HTML tags to their asciidoc equivalents and to convert inline taglets to a relevant asciidoc representation if any. For HTML, we convert: - `p`, `br`, `b` and `i` to their direct equivalents - `strong` tags (including inline) to an `IMPORTANT:` admonition - `ul`/`ol` lists and their `li` elements in a best effort fashion In case an `ol` is detected we have to turn ALL `li` to asciidoc ordered list elements. All other unknown HTML tags are removed but their node content is kept. For taglets: - `@code` and `@value` taglets have their content turned into inline code blocks - `@link` and `@linkplain` taglets consider whether an alias text is provided. If so, only the alias text is provided in the asciidoc. If not, the target of the link is provided in the asciidoc as an inline code block. - unknown taglets are copied as an inline code block Fixes micrometer-metrics#21.
simonbasle
added a commit
to simonbasle/micrometer-docs-generator
that referenced
this issue
Sep 23, 2022
This commit attempts to fix the generated asciidoc for javadoc blocks that are not trivial: A custom parsing of Roaster `getJavadoc()` model is used instead of `getText()`/`getFullText()` to avoid mangling the original javadoc. We attempt to convert a simple subset of HTML tags to their asciidoc equivalents and to convert inline taglets to a relevant asciidoc representation if any. For HTML, we convert: - `p`, `br`, `b` and `i` to their direct equivalents - `strong` tags (including inline) to an `IMPORTANT:` admonition - `ul`/`ol` lists and their `li` elements in a best effort fashion In case an `ol` is detected we have to turn ALL `li` to asciidoc ordered list elements. All other unknown HTML tags are removed but their node content is kept. For taglets: - `@code` and `@value` taglets have their content turned into inline code blocks - `@link` and `@linkplain` taglets consider whether an alias text is provided. If so, only the alias text is provided in the asciidoc. If not, the target of the link is provided in the asciidoc as an inline code block. - unknown taglets are copied as an inline code block Additionally, in order to ensure these asciidoc javadoc conversions are correctly rendered in the output file, this commit polishes the syntax of quotes and tables: - the `____` block style for quote is used instead of a single `>` - column format instruction `[cols="a,a"]` is used for tables Fixes micrometer-metrics#21.
marcingrzejszczak
pushed a commit
that referenced
this issue
Sep 26, 2022
This commit attempts to fix the generated asciidoc for javadoc blocks that are not trivial: A custom parsing of Roaster `getJavadoc()` model is used instead of `getText()`/`getFullText()` to avoid mangling the original javadoc. We attempt to convert a simple subset of HTML tags to their asciidoc equivalents and to convert inline taglets to a relevant asciidoc representation if any. For HTML, we convert: - `p`, `br`, `b` and `i` to their direct equivalents - `strong` tags (including inline) to an `IMPORTANT:` admonition - `ul`/`ol` lists and their `li` elements in a best effort fashion In case an `ol` is detected we have to turn ALL `li` to asciidoc ordered list elements. All other unknown HTML tags are removed but their node content is kept. For taglets: - `@code` and `@value` taglets have their content turned into inline code blocks - `@link` and `@linkplain` taglets consider whether an alias text is provided. If so, only the alias text is provided in the asciidoc. If not, the target of the link is provided in the asciidoc as an inline code block. - unknown taglets are copied as an inline code block Additionally, in order to ensure these asciidoc javadoc conversions are correctly rendered in the output file, this commit polishes the syntax of quotes and tables: - the `____` block style for quote is used instead of a single `>` - column format instruction `[cols="a,a"]` is used for tables Fixes #21.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Consider the following javadoc:
Which seems to produce the following kind of output:
Currently the tool uses Roaster's
getJavadoc().getText()
method, which strips the taglets and seems also to strip the html tags from the returnedString
.Perhaps
getJavadoc().getFullText()
would be a better alternative, but then we have to consider HTML and taglets in the asciidoc output :/Option 1: naive sanitization to HTML
One way of doing basic sanitization would be to:
getFullText()
{@xxx
and replace with an opening code tag<code>
}
and replace by a closing code tag</code>
++++
blockDrawback: this is incompatible with asciidoctor-pdf generation.
Option 2: naive sanitization to Asciidoc
This implies more effort to convert a basic set of common HTML tags to asciidoc. I'd consider
p
,br
for a start.For taglets, I'd consider
@link
and@code
as the minimum viable set.getFullText()
<p>
to double newline, remove</p>
<br>
/<br/>
to newline{@code xxx}
to`xxx`
{@link xxx}
to`xxx`
too (not super reliable especially with links + description, but eh)The text was updated successfully, but these errors were encountered: