Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat!: retrieve all markdown pages #266

Merged
merged 11 commits into from
Nov 15, 2022
Merged

feat!: retrieve all markdown pages #266

merged 11 commits into from
Nov 15, 2022

Conversation

dandhlee
Copy link
Collaborator

Recovered all markdown pages in nested directory structures. This was not correctly retrieved as I was only looking in the top directory.

Few things that's happening (happy to break down, but was hard to do so to have all the moving parts in one picture):

  • Keeping track of where pages are retrieved. This is vital as we need to know where markdown pages will need to end up in the TOC
  • Merging the Markdown and Package's TOC, based on the two sources pulled separately. I could choose to just move all the markdown files at the root directory, but I think a better experience will be to put the markdown pages where they are located.
  • Remove unused markdown pages that are generated. These are redundant API pages that Sphinx generated but are not used by DocFX Yaml plugin.

Only the handwritten TOC is updated and works as expected, gapic-combo and gapic-auto doesn't contain any markdown pages in nested directories.

Some markdown pages might come with a mix of handwritten guides and autogenerated API content. I plan on working with the library maintainers to have this updated throughout (<10) for the purpose of this plugin that separates Sphinx HTML and DocFX YAML content.

Breaking Change: Upgrading to using 3.9. All Python client libraries are able to use 3.9, so this is not a problem, but will be a backwards incompatible change.

Fixes #217.

  • Tests pass

@dandhlee dandhlee requested review from a team as code owners November 14, 2022 09:39
@product-auto-label product-auto-label bot added the size: xl Pull request size is extra large. label Nov 14, 2022
@@ -0,0 +1,52 @@
# Modules for Python Storage
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file is generated as part of the library's structure for handwritten library. We shouldn't use this in production, but WAI with the way that the directory is set up. In practice, this will need to be removed in the actual library and have the Sphinx config re-structured.

Comment on lines 1578 to 1579
pkg_toc_queue = [package for package in pkg_toc_yaml]
for entry in pkg_toc_queue:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use list(pkg_toc_yaml) to create a copy?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

pkg_toc_queue = [package for package in pkg_toc_yaml]
for entry in pkg_toc_queue:
if (children := entry.get('items')):
pkg_toc_queue.extend(children)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This modifies the list being iterated over, which is generally bad. Consider something like flatten_items which can be recursive and flattens the list of items.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Added _flatten_tocs entry to address this.

base_markdown_dir = Path(app.builder.outdir).parent / "markdown"

markdown_dir = (
base_markdown_dir / '/'.join(cwd)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
base_markdown_dir / '/'.join(cwd)
base_markdown_dir.joinpath(*cwd)

(Using '/' and pathlib at the same time is a red flag.)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Thank you for this!

@@ -4,6 +4,16 @@
- href: changelog.md
name: Changelog
- items:
- href: blobs.md
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do these test files need to be as long as they are?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The content should be separated, but works as intended - blobs and buckets should be filtered but there is no class entry with the exact name, but blob and bucket exists. This is the case for ACL and other libraries: there is a markdown file generated but not moved over as we see that there is an entry for which it exists.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See blobs versus ACL: similar entries but blobs is not filtered out and ACL is filtered.

Copy link
Collaborator Author

@dandhlee dandhlee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! Please take a look again.

Comment on lines 1578 to 1579
pkg_toc_queue = [package for package in pkg_toc_yaml]
for entry in pkg_toc_queue:
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

base_markdown_dir = Path(app.builder.outdir).parent / "markdown"

markdown_dir = (
base_markdown_dir / '/'.join(cwd)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Thank you for this!

@@ -4,6 +4,16 @@
- href: changelog.md
name: Changelog
- items:
- href: blobs.md
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The content should be separated, but works as intended - blobs and buckets should be filtered but there is no class entry with the exact name, but blob and bucket exists. This is the case for ACL and other libraries: there is a markdown file generated but not moved over as we see that there is an entry for which it exists.

@@ -4,6 +4,16 @@
- href: changelog.md
name: Changelog
- items:
- href: blobs.md
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See blobs versus ACL: similar entries but blobs is not filtered out and ACL is filtered.

pkg_toc_queue = [package for package in pkg_toc_yaml]
for entry in pkg_toc_queue:
if (children := entry.get('items')):
pkg_toc_queue.extend(children)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Added _flatten_tocs entry to address this.

@dandhlee dandhlee requested a review from tbpg November 15, 2022 07:26
Comment on lines 1584 to 1585
toc_queue.extend(_flatten_toc(children))
toc_queue.extend(children)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is still modifying toc_queue while looping over it. Can you iterate over toc_yaml_entry instead?

If I'm understanding correctly, we don't need a queue at all. Just need to iterate over the "graph" and put it all into a single list.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops 🤦 Made updates.

@dandhlee dandhlee requested a review from tbpg November 15, 2022 13:58
@dandhlee
Copy link
Collaborator Author

Please take a look again!

for entry in toc_yaml_entry:
if (children := entry.get('items')):
entries.extend(_flatten_toc(children))
entries.extend(children)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this line end up double-adding? _flatten_toc is called with children, which will then return children (and more), then we add children again right here.

Seems like this needs a test.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤦🤦 Yes you're right. Removed the redundant line, added unit test coverage as well. I didn't add it before because it required sphinx.app ... before, but not anymore so fine to add now.

@dandhlee dandhlee requested a review from tbpg November 15, 2022 16:56
@dandhlee
Copy link
Collaborator Author

Please take a look again!

@dandhlee dandhlee merged commit 1cee1ed into main Nov 15, 2022
@dandhlee dandhlee deleted the retrieve_directories branch November 15, 2022 19:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
size: xl Pull request size is extra large.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Missing markdown pages in directories
2 participants