Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add support for Property and missing content #41

Merged
merged 7 commits into from
Jun 17, 2021
Merged

Conversation

dandhlee
Copy link
Collaborator

@dandhlee dandhlee commented Jun 15, 2021

Before you open a pull request, note that this repository is forked from here.
Unless the issue you're trying to solve is unique to this specific repository,
please file an issue and/or send changes upstream to the original as well.


While investigating for adding support for property, I managed to add a section for content discrepancy as both fo them seem to be missing with the newer Sphinx upgrade from 1.x to 2.1.0+

This PR achieves the following:

  • Adds handler for property so that they properly show in the YAML hierarchy
  • Adds handler for missing documentation snippets for each entries
    • for each documentation snippet, sanitizes it into useful subsections for display in HTML pages
    • currently has handlers for :type, :rtype:, :returns:, :raises, and :param

@busunkim96 I added handlers for subsections based on the 5 given above from python-spanner, for example:

    def execute_sql(
        self,
        sql,
        params=None,
        param_types=None,
        query_mode=None,
        query_options=None,
        retry=google.api_core.gapic_v1.method.DEFAULT,
        timeout=google.api_core.gapic_v1.method.DEFAULT,
    ):  
        """Perform an ``ExecuteStreamingSql`` API request.

        :type sql: str
        :param sql: SQL query statement

        :type params: dict, {str -> column value}
        :param params: values for parameter replacement.  Keys must match
                       the names used in ``sql``.

        :type param_types:
            dict, {str -> :class:`~google.spanner.v1.types.TypeCode`}
        :param param_types: (Optional) explicit types for one or more param
                            values;  overrides default type detection on the
                            back-end.

        :type query_mode:
            :class:`~google.spanner.v1.types.ExecuteSqlRequest.QueryMode`
        :param query_mode: Mode governing return of results / query plan. See:
            `QueryMode <https://cloud.google.com/spanner/reference/rpc/google.spanner.v1#google.spanner.v1.ExecuteSqlRequest.QueryMode>`_.

        :type query_options:
            :class:`~google.cloud.spanner_v1.types.ExecuteSqlRequest.QueryOptions`
            or :class:`dict`
        :param query_options: (Optional) Options that are provided for query plan stability.

        :type retry: :class:`~google.api_core.retry.Retry`
        :param retry: (Optional) The retry settings for this request.

        :type timeout: float
        :param timeout: (Optional) The timeout for this request.

        :rtype: :class:`~google.cloud.spanner_v1.streamed.StreamedResultSet`
        :returns: a result set instance which can be used to consume rows.
        """

given with such documentation support. Should I expect the client libraries to be accustomed to this, or should I take a more general approach and try to guess based on given parameters or simply leave them blank if I'm not given these in the documentation?

Fixes #40

  • Tests pass
  • Appropriate changes to README are included in PR
  • Send part of this change upstream for support in missing content / handler for property

@dandhlee dandhlee added the type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design. label Jun 15, 2021
@dandhlee dandhlee requested review from busunkim96 and tbpg June 15, 2021 07:13
@dandhlee dandhlee requested a review from a team as a code owner June 15, 2021 07:13
@google-cla google-cla bot added the cla: yes This human has signed the Contributor License Agreement. label Jun 15, 2021
docfx_yaml/extension.py Outdated Show resolved Hide resolved
docfx_yaml/extension.py Outdated Show resolved Hide resolved
# Only add summary to parts of the code that we don't get it from the monkeypatch
if _type == MODULE:
summary_info = {
'variables': {}, # Stores mapping of variables and its description, types
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing something after "types" in the comment?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

meant to say and, will add &

docfx_yaml/extension.py Outdated Show resolved Hide resolved
'exceptions': [] # Stores the exception info
}

# Add extracted summary
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any of this we can pull out into separate functions? The indentation gets pretty deep here, making it more difficult to follow.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll create a separate function for it. I agree it was slightly hectic to follow myself.


if args or sig:
# Clean the string by cleaning newlines and backlashes, then split by white space.
parsed_text = " ".join(filter(None, re.split('\n| |\\\\',summary))).split(" ")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we handle all forms of whitespace (\s)?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should, given that we're not sure how Sphinx might port over the docs (it gets easily littered with newlines and carriage returns)


if args or sig:
# Clean the string by cleaning newlines and backlashes, then split by white space.
parsed_text = " ".join(filter(None, re.split('\n| |\\\\',summary))).split(" ")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use r'...' to avoid escaping issues?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will update for readability.


# Using counter iteration to easily extract names rather than
# coming up with more complicated stopping logic for each tags.
while index <= len(parsed_text):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall comment: I'm surprised Sphinx doesn't handle this parsing under the hood. If it doesn't and we have to do this, that's fine. But, just commenting to confirm there isn't some sort of already-parsed representation of the comment?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After some research I've found this docstring module which helps take in GoogleDocstrings and clean things up. This would help clean the above few lines.

Given this reStructuredText format, Sphinx is able to display them nicely like shown here; however since we need to port these into the YAMLs the parser here would be necessary to implement.

I'll clean the code here as much as I can!

@@ -70,7 +70,9 @@ class Bcolors:
REFMETHOD = 'meth'
REFFUNCTION = 'func'
INITPY = '__init__.py'
REF_PATTERN = ':(py:)?(func|class|meth|mod|ref):`~?[a-zA-Z_\.<> ]*?`'
REF_PATTERN = ':(py:)?(func|class|meth|mod|ref|attr|exc):`~?[a-zA-Z0-9_\.<> ]*?`'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall comment (not just this line): Any tests for this file/these changes?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll add some files for tests since we have the infrastructure for it already.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there tests that verify the output?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've realized the test simply generates content but it doesn't seem to verify the output. Filed #44 to track this, after offline chat moving forward at the moment with generation part and adding back the verification after.

Copy link
Contributor

@busunkim96 busunkim96 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's odd that this needs to be handled separately. The docstrings in spanner are considered "default" by sphinx. The docstrings you see in the generated libraries and most newer libraries are "google style" and are processed through the Napoleon package. https://sphinxcontrib-napoleon.readthedocs.io/en/latest/index.html

try:
lines = inspect.getdoc(obj)
lines = lines.split("\n") if lines else []
except Exception as e:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to capture a more specific Exception type?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will update to reflect more specific types.

'exceptions': [] # Stores the exception info
}

# Add extracted summary
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Copy link
Collaborator Author

@dandhlee dandhlee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the feedback! @busunkim96 I was able to find the napoleon module however since Sphinx stops after having reStructuredText I'd still have to take that and convert them to YAML friendly structure, needing this parser. I'll clean the code up in a bit.

docfx_yaml/extension.py Outdated Show resolved Hide resolved
# Only add summary to parts of the code that we don't get it from the monkeypatch
if _type == MODULE:
summary_info = {
'variables': {}, # Stores mapping of variables and its description, types
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

meant to say and, will add &

'exceptions': [] # Stores the exception info
}

# Add extracted summary
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll create a separate function for it. I agree it was slightly hectic to follow myself.


if args or sig:
# Clean the string by cleaning newlines and backlashes, then split by white space.
parsed_text = " ".join(filter(None, re.split('\n| |\\\\',summary))).split(" ")
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should, given that we're not sure how Sphinx might port over the docs (it gets easily littered with newlines and carriage returns)


if args or sig:
# Clean the string by cleaning newlines and backlashes, then split by white space.
parsed_text = " ".join(filter(None, re.split('\n| |\\\\',summary))).split(" ")
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will update for readability.

try:
lines = inspect.getdoc(obj)
lines = lines.split("\n") if lines else []
except Exception as e:
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will update to reflect more specific types.

@@ -70,7 +70,9 @@ class Bcolors:
REFMETHOD = 'meth'
REFFUNCTION = 'func'
INITPY = '__init__.py'
REF_PATTERN = ':(py:)?(func|class|meth|mod|ref):`~?[a-zA-Z_\.<> ]*?`'
REF_PATTERN = ':(py:)?(func|class|meth|mod|ref|attr|exc):`~?[a-zA-Z0-9_\.<> ]*?`'
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll add some files for tests since we have the infrastructure for it already.


# Using counter iteration to easily extract names rather than
# coming up with more complicated stopping logic for each tags.
while index <= len(parsed_text):
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After some research I've found this docstring module which helps take in GoogleDocstrings and clean things up. This would help clean the above few lines.

Given this reStructuredText format, Sphinx is able to display them nicely like shown here; however since we need to port these into the YAMLs the parser here would be necessary to implement.

I'll clean the code here as much as I can!

@dandhlee
Copy link
Collaborator Author

Updated to address the following:

  • refactored parser into _extract_docstring_info function
  • makes use of GoogleDocstring function to handle both reStructuredText content coming in, as well as Google style docstrings coming in.
    • keeping the sanitizer with re.split for further sanitizing the string for parsing purpose
  • ignores cls entries on parameters which are self arguments passed in
    • added coverage for *kwarg and **kwargs entries
  • catch specific types and not generic Exception types
  • added test cases to cover for updated regex and one of spanner's functions with various types to process

@dandhlee dandhlee requested review from busunkim96 and tbpg June 16, 2021 11:21
Copy link
Contributor

@tbpg tbpg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, looks good. Still more manual parsing than I'd like, but it sounds like it's necessary.

@@ -70,7 +70,9 @@ class Bcolors:
REFMETHOD = 'meth'
REFFUNCTION = 'func'
INITPY = '__init__.py'
REF_PATTERN = ':(py:)?(func|class|meth|mod|ref):`~?[a-zA-Z_\.<> ]*?`'
REF_PATTERN = ':(py:)?(func|class|meth|mod|ref|attr|exc):`~?[a-zA-Z0-9_\.<> ]*?`'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there tests that verify the output?

@dandhlee dandhlee requested review from a team and tbpg June 17, 2021 18:50
@dandhlee dandhlee merged commit 5ac499f into master Jun 17, 2021
@dandhlee dandhlee deleted the lost_content branch June 17, 2021 19:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla: yes This human has signed the Contributor License Agreement. type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Fix content missing compared to googleapis.dev
3 participants