Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Filters: --async and --function and --class and various typing ones #21

Closed
simonw opened this issue Jun 19, 2023 · 16 comments
Closed

Filters: --async and --function and --class and various typing ones #21

simonw opened this issue Jun 19, 2023 · 16 comments
Labels
enhancement New feature or request

Comments

@simonw
Copy link
Owner

simonw commented Jun 19, 2023

The --signatures option turns out to be a pretty great way to start navigating a new codebase.

It might be useful to be able to filter by types of content. Potentially the following:

  • --async - return async function (and method) definitions
  • --function - just functions
  • --class - just classes
  • --method - just class methods

These would be additive, so the following:

symbex --signatures --method --function

Would return all methods and all functions.

But... what would this do?

symbex -s --async

Would it return all async functions AND async methods? If so, would combining it with --function or --method not make a difference?

Or should there be a --async-method filter that's different from --async (which only gets async functions)?

@simonw simonw added the enhancement New feature or request label Jun 19, 2023
@simonw
Copy link
Owner Author

simonw commented Jun 19, 2023

I'll prototype it and play with it and see how it feels.

@simonw
Copy link
Owner Author

simonw commented Jun 19, 2023

This is a pretty fun prototype:

% symbex --async -d ../datasette -s | head -n 20 
# File: /Users/simon/Dropbox/Development/datasette/test-plugins/register_output_renderer.py Line: 6
async def can_render(datasette, columns, rows, sql, query_name, database, table, request, view_name)

# File: /Users/simon/Dropbox/Development/datasette/test-plugins/register_output_renderer.py Line: 26
async def render_test_all_parameters(datasette, columns, rows, sql, query_name, database, table, request, view_name, data)

# File: /Users/simon/Dropbox/Development/datasette/pugins-older/register_output_renderer.py Line: 6
async def can_render(datasette, columns, rows, sql, query_name, database, table, request, view_name)

# File: /Users/simon/Dropbox/Development/datasette/pugins-older/register_output_renderer.py Line: 26
async def render_test_all_parameters(datasette, columns, rows, sql, query_name, database, table, request, view_name, data)

# File: /Users/simon/Dropbox/Development/datasette/pugins-older/register_output_renderer.py Line: 60
async def render_response(request)

# File: /Users/simon/Dropbox/Development/datasette/pugins-older/memory_name.py Line: 6
async def startup(datasette)

# File: /Users/simon/Dropbox/Development/datasette/pugins-older/show_open_files.py Line: 6
async def show_open_files()

Needs a bit more thought, then tests and docs.

@simonw
Copy link
Owner Author

simonw commented Jun 19, 2023

I didn't implement --method yet, not completely convinced it is necessary.

@simonw
Copy link
Owner Author

simonw commented Jun 19, 2023

A --typed filter that just returns things that have type signatures - and a --untyped one that returns things without type signatures - might be neat too.

@simonw
Copy link
Owner Author

simonw commented Jun 19, 2023

Maybe this:

  • --typed - any function with at least one of its arguments or return using type annotations
  • --untyped - not a single type annotation
  • --fully-typed - all arguments plus the return value have type annotations (a subset of --typed)
  • --partially-typed - some but not all of the arguments and return value have annotations

If you are working on a project and trying to add types to every single function, you can iterate on it using this to find the functions that still need some work:

symbex --untyped --partially-typed --signatures

@simonw
Copy link
Owner Author

simonw commented Jun 19, 2023

Maybe --method isn't necessary because you can use a '*.*' selector instead?

Or maybe --method is a shortcut for that adds the '*.*' selector?

It would be slightly surprising that while most of these filter options add together, --method --async would filter methods to just the async ones.

For that reason I think using *.* to filter methods may be more consistent.

@simonw
Copy link
Owner Author

simonw commented Jun 20, 2023

Confirmed, this works already to get all async methods:

symbex -d ../datasette -s --async '*.*'

@simonw simonw changed the title Maybe --async and --function and --class filters? Filters: --async and --function and --class and various typing ones Jun 20, 2023
@simonw
Copy link
Owner Author

simonw commented Jun 20, 2023

I built a messy prototype of the typing ones and I like them a lot:

symbex --typed -s
# File: tests/example_symbols.py Line: 50
def func_type_annotations(a: int, b: str) -> bool

# File: tests/example_symbols.py Line: 94
def function_with_non_pep_0484_annotation(x: ?, xx: ?, yy: ?, y: ?, zz: float) -> ?

# File: tests/example_symbols.py Line: 104
def complex_annotations(code: str, symbols: Iterable[str]) -> List[Tuple[(AST, Optional[str])]]

# File: symbex/lib.py Line: 11
def find_symbol_nodes(code: str, filename: str, symbols: Iterable[str]) -> List[Tuple[(AST, Optional[str])]]

# File: symbex/lib.py Line: 35
def code_for_node(code: str, node: AST, class_name: str, signatures: bool) -> Tuple[(str, int)]

# File: symbex/lib.py Line: 66
def match(name: str, symbols: Iterable[str]) -> bool

# File: symbex/lib.py Line: 91
def function_definition(function_node: AST)

# File: symbex/lib.py Line: 193
def annotation_definition(annotation: AST) -> str

# File: symbex/lib.py Line: 241
def annotation_summary(node: AST) -> AnnotationSummary
symbex --untyped -s
# File: tests/example_symbols.py Line: 10
def func_positional_args(a, b, c)

# File: tests/example_symbols.py Line: 15
async def async_func(a, b, c)

# File: tests/example_symbols.py Line: 20
def func_default_args(a, b=2, c=3)

...

@simonw
Copy link
Owner Author

simonw commented Jun 20, 2023

Am I right that these should always be OR and not AND?

I just tried this and it didn't give me what I expected:

symbex --async --typed -s

Turns out I expected just typed async functions, but I got back ALL async functions and then ALL typed functions.

@simonw
Copy link
Owner Author

simonw commented Jun 20, 2023

I'm going to try these as AND rather than OR and see how that feels.

@simonw
Copy link
Owner Author

simonw commented Jun 20, 2023

Yes, AND for filters feels better. Switching to that.

@simonw
Copy link
Owner Author

simonw commented Jun 20, 2023

This method is failing the test for --fully-typed:

class ClassForTypedTests:
    def method_fully_typed(self, a: int, b: str) -> bool:
        pass

Because self has no type. So I need a special case check for class methods.

@simonw
Copy link
Owner Author

simonw commented Jun 20, 2023

I'm going to refactor AnnotationSummary - it looks like this:

AnnotationSummary = namedtuple(
    "Summary", ["num_arguments", "num_typed", "return_is_typed"]
)

I'm going to change it to this:

class TypeSummary:
    fully: bool
    partially: bool
)

@simonw
Copy link
Owner Author

simonw commented Jun 20, 2023

Because self has no type. So I need a special case check for class methods.

That caused another problem: now --typed is returning any method with a self argument because I wrote code that says "treat that as typed" - even though there are no type annotations on anything else.

@simonw
Copy link
Owner Author

simonw commented Jun 20, 2023

Here's a fun trick:

 llm --system 'Suggest several ways to refactor this' '    if (
        not symbols
        and not signatures
        and not async_
        and not function
        and not class_
        and not typed
        and not untyped
        and not partially_typed
        and not fully_typed
    ):
        ctx = click.get_current_context()
        click.echo(ctx.get_help())
        ctx.exit()'

Output:

There are several ways to refactor this code:

  1. Consolidate all the boolean checks into a list and use all() to check if all items are False:
checks = [
    not symbols, not signatures, not async_,
    not function, not class_, not typed,
    not untyped, not partially_typed,
    not fully_typed
]

if all(checks):
    ctx = click.get_current_context()
    click.echo(ctx.get_help())
    ctx.exit()
  1. Use De Morgan's Law to simplify the boolean checks:
if (
    symbols or signatures or async_ or function or
    class_ or typed or untyped or partially_typed or
    fully_typed
):
    pass
else:
    ctx = click.get_current_context()
    click.echo(ctx.get_help())
    ctx.exit()
  1. Use any() with the negation of the boolean checks:
if not any([
    symbols, signatures, async_, function,
    class_, typed, untyped, partially_typed,
    fully_typed
]):
    ctx = click.get_current_context()
    click.echo(ctx.get_help())
    ctx.exit()

All three options achieve the same result, so it's up to personal preference on which one to choose.

@simonw
Copy link
Owner Author

simonw commented Jun 20, 2023

One last thought: an __init__() method should not need a return type annotation to qualify as not breaing the fully typed rule.

@simonw simonw closed this as completed in 5dbbc20 Jun 20, 2023
simonw added a commit that referenced this issue Jun 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant