Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ref: Improve scrub_dict typing #2768

Merged
merged 7 commits into from
Mar 11, 2024
Merged

Conversation

szokeasaurusrex
Copy link
Member

This change improves the typing of the scrub_dict method.

Previously, the scrub_dict method's type hints indicated that only dict[str, Any] was accepted as the parameter. However, the method is actually implemented to accept any object, since it checks the types of the parameters at runtime. Therefore, object is a more appropriate type hint for the parameter.

#2753 depends on this change for mypy to pass

Also, improve type annotations in the merged code.
These `isinstance` checks are already performed inside the methods that they were wrapping
@szokeasaurusrex szokeasaurusrex requested review from antonpirker and removed request for sl0thentr0py February 27, 2024 16:06
@szokeasaurusrex szokeasaurusrex self-assigned this Mar 8, 2024
Copy link
Contributor

@sentrivana sentrivana left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some comments, please take a look.

sentry_sdk/scrubber.py Outdated Show resolved Hide resolved
@@ -66,7 +66,7 @@ def __init__(self, denylist=None, recursive=False):
self.recursive = recursive

def scrub_list(self, lst):
# type: (List[Any]) -> None
# type: (object) -> None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get where you're coming from (the method theoretically does support any type), but as a user I'd find it confusing that a method called scrub_list accepts an argument lst of any type.

Since types are not enforced at runtime, users are free to ignore the type hint that says a List[Any] should be given to this method, which is why the isinstance(lst, list) check is there now -- just to make sure we don't explode if someone misuses the function. I feel like by changing the type hint to say this function supports any object, we're making it harder for users to understand how this method is meant to be used.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need this change for the Event TypedDict changes (#2753). Originally, I was going to include these changes there, but I split these into a separate PR, since they are not strictly related.

We should use the object type here because without this change, the sentry_sdk.scrubber.EventScrubber.scrub_request method raises type errors on the #2753 code. The reason is that event["request"] has a type of dict[str, object], since the request object (at least from my understanding) can contain an arbitrary mapping from strings to any object. We are then calling scrub_dict on several objects within event["request"] (such as event["request"]["headers"]), which according to the type checker have type object. However without this PR, the scrub_dict method is declared as only accepting dict[str, Any], so the type checker fails, since based on the type hint it appears that the code is not type safe.

However, because of the fact that we are doing the isinstance check within the scrub_dict method, the code is in fact type safe. We can pass any object to scrub_dict and the code will behave correctly.

I feel like by changing the type hint to say this function supports any object, we're making it harder for users to understand how this method is meant to be used.

I understand your point, but there is already a doc comment here to explain that the method only does anything if the argument passed is a list, and I will also add a similar comment to the scrub_dict method to clarify.

I prefer object here, since in my opinion, the purpose of type hints is to communicate to users what type they need to pass in order for type safety to be ensured. For these methods, users can pass objects even with an unknown type without violating the method contract, and type safety is always guaranteed. Going back to the scrub_request example that prompted me to make this change, having type object makes it clear that I can safely pass event["request"]["headers"] without first adding an isinstance check to make sure that event["request"]["headers"] is a dictionary.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotcha -- then let's go with your change.

sentry_sdk/scrubber.py Show resolved Hide resolved
sentry_sdk/scrubber.py Show resolved Hide resolved
Comment on lines 94 to 95
self.scrub_dict(v)
self.scrub_list(v)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as above regarding removing the isinstance checks here.

@szokeasaurusrex szokeasaurusrex enabled auto-merge (squash) March 11, 2024 09:33
@szokeasaurusrex szokeasaurusrex merged commit 461bd59 into master Mar 11, 2024
123 checks passed
@szokeasaurusrex szokeasaurusrex deleted the szokeasaurusrex/scrubber-typing branch March 11, 2024 09:52
szokeasaurusrex added a commit that referenced this pull request Mar 13, 2024
* ref: Improve scrub_dict typing (#2768)

This change improves the typing of the scrub_dict method.

Previously, the scrub_dict method's type hints indicated that only dict[str, Any] was accepted as the parameter. However, the method is actually implemented to accept any object, since it checks the types of the parameters at runtime. Therefore, object is a more appropriate type hint for the parameter.

#2753 depends on this change for mypy to pass

* Propagate sentry-trace and baggage to huey tasks (#2792)

This PR enables passing `sentry-trace` and `baggage` headers to background tasks using the Huey task queue.

This allows easily correlating what happens inside a background task with whatever transaction (e.g. a user request in a Django application) queued the task in the first place.

Periodic tasks do not get these headers, because otherwise each execution of the periodic task would be tied to the same parent trace (the long-running worker process).

--- 

Co-authored-by: Anton Pirker <anton.pirker@sentry.io>

* OpenAI integration (#2791)

* OpenAI integration

* Fix linting errors

* Fix CI

* Fix lint

* Fix more CI issues

* Run tests on version pinned OpenAI too

* Fix pydantic issue in test

* Import type in TYPE_CHECKING gate

* PR feedback fixes

* Fix tiktoken test variant

* PII gate the request and response

* Rename set_data tags

* Move doc location

* Add "exclude prompts" flag as optional

* Change prompts to be excluded by default

* Set flag in tests

* Fix tiktoken tox.ini extra dash

* Change strip PII semantics

* More test coverage for PII

* notiktoken

---------

Co-authored-by: Anton Pirker <anton.pirker@sentry.io>

* Add a method for normalizing data passed to set_data (#2800)

* Discard open spans after 10 minutes (#2801)

OTel spans that are handled in the Sentry span processor can never be finished/closed. This leads to a memory leak. This change makes sure that open spans will be removed from memory after 10 minutes to prevent memory usage from growing constantly.

Fixes #2722

---------

Co-authored-by: Daniel Szoke <szokeasaurusrex@users.noreply.github.com>

* ref: Event Type (#2753)

Implements type hinting for Event via a TypedDict. This commit mainly adjusts type hints; however, there are also some minor code changes to make the code type-safe following the new changes.

Some items in the Event could have their types expanded by being defined as TypedDicts themselves. These items have been indicated with TODO comments.

Fixes GH-2357

* Fix mypy in `client.py`

* Fix functools import

* Fix CI config problem

... by running `python scripts/split-tox-gh-actions/split-tox-gh-actions.py`

---------

Co-authored-by: Christian Schneider <christian@cnschn.com>
Co-authored-by: Anton Pirker <anton.pirker@sentry.io>
Co-authored-by: colin-sentry <161344340+colin-sentry@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants