Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Close ijson coroutines ourselves instead of letting the GC close them #12875

Merged
merged 2 commits into from
May 27, 2022

Conversation

squahtx
Copy link
Contributor

@squahtx squahtx commented May 25, 2022

Hopefully this means that exceptions raised due to truncated JSON
get a sensible logging context and stack.

Signed-off-by: Sean Quah seanq@matrix.org


On matrix.org sentry, we are seeing a lot of Exception ignored in: <generator object utf8_encoder at 0x7f87c9b1ccf0> errors without a proper stack trace or logging context.

Digging into the logs, we find something like:

2022-05-25 05:15:16,092 - twisted - 279 - ERROR - sentinel - Exception ignored in: <generator object utf8_encoder at 0x7ff6c944b6d0>
2022-05-25 05:15:16,093 - twisted - 279 - ERROR - sentinel - Traceback (most recent call last):
2022-05-25 05:15:16,095 - twisted - 279 - ERROR - sentinel -   File "/.../site-packages/ijson/backends/python.py", line 46, in utf8_encoder
2022-05-25 05:15:16,098 - twisted - 279 - ERROR - sentinel -     target.close()
2022-05-25 05:15:16,099 - twisted - 279 - ERROR - sentinel -   File "/.../site-packages/ijson/backends/python.py", line 116, in Lexer
2022-05-25 05:15:16,100 - twisted - 279 - ERROR - sentinel -     target.send(EOF)
2022-05-25 05:15:16,101 - twisted - 279 - ERROR - sentinel -   File "/.../site-packages/ijson/backends/python.py", line 161, in parse_value
2022-05-25 05:15:16,102 - twisted - 279 - ERROR - sentinel -     raise common.IncompleteJSONError('Incomplete JSON content')
2022-05-25 05:15:16,103 - twisted - 279 - ERROR - sentinel - ijson.common.IncompleteJSONError: Incomplete JSON content
2022-05-25 05:15:16,104 - twisted - 279 - ERROR - sentinel - Exception ignored in: <generator object utf8_encoder at 0x7ff614ca97b0>
2022-05-25 05:15:16,105 - twisted - 279 - ERROR - sentinel - Traceback (most recent call last):
2022-05-25 05:15:16,106 - twisted - 279 - ERROR - sentinel -   File "/.../site-packages/ijson/backends/python.py", line 46, in utf8_encoder
2022-05-25 05:15:16,107 - twisted - 279 - ERROR - sentinel -     target.close()
2022-05-25 05:15:16,108 - twisted - 279 - ERROR - sentinel -   File "/.../site-packages/ijson/backends/python.py", line 116, in Lexer
2022-05-25 05:15:16,110 - twisted - 279 - ERROR - sentinel -     target.send(EOF)
2022-05-25 05:15:16,111 - twisted - 279 - ERROR - sentinel -   File "/.../site-packages/ijson/backends/python.py", line 161, in parse_value
2022-05-25 05:15:16,112 - twisted - 279 - ERROR - sentinel -     raise common.IncompleteJSONError('Incomplete JSON content')
2022-05-25 05:15:16,113 - twisted - 279 - ERROR - sentinel - ijson.common.IncompleteJSONError: Incomplete JSON content

Which I think is happening when Python garbage collects / frees ijson coroutines and throws a GeneratorExit into them.
So let's call close() ourselves to have the exception raised upfront.

Note that the ijson examples all include an explicit close() call:
https://pypi.org/project/ijson/#push-interfaces

@squahtx squahtx requested a review from a team as a code owner May 25, 2022 13:30
@squahtx squahtx force-pushed the squah/close_ijson_coroutines branch from 52ef0d6 to bb24948 Compare May 25, 2022 13:34
clokep
clokep previously approved these changes May 25, 2022
Copy link
Member

@clokep clokep left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like an improvement, but I think these won't be closed in situations where we don't call parser.finish(), e.g. a timeout, response, etc.

See

value = parser.finish()
except BodyExceededMaxSize as e:
# The response was too big.
logger.warning(
"{%s} [%s] JSON response exceeded max size %i - %s %s",
request.txn_id,
request.destination,
MAX_RESPONSE_SIZE,
request.method,
request.uri.decode("ascii"),
)
raise RequestSendFailed(e, can_retry=False) from e
except ValueError as e:
# The content was invalid.
logger.warning(
"{%s} [%s] Failed to parse response - %s %s",
request.txn_id,
request.destination,
request.method,
request.uri.decode("ascii"),
)
raise RequestSendFailed(e, can_retry=False) from e
except defer.TimeoutError as e:
logger.warning(
"{%s} [%s] Timed out reading response - %s %s",
request.txn_id,
request.destination,
request.method,
request.uri.decode("ascii"),
)
raise RequestSendFailed(e, can_retry=True) from e
except ResponseFailed as e:
logger.warning(
"{%s} [%s] Failed to read response - %s %s",
request.txn_id,
request.destination,
request.method,
request.uri.decode("ascii"),
)
raise RequestSendFailed(e, can_retry=True) from e
except Exception as e:
logger.warning(
"{%s} [%s] Error reading response %s %s: %s",
request.txn_id,
request.destination,
request.method,
request.uri.decode("ascii"),
e,
)
raise
; not sure if we can refactor that or not?

I'm going to approve since I think this is an improvement.

@squahtx
Copy link
Contributor Author

squahtx commented May 25, 2022

Seems like an improvement, but I think these won't be closed in situations where we don't call parser.finish(), e.g. a timeout, response, etc.

Yuck, that's a good point. I'll have to think about this a little.

@squahtx squahtx changed the base branch from develop to release-v1.60 May 27, 2022 09:34
Sean Quah added 2 commits May 27, 2022 10:34
Hopefully this means that exceptions raised due to truncated JSON
get a sensible logging context and stack.

Signed-off-by: Sean Quah <seanq@matrix.org>
@squahtx squahtx force-pushed the squah/close_ijson_coroutines branch from e72b671 to b401839 Compare May 27, 2022 09:34
@squahtx squahtx dismissed clokep’s stale review May 27, 2022 09:36

Updated some code to always call .finish()

@squahtx squahtx requested a review from a team May 27, 2022 09:36
Copy link
Contributor

@reivilibre reivilibre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks reasonable to me — thanks for adding the clean-up :)

@squahtx
Copy link
Contributor Author

squahtx commented May 27, 2022

The failing sytests are due to CI using develop. The failing tests have been changed recently on develop and pass on the release-v1.60 branch locally.

@squahtx squahtx enabled auto-merge (squash) May 27, 2022 10:02
@squahtx squahtx disabled auto-merge May 27, 2022 10:02
@squahtx squahtx merged commit bb7a637 into release-v1.60 May 27, 2022
@squahtx squahtx deleted the squah/close_ijson_coroutines branch May 27, 2022 10:03
squahtx pushed a commit that referenced this pull request May 27, 2022
Synapse 1.60.0rc2 (2022-05-27)
==============================

This release of Synapse adds a unique index to the `state_group_edges` table, in
order to prevent accidentally introducing duplicate information (for example,
because a database backup was restored multiple times). If your Synapse database
already has duplicate rows in this table, this could fail with an error and
require manual remediation.

Additionally, the signature of the `check_event_for_spam` module callback has changed.
The previous signature has been deprecated and remains working for now. Module authors
should update their modules to use the new signature where possible.

See [the upgrade notes](https://github.com/matrix-org/synapse/blob/develop/docs/upgrade.md#upgrading-to-v1600)
for more details.

Features
--------

- Add an option allowing users to use their password to reauthenticate for privileged actions even though password login is disabled. ([\#12883](#12883))

Bugfixes
--------

- Explicitly close `ijson` coroutines once we are done with them, instead of leaving the garbage collector to close them. ([\#12875](#12875))

Internal Changes
----------------

- Improve URL previews by not including the content of media tags in the generated description. ([\#12887](#12887))
Danieloni1 added a commit to globekeeper/synapse that referenced this pull request Jun 8, 2022
Synapse 1.60.0 (2022-05-31)
===========================

This release of Synapse adds a unique index to the `state_group_edges` table, in
order to prevent accidentally introducing duplicate information (for example,
because a database backup was restored multiple times). If your Synapse database
already has duplicate rows in this table, this could fail with an error and
require manual remediation.

Additionally, the signature of the `check_event_for_spam` module callback has changed.
The previous signature has been deprecated and remains working for now. Module authors
should update their modules to use the new signature where possible.

See [the upgrade notes](https://github.com/matrix-org/synapse/blob/develop/docs/upgrade.md#upgrading-to-v1600)
for more details.

Bugfixes
--------

- Fix a bug introduced in Synapse 1.60.0rc1 that would break some imports from `synapse.module_api`. ([\matrix-org#12918](matrix-org#12918))

Synapse 1.60.0rc2 (2022-05-27)
==============================

Features
--------

- Add an option allowing users to use their password to reauthenticate for privileged actions even though password login is disabled. ([\matrix-org#12883](matrix-org#12883))

Bugfixes
--------

- Explicitly close `ijson` coroutines once we are done with them, instead of leaving the garbage collector to close them. ([\matrix-org#12875](matrix-org#12875))

Internal Changes
----------------

- Improve URL previews by not including the content of media tags in the generated description. ([\matrix-org#12887](matrix-org#12887))

Synapse 1.60.0rc1 (2022-05-24)
==============================

Features
--------

- Measure the time taken in spam-checking callbacks and expose those measurements as metrics. ([\matrix-org#12513](matrix-org#12513))
- Add a `default_power_level_content_override` config option to set default room power levels per room preset. ([\matrix-org#12618](matrix-org#12618))
- Add support for [MSC3787: Allowing knocks to restricted rooms](matrix-org/matrix-spec-proposals#3787). ([\matrix-org#12623](matrix-org#12623))
- Send `USER_IP` commands on a different Redis channel, in order to reduce traffic to workers that do not process these commands. ([\matrix-org#12672](matrix-org#12672), [\matrix-org#12809](matrix-org#12809))
- Synapse will now reload [cache config](https://matrix-org.github.io/synapse/latest/usage/configuration/config_documentation.html#caching) when it receives a [SIGHUP](https://en.wikipedia.org/wiki/SIGHUP) signal. ([\matrix-org#12673](matrix-org#12673))
- Add a config options to allow for auto-tuning of caches. ([\matrix-org#12701](matrix-org#12701))
- Update [MSC2716](matrix-org/matrix-spec-proposals#2716) implementation to process marker events from the current state to avoid markers being lost in timeline gaps for federated servers which would cause the imported history to be undiscovered. ([\matrix-org#12718](matrix-org#12718))
- Add a `drop_federated_event` callback to `SpamChecker` to disregard inbound federated events before they take up much processing power, in an emergency. ([\matrix-org#12744](matrix-org#12744))
- Implement [MSC3818: Copy room type on upgrade](matrix-org/matrix-spec-proposals#3818). ([\matrix-org#12786](matrix-org#12786), [\matrix-org#12792](matrix-org#12792))
- Update to the `check_event_for_spam` module callback. Deprecate the current callback signature, replace it with a new signature that is both less ambiguous (replacing booleans with explicit allow/block) and more powerful (ability to return explicit error codes). ([\matrix-org#12808](matrix-org#12808))

Bugfixes
--------

- Fix a bug introduced in Synapse 1.7.0 that would prevent events from being sent to clients if there's a retention policy in the room when the support for retention policies is disabled. ([\matrix-org#12611](matrix-org#12611))
- Fix a bug introduced in Synapse 1.57.0 where `/messages` would throw a 500 error when querying for a non-existent room. ([\matrix-org#12683](matrix-org#12683))
- Add a unique index to `state_group_edges` to prevent duplicates being accidentally introduced and the consequential impact to performance. ([\matrix-org#12687](matrix-org#12687))
- Fix a long-standing bug where an empty room would be created when a user with an insufficient power level tried to upgrade a room. ([\matrix-org#12696](matrix-org#12696))
- Fix a bug introduced in Synapse 1.30.0 where empty rooms could be automatically created if a monthly active users limit is set. ([\matrix-org#12713](matrix-org#12713))
- Fix push to dismiss notifications when read on another client. Contributed by @SpiritCroc @ Beeper. ([\matrix-org#12721](matrix-org#12721))
- Fix poor database performance when reading the cache invalidation stream for large servers with lots of workers. ([\matrix-org#12747](matrix-org#12747))
- Delete events from the `federation_inbound_events_staging` table when a room is purged through the admin API. ([\matrix-org#12770](matrix-org#12770))
- Give a meaningful error message when a client tries to create a room with an invalid alias localpart. ([\matrix-org#12779](matrix-org#12779))
- Fix a bug introduced in 1.43.0 where a file (`providers.json`) was never closed. Contributed by @arkamar. ([\matrix-org#12794](matrix-org#12794))
- Fix a long-standing bug where finished log contexts would be re-started when failing to contact remote homeservers. ([\matrix-org#12803](matrix-org#12803))
- Fix a bug, introduced in Synapse 1.21.0, that led to media thumbnails being unusable before the index has been added in the background. ([\matrix-org#12823](matrix-org#12823))

Updates to the Docker image
---------------------------

- Fix the docker file after a dependency update. ([\matrix-org#12853](matrix-org#12853))

Improved Documentation
----------------------

- Fix a typo in the Media Admin API documentation. ([\matrix-org#12715](matrix-org#12715))
- Update the OpenID Connect example for Keycloak to be compatible with newer versions of Keycloak. Contributed by @nhh. ([\matrix-org#12727](matrix-org#12727))
- Fix typo in server listener documentation. ([\matrix-org#12742](matrix-org#12742))
- Link to the configuration manual from the welcome page of the documentation. ([\matrix-org#12748](matrix-org#12748))
- Fix typo in `run_background_tasks_on` option name in configuration manual documentation. ([\matrix-org#12749](matrix-org#12749))
- Add information regarding the `rc_invites` ratelimiting option to the configuration docs. ([\matrix-org#12759](matrix-org#12759))
- Add documentation for cancellation of request processing. ([\matrix-org#12761](matrix-org#12761))
- Recommend using docker to run tests against postgres. ([\matrix-org#12765](matrix-org#12765))
- Add missing user directory endpoint from the generic worker documentation. Contributed by @olmari. ([\matrix-org#12773](matrix-org#12773))
- Add additional info to documentation of config option `cache_autotuning`. ([\matrix-org#12776](matrix-org#12776))
- Update configuration manual documentation to document size-related suffixes. ([\matrix-org#12777](matrix-org#12777))
- Fix invalid YAML syntax in the example documentation for the `url_preview_accept_language` config option. ([\matrix-org#12785](matrix-org#12785))

Deprecations and Removals
-------------------------

- Require a body in POST requests to `/rooms/{roomId}/receipt/{receiptType}/{eventId}`, as required by the [Matrix specification](https://spec.matrix.org/v1.2/client-server-api/#post_matrixclientv3roomsroomidreceiptreceipttypeeventid). This breaks compatibility with Element Android 1.2.0 and earlier: users of those clients will be unable to send read receipts. ([\matrix-org#12709](matrix-org#12709))

Internal Changes
----------------

- Improve event caching mechanism to avoid having multiple copies of an event in memory at a time. ([\matrix-org#10533](matrix-org#10533))
- Preparation for faster-room-join work: return subsets of room state which we already have, immediately. ([\matrix-org#12498](matrix-org#12498))
- Add `@cancellable` decorator, for use on endpoint methods that can be cancelled when clients disconnect. ([\matrix-org#12586](matrix-org#12586), [\matrix-org#12588](matrix-org#12588), [\matrix-org#12630](matrix-org#12630), [\matrix-org#12694](matrix-org#12694), [\matrix-org#12698](matrix-org#12698), [\matrix-org#12699](matrix-org#12699), [\matrix-org#12700](matrix-org#12700), [\matrix-org#12705](matrix-org#12705))
- Enable cancellation of `GET /rooms/$room_id/members`, `GET /rooms/$room_id/state` and `GET /rooms/$room_id/state/$event_type/*` requests. ([\matrix-org#12708](matrix-org#12708))
- Improve documentation of the `synapse.push` module. ([\matrix-org#12676](matrix-org#12676))
- Refactor functions to on `PushRuleEvaluatorForEvent`. ([\matrix-org#12677](matrix-org#12677))
- Preparation for database schema simplifications: stop writing to `event_reference_hashes`. ([\matrix-org#12679](matrix-org#12679))
- Remove code which updates unused database column `application_services_state.last_txn`. ([\matrix-org#12680](matrix-org#12680))
- Refactor `EventContext` class. ([\matrix-org#12689](matrix-org#12689))
- Remove an unneeded class in the push code. ([\matrix-org#12691](matrix-org#12691))
- Consolidate parsing of relation information from events. ([\matrix-org#12693](matrix-org#12693))
- Convert namespace class `Codes` into a string enum. ([\matrix-org#12703](matrix-org#12703))
- Optimize private read receipt filtering. ([\matrix-org#12711](matrix-org#12711))
- Drop the logging level of status messages for the URL preview cache expiry job from INFO to DEBUG. ([\matrix-org#12720](matrix-org#12720))
- Downgrade some OIDC errors to warnings in the logs, to reduce the noise of Sentry reports. ([\matrix-org#12723](matrix-org#12723))
- Update configs used by Complement to allow more invites/3PID validations during tests. ([\matrix-org#12731](matrix-org#12731))
- Fix a long-standing bug where the user directory background process would fail to make forward progress if a user included a null codepoint in their display name or avatar. ([\matrix-org#12762](matrix-org#12762))
- Tweak the mypy plugin so that `@cached` can accept `on_invalidate=None`. ([\matrix-org#12769](matrix-org#12769))
- Move methods that call `add_push_rule` to the `PushRuleStore` class. ([\matrix-org#12772](matrix-org#12772))
- Make handling of federation Authorization header (more) compliant with RFC7230. ([\matrix-org#12774](matrix-org#12774))
- Refactor `resolve_state_groups_for_events` to not pull out full state when no state resolution happens. ([\matrix-org#12775](matrix-org#12775))
- Do not keep going if there are 5 back-to-back background update failures. ([\matrix-org#12781](matrix-org#12781))
- Fix federation when using the demo scripts. ([\matrix-org#12783](matrix-org#12783))
- The `hash_password` script now fails when it is called without specifying a config file. Contributed by @jae1911. ([\matrix-org#12789](matrix-org#12789))
- Improve and fix type hints. ([\matrix-org#12567](matrix-org#12567), [\matrix-org#12477](matrix-org#12477), [\matrix-org#12717](matrix-org#12717), [\matrix-org#12753](matrix-org#12753), [\matrix-org#12695](matrix-org#12695), [\matrix-org#12734](matrix-org#12734), [\matrix-org#12716](matrix-org#12716), [\matrix-org#12726](matrix-org#12726), [\matrix-org#12790](matrix-org#12790), [\matrix-org#12833](matrix-org#12833))
- Update EventContext `get_current_event_ids` and `get_prev_event_ids` to accept state filters and update calls where possible. ([\matrix-org#12791](matrix-org#12791))
- Remove Caddy from the Synapse workers image used in Complement. ([\matrix-org#12818](matrix-org#12818))
- Add Complement's shared registration secret to the Complement worker image. This fixes tests that depend on it. ([\matrix-org#12819](matrix-org#12819))
- Support registering Application Services when running with workers under Complement. ([\matrix-org#12826](matrix-org#12826))
- Disable 'faster room join' Complement tests when testing against Synapse with workers. ([\matrix-org#12842](matrix-org#12842))
Comment on lines +1414 to +1415
for c in self._coros:
c.close()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This'll only close the first ijson coroutine if it raises an IncompleteJSONError.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants