Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix possible pipeline connections leak #3104

Merged
merged 5 commits into from
Jan 11, 2024
Merged

Conversation

ING-XIAOJIAN
Copy link
Contributor

@ING-XIAOJIAN ING-XIAOJIAN commented Jan 9, 2024

Pull Request check-list

Please make sure to review and check all of these items:

  • Do tests and lints pass with this change?
  • Do the CI tests pass with this change (enable it first in your forked repo and wait for the github action build to finish)?
  • Is the new or changed code fully tested?
  • Is a documentation update included (if this change modifies existing APIs, or introduces new ones)?
  • Is there an example added to the examples folder (if applicable)?
  • Was the change added to CHANGES file?

NOTE: these things are not required to open a PR and can be done
afterwards / while the PR is open.

Description of change

When Executing "n.write()" may generate some unknown errors(e.g. DataError), which could result in the connection not being released, finally lead to connection leak.

When Executing "n.write()" may generate some unknown errors(e.g. DataError), which could result in the connection not being released.
@ING-XIAOJIAN ING-XIAOJIAN changed the title Update cluster.py pipeline connections leak Jan 9, 2024
@codecov-commenter
Copy link

codecov-commenter commented Jan 9, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (18bb016) 92.05% compared to head (adc8893) 92.00%.
Report is 2 commits behind head on master.

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #3104      +/-   ##
==========================================
- Coverage   92.05%   92.00%   -0.06%     
==========================================
  Files         128      128              
  Lines       33195    33197       +2     
==========================================
- Hits        30557    30542      -15     
- Misses       2638     2655      +17     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@dvora-h
Copy link
Collaborator

dvora-h commented Jan 9, 2024

@ING-XIAOJIAN thanks for suggestion, looks good, but WDYT about:

    try:
        node_commands = nodes.values()
        for n in node_commands:
            n.write()

        for n in node_commands:
            n.read()
    finally:
        for n in nodes.values():
            n.connection_pool.release(n.connection)

@ING-XIAOJIAN
Copy link
Contributor Author

@ING-XIAOJIAN thanks for suggestion, looks good, but WDYT about:

    try:
        node_commands = nodes.values()
        for n in node_commands:
            n.write()

        for n in node_commands:
            n.read()
    finally:
        for n in nodes.values():
            n.connection_pool.release(n.connection)

I naively think "n.read" never generates some unknown errors,
I will take your suggestion.

release connection move to "try...finally"
@dvora-h
Copy link
Collaborator

dvora-h commented Jan 10, 2024

I agree that there is less chance to get an error in the reading, but I prefer to cover that as well just to be safe

@dvora-h
Copy link
Collaborator

dvora-h commented Jan 10, 2024

@ING-XIAOJIAN Can you fix the linters or wants me to do it?

@dvora-h dvora-h changed the title pipeline connections leak Fix possible pipeline connections leak Jan 10, 2024
@dvora-h dvora-h added the bug Bug label Jan 10, 2024
 fix the linters
@ING-XIAOJIAN
Copy link
Contributor Author

@ING-XIAOJIAN Can you fix the linters or wants me to do it?

I already fixed those linters

@dvora-h
Copy link
Collaborator

dvora-h commented Jan 10, 2024

Thanks!

redis/cluster.py Outdated
@@ -2190,8 +2195,8 @@ def _send_cluster_commands(
# buffered result back from a previous command and
# every single request after to that connection will always get
# a mismatched result.
for n in nodes.values():
n.connection_pool.release(n.connection)
# for n in nodes.values():
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you delete this lines from the comment and move the explanations of this lines before them?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I already fixed this. could you review it again?

@dvora-h dvora-h merged commit 254c8c0 into redis:master Jan 11, 2024
47 checks passed
dvora-h pushed a commit that referenced this pull request Feb 25, 2024
* Update cluster.py

When Executing "n.write()" may generate some unknown errors(e.g. DataError), which could result in the connection not being released.

* Update cluster.py

* Update cluster.py

release connection move to "try...finally"

* Update cluster.py

 fix the linters

* fix problems of code review
dvora-h added a commit that referenced this pull request Feb 26, 2024
* Bump actions/checkout from 3 to 4 (#2969)

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump rojopolis/spellcheck-github-actions from 0.33.1 to 0.34.0 (#2970)

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* fix type hint (#2963)

Co-authored-by: d184230 <dubkov.a@skbkontur.ru>

* Don't perform blocking connect inside the BlockingConnectionQueue Condition variable. (#2997)

* Creating CODEOWNERS for the examples (#2993)

* Close various objects created during asyncio tests (#3005)

* Close various objects created during asyncio tests

* Fix resource leake in test_cwe_404.py
Need to wait for individual handler tasks when shutting down server.

* Linking to Redis resources (#3006)

* Add GEOSHAPE field type for index creation of RediSearch (#2957)

* first pass of geoshape index type

* first attempt at test, but demonstrates the initial commit is broken

* fix new field + fix tests

* work on linter

* more linter

* try to mark test with correct fixture

* fix linter

* Better deal with "lost" connections for async Redis (#2999)

* Allow tracking/reporting and closing of "lost" connections.
ConnectionPool keeps a WeakSet of in_use connections, allowing lost ones to be collected.
Collection produces a warning and closes the underlying transport.

* Add tests for the __del__ handlers of async Redis and Connection objects

* capture expected warnings in the test

* lint

* Update client.py sleep_time typing for run_in_thread function (#2977)

Changed from 
`sleep_time: int = 0`
to
`sleep_time: float = 0.0`
To avoid Pylance complaining: 
`Argument of type "float" cannot be assigned to parameter "sleep_time" of type "int" in function "run_in_thread"
  "float" is incompatible with "int"`

* Fix BlockingConnectionPool.from_url parsing of timeout in query args #2983 (#2984)

Co-authored-by: Romain Fliedel <romain@oqee.tv>

* Fix parsing resp3 dicts (#2982)

* Update ocsp.py (#3022)

* Bump rojopolis/spellcheck-github-actions from 0.34.0 to 0.35.0 (#3060)

Bumps [rojopolis/spellcheck-github-actions](https://github.com/rojopolis/spellcheck-github-actions) from 0.34.0 to 0.35.0.
- [Release notes](https://github.com/rojopolis/spellcheck-github-actions/releases)
- [Changelog](https://github.com/rojopolis/spellcheck-github-actions/blob/master/CHANGELOG.md)
- [Commits](rojopolis/spellcheck-github-actions@0.34.0...0.35.0)

---
updated-dependencies:
- dependency-name: rojopolis/spellcheck-github-actions
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Use `disable_decoding` in async `read_response`. (#3042)

* Add "sum" to DUPLICATE_POLICY documentation of TS.CREATE, TS.ADD and TS.ALTER (#3027)

* Update advanced_features.rst (#3019)

* Fix typos. (#3016)

* Fix parsing of `FT.PROFILE` result (#3063)

* Fix parsing of ft.profile result

* test

* Make the connection callback methods public again, add documentation (#2980)

* Fix reported version of deprecations in asyncio.client (#2968)

* Allow the parsing of the asking command to forward original options (#3012)

Co-authored-by: dvora-h <67596500+dvora-h@users.noreply.github.com>

* Fix Specifying Target Nodes broken hyperlink (#3072)

The typo cause hyperlinks to fail.

* Fix return types in json commands (#3071)

* Fix return types in JSONCommands class

* Update CHANGES

* fix acl_genpass with bits (#3062)

* Bump github/codeql-action from 2 to 3 (#3096)

Bumps [github/codeql-action](https://github.com/github/codeql-action) from 2 to 3.
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](github/codeql-action@v2...v3)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump actions/upload-artifact from 3 to 4 (#3097)

Bumps [actions/upload-artifact](https://github.com/actions/upload-artifact) from 3 to 4.
- [Release notes](https://github.com/actions/upload-artifact/releases)
- [Commits](actions/upload-artifact@v3...v4)

---
updated-dependencies:
- dependency-name: actions/upload-artifact
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump actions/setup-python from 4 to 5 (#3095)

Bumps [actions/setup-python](https://github.com/actions/setup-python) from 4 to 5.
- [Release notes](https://github.com/actions/setup-python/releases)
- [Commits](actions/setup-python@v4...v5)

---
updated-dependencies:
- dependency-name: actions/setup-python
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Always sending codecov (#3101)

* filter commits for main branch (#3036)

* fix(docs): organize cluster mode part of lua scripting (#3073)

* Fix typing for `HashCommand.hdel` (#3029)

Co-authored-by: dvora-h <67596500+dvora-h@users.noreply.github.com>

* Adding lock_name to LockError (#3023)

* Adding lock_name to LockError

* Adding lock_name to LockError

---------

Co-authored-by: dvora-h <67596500+dvora-h@users.noreply.github.com>

* Fix objlen type hint (#2966)

* Fix objlen type hint

* Update redis/commands/json/commands.py

Co-authored-by: dvora-h <67596500+dvora-h@users.noreply.github.com>

* linters

---------

Co-authored-by: dvora-h <67596500+dvora-h@users.noreply.github.com>

* Fix type hint of arbitrary argument lists (#2908)

* Fix: hset unexpectedly mutates the list passed to items (#3103)

* Fix possible pipeline connections leak (#3104)

* Update cluster.py

When Executing "n.write()" may generate some unknown errors(e.g. DataError), which could result in the connection not being released.

* Update cluster.py

* Update cluster.py

release connection move to "try...finally"

* Update cluster.py

 fix the linters

* fix problems of code review

* Add modules support to async RedisCluster (#3115)

* Fix grammer in BlockingConnectionPool class documentation (#3120)

Co-authored-by: ahmedabdou14 <root@xps>

* release already acquired connections on ClusterPipeline, when get_connection raises an exception (#3133)

Signed-off-by: zach.lee <zach.lee@sendbird.com>

* Bump actions/stale from 3 to 9 (#3132)

Bumps [actions/stale](https://github.com/actions/stale) from 3 to 9.
- [Release notes](https://github.com/actions/stale/releases)
- [Changelog](https://github.com/actions/stale/blob/main/CHANGELOG.md)
- [Commits](actions/stale@v3...v9)

---
updated-dependencies:
- dependency-name: actions/stale
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump codecov/codecov-action from 3 to 4 (#3131)

Bumps [codecov/codecov-action](https://github.com/codecov/codecov-action) from 3 to 4.
- [Release notes](https://github.com/codecov/codecov-action/releases)
- [Changelog](https://github.com/codecov/codecov-action/blob/main/CHANGELOG.md)
- [Commits](codecov/codecov-action@v3...v4)

---
updated-dependencies:
- dependency-name: codecov/codecov-action
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Allow to control the minimum SSL version (#3127)

* Allow to control the minimum SSL version

It's useful for applications that has strict security requirements.

* Add tests for minimum SSL version

The commit updates test_tcp_ssl_connect for both sync and async
connections. Now it sets the minimum SSL version. The test is ran with
both TLSv1.2 and TLSv1.3 (if supported).

A new test case is test_tcp_ssl_version_mismatch. The test added for
both sync and async connections. It uses TLS 1.3 on the client side,
and TLS 1.2 on the server side. It expects a connection error. The
test is skipped if TLS 1.3 is not supported.

* Add example of using a minimum TLS version

* docs: Add timeout parameter for get_message example (#3129)

The `get_message()` method in asyncio PubSub has a `timeout` argument that defaults to 0.0, causing it to immediately return. This can cause high CPU usage with the given code example and should not be recommended. By setting `timeout=None`, it works with much more efficient resource usage.

* Revert stale isuue version update (#3142)

* Update connection.py (#3149)

Exception ignored in: <function Redis.__del__ at ...>
Traceback ....
TypeError: 'NoneType' object cannot be interpreted as an integer.

This happens when closing the connection within a spawned Process (multiprocess).

* Fix retry logic for pubsub and pipeline (#3134)

* Fix retry logic for pubsub and pipeline

Extend the fix from bea7299 to apply to
pipeline and pubsub as well.

Fixes #2973

* fix isort

---------

Co-authored-by: dvora-h <67596500+dvora-h@users.noreply.github.com>

* Update install_requires (#3109)

---------

Signed-off-by: dependabot[bot] <support@github.com>
Signed-off-by: zach.lee <zach.lee@sendbird.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Artem Diubkov <d184230@gmail.com>
Co-authored-by: d184230 <dubkov.a@skbkontur.ru>
Co-authored-by: Kristján Valur Jónsson <sweskman@gmail.com>
Co-authored-by: Chayim <chayim@users.noreply.github.com>
Co-authored-by: Shaya Potter <spotter@gmail.com>
Co-authored-by: Bosheng (Daniel) Zhang <740807262@qq.com>
Co-authored-by: Romain <r0ro@users.noreply.github.com>
Co-authored-by: Romain Fliedel <romain@oqee.tv>
Co-authored-by: Aniket Patil <128228805+AniketP04@users.noreply.github.com>
Co-authored-by: Stanisław Denkowski <mr.denkov@gmail.com>
Co-authored-by: Pedram Parsian <pedram.parsian@gmail.com>
Co-authored-by: BackflipPenguin <63213817+BackflipPenguin@users.noreply.github.com>
Co-authored-by: AYMEN Mohammed <53928879+AYMENJD@users.noreply.github.com>
Co-authored-by: Zachary Ware <zachary.ware@gmail.com>
Co-authored-by: Tyler Bream (Event pipeline) <97038416+tbbream@users.noreply.github.com>
Co-authored-by: Binbin <binloveplay1314@qq.com>
Co-authored-by: Pármenas Haniel <42049025+parmenashp@users.noreply.github.com>
Co-authored-by: Wei-Hsiang (Matt) Wang <mattwang44@gmail.com>
Co-authored-by: Dmitry Kulazhenko <dmkulazhenko@gmail.com>
Co-authored-by: Dan Lousqui <github@blusky.fr>
Co-authored-by: trkwyk <50520795+trkwyk@users.noreply.github.com>
Co-authored-by: SwordHeart <37992593+zxjlm@users.noreply.github.com>
Co-authored-by: Jason <31104990+ING-XIAOJIAN@users.noreply.github.com>
Co-authored-by: Ahmed Ashraf <104530599+ahmedabdou14@users.noreply.github.com>
Co-authored-by: ahmedabdou14 <root@xps>
Co-authored-by: Dongkeun Lee <3315213+zakaf@users.noreply.github.com>
Co-authored-by: poiuj <1099644+poiuj@users.noreply.github.com>
Co-authored-by: Qiangning Hong <hongqn@gmail.com>
Co-authored-by: wKollendorf <83725977+wKollendorf@users.noreply.github.com>
Co-authored-by: Will Miller <w-miller@users.noreply.github.com>
finally:
# release all of the redis connections we allocated earlier
# back into the connection pool.
# we used to do this step as part of a try/finally block,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment seems out-of-date — isn't it explaining why not to do exactly what this change is doing?

It should probably be update to explain why this is safe, not why it isn't. 😅

vladvildanov pushed a commit that referenced this pull request Sep 27, 2024
* Update cluster.py

When Executing "n.write()" may generate some unknown errors(e.g. DataError), which could result in the connection not being released.

* Update cluster.py

* Update cluster.py

release connection move to "try...finally"

* Update cluster.py

 fix the linters

* fix problems of code review
vladvildanov pushed a commit that referenced this pull request Sep 27, 2024
* Update cluster.py

When Executing "n.write()" may generate some unknown errors(e.g. DataError), which could result in the connection not being released.

* Update cluster.py

* Update cluster.py

release connection move to "try...finally"

* Update cluster.py

 fix the linters

* fix problems of code review
vladvildanov pushed a commit that referenced this pull request Sep 27, 2024
* Update cluster.py

When Executing "n.write()" may generate some unknown errors(e.g. DataError), which could result in the connection not being released.

* Update cluster.py

* Update cluster.py

release connection move to "try...finally"

* Update cluster.py

 fix the linters

* fix problems of code review
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Bug
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants