Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(dns): Eliminate asynchronous timer in syncQuery() to prevent hang risk #11900

Merged
merged 26 commits into from
Nov 15, 2023

Conversation

chobits
Copy link
Contributor

@chobits chobits commented Nov 1, 2023

Summary

Originally the first request to syncQuery() will trigger an asynchronous timer event, which added the risk of the thread pool hanging in the timerng library. With this patch, it will perform a synchronously DNS query for the first time if current phase is yieldable.

Checklist

  • The Pull Request has tests
  • A changelog file has been created under changelog/unreleased/kong or skip-changelog label added on PR if changelog is unnecessary. README.md
  • There is a user-facing docs PR against https://github.com/Kong/docs.konghq.com - PUT DOCS PR HERE

Full changelog

  • syncQuery() will perform a synchronously DNS query for the first time if current phase is yieldable.
    • Keep the original async query using event timer for some unyieldable phase like init_worker_by_lua.
  • simplify the logic of yieldable checking by using kong.tools.yield.in_yieldable_phase()
  • simplify executeQuery() by using individualQuery()
  • revert fix(dns): set default dns_no_sync to on #11869

Issue reference

Fix KAG-2913 FTI-5348

@chobits chobits requested review from oowl, vm-001 and outsinre November 1, 2023 08:20
@chobits chobits marked this pull request as draft November 1, 2023 08:21
@pull-request-size pull-request-size bot added size/L and removed size/M labels Nov 1, 2023
@chobits chobits marked this pull request as ready for review November 1, 2023 08:49
@chobits chobits requested a review from dndx November 1, 2023 08:53
@chobits chobits force-pushed the fix_dns_blocking branch 6 times, most recently from 902787f to 601ec2f Compare November 1, 2023 16:01
@VicYP VicYP requested a review from Tieske November 2, 2023 02:52
@chobits chobits changed the title fix(dns): eliminate asynchronous timer in syncQuery() to prevent hang risk fix(dns): introduce the synchronous query in syncQuery() to prevent hang risk Nov 2, 2023
kong/resty/dns/client.lua Outdated Show resolved Hide resolved
kong/resty/dns/client.lua Outdated Show resolved Hide resolved
kong/resty/dns/client.lua Outdated Show resolved Hide resolved
chobits added a commit that referenced this pull request Nov 16, 2023
…adlock risk (#11900)

* Revert "fix(conf): set default value of `dns_no_sync` to `on` (#11869)"

This reverts commit 3be2513.

* fix(dns): introduce the synchronous query in syncQuery() to prevent hang risk

Originally the first request to `syncQuery()` will trigger an asynchronous timer
event, which added the risk of thread pool hanging.

With this patch, cold synchronously DNS query will always happen in the current
thread if current phase supports yielding.

Fix FTI-5348

---------

Co-authored-by: Datong Sun <datong.sun@konghq.com>
chobits added a commit that referenced this pull request Nov 16, 2023
…adlock risk (#11900)

* Revert "fix(conf): set default value of `dns_no_sync` to `on` (#11869)"

This reverts commit 3be2513.

* fix(dns): introduce the synchronous query in syncQuery() to prevent hang risk

Originally the first request to `syncQuery()` will trigger an asynchronous timer
event, which added the risk of thread pool hanging.

With this patch, cold synchronously DNS query will always happen in the current
thread if current phase supports yielding.

Fix FTI-5348

---------

Co-authored-by: Datong Sun <datong.sun@konghq.com>
chobits added a commit that referenced this pull request Nov 16, 2023
…adlock risk (#11900)

* fix(dns): introduce the synchronous query in syncQuery() to prevent hang risk

Originally the first request to `syncQuery()` will trigger an asynchronous timer
event, which added the risk of thread pool hanging.

With this patch, cold synchronously DNS query will always happen in the current
thread if current phase supports yielding.

Fix FTI-5348

---------

Co-authored-by: Datong Sun <datong.sun@konghq.com>
bungle pushed a commit that referenced this pull request Nov 16, 2023
…adlock risk (#11900)

* Revert "fix(conf): set default value of `dns_no_sync` to `on` (#11869)"

This reverts commit 3be2513.

* fix(dns): introduce the synchronous query in syncQuery() to prevent hang risk

Originally the first request to `syncQuery()` will trigger an asynchronous timer
event, which added the risk of thread pool hanging.

With this patch, cold synchronously DNS query will always happen in the current
thread if current phase supports yielding.

Fix FTI-5348

---------

Co-authored-by: Datong Sun <datong.sun@konghq.com>
bungle pushed a commit that referenced this pull request Nov 16, 2023
…adlock risk (#11900)

* Revert "fix(conf): set default value of `dns_no_sync` to `on` (#11869)"

This reverts commit 3be2513.

* fix(dns): introduce the synchronous query in syncQuery() to prevent hang risk

Originally the first request to `syncQuery()` will trigger an asynchronous timer
event, which added the risk of thread pool hanging.

With this patch, cold synchronously DNS query will always happen in the current
thread if current phase supports yielding.

Fix FTI-5348

---------

Co-authored-by: Datong Sun <datong.sun@konghq.com>
dndx added a commit that referenced this pull request Nov 24, 2023
…adlock risk (#11900)

* fix(dns): introduce the synchronous query in syncQuery() to prevent hang risk

Originally the first request to `syncQuery()` will trigger an asynchronous timer
event, which added the risk of thread pool hanging.

With this patch, cold synchronously DNS query will always happen in the current
thread if current phase supports yielding.

Fix FTI-5348

---------

Co-authored-by: Datong Sun <datong.sun@konghq.com>
@team-gateway-bot
Copy link
Collaborator

The backport to release/3.2.x failed:

The process '/usr/bin/git' failed with exit code 1

To backport manually, run these commands in your terminal:

# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add .worktrees/backport-release/3.2.x release/3.2.x
# Navigate to the new working tree
cd .worktrees/backport-release/3.2.x
# Create a new branch
git switch --create backport-11900-to-release/3.2.x
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 a6d647566991e339ea5126113df4bef21fe0115d
# Push it to GitHub
git push --set-upstream origin backport-11900-to-release/3.2.x
# Go back to the original working tree
cd ../..
# Delete the working tree
git worktree remove .worktrees/backport-release/3.2.x

Then, create a pull request where the base branch is release/3.2.x and the compare/head branch is backport-11900-to-release/3.2.x.

@team-gateway-bot
Copy link
Collaborator

The backport to release/3.3.x failed:

The process '/usr/bin/git' failed with exit code 1

To backport manually, run these commands in your terminal:

# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add .worktrees/backport-release/3.3.x release/3.3.x
# Navigate to the new working tree
cd .worktrees/backport-release/3.3.x
# Create a new branch
git switch --create backport-11900-to-release/3.3.x
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 a6d647566991e339ea5126113df4bef21fe0115d
# Push it to GitHub
git push --set-upstream origin backport-11900-to-release/3.3.x
# Go back to the original working tree
cd ../..
# Delete the working tree
git worktree remove .worktrees/backport-release/3.3.x

Then, create a pull request where the base branch is release/3.3.x and the compare/head branch is backport-11900-to-release/3.3.x.

chobits added a commit that referenced this pull request Nov 29, 2023
…adlock risk (#11900)

* fix(dns): introduce the synchronous query in syncQuery() to prevent hang risk

Originally the first request to `syncQuery()` will trigger an asynchronous timer
event, which added the risk of thread pool hanging.

With this patch, cold synchronously DNS query will always happen in the current
thread if current phase supports yielding.

Fix FTI-5348

---------

Co-authored-by: Datong Sun <datong.sun@konghq.com>
dndx added a commit that referenced this pull request Nov 29, 2023
…adlock risk (#11900)

* fix(dns): introduce the synchronous query in syncQuery() to prevent hang risk

Originally the first request to `syncQuery()` will trigger an asynchronous timer
event, which added the risk of thread pool hanging.

With this patch, cold synchronously DNS query will always happen in the current
thread if current phase supports yielding.

Fix FTI-5348

---------

Co-authored-by: Datong Sun <datong.sun@konghq.com>
chobits added a commit that referenced this pull request Nov 30, 2023
…adlock risk (#11900)

* fix(dns): introduce the synchronous query in syncQuery() to prevent hang risk

Originally the first request to `syncQuery()` will trigger an asynchronous timer
event, which added the risk of thread pool hanging.

With this patch, cold synchronously DNS query will always happen in the current
thread if current phase supports yielding.

Fix FTI-5348

---------

Co-authored-by: Datong Sun <datong.sun@konghq.com>
dndx added a commit that referenced this pull request Nov 30, 2023
…adlock risk (#11900)

* fix(dns): introduce the synchronous query in syncQuery() to prevent hang risk

Originally the first request to `syncQuery()` will trigger an asynchronous timer
event, which added the risk of thread pool hanging.

With this patch, cold synchronously DNS query will always happen in the current
thread if current phase supports yielding.

Fix FTI-5348

---------

Co-authored-by: Datong Sun <datong.sun@konghq.com>
chobits added a commit that referenced this pull request Nov 30, 2023
…adlock risk (#11900)

* fix(dns): introduce the synchronous query in syncQuery() to prevent hang risk

Originally the first request to `syncQuery()` will trigger an asynchronous timer
event, which added the risk of thread pool hanging.

With this patch, cold synchronously DNS query will always happen in the current
thread if current phase supports yielding.

Fix FTI-5348

---------

Co-authored-by: Datong Sun <datong.sun@konghq.com>
AndyZhang0707 added a commit that referenced this pull request Jul 18, 2024
AndyZhang0707 added a commit that referenced this pull request Jul 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.