Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

THREESCALE-9003 in boot mode on worker init check configuration expiration #1399

Merged
merged 3 commits into from
Mar 31, 2023

Conversation

eguzki
Copy link
Member

@eguzki eguzki commented Mar 24, 2023

What

Fixes https://issues.redhat.com/browse/THREESCALE-9003

Note; no test provided. boot.init and boot.init_worker are hard to test and they are not tested. Manual test required.

Verification steps

Test 1 === Re-spawn nginx worker before TTL expired

  • Run development environment
make development
make dependencies
  • Start APIcast in development mode boot with a non-zero TTL (i.e. 30s) and 1 worker
APICAST_LOG_LEVEL=info APICAST_WORKERS=1 THREESCALE_PORTAL_ENDPOINT=https://ACCESS_TOKEN@3scale-admin.3scale-domain BACKEND_ENDPOINT_OVERRIDE=http://localhost:8081 APICAST_CONFIGURATION_LOADER=boot APICAST_CONFIGURATION_CACHE=30 THREESCALE_DEPLOYMENT_ENV=staging ./bin/apicast
  • Check logs for the initial configuration being loaded
2023/03/24 14:44:41 [info] 48229#48229: [lua] configuration_store.lua:132: store(): added service 1163 configuration with hosts: b.com ttl: 30
  • Before the TTL expires, kill the worker process. In other terminal:
docker exec -ti -u root apicast_build_0-development-1 /bin/bash
bash-4.4# kill $(ls -l /proc/*/exe | grep nginx | sed -n '2p' | awk -F/ '{print $3}')
  • Check APIcast logs. Configuration should not be immediately be re-loaded. It will wait the TTL time to re-load the configuration

Test 2 === Re-spawn nginx worker after TTL expired

  • Get any example product (echo API for example) working and add a HTTP Status Code Overwrite policy to change the return code to 200 (used only for debug purposes)
  • Run development environment
make development
make dependencies
  • Start APIcast in development mode boot with a non-zero TTL (i.e. 30s) and 1 worker
APICAST_LOG_LEVEL=info APICAST_WORKERS=1 THREESCALE_PORTAL_ENDPOINT=https://ACCESS_TOKEN@3scale-admin.3scale-domain BACKEND_ENDPOINT_OVERRIDE=http://localhost:8081 APICAST_CONFIGURATION_LOADER=boot APICAST_CONFIGURATION_CACHE=30 THREESCALE_DEPLOYMENT_ENV=staging ./bin/apicast

  • Check if the rule is working and returning the configured status code
curl -v -H "Host: b.com" http://${APICAST_IP}:8080/?user_key=123456
< HTTP/1.1 200 OK
< Server: openresty
....

The user_key value is irrelevant because the backend endpoint is overridden. The host must match the 3scale product host.

  • Changed the policy HTTP Status Code Overwrite of Response Code to return 201
  • Manually promote to production and staging
  • Wait for the loop response code to present the publication result 201
curl -v -H "Host: b.com" http://${APICAST_IP}:8080/?user_key=123456
< HTTP/1.1 201 OK
< Server: openresty
  • Kill the worker process. Make sure TTL secs have passed since apicast boot time. In other terminal:
docker exec -ti -u root apicast_build_0-development-1 /bin/bash
bash-4.4# kill $(ls -l /proc/*/exe | grep nginx | sed -n '2p' | awk -F/ '{print $3}')
  • After killing the worker process, before TTL expires, run request to verify the new response code 201 is returned indicating that the NGiNX process spawned a child and the child immediately loaded new configuration and did not use the one loaded at the apicast boot time.
curl -v -H "Host: b.com" http://${APICAST_IP}:8080/?user_key=123456
< HTTP/1.1 201 OK
< Server: openresty
  • APIcast logs should also show the killing of the worker process and immediate configuration load
2023/03/24 15:07:22 [notice] 49339#49339: signal 15 (SIGTERM) received from 48450, exiting
2023/03/24 15:07:22 [info] 49339#49339: epoll_wait() failed (4: Interrupted system call)
2023/03/24 15:07:22 [notice] 49339#49339: exiting
2023/03/24 15:07:22 [notice] 49339#49339: exit
2023/03/24 15:07:22 [notice] 49326#49326: signal 17 (SIGCHLD) received from 49339
2023/03/24 15:07:22 [notice] 49326#49326: worker process 49339 exited with code 0
2023/03/24 15:07:22 [notice] 49326#49326: start worker process 49875
2023/03/24 15:07:22 [notice] 49326#49326: signal 29 (SIGIO) received
2023/03/24 15:07:22 [info] 49875#49875: *72 [lua] configuration_loader.lua:215: init_worker(): boot time configuration has expired, context: init_worker_by_lua*
2023/03/24 15:07:22 [info] 49875#49875: *73 [lua] configuration_loader.lua:194: auto updating configuration, context: ngx.timer
2023/03/24 15:07:22 [info] 49875#49875: *73 [lua] configuration_store.lua:132: store(): added service 1163 configuration with hosts: b.com ttl: 30, context: ngx.timer

The log lines show that within one second, the worker process is re-spawned and configuration is loaded.

@eguzki eguzki marked this pull request as ready for review March 24, 2023 15:10
@eguzki eguzki requested a review from a team as a code owner March 24, 2023 15:10
Copy link
Contributor

@ernaniaz ernaniaz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested on a quicklab and solved the issue.

@eguzki eguzki merged commit 91f39b1 into master Mar 31, 2023
@eguzki eguzki deleted the THREESCALE-9003 branch March 31, 2023 13:39
coderbydesign added a commit to RedHatInsights/APIcast that referenced this pull request Jun 7, 2024
…sabled

Details of the issue as well as local reproducers are documented in 3scale#1467.

This change ensures that when `boot.init_worker` checks to see whether or not
the configuration needs to be reloaded, it also checks to confirm that the `boot.ttl()`
value set from `APICAST_CONFIGURATION_CACHE` is a positive number, in order to avoid
an immediate and infinite recursive loop of refreshing the configuration.

This should still preserve the initial fix [1] where this regression appears to
have been introduced, as we should never schedule reloading with the configuration settings of:

```
APICAST_CONFIGURATION_LOADER: boot
APICAST_CONFIGURATION_CACHE: -1
```

We should also never have a situation where we'd need to account for `interval = 0`
since we fail out in `boot.init` in that case [2].

Tested locally as documented in the issue, however based on the comment in the initial
fix PR, I've not added any tests in code. Happy to do so if there's now a good way
to accomplish that!

[1] 3scale#1399
[2] https://github.com/3scale/APIcast/blob/7e7eaf6f1d584c78d999c0d09f5b65203161c402/gateway/src/apicast/configuration_loader.lua#L146-L149
coderbydesign added a commit to coderbydesign/APIcast that referenced this pull request Jun 7, 2024
…sabled

Details of the issue as well as local reproducers are documented in 3scale#1467.

This change ensures that when `boot.init_worker` checks to see whether or not
the configuration needs to be reloaded, it also checks to confirm that the `boot.ttl()`
value set from `APICAST_CONFIGURATION_CACHE` is a positive number, in order to avoid
an immediate and infinite recursive loop of refreshing the configuration.

This should still preserve the initial fix [1] where this regression appears to
have been introduced, as we should never schedule reloading with the configuration settings of:

```
APICAST_CONFIGURATION_LOADER: boot
APICAST_CONFIGURATION_CACHE: -1
```

We should also never have a situation where we'd need to account for `interval = 0`
since we fail out in `boot.init` in that case [2].

Tested locally as documented in the issue, however based on the comment in the initial
fix PR, I've not added any tests in code. Happy to do so if there's now a good way
to accomplish that!

[1] 3scale#1399
[2] https://github.com/3scale/APIcast/blob/7e7eaf6f1d584c78d999c0d09f5b65203161c402/gateway/src/apicast/configuration_loader.lua#L146-L149
tkan145 pushed a commit to tkan145/APIcast that referenced this pull request Jul 9, 2024
…sabled

Details of the issue as well as local reproducers are documented in 3scale#1467.

This change ensures that when `boot.init_worker` checks to see whether or not
the configuration needs to be reloaded, it also checks to confirm that the `boot.ttl()`
value set from `APICAST_CONFIGURATION_CACHE` is a positive number, in order to avoid
an immediate and infinite recursive loop of refreshing the configuration.

This should still preserve the initial fix [1] where this regression appears to
have been introduced, as we should never schedule reloading with the configuration settings of:

```
APICAST_CONFIGURATION_LOADER: boot
APICAST_CONFIGURATION_CACHE: -1
```

We should also never have a situation where we'd need to account for `interval = 0`
since we fail out in `boot.init` in that case [2].

Tested locally as documented in the issue, however based on the comment in the initial
fix PR, I've not added any tests in code. Happy to do so if there's now a good way
to accomplish that!

[1] 3scale#1399
[2] https://github.com/3scale/APIcast/blob/7e7eaf6f1d584c78d999c0d09f5b65203161c402/gateway/src/apicast/configuration_loader.lua#L146-L149
tkan145 pushed a commit to tkan145/APIcast that referenced this pull request Jul 10, 2024
…sabled

Details of the issue as well as local reproducers are documented in 3scale#1467.

This change ensures that when `boot.init_worker` checks to see whether or not
the configuration needs to be reloaded, it also checks to confirm that the `boot.ttl()`
value set from `APICAST_CONFIGURATION_CACHE` is a positive number, in order to avoid
an immediate and infinite recursive loop of refreshing the configuration.

This should still preserve the initial fix [1] where this regression appears to
have been introduced, as we should never schedule reloading with the configuration settings of:

```
APICAST_CONFIGURATION_LOADER: boot
APICAST_CONFIGURATION_CACHE: -1
```

We should also never have a situation where we'd need to account for `interval = 0`
since we fail out in `boot.init` in that case [2].

Tested locally as documented in the issue, however based on the comment in the initial
fix PR, I've not added any tests in code. Happy to do so if there's now a good way
to accomplish that!

[1] 3scale#1399
[2] https://github.com/3scale/APIcast/blob/7e7eaf6f1d584c78d999c0d09f5b65203161c402/gateway/src/apicast/configuration_loader.lua#L146-L149
tkan145 pushed a commit to tkan145/APIcast that referenced this pull request Jul 17, 2024
…sabled

Details of the issue as well as local reproducers are documented in 3scale#1467.

This change ensures that when `boot.init_worker` checks to see whether or not
the configuration needs to be reloaded, it also checks to confirm that the `boot.ttl()`
value set from `APICAST_CONFIGURATION_CACHE` is a positive number, in order to avoid
an immediate and infinite recursive loop of refreshing the configuration.

This should still preserve the initial fix [1] where this regression appears to
have been introduced, as we should never schedule reloading with the configuration settings of:

```
APICAST_CONFIGURATION_LOADER: boot
APICAST_CONFIGURATION_CACHE: -1
```

We should also never have a situation where we'd need to account for `interval = 0`
since we fail out in `boot.init` in that case [2].

Tested locally as documented in the issue, however based on the comment in the initial
fix PR, I've not added any tests in code. Happy to do so if there's now a good way
to accomplish that!

[1] 3scale#1399
[2] https://github.com/3scale/APIcast/blob/7e7eaf6f1d584c78d999c0d09f5b65203161c402/gateway/src/apicast/configuration_loader.lua#L146-L149
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants