-
Notifications
You must be signed in to change notification settings - Fork 170
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[THREESCALE-9537] Configure batcher policy storage #1452
[THREESCALE-9537] Configure batcher policy storage #1452
Conversation
1ba4383
to
90e18eb
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Verification steps successful.
do you know what is the highest load a single gateway can handle?
That depends on number of workers, policies and available infra resources. Hard to tell. It needs to be measured for given context and traffic type like credential length.
Question: The shared dict is shared for all the 3scale products? So if I have 20m, those are shared for me and other 3scale users?
Maybe I would add some documentation with your tests, saying what you can get for the default values of the policy and the new env var for several key sizes. Then the same for half/double of the default value of batch_report_seconds
. Same thing for haf/double for the default value of the new env var.
Regarding
lua_shared_dict test 1m;
location = /t {
content_by_lua_block{
local rep = string.rep
local dt = ngx.shared.test
local val = rep("v", 15)
local key = rep("k", 32)
local i = 0
while i < 200000 do
local ok, err = dict:safe_add("service_id:_" .. i ..",user_key:".. key ..",metric:hits", i)
if not ok then
break
end
i = i + 1
end
ngx.say(n, " key/value pairs inserted")
}
}
dt
local variable is not being used.
n
variable is unknown to me as well.
Also fix the test steps.
Yes the shared dict is shared between workers and all 3scal products and perhaps user also.
Where do you think that doc would live? inside the top level doc or inside the policy? |
I would say in the specific readme for the batcher policy: https://github.com/3scale/APIcast/blob/master/gateway/src/apicast/policy/3scale_batcher/README.md |
CHANGELOG.md
Outdated
@@ -55,6 +55,8 @@ and this project adheres to [Semantic Versioning](http://semver.org/). | |||
|
|||
- Added `APICAST_CLIENT_REQUEST_HEADER_BUFFERS` variable to allow configure of the NGINX `client_request_header_buffers` directive: [PR #1446](https://github.com/3scale/APIcast/pull/1446), [THREESCALE-10164](https://issues.redhat.com/browse/THREESCALE-10164) | |||
|
|||
Added `APICAST_POLICY_BATCHER_SHARED_MEMORY_SIZE` variable to allow configure batcher policy share memory size. [PR #1452](https://github.com/3scale/APIcast/pull/1452), [THREESCALE-9537](https://issues.redhat.com/browse/THREESCALE-9537) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added `APICAST_POLICY_BATCHER_SHARED_MEMORY_SIZE` variable to allow configure batcher policy share memory size. [PR #1452](https://github.com/3scale/APIcast/pull/1452), [THREESCALE-9537](https://issues.redhat.com/browse/THREESCALE-9537) | |
Added the `APICAST_POLICY_BATCHER_SHARED_MEMORY_SIZE` variable to allow configuration of the batcher policy to share memory size. [PR #1452](https://github.com/3scale/APIcast/pull/1452), [THREESCALE-9537](https://issues.redhat.com/browse/THREESCALE-9537) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated
gateway/http.d/shdict.conf
Outdated
lua_shared_dict limiter 1m; | ||
|
||
# This shared dictionaries are only used in the 3scale batcher policy. | ||
# This is not ideal, but they'll need to be here until we allow policies to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# This is not ideal, but they'll need to be here until we allow policies to | |
# These requirements will remain in place until we allow policy changes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated
| | 512 | 15 | 40712 | | ||
|
||
In practice, the actual number will depend on the size of the key/value pair, the | ||
underlying OS architecture, memory segment sizes, etc... More details [here](https://blog.openresty.com/en/nginx-shm-frag/) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
underlying OS architecture, memory segment sizes, etc... More details [here](https://blog.openresty.com/en/nginx-shm-frag/) | |
underlying operating system (OS) architecture and memory segment sizes...More details [here](https://blog.openresty.com/en/nginx-shm-frag/) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added some suggestions.
If cache_handler exists, the policy will need to update_downtime_cache regardless of the backend status code. Thus, instead of passing cache_handler to other functions it is much simpler to update the cache inside access().
In some cases, the batcher policy will run out of storage space (batched_reports) and cause metrics to not being reported. This commit adds a new environment variable to configure the batcher policy shared memory storage size
f1e0a96
to
c08731c
Compare
Thanks @dfennessy. I will need your approval also |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
What
Fixes: https://issues.redhat.com/browse/THREESCALE-9537
Dev notes
1. What shared dict value should we increase?
3scale batcher policy use a few different shared dict caches
and
api_keys
if Caching policy included in the chainFirst let's run some test to see how much
1m
of shared cache can hold.The reason to use
safe_set()
here is to prevent automatic evicting least recently used items upon memory shortage in set().Querying /t gives the response body on my Linux x86_64 system. NOTE: you may get different value as this actually depends on the underlying architecture.
So a
1m
store can hold 4033 key/value pairs with 57-byte keys and 15-byte values. In reality, the actual available space will depend on the memory fragment but since these key/value pairs are consistent in size, we should have no problem. More details hereChanging the "dict" store to 10m gives
So a 10m store can hold 40673 pairs. It's a linear growth as expected.
So we can see that all will grow the equally, but due to the use of safe_add and cache reports for 10 seconds, only
batched_reports
will returnno memory
error.A possible workaround is to set
batch_report_seconds
to lower value.2. Do I need to increase the batcher policy storage.
Let do a small test and increase the key size:
with a key that ~400bytes, with the default report time of 10s for the
batched_reports
to be fully filled, it would require20400/10 = 2040 req/sec
. It's very unlikely that a single gateway will be hit with this much traffic.@eguzki do you know what is the highest load a single gateway can handle?
Verification Steps
Filling the storage is a bit tricky so I just check to see if the configuration file is filled with the correct value.
APICAST_POLICY_BATCHER_SHARED_MEMORY_SIZE
set to 40mlua_shared_dict batched_reports
is set to 40m