Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SGW Redis client side cache #5733

Closed
philiptrovato opened this issue Jan 18, 2024 · 1 comment · Fixed by #5734
Closed

SGW Redis client side cache #5733

philiptrovato opened this issue Jan 18, 2024 · 1 comment · Fixed by #5734
Assignees

Comments

@philiptrovato
Copy link

Trying to implement redis “client side” caching for index and chunks cache.

Have a setup like below (Cortex 1.16.0) and it runs, and SGWs talk to redis, but I don’t see any evidence that the “client side” caching is working. (AWS redis 7.0.7)

Don’t know, or I am not aware of any specific “Client-side” metrics or logs to look at to validate. But I validated like below.

Looked at Network I/O bytes in/out from the SGWs and from the Redis nodes. And see no difference in I/O with the “redis.cache-size” flags set or removed. Would expect to see a drop in network I/O after the “redis.cache-size” flags are set.

Additionally see no difference in memory utilization from the SGWs with the “redis.cache-size” flags set or removed. Would expect to see an memory increase once the flags are set, because of the client side cache.

Also a little unclear with what the flag "max_chunk_pool_bytes" does in relation to what I am trying to achieve (haven’t played with it yet – running default of 2GB). Is this an inmemory chunks cache, or just working space that gets quickly cleared? If it’s a cache, how is a “Client-side” chunks redis cache different?

Flags on SGW’s
-log.level=info
-frontend.log-queries-longer-than=5s
-blocks-storage.bucket-store.ignore-deletion-marks-delay=1h
-store-gateway.sharding-ring.wait-stability-min-duration=0s
-store-gateway.sharding-ring.wait-stability-max-duration=0s
-blocks-storage.bucket-store.chunks-cache.backend= redis
-blocks-storage.bucket-store.chunks-cache.redis.addresses= XXXXXXXX:6379
-blocks-storage.bucket-store.chunks-cache.redis.tls-enabled=true
-blocks-storage.bucket-store.index-cache.backend=redis
-blocks-storage.bucket-store.index-cache.redis.addresses= XXXXXXXX:6379
-blocks-storage.bucket-store.index-cache.redis.tls-enabled=true
-blocks-storage.bucket-store.index-cache.redis.cache-size=1500000000
-blocks-storage.bucket-store.chunks-cache.redis.cache-size=5000000000

Config
bucket_store:
sync_dir: /opt/monitoring/cortex/tsdb-sync
sync_interval: 5m0s
max_concurrent: 100
max_inflight_requests: 0
tenant_sync_concurrency: 10
block_sync_concurrency: 20
meta_sync_concurrency: 20
consistency_delay: 0s
index_cache:
backend: redis
inmemory:
max_size_bytes: 1073741824
enabled_items: []
memcached:
addresses: ""
timeout: 100ms
max_idle_connections: 16
max_async_concurrency: 50
max_async_buffer_size: 10000
max_get_multi_concurrency: 100
max_get_multi_batch_size: 0
max_item_size: 1048576
auto_discovery: false
enabled_items: []
redis:
addresses: XXXXXXXX:6379
username: ""
password: XXXXXXXX
db: 0
master_name: ""
max_get_multi_concurrency: 100
get_multi_batch_size: 100
max_set_multi_concurrency: 100
set_multi_batch_size: 100
max_async_concurrency: 50
max_async_buffer_size: 10000
dial_timeout: 5s
read_timeout: 3s
write_timeout: 3s
tls_enabled: true
tls_cert_path: ""
tls_key_path: ""
tls_ca_path: ""
tls_server_name: ""
tls_insecure_skip_verify: false
cache_size: 1500000000
enabled_items: []
chunks_cache:
backend: redis
memcached:
addresses: ""
timeout: 100ms
max_idle_connections: 16
max_async_concurrency: 50
max_async_buffer_size: 10000
max_get_multi_concurrency: 100
max_get_multi_batch_size: 0
max_item_size: 1048576
auto_discovery: false
redis:
addresses: XXXXXXXX:6379
username: ""
password: XXXXXXXX
db: 0
master_name: ""
max_get_multi_concurrency: 100
get_multi_batch_size: 100
max_set_multi_concurrency: 100
set_multi_batch_size: 100
max_async_concurrency: 50
max_async_buffer_size: 10000
dial_timeout: 5s
read_timeout: 3s
write_timeout: 3s
tls_enabled: true
tls_cert_path: ""
tls_key_path: ""
tls_ca_path: ""
tls_server_name: ""
tls_insecure_skip_verify: false
cache_size: 5000000000
subrange_size: 16000
max_get_range_requests: 3
attributes_ttl: 168h0m0s
subrange_ttl: 24h0m0s
metadata_cache:
backend: ""
memcached:
addresses: ""
timeout: 100ms
max_idle_connections: 16
max_async_concurrency: 50
max_async_buffer_size: 10000
max_get_multi_concurrency: 100
max_get_multi_batch_size: 0
max_item_size: 1048576
auto_discovery: false
redis:
addresses: ""
username: ""
password: ""
db: 0
master_name: ""
max_get_multi_concurrency: 100
get_multi_batch_size: 100
max_set_multi_concurrency: 100
set_multi_batch_size: 100
max_async_concurrency: 50
max_async_buffer_size: 10000
dial_timeout: 5s
read_timeout: 3s
write_timeout: 3s
tls_enabled: false
tls_cert_path: ""
tls_key_path: ""
tls_ca_path: ""
tls_server_name: ""
tls_insecure_skip_verify: false
cache_size: 0
tenants_list_ttl: 15m0s
tenant_blocks_list_ttl: 5m0s
chunks_list_ttl: 24h0m0s
metafile_exists_ttl: 2h0m0s
metafile_doesnt_exist_ttl: 5m0s
metafile_content_ttl: 24h0m0s
metafile_max_size_bytes: 1048576
metafile_attributes_ttl: 168h0m0s
block_index_attributes_ttl: 168h0m0s
bucket_index_content_ttl: 5m0s
bucket_index_max_size_bytes: 1048576
ignore_deletion_mark_delay: 1h0m0s
ignore_blocks_within: 0s
bucket_index:
enabled: true
update_on_error_interval: 1m0s
idle_timeout: 1h0m0s
max_stale_period: 1h0m0s
max_chunk_pool_bytes: 2147483648
chunk_pool_min_bucket_size_bytes: 16000
chunk_pool_max_bucket_size_bytes: 50000000
index_header_lazy_loading_enabled: false
index_header_lazy_loading_idle_timeout: 20m0s
lazy_expanded_postings_enabled: false
partitioner_max_gap_bytes: 524288
estimated_max_series_size_bytes: 65536
estimated_max_chunk_size_bytes: 16000
postings_offsets_in_mem_sampling: 32
series_batch_size: 10000
tsdb:
dir: /opt/monitoring/cortex/tsdb
block_ranges_period:
- 2h0m0s
retention_period: 6h0m0s
ship_interval: 1m0s
ship_concurrency: 10
head_compaction_interval: 1m0s
head_compaction_concurrency: 5
head_compaction_idle_timeout: 1h0m0s
head_chunks_write_buffer_size_bytes: 4194304
stripe_size: 16384
wal_compression_enabled: false
wal_segment_size_bytes: 134217728
flush_blocks_on_shutdown: false
close_idle_tsdb_timeout: 0s
head_chunks_write_queue_size: 0
max_tsdb_opening_concurrency_on_startup: 10
max_exemplars: 0
memory_snapshot_on_shutdown: false
out_of_order_cap_max: 32
.......
.......
store_gateway:
sharding_enabled: true
sharding_ring:
kvstore:
store: consul
prefix: collectors/
dynamodb:
region: ""
table_name: ""
ttl: 0s
puller_sync_time: 1m0s
max_cas_retries: 10
consul:
host: localhost:8500
acl_token: ""
http_client_timeout: 20s
consistent_reads: false
watch_rate_limit: 1
watch_burst_size: 1
etcd:
endpoints: []
dial_timeout: 10s
max_retries: 10
tls_enabled: false
tls_cert_path: ""
tls_key_path: ""
tls_ca_path: ""
tls_server_name: ""
tls_insecure_skip_verify: false
username: ""
password: ""
multi:
primary: ""
secondary: ""
mirror_enabled: false
mirror_timeout: 2s
heartbeat_period: 15s
heartbeat_timeout: 1m0s
replication_factor: 3
tokens_file_path: ""
zone_awareness_enabled: false
keep_instance_in_the_ring_on_shutdown: false
zone_stable_shuffle_sharding: false
wait_stability_min_duration: 0s
wait_stability_max_duration: 0s
wait_instance_state_timeout: 10m0s
final_sleep: 0s
instance_id: ip-10-250-39-113.ec2.internal
instance_interface_names:
- eth1
- eth0
instance_port: 0
instance_addr: ""
instance_availability_zone: ""
sharding_strategy: default

@yeya24
Copy link
Contributor

yeya24 commented Jan 19, 2024

Hi @philiptrovato, this seems indeed a bug. I will submit a fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants