-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove telemetry from gasKV and cacheKV store operation's get methods #10072
Labels
C: telemetry
Issues and features pertaining to SDK telemetry.
T: Performance
Performance improvements
Comments
9 tasks
clevinson
added
C: telemetry
Issues and features pertaining to SDK telemetry.
backport/0.42.x (Stargate)
labels
Sep 3, 2021
19 tasks
mergify bot
pushed a commit
that referenced
this issue
Sep 5, 2021
## Description Closes: #10072 Significantly speeds up all the GasKvStore and CacheKVStore operations. Now Get/Set for these stores no longer even appears on my Osmosis benchmarks, saving ~8% of those benchmark's times. (Only one of the internal methods for setCacheValue appear -- will try to separately get a PR for reducing those memory allocations if #10026 is merged) Talked to @alexanderbez about this on Discord, and he seemed in agreement with the approach of removing telemetry from these stores. This does technically change telemetry, but I don't know if this is / should be considered a breaking change? --- ### Author Checklist *All items are required. Please add a note to the item if the item is not applicable and please add links to any relevant follow up issues.* I have... - [x] included the correct [type prefix](https://github.com/commitizen/conventional-commit-types/blob/v3.0.0/index.json) in the PR title - [ ] added `!` to the type prefix if API or client breaking change - [x] targeted the correct branch (see [PR Targeting](https://github.com/cosmos/cosmos-sdk/blob/master/CONTRIBUTING.md#pr-targeting)) - [x] provided a link to the relevant issue or specification - [x] followed the guidelines for [building modules](https://github.com/cosmos/cosmos-sdk/blob/master/docs/building-modules) - [x] included the necessary unit and integration [tests](https://github.com/cosmos/cosmos-sdk/blob/master/CONTRIBUTING.md#testing) - [ ] added a changelog entry to `CHANGELOG.md` - [x] included comments for [documenting Go code](https://blog.golang.org/godoc) - [ ] updated the relevant documentation or specification - is there something I should be updating? - [x] reviewed "Files changed" and left comments if necessary - [ ] confirmed all CI checks have passed ### Reviewers Checklist *All items are required. Please add a note if the item is not applicable and please add your handle next to the items reviewed if you only reviewed selected items.* I have... - [ ] confirmed the correct [type prefix](https://github.com/commitizen/conventional-commit-types/blob/v3.0.0/index.json) in the PR title - [ ] confirmed `!` in the type prefix if API or client breaking change - [ ] confirmed all author checklist items have been addressed - [ ] reviewed state machine logic - [ ] reviewed API design and naming - [ ] reviewed documentation is accurate - [ ] reviewed tests and test coverage - [ ] manually tested (if applicable)
mergify bot
pushed a commit
that referenced
this issue
Sep 5, 2021
## Description Closes: #10072 Significantly speeds up all the GasKvStore and CacheKVStore operations. Now Get/Set for these stores no longer even appears on my Osmosis benchmarks, saving ~8% of those benchmark's times. (Only one of the internal methods for setCacheValue appear -- will try to separately get a PR for reducing those memory allocations if #10026 is merged) Talked to @alexanderbez about this on Discord, and he seemed in agreement with the approach of removing telemetry from these stores. This does technically change telemetry, but I don't know if this is / should be considered a breaking change? --- ### Author Checklist *All items are required. Please add a note to the item if the item is not applicable and please add links to any relevant follow up issues.* I have... - [x] included the correct [type prefix](https://github.com/commitizen/conventional-commit-types/blob/v3.0.0/index.json) in the PR title - [ ] added `!` to the type prefix if API or client breaking change - [x] targeted the correct branch (see [PR Targeting](https://github.com/cosmos/cosmos-sdk/blob/master/CONTRIBUTING.md#pr-targeting)) - [x] provided a link to the relevant issue or specification - [x] followed the guidelines for [building modules](https://github.com/cosmos/cosmos-sdk/blob/master/docs/building-modules) - [x] included the necessary unit and integration [tests](https://github.com/cosmos/cosmos-sdk/blob/master/CONTRIBUTING.md#testing) - [ ] added a changelog entry to `CHANGELOG.md` - [x] included comments for [documenting Go code](https://blog.golang.org/godoc) - [ ] updated the relevant documentation or specification - is there something I should be updating? - [x] reviewed "Files changed" and left comments if necessary - [ ] confirmed all CI checks have passed ### Reviewers Checklist *All items are required. Please add a note if the item is not applicable and please add your handle next to the items reviewed if you only reviewed selected items.* I have... - [ ] confirmed the correct [type prefix](https://github.com/commitizen/conventional-commit-types/blob/v3.0.0/index.json) in the PR title - [ ] confirmed `!` in the type prefix if API or client breaking change - [ ] confirmed all author checklist items have been addressed - [ ] reviewed state machine logic - [ ] reviewed API design and naming - [ ] reviewed documentation is accurate - [ ] reviewed tests and test coverage - [ ] manually tested (if applicable) (cherry picked from commit 78b151d) # Conflicts: # CHANGELOG.md
mergify bot
pushed a commit
that referenced
this issue
Sep 16, 2021
## Description Closes: #10072 Significantly speeds up all the GasKvStore and CacheKVStore operations. Now Get/Set for these stores no longer even appears on my Osmosis benchmarks, saving ~8% of those benchmark's times. (Only one of the internal methods for setCacheValue appear -- will try to separately get a PR for reducing those memory allocations if #10026 is merged) Talked to @alexanderbez about this on Discord, and he seemed in agreement with the approach of removing telemetry from these stores. This does technically change telemetry, but I don't know if this is / should be considered a breaking change? --- ### Author Checklist *All items are required. Please add a note to the item if the item is not applicable and please add links to any relevant follow up issues.* I have... - [x] included the correct [type prefix](https://github.com/commitizen/conventional-commit-types/blob/v3.0.0/index.json) in the PR title - [ ] added `!` to the type prefix if API or client breaking change - [x] targeted the correct branch (see [PR Targeting](https://github.com/cosmos/cosmos-sdk/blob/master/CONTRIBUTING.md#pr-targeting)) - [x] provided a link to the relevant issue or specification - [x] followed the guidelines for [building modules](https://github.com/cosmos/cosmos-sdk/blob/master/docs/building-modules) - [x] included the necessary unit and integration [tests](https://github.com/cosmos/cosmos-sdk/blob/master/CONTRIBUTING.md#testing) - [ ] added a changelog entry to `CHANGELOG.md` - [x] included comments for [documenting Go code](https://blog.golang.org/godoc) - [ ] updated the relevant documentation or specification - is there something I should be updating? - [x] reviewed "Files changed" and left comments if necessary - [ ] confirmed all CI checks have passed ### Reviewers Checklist *All items are required. Please add a note if the item is not applicable and please add your handle next to the items reviewed if you only reviewed selected items.* I have... - [ ] confirmed the correct [type prefix](https://github.com/commitizen/conventional-commit-types/blob/v3.0.0/index.json) in the PR title - [ ] confirmed `!` in the type prefix if API or client breaking change - [ ] confirmed all author checklist items have been addressed - [ ] reviewed state machine logic - [ ] reviewed API design and naming - [ ] reviewed documentation is accurate - [ ] reviewed tests and test coverage - [ ] manually tested (if applicable) (cherry picked from commit 78b151d) # Conflicts: # CHANGELOG.md
robert-zaremba
added a commit
that referenced
this issue
Sep 17, 2021
) * perf: Remove telemetry from wrappings of store (#10077) ## Description Closes: #10072 Significantly speeds up all the GasKvStore and CacheKVStore operations. Now Get/Set for these stores no longer even appears on my Osmosis benchmarks, saving ~8% of those benchmark's times. (Only one of the internal methods for setCacheValue appear -- will try to separately get a PR for reducing those memory allocations if #10026 is merged) Talked to @alexanderbez about this on Discord, and he seemed in agreement with the approach of removing telemetry from these stores. This does technically change telemetry, but I don't know if this is / should be considered a breaking change? --- ### Author Checklist *All items are required. Please add a note to the item if the item is not applicable and please add links to any relevant follow up issues.* I have... - [x] included the correct [type prefix](https://github.com/commitizen/conventional-commit-types/blob/v3.0.0/index.json) in the PR title - [ ] added `!` to the type prefix if API or client breaking change - [x] targeted the correct branch (see [PR Targeting](https://github.com/cosmos/cosmos-sdk/blob/master/CONTRIBUTING.md#pr-targeting)) - [x] provided a link to the relevant issue or specification - [x] followed the guidelines for [building modules](https://github.com/cosmos/cosmos-sdk/blob/master/docs/building-modules) - [x] included the necessary unit and integration [tests](https://github.com/cosmos/cosmos-sdk/blob/master/CONTRIBUTING.md#testing) - [ ] added a changelog entry to `CHANGELOG.md` - [x] included comments for [documenting Go code](https://blog.golang.org/godoc) - [ ] updated the relevant documentation or specification - is there something I should be updating? - [x] reviewed "Files changed" and left comments if necessary - [ ] confirmed all CI checks have passed ### Reviewers Checklist *All items are required. Please add a note if the item is not applicable and please add your handle next to the items reviewed if you only reviewed selected items.* I have... - [ ] confirmed the correct [type prefix](https://github.com/commitizen/conventional-commit-types/blob/v3.0.0/index.json) in the PR title - [ ] confirmed `!` in the type prefix if API or client breaking change - [ ] confirmed all author checklist items have been addressed - [ ] reviewed state machine logic - [ ] reviewed API design and naming - [ ] reviewed documentation is accurate - [ ] reviewed tests and test coverage - [ ] manually tested (if applicable) (cherry picked from commit 78b151d) # Conflicts: # CHANGELOG.md * fix changelog Co-authored-by: Dev Ojha <ValarDragon@users.noreply.github.com> Co-authored-by: Robert Zaremba <robert@zaremba.ch>
evan-forbes
pushed a commit
to evan-forbes/cosmos-sdk
that referenced
this issue
Oct 12, 2021
cosmos#10170) * perf: Remove telemetry from wrappings of store (cosmos#10077) ## Description Closes: cosmos#10072 Significantly speeds up all the GasKvStore and CacheKVStore operations. Now Get/Set for these stores no longer even appears on my Osmosis benchmarks, saving ~8% of those benchmark's times. (Only one of the internal methods for setCacheValue appear -- will try to separately get a PR for reducing those memory allocations if cosmos#10026 is merged) Talked to @alexanderbez about this on Discord, and he seemed in agreement with the approach of removing telemetry from these stores. This does technically change telemetry, but I don't know if this is / should be considered a breaking change? --- ### Author Checklist *All items are required. Please add a note to the item if the item is not applicable and please add links to any relevant follow up issues.* I have... - [x] included the correct [type prefix](https://github.com/commitizen/conventional-commit-types/blob/v3.0.0/index.json) in the PR title - [ ] added `!` to the type prefix if API or client breaking change - [x] targeted the correct branch (see [PR Targeting](https://github.com/cosmos/cosmos-sdk/blob/master/CONTRIBUTING.md#pr-targeting)) - [x] provided a link to the relevant issue or specification - [x] followed the guidelines for [building modules](https://github.com/cosmos/cosmos-sdk/blob/master/docs/building-modules) - [x] included the necessary unit and integration [tests](https://github.com/cosmos/cosmos-sdk/blob/master/CONTRIBUTING.md#testing) - [ ] added a changelog entry to `CHANGELOG.md` - [x] included comments for [documenting Go code](https://blog.golang.org/godoc) - [ ] updated the relevant documentation or specification - is there something I should be updating? - [x] reviewed "Files changed" and left comments if necessary - [ ] confirmed all CI checks have passed ### Reviewers Checklist *All items are required. Please add a note if the item is not applicable and please add your handle next to the items reviewed if you only reviewed selected items.* I have... - [ ] confirmed the correct [type prefix](https://github.com/commitizen/conventional-commit-types/blob/v3.0.0/index.json) in the PR title - [ ] confirmed `!` in the type prefix if API or client breaking change - [ ] confirmed all author checklist items have been addressed - [ ] reviewed state machine logic - [ ] reviewed API design and naming - [ ] reviewed documentation is accurate - [ ] reviewed tests and test coverage - [ ] manually tested (if applicable) (cherry picked from commit 78b151d) # Conflicts: # CHANGELOG.md * fix changelog Co-authored-by: Dev Ojha <ValarDragon@users.noreply.github.com> Co-authored-by: Robert Zaremba <robert@zaremba.ch>
evan-forbes
pushed a commit
to evan-forbes/cosmos-sdk
that referenced
this issue
Nov 1, 2021
cosmos#10170) * perf: Remove telemetry from wrappings of store (cosmos#10077) ## Description Closes: cosmos#10072 Significantly speeds up all the GasKvStore and CacheKVStore operations. Now Get/Set for these stores no longer even appears on my Osmosis benchmarks, saving ~8% of those benchmark's times. (Only one of the internal methods for setCacheValue appear -- will try to separately get a PR for reducing those memory allocations if cosmos#10026 is merged) Talked to @alexanderbez about this on Discord, and he seemed in agreement with the approach of removing telemetry from these stores. This does technically change telemetry, but I don't know if this is / should be considered a breaking change? --- ### Author Checklist *All items are required. Please add a note to the item if the item is not applicable and please add links to any relevant follow up issues.* I have... - [x] included the correct [type prefix](https://github.com/commitizen/conventional-commit-types/blob/v3.0.0/index.json) in the PR title - [ ] added `!` to the type prefix if API or client breaking change - [x] targeted the correct branch (see [PR Targeting](https://github.com/cosmos/cosmos-sdk/blob/master/CONTRIBUTING.md#pr-targeting)) - [x] provided a link to the relevant issue or specification - [x] followed the guidelines for [building modules](https://github.com/cosmos/cosmos-sdk/blob/master/docs/building-modules) - [x] included the necessary unit and integration [tests](https://github.com/cosmos/cosmos-sdk/blob/master/CONTRIBUTING.md#testing) - [ ] added a changelog entry to `CHANGELOG.md` - [x] included comments for [documenting Go code](https://blog.golang.org/godoc) - [ ] updated the relevant documentation or specification - is there something I should be updating? - [x] reviewed "Files changed" and left comments if necessary - [ ] confirmed all CI checks have passed ### Reviewers Checklist *All items are required. Please add a note if the item is not applicable and please add your handle next to the items reviewed if you only reviewed selected items.* I have... - [ ] confirmed the correct [type prefix](https://github.com/commitizen/conventional-commit-types/blob/v3.0.0/index.json) in the PR title - [ ] confirmed `!` in the type prefix if API or client breaking change - [ ] confirmed all author checklist items have been addressed - [ ] reviewed state machine logic - [ ] reviewed API design and naming - [ ] reviewed documentation is accurate - [ ] reviewed tests and test coverage - [ ] manually tested (if applicable) (cherry picked from commit 78b151d) # Conflicts: # CHANGELOG.md * fix changelog Co-authored-by: Dev Ojha <ValarDragon@users.noreply.github.com> Co-authored-by: Robert Zaremba <robert@zaremba.ch>
JeancarloBarrios
pushed a commit
to agoric-labs/cosmos-sdk
that referenced
this issue
Sep 28, 2024
cosmos#10170) * perf: Remove telemetry from wrappings of store (cosmos#10077) ## Description Closes: cosmos#10072 Significantly speeds up all the GasKvStore and CacheKVStore operations. Now Get/Set for these stores no longer even appears on my Osmosis benchmarks, saving ~8% of those benchmark's times. (Only one of the internal methods for setCacheValue appear -- will try to separately get a PR for reducing those memory allocations if cosmos#10026 is merged) Talked to @alexanderbez about this on Discord, and he seemed in agreement with the approach of removing telemetry from these stores. This does technically change telemetry, but I don't know if this is / should be considered a breaking change? --- ### Author Checklist *All items are required. Please add a note to the item if the item is not applicable and please add links to any relevant follow up issues.* I have... - [x] included the correct [type prefix](https://github.com/commitizen/conventional-commit-types/blob/v3.0.0/index.json) in the PR title - [ ] added `!` to the type prefix if API or client breaking change - [x] targeted the correct branch (see [PR Targeting](https://github.com/cosmos/cosmos-sdk/blob/master/CONTRIBUTING.md#pr-targeting)) - [x] provided a link to the relevant issue or specification - [x] followed the guidelines for [building modules](https://github.com/cosmos/cosmos-sdk/blob/master/docs/building-modules) - [x] included the necessary unit and integration [tests](https://github.com/cosmos/cosmos-sdk/blob/master/CONTRIBUTING.md#testing) - [ ] added a changelog entry to `CHANGELOG.md` - [x] included comments for [documenting Go code](https://blog.golang.org/godoc) - [ ] updated the relevant documentation or specification - is there something I should be updating? - [x] reviewed "Files changed" and left comments if necessary - [ ] confirmed all CI checks have passed ### Reviewers Checklist *All items are required. Please add a note if the item is not applicable and please add your handle next to the items reviewed if you only reviewed selected items.* I have... - [ ] confirmed the correct [type prefix](https://github.com/commitizen/conventional-commit-types/blob/v3.0.0/index.json) in the PR title - [ ] confirmed `!` in the type prefix if API or client breaking change - [ ] confirmed all author checklist items have been addressed - [ ] reviewed state machine logic - [ ] reviewed API design and naming - [ ] reviewed documentation is accurate - [ ] reviewed tests and test coverage - [ ] manually tested (if applicable) (cherry picked from commit 78b151d) # Conflicts: # CHANGELOG.md * fix changelog Co-authored-by: Dev Ojha <ValarDragon@users.noreply.github.com> Co-authored-by: Robert Zaremba <robert@zaremba.ch>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
C: telemetry
Issues and features pertaining to SDK telemetry.
T: Performance
Performance improvements
Summary
The current telemetry work in GasKV and CacheKV store get operations causes significant overheads, antithetical to their purpose.
These telemetry overheads are (seemingly) for the use case of find all I/O time. However the approach being used here does not make sense to me (its effectively measuring 'time spent in getting from cache', not 'time from underlying store'), and is actually causing the overhead that its measuring!
If these are viewed as necessary, then they should only happen if the node has telemetry enabled.
Problem Definition
In our osmosis epoch time benchmarks, where the CacheKV store's cache size is huge, at least a third of the CacheKV store's Get function is spent in telemetry related operations. However per my (back of the envelope) estimate, our hashmap get operation time should probably be at least double the time of normal CacheKV operations used in the SDK.
However this telemetry overhead is then paid again in the gas kv store that is calling the cache kv store. (As they both do the same telemetry operation on each get)
This means that most of the time in fetching data from a catch (even in the case of a miss) is spent in telemetry.
Proposal
I don't view time spent in caches as the helpful telemetry operation. I think telemetry for getting cache timings is also bad, as it ruins lots of hardware cache efficiency work going on here. If you wanted telemetry for your caches, you should be doing this at a higher level (e.g. overall time with and without the cache). imo you should not be benchmarking low level things like this, as it artificially screws up things like stack/registers, cache lines, which are important for data caches.
The alternative take would be to 'gate' these defers to only be valid when telemetry is enabled, and make that propogate throughout the state machine. I think these telemetry ops should be removed personally.
pprof source timings for Osmosis epoch benchmarks' very heavy usage of cacheKVStore. (My understanding is that the runtime.NewObject comes from the defer? But I don't actually know that at all)
For Admin Use
The text was updated successfully, but these errors were encountered: