-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Initial Metrics #2499
base: main
Are you sure you want to change the base?
Add Initial Metrics #2499
Conversation
Starting early global discussions here:
Counter ideas:
Tip: GitHub will auto redirect PRs, so you can PR to a PR and when the first PR is merged downstream PRs automatically get redirected to their merge target, e.g. this would change to look at main upon merge, showing only relevant diff the whole time...very handy for these layering cases :) Happy to hop on a call next week if y'all want to talk though, just getting initial ideas out there. Thoughts? |
There isn't a proscribed format from OTel about what format counter names should look like. OTel has some counters they're defining for certain domains and they use dots to separate words. .NET counters use dashes. When exported to Prometheus, they're both turned into underscores.
Meter + counters don't necessarily have to be a singleton. In ASP.NET Core, the meter is created from A singleton is difficult because multiple tests can run in parallel, or work may still be happening from a finished test, which interferes with results. See https://github.com/JamesNK/MeterFactoryDemo/blob/d0660fb70ea54bfa358606830a4e02f507fc89f6/src/MeterFactoryDemo.Tests/BasicTests.cs#L14-L41 for an example of how easy it can be to test metrics once the meter is injected. I'm guessing the Redis client doesn't have close integration with DI so a more manual approach like what we're doing with HttpClient would be required: dotnet/runtime#86961
Check out dotnet/aspnetcore#47536 for ideas of the kind of counters that we're adding to ASP.NET Core + dimensions. |
You could consider combining these into one histogram counter: |
if (_operationCount.Enabled) | ||
{ | ||
_operationCount.Add(1, | ||
new KeyValuePair<string, object?>("endpoint", endpoint)); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Counter.Add
tests for enabled internally. Having an extra check like this is really only useful if some work happens when building up the tag list.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 - we should have a string version of the endpoint memoized so that this KVP is as cheap as possible if not already, no objections adding that.
@JamesNK great points all around (and I really appreciate the expertise and time!) I agree with collapsing and using dimensions for types (sync/async) and the testing story. I think my confusion on instance intent here is the direct usage of Also agree on DI assessment: we don't expose the |
Just as a heads up (and not at all specific to Redis) I've been questioning lately whether .NET should continue to use and recommend dashes in these names. I sent some email earlier that touched on it. I can't find an explicit written recommendation for '.' in OTel's guidance, yet '.' is used uniformly in every OTel example, every experimental semantic convention, and every implementation I can find. The fact that the OTel text doesn't explicitly say "we recommend using dot" feels more like an oversight rather than OTel deliberately not having a recommendation/preference. Some other other generalized guidance from OTel on naming is here.
If you can create an internal constructor for RedisMetrics and get your code under test to use it then I'd recommend making that internal constructor take a Meter object: public static readonly RedisMetrics Instance = new RedisMetrics(new Meter("StackExchange.Redis"));
internal RedisMetrics(Meter meter)
{
_meter = meter;
_operationCount = _meter.CreateCounter<long>(
"redis-operation-count",
description: "The number of operations performed.");
...
} And then use it something like: // arrange
// the name of this Meter doesn't matter and it doesn't need to be unique across tests
var meter = new Meter("StackExchange.Redis");
// This collector will only collect metrics recorded by this specific Meter object.
// Even if other tests use Meters with the same name, they will not be captured by this collector
using var collector = new MetricCollector(meter, "redis-operation-count");
var redisMetrics = new RedisMetrics(meter);
// act
DoSomeWorkThatEmitsMetrics(redisMetrics);
// assert
Assert.Equal(1, collector.GetMeasurementSnapshot().Count);
... Its certainly possible to inject strings and use them to create custom Meter names, but I think that winds up being more complex than injecting a Meter object and just using the object identity as the thing that keeps metrics for each test isolated. The other option if you can't inject anything (or prefer not to), is to create |
@noahfalk The dash feels weird to me too, everything was namespace delimited with dots when we did Bosun's system design and metrics because that's what we saw everywhere else. Dashes weren't a thing at all, and spaces were underscores, e.g. As for testing: great suggestion, I love it. I knew we could narrow in on a custom |
Yeah, I don't want Redis to look like an oddball either. If we go through with this I'm going to adjust names for all the metrics being added to ASP.NET and HttpClient for .NET 8, as well as change the written naming guidance in our public docs. I'm not proposing we change names that already got shipped in past releases as part of EventCounters API, but I expect the usage of those metrics will diminish over time and we'd converge towards a place where all the metrics people care about are using dot. |
- no dashes. - use `db.redis` prefix Implement duration histogram and collapse 4 counters into it.
Thanks for all the feedback so far! I still need to address the feedback to make this testable by allowing the unit tests to inject a Metrics object into the Multiplexer. I'll work on that next week. In the meantime, I've updated for better naming to align with OpenTelemetry (no more dash
@NickCraver @mgravell - my intention is for this PR to get the infrastructure and an initial set of metrics added to the library. And then add more metrics in separate PRs as we get the need for them. How does that approach sound? Do you have an idea for what a "minimal set" of initial metrics would be? |
@eerhardt I get the intent, but all past experience has taught me not to put a few metrics in and come back later. In every case we've done that, when we go from say the first 3 to the first 20 we found out the scheme/name/dimensions/etc. for the first 3 wasn't right and stands out. IMO, we need to figure out a larger set together. Looking at current, I see db.redis.operation.count and and db.redis.duration, wouldn't the second be db.redis.opration.duration? (I'm also not sure operation is the right word - we do a lot of operations, in Redis these are typically "commands" and I'd expect that here. This could be bias, but connections, reconnectiions, reconfigurations, etc. are all "operations". I'm not 100% sure operations/commands makes overall sense as an item. For example, we have commands sent and received. I'd guess based on most of our debugging these should be separate, e.g. the first being steady state and the second pausing indicates a server stall. On commands we probably want the command (e.g. Overall, I think we've got the approach figured out, we should now make an overall list of metrics to get a decent API/name/pattern/tag design that's a lot more comprehensive than starting with just a few. Happy to do a call this week if it helps, I just need to get some active blocking release stuff out the door to get more time on SE.Redis. Let me try and get a decent start on the full list here during the Tuesday call. |
…the Meter object.
Ok, I've updated the latest code to allow for injection of the If we like how this is structured, I can move forward with adding/removing/refactoring the metrics we really want to design. And then add more tests for them. |
var now = Stopwatch.GetTimestamp(); | ||
var duration = new TimeSpan((long)((now - message.CreatedTimestamp) * s_tickFrequency)); | ||
|
||
var tags = new TagList |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for upto 3 tags, its faster to pass them directly.
https://github.com/open-telemetry/opentelemetry-dotnet/tree/main/docs/metrics#instruments
When reporting measurements with 3 tags or less, pass the tags directly to the instrument API.
It's pity to see it inactive. Are there any plans to include metrics into near future releases? |
(Note: Leaving in Draft until #2497 is merged as this needs a
net6.0
target. You can skip the first commit in the review for now.)This adds the initial implementation of using System.Diagnostics.Metrics in the StackExchange.Redis client. For this first round, the following metrics are tracked:
Other metrics in
ConnectionCounters
need to use UpDownCounter, which is only available innet7.0
.cc @NickCraver @mgravell @JamesNK @noahfalk