Add Metrics #248

gregkalapos · 2019-05-31T21:15:39Z

Implements #154.

Adds the whole infrastructure to send metrics
Implements sending the following metrics:
- system.process.cpu.total.norm.pct
- system.process.memory.size
- system.process.memory.rss.bytes
- system.memory.actual.free (Windows and Linux only - no macOS support atm)
- system.memory.total (Windows and Linux only - no macOS support atm)
- system.cpu.total.norm.pct (Windows and Linux only - we dropped the xplat implementation)
~~Also adds public API interface to send custom metrics~~ decided to drop this part - current plan is to do it in a follow up PR.

and started working on memory metrics - wip

The x-plat version is slower and its accuracy is questionable, therefore we only use it as a fallback on non-Windows OSs

Reason: WMI seem to have very bad perf.

This was already done, but now moved to a dedicated method. Also moved _timer.Start() to its own method.

codecov-io · 2019-06-03T22:50:49Z

Codecov Report

Merging #248 into master will increase coverage by 3.15%.
The diff coverage is 79.54%.

@@            Coverage Diff             @@
##           master     #248      +/-   ##
==========================================
+ Coverage   76.61%   79.76%   +3.15%     
==========================================
  Files          66       68       +2     
  Lines        2245     2239       -6     
  Branches      443      403      -40     
==========================================
+ Hits         1720     1786      +66     
+ Misses        399      283     -116     
- Partials      126      170      +44

Impacted Files	Coverage Δ
src/Elastic.Apm/Api/Tracer.cs	`95.79% <ø> (-0.07%)`	⬇️
src/Elastic.Apm/Model/Transaction.cs	`97.32% <100%> (+0.15%)`	⬆️
src/Elastic.Apm/Helpers/StacktraceHelper.cs	`74% <69.23%> (-8.76%)`	⬇️
src/Elastic.Apm.AspNetCore/ApmMiddleware.cs	`82.73% <90.9%> (+0.42%)`	⬆️
src/Elastic.Apm/Sampler.cs	`56.75% <0%> (-27.03%)`	⬇️
.../Elastic.Apm/Config/AbstractConfigurationReader.cs	`72.64% <0%> (-10.9%)`	⬇️
...Apm.AspNetCore/Config/MicrosoftExtensionsConfig.cs	`73.52% <0%> (-9.81%)`	⬇️
...gnosticListeners/HttpDiagnosticListenerImplBase.cs	`62.16% <0%> (-4.51%)`	⬇️
src/Elastic.Apm/Logging/LogValuesFormatter.cs	`88.34% <0%> (-0.98%)`	⬇️
src/Elastic.Apm/Logging/ConsoleLogger.cs	`85.71% <0%> (-0.96%)`	⬇️
... and 17 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 5628f4e...6929bdd. Read the comment docs.

…hods

With this approach the time range in which we measure the CPU usage is smaller than the whole time range (totalMsPassed). With this it's impossible to have higher than 100% CPU usage - this wasn't the case previously.

gregkalapos · 2019-06-07T22:15:54Z

Failing test fixed.

Decided to remove the public API part. Also, this is something most agents don’t have, so this time it wouldn't be just copying ideas :)

I think it’s better to leave it out now and reiterate on this.

I made everything internal, except IMetricSet and MetricSample.

Reason is that the IPayloadSender has a new class that accepts IMetrics (which contains an MetricSample property). I think it’s fine, or actually necessary to make this public: the point of IPayloadSender is to replace how the agents reports events, e.g. in tests instead of sending to the APM Server we keep events in memory - also valid use case for users to inject another implementation (although I think no one really does this) that let’s say just writes events into a files - the API enables this. So users can see the metrics in case they provide the agent a custom IPayloadSender implementation, but they cannot send custom metrics. So I try not to hide event reporting on the PayloadSender, there won’t be any benefit of that.

Obviously sending custom metrics will be the interesting use case - and that will be implemented in a future PR.

gregkalapos · 2019-06-07T22:35:00Z

~~The SystemCPU calculation is atm wrong, will reiterate on that - it's already too late here to start working on it, I need a sleep before that.~~

gregkalapos · 2019-06-07T23:06:31Z

The SystemCPU calculation is atm wrong, will reiterate on that - it's already too late here to start working on it, I need a sleep before that.

Fixed.

…measure Addressing #248 (comment)

gregkalapos · 2019-06-09T10:18:19Z

@SergeyKleyman please take another look. Thanks.

test/Elastic.Apm.Tests/MetricsTests.cs

src/Elastic.Apm/Metrics/MetricsProvider/SystemTotalCpuProvider.cs

… 'cpu' + tests

gregkalapos · 2019-06-10T15:58:52Z

Tests fail on Linux, and calling virtual from a .ctor is a terrible idea. Working on it...

src/Elastic.Apm/Metrics/MetricsProvider/SystemTotalCpuProvider.cs

gregkalapos · 2019-06-10T16:32:08Z

Tests fail on Linux, and calling virtual from a .ctor is a terrible idea. Working on it...

That issue is fixed.

I made the SystemCpu test more strict (not accepting 0), which fails on Windows:

[2019-06-10T16:20:36.603Z] Failed   Elastic.Apm.Tests.MetricsTests.SystemCpu
[2019-06-10T16:20:36.603Z] Error Message:
[2019-06-10T16:20:36.603Z]  Expected metricSamples.First().KeyValue.Value to be greater than 0.0, but found 0.0.

...working on this one.

The perf. counter API seems to return 0 for the 1. call, so we simply call it 2 times - the MetricsCollector can deal with this.

test/Elastic.Apm.Tests/MetricsTests.cs

SergeyKleyman

LGTM

gregkalapos · 2019-06-10T21:24:18Z

Tests fail on Linux, and calling virtual from a .ctor is a terrible idea. Working on it...

That issue is fixed.

I made the SystemCpu test more strict (not accepting 0), which fails on Windows:
[2019-06-10T16:20:36.603Z] Failed   Elastic.Apm.Tests.MetricsTests.SystemCpu
[2019-06-10T16:20:36.603Z] Error Message:
[2019-06-10T16:20:36.603Z]  Expected metricSamples.First().KeyValue.Value to be greater than 0.0, but found 0.0.
...working on this one.

Failing test on Windows also fixed - the perfcounter API seems to simply return 0 for the 1. call. We had 2 calls previously there, so that's why we never noticed it.

I also have some ideas why they return 0 the first time... we had the same discussion 😄

gregkalapos · 2019-06-10T21:38:40Z

Awesome, I think we are good here. I'll resolve the conflict tomorrow and merge. 🚀

gregkalapos added 14 commits May 7, 2019 14:04

Add MetricsCollector and impl. system.process.cpu.total.norm.pc - WIP

c367be9

Implement collecting multiple metrics for 1 timestamp

690b520

and started working on memory metrics - wip

Metrics - Add System.Management reference

d1152fb

Add more memory metrics

36dc0fa

Change TotalProcessorTime calculation

8e251b2

Add metrics benchmark

f0d67c6

Add total CPU x-plat and windows implementation

31b7856

The x-plat version is slower and its accuracy is questionable, therefore we only use it as a fallback on non-Windows OSs

Use GlobalMemoryStatusEx to get Total and Avail. Mem on Windows

fa8c20f

Reason: WMI seem to have very bad perf.

Add GetProcessWorkingSetAndVirtualMemory function

e29093a

This was already done, but now moved to a dedicated method. Also moved _timer.Start() to its own method.

Update src/Elastic.Apm/Metrics/MetricsCollector.cs

81f9f84

WIP x-plat CPU metrics

846c47d

Cleanup and add metrics to Public API

7a34bbb

Add MetricsInterval config

abb0604

Move MetricsCollector to AgentComponents

281377a

gregkalapos mentioned this pull request May 31, 2019

Implement metrics #154

Closed

Update src/Elastic.Apm/Metrics/MetricsCollector.cs

3c926d2

gregkalapos self-assigned this May 31, 2019

gregkalapos added 2 commits June 3, 2019 11:59

Add FakeMetricsCollector for tests that don't rely on metrics

1750475

Introduce IMetricsProvider

4b14ccd

gregkalapos added 2 commits June 4, 2019 15:04

Metrics: Add test with real agent, add comments to public classes/met…

099b718

…hods

Code cleanup

264e57e

gregkalapos changed the title ~~Add Metrics - WIP~~ Add Metrics Jun 4, 2019

gregkalapos marked this pull request as ready for review June 4, 2019 13:42

gregkalapos requested a review from SergeyKleyman June 4, 2019 13:42

gregkalapos added 2 commits June 4, 2019 16:28

Update test/Elastic.Apm.Tests/MetricsTests.cs

e4bbb06

Change CPU usage calculation

736e2f7

With this approach the time range in which we measure the CPU usage is smaller than the whole time range (totalMsPassed). With this it's impossible to have higher than 100% CPU usage - this wasn't the case previously.

gregkalapos added the [zube]: In Review label Jun 4, 2019

SergeyKleyman added 2 commits June 5, 2019 13:57

Refactor parsing code

2184277

Replace usages of default value with a named constant

64ff726

Update src/Elastic.Apm/Metrics/MetricsProvider/SystemTotalCpuProvider.cs

5595d10

Update src/Elastic.Apm/Metrics/MetricsProvider/SystemTotalCpuProvider.cs

d175b0b

ProcessTotalCpuTimeProvider: add comment explaning the timeframes we …

790a4f2

…measure Addressing #248 (comment)

SergeyKleyman reviewed Jun 9, 2019

View reviewed changes

test/Elastic.Apm.Tests/MetricsTests.cs Outdated Show resolved Hide resolved

Update test/Elastic.Apm.Tests/MetricsTests.cs

a8b272f

SergeyKleyman reviewed Jun 9, 2019

View reviewed changes

src/Elastic.Apm/Metrics/MetricsProvider/SystemTotalCpuProvider.cs Outdated Show resolved Hide resolved

SergeyKleyman reviewed Jun 9, 2019

View reviewed changes

src/Elastic.Apm/Metrics/MetricsProvider/SystemTotalCpuProvider.cs Outdated Show resolved Hide resolved

SergeyKleyman reviewed Jun 9, 2019

View reviewed changes

src/Elastic.Apm/Metrics/MetricsProvider/SystemTotalCpuProvider.cs Outdated Show resolved Hide resolved

/proc/stat: parse values to long, trim empty spaces dynamically after…

fb96b54

… 'cpu' + tests

SystemTotalCpuProvider: Inject StreamReader to test parsing logic

63c1a72

SergeyKleyman reviewed Jun 10, 2019

View reviewed changes

src/Elastic.Apm/Metrics/MetricsProvider/SystemTotalCpuProvider.cs Show resolved Hide resolved

Fix failing SystemCpu on Windows

6929bdd

The perf. counter API seems to return 0 for the 1. call, so we simply call it 2 times - the MetricsCollector can deal with this.

SergeyKleyman reviewed Jun 10, 2019

View reviewed changes

test/Elastic.Apm.Tests/MetricsTests.cs Outdated Show resolved Hide resolved

SergeyKleyman approved these changes Jun 10, 2019

View reviewed changes

SystemCPU on Windows: move 1. perf. counter call to .ctor

ab37547

gregkalapos merged commit ab37547 into elastic:master Jun 11, 2019

zube bot added [zube]: Done and removed [zube]: In Review labels Jun 11, 2019

gregkalapos removed the [zube]: Done label Jun 13, 2019

This was referenced Jun 13, 2019

Add net461 as a separate target for Elastic.Apm #277

Merged

system.cpu.total.norm.pct x-plat implementation #278

Closed

gregkalapos mentioned this pull request Jun 26, 2019

Send agent specific runtime metrics #321

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Metrics #248

Add Metrics #248

gregkalapos commented May 31, 2019 •

edited

Loading

codecov-io commented Jun 3, 2019 •

edited

Loading

gregkalapos commented Jun 7, 2019

gregkalapos commented Jun 7, 2019 •

edited

Loading

gregkalapos commented Jun 7, 2019

gregkalapos commented Jun 9, 2019

gregkalapos commented Jun 10, 2019

gregkalapos commented Jun 10, 2019

SergeyKleyman left a comment

gregkalapos commented Jun 10, 2019

gregkalapos commented Jun 10, 2019

Add Metrics #248

Add Metrics #248

Conversation

gregkalapos commented May 31, 2019 • edited Loading

codecov-io commented Jun 3, 2019 • edited Loading

Codecov Report

gregkalapos commented Jun 7, 2019

gregkalapos commented Jun 7, 2019 • edited Loading

gregkalapos commented Jun 7, 2019

gregkalapos commented Jun 9, 2019

gregkalapos commented Jun 10, 2019

gregkalapos commented Jun 10, 2019

SergeyKleyman left a comment

Choose a reason for hiding this comment

gregkalapos commented Jun 10, 2019

gregkalapos commented Jun 10, 2019

gregkalapos commented May 31, 2019 •

edited

Loading

codecov-io commented Jun 3, 2019 •

edited

Loading

gregkalapos commented Jun 7, 2019 •

edited

Loading