Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make reports smaller #1201

Open
tomwilkie opened this issue Mar 23, 2016 · 14 comments
Open

Make reports smaller #1201

tomwilkie opened this issue Mar 23, 2016 · 14 comments
Assignees
Labels
chore Related to fix/refinement/improvement of end user or new/existing developer functionality performance Excessive resource usage and latency; usually a bug or chore
Milestone

Comments

@tomwilkie
Copy link
Contributor

I took a report from the service, and found it to be 6M uncompressed (500K compressed). This was slightly bigger than I was expecting - although lets assume this is made of 15*3=45 different reports, which still puts compressed probe -> app reports at about 10K.

I was interested in seeing which topology was using the most space - assuming it to be endpoints. It was not:

-rw-r--r--@  1 twilkie  staff   6.0M 23 Mar 12:46 k8s_report.json
-rw-r--r--   1 twilkie  staff   1.2M 23 Mar 12:50 container.json
-rw-r--r--   1 twilkie  staff    31K 23 Mar 12:51 container_image.json
-rw-r--r--   1 twilkie  staff   1.7M 23 Mar 12:50 endpoints.json
-rw-r--r--   1 twilkie  staff    19K 23 Mar 12:52 host.json
-rw-r--r--   1 twilkie  staff    55K 23 Mar 12:52 pod.json
-rw-r--r--   1 twilkie  staff   2.6M 23 Mar 12:50 process.json
-rw-r--r--   1 twilkie  staff    39K 23 Mar 12:52 service.json

So, perhaps an easy win would be to not report on every process; for instance, we only show ones which are doing IO, so we could easily filter out processes in the probe who's PIDs don't appear in the endpoint topology. How this would affect things like the details panel is a different problem.

I also suspect the reason the process topology is so big is because we have metrics on it (and we don't have metrics on endpoints). Perhaps their representation could be improved?

@tomwilkie
Copy link
Contributor Author

For instance, I suspect the way we encode times in json might be a little verbose, although (a) compression should be very effective on that and (b) we use msgpack.

@2opremio
Copy link
Contributor

Related: #985

@2opremio
Copy link
Contributor

This is getting urgent. After seeing the pressure put on gzip on #1454 #1457 I measured the reports on the service and the are all around ~10MB. Note that this is mspack and not json.

<app> DEBU: 2016/05/10 07:07:19.868129 Decoded report with uncompressed size: 11039439 bytes
<app> DEBU: 2016/05/10 07:07:20.514462 Decoded report with uncompressed size: 1356204 bytes
<app> DEBU: 2016/05/10 07:07:21.075427 Decoded report with uncompressed size: 8903471 bytes
<app> DEBU: 2016/05/10 07:07:22.460466 Decoded report with uncompressed size: 11021565 bytes
<app> DEBU: 2016/05/10 07:07:23.263057 Decoded report with uncompressed size: 8815125 bytes
<app> DEBU: 2016/05/10 07:07:23.487196 Decoded report with uncompressed size: 1362641 bytes
<app> DEBU: 2016/05/10 07:07:25.942304 Decoded report with uncompressed size: 10991935 bytes
<app> DEBU: 2016/05/10 07:07:26.518838 Decoded report with uncompressed size: 1355327 bytes
<app> DEBU: 2016/05/10 07:07:27.537417 Decoded report with uncompressed size: 8838048 bytes
<app> DEBU: 2016/05/10 07:07:30.083979 Decoded report with uncompressed size: 8840959 bytes
<app> DEBU: 2016/05/10 07:07:30.084173 Decoded report with uncompressed size: 1362380 bytes
<app> DEBU: 2016/05/10 07:07:31.110165 Decoded report with uncompressed size: 10992570 bytes
<app> DEBU: 2016/05/10 07:07:31.937460 Decoded report with uncompressed size: 10995374 bytes
<app> DEBU: 2016/05/10 07:07:32.497260 Decoded report with uncompressed size: 1366081 bytes
<app> DEBU: 2016/05/10 07:07:34.032572 Decoded report with uncompressed size: 8860366 bytes
<app> DEBU: 2016/05/10 07:07:34.658462 Decoded report with uncompressed size: 11069890 bytes
<app> DEBU: 2016/05/10 07:07:36.411694 Decoded report with uncompressed size: 1360041 bytes
<app> DEBU: 2016/05/10 07:07:36.411867 Decoded report with uncompressed size: 8875983 bytes
<app> DEBU: 2016/05/10 07:07:40.419795 Decoded report with uncompressed size: 8827937 bytes
<app> DEBU: 2016/05/10 07:07:40.420016 Decoded report with uncompressed size: 1358527 bytes
<app> DEBU: 2016/05/10 07:07:40.420170 Decoded report with uncompressed size: 10994623 bytes
<app> DEBU: 2016/05/10 07:07:42.605203 Decoded report with uncompressed size: 1374407 bytes
<app> DEBU: 2016/05/10 07:07:42.605459 Decoded report with uncompressed size: 8863273 bytes
<app> DEBU: 2016/05/10 07:07:42.606038 Decoded report with uncompressed size: 10993672 bytes
<app> DEBU: 2016/05/10 07:07:43.982567 Decoded report with uncompressed size: 11074538 bytes
<app> DEBU: 2016/05/10 07:07:44.130010 Decoded report with uncompressed size: 8887728 bytes
<app> DEBU: 2016/05/10 07:07:44.539015 Decoded report with uncompressed size: 1364226 bytes
<app> DEBU: 2016/05/10 07:07:47.459972 Decoded report with uncompressed size: 10991909 bytes
<app> DEBU: 2016/05/10 07:07:47.563402 Decoded report with uncompressed size: 8866257 bytes
<app> DEBU: 2016/05/10 07:07:47.592480 Decoded report with uncompressed size: 1362825 bytes
<app> DEBU: 2016/05/10 07:07:51.607983 Decoded report with uncompressed size: 10991917 bytes
<app> DEBU: 2016/05/10 07:07:51.608229 Decoded report with uncompressed size: 8862542 bytes
<app> DEBU: 2016/05/10 07:07:51.608356 Decoded report with uncompressed size: 1370380 bytes
<app> DEBU: 2016/05/10 07:07:53.149122 Decoded report with uncompressed size: 11076156 bytes
<app> DEBU: 2016/05/10 07:07:53.289902 Decoded report with uncompressed size: 8904769 bytes
<app> DEBU: 2016/05/10 07:07:53.868596 Decoded report with uncompressed size: 1373424 bytes
<app> DEBU: 2016/05/10 07:07:56.264958 Decoded report with uncompressed size: 10994070 bytes
<app> DEBU: 2016/05/10 07:07:56.546419 Decoded report with uncompressed size: 1368808 bytes
<app> DEBU: 2016/05/10 07:07:57.447718 Decoded report with uncompressed size: 8863152 bytes
<app> DEBU: 2016/05/10 07:08:00.677720 Decoded report with uncompressed size: 8862768 bytes
<app> DEBU: 2016/05/10 07:08:01.772031 Decoded report with uncompressed size: 1371314 bytes
<app> DEBU: 2016/05/10 07:08:01.772277 Decoded report with uncompressed size: 11040852 bytes
<app> DEBU: 2016/05/10 07:08:02.708167 Decoded report with uncompressed size: 10948433 bytes
<app> DEBU: 2016/05/10 07:08:02.708344 Decoded report with uncompressed size: 1376683 bytes
<app> DEBU: 2016/05/10 07:08:05.054480 Decoded report with uncompressed size: 11074448 bytes
<app> DEBU: 2016/05/10 07:08:06.058167 Decoded report with uncompressed size: 8879050 bytes
<app> DEBU: 2016/05/10 07:08:06.058380 Decoded report with uncompressed size: 1362445 bytes
<app> DEBU: 2016/05/10 07:08:07.000014 Decoded report with uncompressed size: 8984940 bytes
<app> DEBU: 2016/05/10 07:08:10.112212 Decoded report with uncompressed size: 8738654 bytes
<app> DEBU: 2016/05/10 07:08:11.323737 Decoded report with uncompressed size: 10992574 bytes
<app> DEBU: 2016/05/10 07:08:11.324003 Decoded report with uncompressed size: 1368511 bytes
<app> DEBU: 2016/05/10 07:08:12.347687 Decoded report with uncompressed size: 8872709 bytes
<app> DEBU: 2016/05/10 07:08:12.347873 Decoded report with uncompressed size: 1375379 bytes

@rade rade added performance Excessive resource usage and latency; usually a bug or chore tech-debt Unpleasantness that does (or may in future) affect development labels Jul 4, 2016
@rade rade modified the milestone: July2016 Jul 4, 2016
@rade
Copy link
Member

rade commented Jul 18, 2016

Looking at a report (from dev admin scope, so it was running without proc probing), most of it is taken up by timestamps! iirc @tomwilkie mentioned this before.

e.g.

        "latest": {
          "conntracked": {
            "timestamp": "2016-07-18T20:54:46.155006786Z",
            "value": "true"
          },
          "addr": {
            "timestamp": "2016-07-18T20:54:46.155000127Z",
            "value": "172.20.0.122"
          },
          "port": {
            "timestamp": "2016-07-18T20:54:46.155000127Z",
            "value": "36536"
          }
        },

which would shrink to

        "latest": {
          "conntracked": "true",
          "addr": "172.20.0.122",
          "port": "36536"
        },

without timestamps.

Surely we don't need timestamps on nearly every individual json object - one time stamp per report should be fine, with the possible exception of some metrics (though these seem to carry separate "first" and "last" timestamps anyway).

Related: some timestamps are recorded with second precision, some with nanosecond precision.

@rade
Copy link
Member

rade commented Jul 19, 2016

Looks like the source of all these timestamps is a) LatestMap, and b) Metric.

For LatestMap I wonder whether we could associate the timestamp with the map rather than the individual entries.

Metric is a timeseries, so the individual timestamps there are rather important. However, for many metrics the value is constant, in which case the samples could be replaced with just a single value. Combined with the existing first/last fields this is a lossless compression.

@rade
Copy link
Member

rade commented Jul 19, 2016

For LatestMap I wonder whether we could associate the timestamp with the map rather than the individual entries.

That would make merging non-commutative when there are any entries where the value can change over time. Maintaining commutativity in such cases is of course the main reason of having any timestamps at all in the first place.

Now, in reality the majority of values are constant - they cannot change over time. So perhaps we should have an additional data structure, ConstMap where we stuff all such entries. The likes of container env entries and container labels, and endpoint addr/port.

@tomwilkie
Copy link
Contributor Author

The reality is we can't really change this data structure anymore, so to
move the timestamps to the map level would require us to add a new data
structure (const map...).

On Tue, Jul 19, 2016 at 7:41 AM, Matthias Radestock <
notifications@github.com> wrote:

For LatestMap I wonder whether we could associate the timestamp with the
map rather than the individual entries.

That would make merging non-commutative when there are any entries where
the value can change over time. Maintaining commutativity in such cases is
of course the main reason of having any timestamps at all in the first
place.

Now, in reality the majority of values are constant - they cannot
change over time. So perhaps we should have an additional data structure,
ConstMap where we stuff all such entries. The likes of container env
entries and container labels, and endpoint addr/port.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#1201 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAbGhb76fzi8DYKnkLLqgUDwcnHtrVQkks5qXHGhgaJpZM4H3CYh
.

@rade
Copy link
Member

rade commented Jul 19, 2016

The reality is we can't really change this data structure anymore, so to move the timestamps to the map level would require us to add a new data structure (const map...).

Yep, that's what I suggested :)

@tomwilkie
Copy link
Contributor Author

Indeed, I was agreeing.

On Tue, Jul 19, 2016 at 11:50 AM, Matthias Radestock <
notifications@github.com> wrote:

The reality is we can't really change this data structure anymore, so to
move the timestamps to the map level would require us to add a new data
structure (const map...).

Yep, that's what I suggested :)


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#1201 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAbGhW3QOAF-A6dmuQ-BDglZtdkOFLJsks5qXKvjgaJpZM4H3CYh
.

@2opremio 2opremio self-assigned this Jul 19, 2016
@2opremio
Copy link
Contributor

2opremio commented Jul 26, 2016

Another thing we could to is to make sure that empty fields are not serialized/deserialized. Here's an example of a serialized endpoint node:

{
  "id": ";172.20.0.149;47053",
  "topology": "endpoint",
  "counters": {},
  "sets": {},
  "adjacency": [
    ";172.20.0.207;10250"
  ],
  "edges": {
    ";172.20.0.207;10250": {}
  },
  "controls": {},
  "latest": {
    "addr": {
      "timestamp": "2016-07-25T17:20:06.092157908Z",
      "value": "172.20.0.149"
    },
    "copy_of": {
      "timestamp": "2016-07-25T17:20:06.092157908Z",
      "value": ";10.244.1.62;47053"
    },
    "port": {
      "timestamp": "2016-07-25T17:20:06.092157908Z",
      "value": "47053"
    },
    "conntracked": {
      "timestamp": "2016-07-25T17:20:06.089607683Z",
      "value": "true"
    }
  },
  "parents": {},
  "children": null
}

counters, sets, counters and controls are not used and yet occupy space in the serialized report.

For ugorji's msgpack serializer, we would need to check whether json:"omitempty" does this.

@2opremio
Copy link
Contributor

2opremio commented Jul 27, 2016

Also, the "addr" and "port" fields of the LatestMap in the Endpoint topology are redundant since they are also present in the node keys. We should be able to remove them completely.

EDIT: Done in #2581

@2opremio
Copy link
Contributor

2opremio commented Jul 28, 2016

Another thing we could to is to make sure that empty fields are not serialized/deserialized.

Not immediately possible without using pointers instead of structs for each field :S See http://stackoverflow.com/questions/18088294/how-to-not-marshal-an-empty-struct-into-json-with-go

The codec library we are using has the same problem, see https://godoc.org/github.com/ugorji/go/codec#Encoder.Encode

The empty values (for omitempty option) are false, 0, any nil pointer or interface value, and any array, slice, map, or string of length zero.

@2opremio
Copy link
Contributor

I've created a feature request upstream: ugorji/go#163

@2opremio
Copy link
Contributor

Related: some timestamps are recorded with second precision, some with nanosecond precision.

@rade I've just realized that those are probably the Docker stat metrics which I believe only have second-precision (the serializer omits the zeros for the nanoseconds).

@2opremio 2opremio mentioned this issue Jul 30, 2016
3 tasks
@2opremio 2opremio modified the milestones: July2016, August2016 Aug 2, 2016
@rade rade modified the milestones: 0.18/1.0, October2016 Sep 15, 2016
@rade rade removed this from the 0.18/1.0 milestone Sep 15, 2016
@rade rade added chore Related to fix/refinement/improvement of end user or new/existing developer functionality and removed tech-debt Unpleasantness that does (or may in future) affect development labels Jan 11, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
chore Related to fix/refinement/improvement of end user or new/existing developer functionality performance Excessive resource usage and latency; usually a bug or chore
Projects
None yet
Development

No branches or pull requests

3 participants