Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add new configuration option to limit the size of string attribute values #1484

Merged
merged 9 commits into from
Aug 3, 2020

Conversation

iNikem
Copy link
Contributor

@iNikem iNikem commented Jul 30, 2020

Closes #1478

I have implemented a simpler version for now, allowing to specify limits in characters, not bytes. I think it is not ideal, but probably an acceptable compromise between precision and complexity.

@codecov
Copy link

codecov bot commented Jul 30, 2020

Codecov Report

Merging #1484 into master will decrease coverage by 0.16%.
The diff coverage is 75.75%.

Impacted file tree graph

@@             Coverage Diff              @@
##             master    #1484      +/-   ##
============================================
- Coverage     92.50%   92.33%   -0.17%     
- Complexity      938      951      +13     
============================================
  Files           117      117              
  Lines          3348     3380      +32     
  Branches        270      281      +11     
============================================
+ Hits           3097     3121      +24     
- Misses          170      172       +2     
- Partials         81       87       +6     
Impacted Files Coverage Δ Complexity Δ
...io/opentelemetry/sdk/trace/config/TraceConfig.java 95.00% <62.50%> (-5.00%) 6.00 <2.00> (+2.00) ⬇️
...in/java/io/opentelemetry/internal/StringUtils.java 80.00% <76.19%> (-8.89%) 17.00 <9.00> (+9.00) ⬇️
...ntelemetry/sdk/trace/RecordEventsReadableSpan.java 94.31% <100.00%> (+0.05%) 78.00 <0.00> (+1.00)
...ava/io/opentelemetry/sdk/trace/SpanBuilderSdk.java 96.00% <100.00%> (+0.06%) 45.00 <0.00> (+1.00)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 6dffbb8...2bb2ee5. Read the comment docs.

@@ -89,6 +93,7 @@
private static final int DEFAULT_SPAN_MAX_NUM_LINKS = 32;
private static final int DEFAULT_SPAN_MAX_NUM_ATTRIBUTES_PER_EVENT = 32;
private static final int DEFAULT_SPAN_MAX_NUM_ATTRIBUTES_PER_LINK = 32;
private static final int DEFAULT_KEY_SPAN_ATTRIBUTE_MAX_VALUE_LENGTH = 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is zero a good default? shouldn't it be MAX_INT?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

even better, make the default be a null Integer and get rid of this magic number.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All other methods here return int. Do you want for this one to return Integer?

With default value being MAX_INT the code will look like if(traceConfig.getMaxLengthOfAttributeValues() < Integer.MAX_INT) truncate(), which is strange for me.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also agree that 0 is strange as a default value. I would set it to a more common magic number, e.g. 255.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about if(traceConfig.getMaxLengthOfAttributeValues() != TraceConfig.DEFAULT_MAX_ATTRIBUTE_LENGTH) ? I think that reads great, makes it clear, and doesn't use '0' as a magic number.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I prefer MAX_VALUE, to the nullable, but I don't feel super strongly about it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait, that does not make sense. Currently I attempt to truncate values if configured value differs from "unlimited" which is signalled by 0. Your proposal says "let's try to truncate if configured value is not default". But what if we change default to, say, 2M? if(traceConfig.getMaxLengthOfAttributeValues() != TraceConfig.DEFAULT_MAX_ATTRIBUTE_LENGTH) will not work anymore. Compare with default is wrong, default may change. We have to compare with "unlimited", however we denote that. And I like traceConfig.getMaxLengthOfAttributeValues() > 0 more than traceConfig.getMaxLengthOfAttributeValues() > Integer.MAX_INT.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd be happier with -1, rather than 0. 0 is actually a valid length, and -1 clearly is not. And, I'd rather have a constant defined for the "unset" value, rather than just hardcode the magic number in two places.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jkwatson PTAL

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 I like -1 for unlimited, I think that's reasonably conventional

@iNikem
Copy link
Contributor Author

iNikem commented Jul 31, 2020

Should I do any performance testing? If yes, then how? Do we have some testbed/benchmark?

@jkwatson
Copy link
Contributor

Should I do any performance testing? If yes, then how? Do we have some testbed/benchmark?

There are some benchmarks in the sdk/src/jmh directory, but I'm not sure if any of them would be relevant. Please add something there that would exercise this, if you have time.

@iNikem
Copy link
Contributor Author

iNikem commented Aug 2, 2020

I have add a benchmark, see SpanAttributeTruncateBenchmark.

Benchmark                                                                       (maxLength)   Mode  Cnt         Score        Error   Units
SpanAttributeTruncateBenchmark.longAttributes                                            10  thrpt   10      1842.242 ±     30.862  ops/ms
SpanAttributeTruncateBenchmark.longAttributes:·gc.alloc.rate                             10  thrpt   10      3079.287 ±     49.667  MB/sec
SpanAttributeTruncateBenchmark.longAttributes:·gc.alloc.rate.norm                        10  thrpt   10      2632.000 ±      0.001    B/op
SpanAttributeTruncateBenchmark.longAttributes:·gc.churn.G1_Eden_Space                    10  thrpt   10      3086.074 ±    175.322  MB/sec
SpanAttributeTruncateBenchmark.longAttributes:·gc.churn.G1_Eden_Space.norm               10  thrpt   10      2638.152 ±    159.422    B/op
SpanAttributeTruncateBenchmark.longAttributes:·gc.churn.G1_Old_Gen                       10  thrpt   10         0.002 ±      0.004  MB/sec
SpanAttributeTruncateBenchmark.longAttributes:·gc.churn.G1_Old_Gen.norm                  10  thrpt   10         0.002 ±      0.003    B/op
SpanAttributeTruncateBenchmark.longAttributes:·gc.count                                  10  thrpt   10       136.000               counts
SpanAttributeTruncateBenchmark.longAttributes:·gc.time                                   10  thrpt   10        84.000                   ms
SpanAttributeTruncateBenchmark.longAttributes                                       1000000  thrpt   10      2106.260 ±     37.616  ops/ms
SpanAttributeTruncateBenchmark.longAttributes:·gc.alloc.rate                        1000000  thrpt   10      2770.369 ±     51.017  MB/sec
SpanAttributeTruncateBenchmark.longAttributes:·gc.alloc.rate.norm                   1000000  thrpt   10      2072.000 ±      0.001    B/op
SpanAttributeTruncateBenchmark.longAttributes:·gc.churn.G1_Eden_Space               1000000  thrpt   10      2785.632 ±    195.567  MB/sec
SpanAttributeTruncateBenchmark.longAttributes:·gc.churn.G1_Eden_Space.norm          1000000  thrpt   10      2082.859 ±    118.751    B/op
SpanAttributeTruncateBenchmark.longAttributes:·gc.churn.G1_Old_Gen                  1000000  thrpt   10         0.002 ±      0.001  MB/sec
SpanAttributeTruncateBenchmark.longAttributes:·gc.churn.G1_Old_Gen.norm             1000000  thrpt   10         0.002 ±      0.001    B/op
SpanAttributeTruncateBenchmark.longAttributes:·gc.count                             1000000  thrpt   10       121.000               counts
SpanAttributeTruncateBenchmark.longAttributes:·gc.time                              1000000  thrpt   10        76.000                   ms
SpanAttributeTruncateBenchmark.shortAttributes                                           10  thrpt   10      2214.400 ±     63.958  ops/ms
SpanAttributeTruncateBenchmark.shortAttributes:·gc.alloc.rate                            10  thrpt   10      2687.871 ±     78.316  MB/sec
SpanAttributeTruncateBenchmark.shortAttributes:·gc.alloc.rate.norm                       10  thrpt   10      1912.000 ±      0.001    B/op
SpanAttributeTruncateBenchmark.shortAttributes:·gc.churn.G1_Eden_Space                   10  thrpt   10      2694.988 ±    168.354  MB/sec
SpanAttributeTruncateBenchmark.shortAttributes:·gc.churn.G1_Eden_Space.norm              10  thrpt   10      1916.998 ±    103.322    B/op
SpanAttributeTruncateBenchmark.shortAttributes:·gc.churn.G1_Old_Gen                      10  thrpt   10         0.002 ±      0.002  MB/sec
SpanAttributeTruncateBenchmark.shortAttributes:·gc.churn.G1_Old_Gen.norm                 10  thrpt   10         0.002 ±      0.002    B/op
SpanAttributeTruncateBenchmark.shortAttributes:·gc.count                                 10  thrpt   10       117.000               counts
SpanAttributeTruncateBenchmark.shortAttributes:·gc.time                                  10  thrpt   10        73.000                   ms
SpanAttributeTruncateBenchmark.shortAttributes                                      1000000  thrpt   10      2164.073 ±     50.902  ops/ms
SpanAttributeTruncateBenchmark.shortAttributes:·gc.alloc.rate                       1000000  thrpt   10      2847.796 ±     67.014  MB/sec
SpanAttributeTruncateBenchmark.shortAttributes:·gc.alloc.rate.norm                  1000000  thrpt   10      2072.000 ±      0.001    B/op
SpanAttributeTruncateBenchmark.shortAttributes:·gc.churn.G1_Eden_Space              1000000  thrpt   10      2860.692 ±    187.860  MB/sec
SpanAttributeTruncateBenchmark.shortAttributes:·gc.churn.G1_Eden_Space.norm         1000000  thrpt   10      2080.790 ±    100.616    B/op
SpanAttributeTruncateBenchmark.shortAttributes:·gc.churn.G1_Old_Gen                 1000000  thrpt   10         0.003 ±      0.003  MB/sec
SpanAttributeTruncateBenchmark.shortAttributes:·gc.churn.G1_Old_Gen.norm            1000000  thrpt   10         0.002 ±      0.002    B/op
SpanAttributeTruncateBenchmark.shortAttributes:·gc.count                            1000000  thrpt   10       131.000               counts
SpanAttributeTruncateBenchmark.shortAttributes:·gc.time                             1000000  thrpt   10        81.000                   ms
SpanAttributeTruncateBenchmark.veryLongAttributes                                        10  thrpt   10      1828.598 ±     50.072  ops/ms
SpanAttributeTruncateBenchmark.veryLongAttributes:·gc.alloc.rate                         10  thrpt   10      3053.819 ±     81.810  MB/sec
SpanAttributeTruncateBenchmark.veryLongAttributes:·gc.alloc.rate.norm                    10  thrpt   10      2632.000 ±      0.001    B/op
SpanAttributeTruncateBenchmark.veryLongAttributes:·gc.churn.G1_Eden_Space                10  thrpt   10      3073.256 ±    113.942  MB/sec
SpanAttributeTruncateBenchmark.veryLongAttributes:·gc.churn.G1_Eden_Space.norm           10  thrpt   10      2649.243 ±    107.161    B/op
SpanAttributeTruncateBenchmark.veryLongAttributes:·gc.churn.G1_Old_Gen                   10  thrpt   10         0.004 ±      0.004  MB/sec
SpanAttributeTruncateBenchmark.veryLongAttributes:·gc.churn.G1_Old_Gen.norm              10  thrpt   10         0.003 ±      0.004    B/op
SpanAttributeTruncateBenchmark.veryLongAttributes:·gc.count                              10  thrpt   10       129.000               counts
SpanAttributeTruncateBenchmark.veryLongAttributes:·gc.time                               10  thrpt   10        82.000                   ms
SpanAttributeTruncateBenchmark.veryLongAttributes                                   1000000  thrpt   10         1.046 ±      0.029  ops/ms
SpanAttributeTruncateBenchmark.veryLongAttributes:·gc.alloc.rate                    1000000  thrpt   10      6646.052 ±    187.035  MB/sec
SpanAttributeTruncateBenchmark.veryLongAttributes:·gc.alloc.rate.norm               1000000  thrpt   10  10002632.475 ±      0.384    B/op
SpanAttributeTruncateBenchmark.veryLongAttributes:·gc.churn.G1_Eden_Space           1000000  thrpt   10        11.213 ±      0.742  MB/sec
SpanAttributeTruncateBenchmark.veryLongAttributes:·gc.churn.G1_Eden_Space.norm      1000000  thrpt   10     16878.576 ±   1147.622    B/op
SpanAttributeTruncateBenchmark.veryLongAttributes:·gc.churn.G1_Old_Gen              1000000  thrpt   10      6968.556 ±    325.177  MB/sec
SpanAttributeTruncateBenchmark.veryLongAttributes:·gc.churn.G1_Old_Gen.norm         1000000  thrpt   10  10489054.338 ± 459834.403    B/op
SpanAttributeTruncateBenchmark.veryLongAttributes:·gc.count                         1000000  thrpt   10       157.000               counts
SpanAttributeTruncateBenchmark.veryLongAttributes:·gc.time                          1000000  thrpt   10       195.000                   ms

Unsurprisingly, truncating strings takes time. Truncating long strings to long strings takes a lot of time.

@jkwatson
Copy link
Contributor

jkwatson commented Aug 2, 2020

looks like some checkstyle issues to be resolved still

@jkwatson
Copy link
Contributor

jkwatson commented Aug 3, 2020

Just curious...what JVM version did you run the benchmarks on? Do they change much with a different version?

@jkwatson jkwatson merged commit 70e8433 into open-telemetry:master Aug 3, 2020
@iNikem
Copy link
Contributor Author

iNikem commented Aug 3, 2020

  1. Haven't tried with different versions. If I remember, can do :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Allow to configure maximum attribute value size
5 participants