Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Code health] Remove Unicode Text from Source files #2707

Merged
merged 7 commits into from
Jun 20, 2024

Conversation

perhapsmaple
Copy link
Contributor

@perhapsmaple perhapsmaple commented Jun 19, 2024

Fixes #2706

Changes

Removed Unicode characters from API headers and opentracing-shim/src/span_shim.cc.

Escaped UTF-8 characters from test file data:

./ext/test/http/url_parser_test.cc: c program text, Unicode text, UTF-8 text
./sdk/test/metrics/instrument_metadata_validator_test.cc: c program text, Unicode text, UTF-8 text

For significant contributions please make sure you have completed the following items:

  • CHANGELOG.md updated for non-trivial changes
  • Unit tests have been added
  • Changes in public API reviewed

Signed-off-by: perhapsmaple <140232061+perhapsmaple@users.noreply.github.com>
Signed-off-by: perhapsmaple <140232061+perhapsmaple@users.noreply.github.com>
Signed-off-by: perhapsmaple <140232061+perhapsmaple@users.noreply.github.com>
@perhapsmaple perhapsmaple requested a review from a team June 19, 2024 16:03
@marcalff
Copy link
Member

For UTF-8 character still present, like in:

./ext/test/http/url_parser_test.cc: c program text, Unicode text, UTF-8 text

The offending source code is:

      {"%C3%B6%C3%A0%C2%A7%C3%96abcd%C3%84", "öà§ÖabcdÄ"},

Instead of using the character ö in the expected string, which corresponds to C3B6:

U+00F6	ö	c3 b6	LATIN SMALL LETTER O WITH DIAERESIS

could the code be changed to use unicode escape sequences like \u00F6 instead, so all the code is UTF-8 clean.

Copy link

codecov bot commented Jun 19, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 87.67%. Comparing base (497eaf4) to head (0a8dc8d).
Report is 86 commits behind head on main.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #2707      +/-   ##
==========================================
+ Coverage   87.12%   87.67%   +0.55%     
==========================================
  Files         200      190      -10     
  Lines        6109     5853     -256     
==========================================
- Hits         5322     5131     -191     
+ Misses        787      722      -65     
Files Coverage Δ
api/include/opentelemetry/baggage/baggage.h 97.30% <ø> (ø)
api/include/opentelemetry/context/context.h 100.00% <ø> (ø)
...pi/include/opentelemetry/context/runtime_context.h 97.50% <ø> (-0.09%) ⬇️

... and 105 files with indirect coverage changes

Copy link
Member

@marcalff marcalff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fix.

See comments, to also replace UTF-8 characters present in test files.

This will affect the build when compiling with unit tests, so it is desirable to also clean test code, even if used only optionally.

Signed-off-by: perhapsmaple <140232061+perhapsmaple@users.noreply.github.com>
@marcalff
Copy link
Member

Instead of using the character ö in the expected string, which corresponds to C3B6:

U+00F6	ö	c3 b6	LATIN SMALL LETTER O WITH DIAERESIS

could the code be changed to use unicode escape sequences like \u00F6 instead, so all the code is UTF-8 clean.

... and of course my suggestion does not work for windows / msvc for some reason (CI failures), sorry about that.

From what I could find, windows may need a std::wstring and a L"\u00F6" literal, but since no code in opentelemetry-cpp uses wstring currently, this is likely to open a can of worms.

A possible work around is to code the UTF-8 data instead : "\xC3\xB6", please try.

If this still fails, don't waste time on it, and remove changes from the two test files, these will be fixed separately.

Thanks

Signed-off-by: perhapsmaple <140232061+perhapsmaple@users.noreply.github.com>
Signed-off-by: perhapsmaple <140232061+perhapsmaple@users.noreply.github.com>
Signed-off-by: perhapsmaple <140232061+perhapsmaple@users.noreply.github.com>
@marcalff marcalff changed the title Remove Unicode Text from Source files [Code health] Remove Unicode Text from Source files Jun 20, 2024
Copy link
Member

@marcalff marcalff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for the fix.

@marcalff marcalff merged commit e1d9690 into open-telemetry:main Jun 20, 2024
51 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Source Files contain Unicode Text
3 participants