Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kaizen: add equals-ignore-case pattern #340

Merged
merged 5 commits into from
Jul 22, 2024
Merged

kaizen: add equals-ignore-case pattern #340

merged 5 commits into from
Jul 22, 2024

Conversation

timbray
Copy link
Owner

@timbray timbray commented Jul 20, 2024

addresses: #186

This adds support for the equals-ignore-case pattern. The automaton-building is a bit complex but reasonably well tested.

What is controversial is that this PR contains Quamina's first generated code - the case-mappings are pulled from the file CaseFolding.txt in the Unicode Character Database. The generated code is in case_folding.go. It is built by code_gen/build_casefolding_table.go which is in package main and has main()

The problem is that the CaseFolding.txt is regularly changed with Unicode releases. So build_casefolding_table checks to see if the case_folding.go file is older than 3 months and if so, rebuilds it. There is a Makefile to take care of building and running the code generator. So if you're doing a PR you should at some point type make and if you get a new case_folding.go then check it in with your PR.

There are a couple of problems. First, I've never done generated-Go code before and maybe there's a much better well-known approach. Second, I haven't figured out how to unit test the code generator. The coverage tools ignore it because it's not in the quamina package. Implicitly, it's tested by the tests in monocase_test.go and elsewhere having the desired results but I'm going to open an issue to figure out how to test properly.

addresses: #186
Signed-off-by: Tim Bray <tbray@textuality.com>
@codecov-commenter
Copy link

codecov-commenter commented Jul 20, 2024

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

Attention: Patch coverage is 97.50000% with 2 lines in your changes missing coverage. Please review.

Project coverage is 96.59%. Comparing base (8da5067) to head (0a03480).

Files Patch % Lines
monocase.go 96.07% 1 Missing and 1 partial ⚠️

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #340      +/-   ##
==========================================
+ Coverage   96.56%   96.59%   +0.03%     
==========================================
  Files          18       19       +1     
  Lines        1863     1940      +77     
==========================================
+ Hits         1799     1874      +75     
- Misses         36       37       +1     
- Partials       28       29       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@timbray
Copy link
Owner Author

timbray commented Jul 20, 2024

Hmm, surprised by the benchmark failures, I didn't think I changed anything that should affect an existing benchmark. Will investigate. Could be trailing fallout from the Q-numbers work.

@arnehormann
Copy link
Collaborator

@timbray Go itself (the x package is semi-official and comes close to the standard library) appears to do it this way:
https://cs.opensource.google/go/x/text/+/master:gen.go
with generated code like this: https://cs.opensource.google/go/x/text/+/master:cases/tables15.0.0.go

@timbray
Copy link
Owner Author

timbray commented Jul 20, 2024

This is annoying; the same problem with the same benchmark showed up on the Q numbers PR, and I accepted the numeric-match slowdown in the interests of correctness. So I changed the CI/CD to warn but not fail, and then after everything was clean I put the CI/CD back to fail mode, assuming that would reset the cached previous value. Apparently this did not happen; I have poked around and failed (so far) to figure out how/where the benchmark tool caches previous results.

Anyhow, for now, I'm going to reset the CI/CD again not to fail on slowdown.

timbray added 2 commits July 20, 2024 11:53
Signed-off-by: Tim Bray <tbray@textuality.com>
Signed-off-by: Tim Bray <tbray@textuality.com>
Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Performance Alert ⚠️

Possible performance regression was detected for benchmark 'Go Benchmark'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.20.

Benchmark suite Current: 0a03480 Previous: 78e2ec8 Ratio
BenchmarkCityLots 6894 ns/op 830 B/op 34 allocs/op 5592 ns/op 773 B/op 31 allocs/op 1.23
BenchmarkCityLots - ns/op 6894 ns/op 5592 ns/op 1.23
Benchmark_JsonFlattner_Evaluate_ContextFields 1247 ns/op 96 B/op 8 allocs/op 726.2 ns/op 56 B/op 4 allocs/op 1.72
Benchmark_JsonFlattner_Evaluate_ContextFields - ns/op 1247 ns/op 726.2 ns/op 1.72
Benchmark_JsonFlattner_Evaluate_ContextFields - B/op 96 B/op 56 B/op 1.71
Benchmark_JsonFlattner_Evaluate_ContextFields - allocs/op 8 allocs/op 4 allocs/op 2

This comment was automatically generated by workflow using github-action-benchmark.

@timbray timbray merged commit 6d34146 into main Jul 22, 2024
7 checks passed
@timbray timbray deleted the ignore-case branch July 22, 2024 16:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants