Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Treat map keys matching the same struct field as duplicates. #492

Merged
merged 2 commits into from
Feb 19, 2024

Conversation

benluddy
Copy link
Contributor

@benluddy benluddy commented Feb 14, 2024

Description

This resolves the case where two map keys could match the same struct field without being treated as
duplicate map keys. The DupMapKey documentation has been updated accordingly.

A map key that matches a previously-matched field will no longer quietly fall back to a
case-insensitive match and is now always treated as a duplicate key. Case-sensitive matching of map
key to struct field name is now a map lookup to improve the quadratic worst-case:

                                                                   │ manykeys-before.txt │         manykeys-after.txt          │
                                                                   │       sec/op        │   sec/op     vs base                │
UnmarshalMapToStruct/default_options/many_fields_one_key_per_field          24.137µ ± 4%   5.240µ ± 1%  -78.29% (p=0.000 n=10)

                                                                   │ manykeys-before.txt │        manykeys-after.txt         │
                                                                   │        B/op         │   B/op    vs base                 │
UnmarshalMapToStruct/default_options/many_fields_one_key_per_field            112.0 ± 0%   0.0 ± 0%  -100.00% (p=0.000 n=10)

                                                                   │ manykeys-before.txt │         manykeys-after.txt          │
                                                                   │      allocs/op      │ allocs/op   vs base                 │
UnmarshalMapToStruct/default_options/many_fields_one_key_per_field            1.000 ± 0%   0.000 ± 0%  -100.00% (p=0.000 n=10)

This fix cleared the way to optimize the way duplicate keys are tracked, especially for inputs that
contain no unknown fields:

                                                                               │  before.txt  │              after.txt              │
                                                                               │    sec/op    │   sec/op     vs base                │
UnmarshalMapToStruct/default_options/all_known_fields                             642.7n ± 3%   735.4n ± 0%  +14.42% (p=0.000 n=10)
UnmarshalMapToStruct/default_options/all_known_duplicate_fields                  1611.5n ± 1%   521.1n ± 1%  -67.66% (p=0.000 n=10)
UnmarshalMapToStruct/default_options/all_unknown_fields                           1.731µ ± 3%   1.174µ ± 0%  -32.18% (p=0.000 n=10)
UnmarshalMapToStruct/default_options/all_unknown_duplicate_fields                 1.726µ ± 2%   1.076µ ± 0%  -37.66% (p=0.000 n=10)
UnmarshalMapToStruct/reject_unknown/all_known_fields                              638.3n ± 1%   725.5n ± 1%  +13.66% (p=0.000 n=10)
UnmarshalMapToStruct/reject_unknown/all_known_duplicate_fields                    476.1n ± 1%   514.7n ± 0%   +8.11% (p=0.000 n=10)
UnmarshalMapToStruct/reject_unknown/all_unknown_fields                            448.1n ± 1%   398.8n ± 1%  -11.00% (p=0.000 n=10)
UnmarshalMapToStruct/reject_unknown/all_unknown_duplicate_fields                  447.9n ± 0%   399.9n ± 1%  -10.71% (p=0.000 n=10)
UnmarshalMapToStruct/reject_duplicate/all_known_fields                           1825.5n ± 0%   726.0n ± 0%  -60.23% (p=0.000 n=10)
UnmarshalMapToStruct/reject_duplicate/all_known_duplicate_fields                  744.3n ± 0%   440.8n ± 0%  -40.78% (p=0.000 n=10)
UnmarshalMapToStruct/reject_duplicate/all_unknown_fields                          2.885µ ± 1%   3.329µ ± 0%  +15.37% (p=0.000 n=10)
UnmarshalMapToStruct/reject_duplicate/all_unknown_duplicate_fields                844.8n ± 0%   738.7n ± 1%  -12.56% (p=0.000 n=10)
UnmarshalMapToStruct/reject_unknown_and_duplicate/all_known_fields               1828.5n ± 0%   723.4n ± 0%  -60.44% (p=0.000 n=10)
UnmarshalMapToStruct/reject_unknown_and_duplicate/all_known_duplicate_fields      744.3n ± 0%   436.4n ± 0%  -41.36% (p=0.000 n=10)
UnmarshalMapToStruct/reject_unknown_and_duplicate/all_unknown_fields              643.4n ± 0%   435.3n ± 0%  -32.34% (p=0.000 n=10)
UnmarshalMapToStruct/reject_unknown_and_duplicate/all_unknown_duplicate_fields    642.1n ± 0%   435.6n ± 1%  -32.15% (p=0.000 n=10)
UnmarshalMapToStruct/default_options/many_fields_one_key_per_field                              5.327µ ± 0%
geomean                                                                           936.7n        748.3n       -29.34%

                                                                               │ before.txt  │                after.txt                 │
                                                                               │    B/op     │    B/op      vs base                     │
UnmarshalMapToStruct/default_options/all_known_fields                             16.00 ± 0%     0.00 ± 0%  -100.00% (p=0.000 n=10)
UnmarshalMapToStruct/default_options/all_known_duplicate_fields                   16.00 ± 0%     0.00 ± 0%  -100.00% (p=0.000 n=10)
UnmarshalMapToStruct/default_options/all_unknown_fields                           16.00 ± 0%     0.00 ± 0%  -100.00% (p=0.000 n=10)
UnmarshalMapToStruct/default_options/all_unknown_duplicate_fields                 16.00 ± 0%     0.00 ± 0%  -100.00% (p=0.000 n=10)
UnmarshalMapToStruct/reject_unknown/all_known_fields                              16.00 ± 0%     0.00 ± 0%  -100.00% (p=0.000 n=10)
UnmarshalMapToStruct/reject_unknown/all_known_duplicate_fields                    24.00 ± 0%     0.00 ± 0%  -100.00% (p=0.000 n=10)
UnmarshalMapToStruct/reject_unknown/all_unknown_fields                           24.000 ± 0%    8.000 ± 0%   -66.67% (p=0.000 n=10)
UnmarshalMapToStruct/reject_unknown/all_unknown_duplicate_fields                 24.000 ± 0%    8.000 ± 0%   -66.67% (p=0.000 n=10)
UnmarshalMapToStruct/reject_duplicate/all_known_fields                            800.0 ± 0%      0.0 ± 0%  -100.00% (p=0.000 n=10)
UnmarshalMapToStruct/reject_duplicate/all_known_duplicate_fields                 648.00 ± 0%    40.00 ± 0%   -93.83% (p=0.000 n=10)
UnmarshalMapToStruct/reject_duplicate/all_unknown_fields                          800.0 ± 0%   1285.0 ± 0%   +60.62% (p=0.000 n=10)
UnmarshalMapToStruct/reject_duplicate/all_unknown_duplicate_fields                648.0 ± 0%    248.0 ± 0%   -61.73% (p=0.000 n=10)
UnmarshalMapToStruct/reject_unknown_and_duplicate/all_known_fields                800.0 ± 0%      0.0 ± 0%  -100.00% (p=0.000 n=10)
UnmarshalMapToStruct/reject_unknown_and_duplicate/all_known_duplicate_fields     648.00 ± 0%    40.00 ± 0%   -93.83% (p=0.000 n=10)
UnmarshalMapToStruct/reject_unknown_and_duplicate/all_unknown_fields             616.00 ± 0%    24.00 ± 0%   -96.10% (p=0.000 n=10)
UnmarshalMapToStruct/reject_unknown_and_duplicate/all_unknown_duplicate_fields   616.00 ± 0%    24.00 ± 0%   -96.10% (p=0.000 n=10)
UnmarshalMapToStruct/default_options/many_fields_one_key_per_field                              0.000 ± 0%
geomean                                                                           113.6                     ?                       ¹ ²
¹ summaries must be >0 to compute geomean
² ratios must be >0 to compute geomean

                                                                               │ before.txt │                after.txt                │
                                                                               │ allocs/op  │ allocs/op   vs base                     │
UnmarshalMapToStruct/default_options/all_known_fields                            1.000 ± 0%   0.000 ± 0%  -100.00% (p=0.000 n=10)
UnmarshalMapToStruct/default_options/all_known_duplicate_fields                  1.000 ± 0%   0.000 ± 0%  -100.00% (p=0.000 n=10)
UnmarshalMapToStruct/default_options/all_unknown_fields                          1.000 ± 0%   0.000 ± 0%  -100.00% (p=0.000 n=10)
UnmarshalMapToStruct/default_options/all_unknown_duplicate_fields                1.000 ± 0%   0.000 ± 0%  -100.00% (p=0.000 n=10)
UnmarshalMapToStruct/reject_unknown/all_known_fields                             1.000 ± 0%   0.000 ± 0%  -100.00% (p=0.000 n=10)
UnmarshalMapToStruct/reject_unknown/all_known_duplicate_fields                   2.000 ± 0%   0.000 ± 0%  -100.00% (p=0.000 n=10)
UnmarshalMapToStruct/reject_unknown/all_unknown_fields                           2.000 ± 0%   1.000 ± 0%   -50.00% (p=0.000 n=10)
UnmarshalMapToStruct/reject_unknown/all_unknown_duplicate_fields                 2.000 ± 0%   1.000 ± 0%   -50.00% (p=0.000 n=10)
UnmarshalMapToStruct/reject_duplicate/all_known_fields                           15.00 ± 0%    0.00 ± 0%  -100.00% (p=0.000 n=10)
UnmarshalMapToStruct/reject_duplicate/all_known_duplicate_fields                 5.000 ± 0%   2.000 ± 0%   -60.00% (p=0.000 n=10)
UnmarshalMapToStruct/reject_duplicate/all_unknown_fields                         15.00 ± 0%   17.00 ± 0%   +13.33% (p=0.000 n=10)
UnmarshalMapToStruct/reject_duplicate/all_unknown_duplicate_fields               5.000 ± 0%   5.000 ± 0%         ~ (p=1.000 n=10) ¹
UnmarshalMapToStruct/reject_unknown_and_duplicate/all_known_fields               15.00 ± 0%    0.00 ± 0%  -100.00% (p=0.000 n=10)
UnmarshalMapToStruct/reject_unknown_and_duplicate/all_known_duplicate_fields     5.000 ± 0%   2.000 ± 0%   -60.00% (p=0.000 n=10)
UnmarshalMapToStruct/reject_unknown_and_duplicate/all_unknown_fields             4.000 ± 0%   2.000 ± 0%   -50.00% (p=0.000 n=10)
UnmarshalMapToStruct/reject_unknown_and_duplicate/all_unknown_duplicate_fields   4.000 ± 0%   2.000 ± 0%   -50.00% (p=0.000 n=10)
UnmarshalMapToStruct/default_options/many_fields_one_key_per_field                            0.000 ± 0%
geomean                                                                          3.043                    ?                       ² ³
¹ all samples are equal
² summaries must be >0 to compute geomean
³ ratios must be >0 to compute geomean

PR Was Proposed and Welcomed in Currently Open Issue

Checklist (for code PR only, ignore for docs PR)

  • Include unit tests that cover the new code
  • Pass all unit tests
  • Pass all 18 ci linters (golint, gosec, staticcheck, etc.)
  • Sign each commit with your real name and email.
    Last line of each commit message should be in this format:
    Signed-off-by: Firstname Lastname firstname.lastname@example.com
  • Certify the Developer's Certificate of Origin 1.1
    (see next section).

Certify the Developer's Certificate of Origin 1.1

  • By marking this item as completed, I certify
    the Developer Certificate of Origin 1.1.
Developer Certificate of Origin
Version 1.1

Copyright (C) 2004, 2006 The Linux Foundation and its contributors.
660 York Street, Suite 102,
San Francisco, CA 94110 USA

Everyone is permitted to copy and distribute verbatim copies of this
license document, but changing it is not allowed.

Developer's Certificate of Origin 1.1

By making a contribution to this project, I certify that:

(a) The contribution was created in whole or in part by me and I
    have the right to submit it under the open source license
    indicated in the file; or

(b) The contribution is based upon previous work that, to the best
    of my knowledge, is covered under an appropriate open source
    license and I have the right under that license to submit that
    work with modifications, whether created in whole or in part
    by me, under the same open source license (unless I am
    permitted to submit under a different license), as indicated
    in the file; or

(c) The contribution was provided directly to me by some other
    person who certified (a), (b) or (c) and I have not modified
    it.

(d) I understand and agree that this project and the contribution
    are public and that a record of the contribution (including all
    personal information I submit with it, including my sign-off) is
    maintained indefinitely and may be redistributed consistent with
    this project or the open source license(s) involved.

Signed-off-by: Ben Luddy <bluddy@redhat.com>
This resolves the case where two map keys could match the same struct field without being treated as
duplicate map keys. The DupMapKey documentation has been updated accordingly.

A map key that matches a previously-matched field will no longer quietly fall back to a
case-insensitive match and is now always treated as a duplicate key. Case-sensitive matching of map
key to struct field name is now a map lookup to improve the quadratic worst-case:

                                                                   │ manykeys-before.txt │         manykeys-after.txt          │
                                                                   │       sec/op        │   sec/op     vs base                │
UnmarshalMapToStruct/default_options/many_fields_one_key_per_field          24.137µ ± 4%   5.240µ ± 1%  -78.29% (p=0.000 n=10)

                                                                   │ manykeys-before.txt │        manykeys-after.txt         │
                                                                   │        B/op         │   B/op    vs base                 │
UnmarshalMapToStruct/default_options/many_fields_one_key_per_field            112.0 ± 0%   0.0 ± 0%  -100.00% (p=0.000 n=10)

                                                                   │ manykeys-before.txt │         manykeys-after.txt          │
                                                                   │      allocs/op      │ allocs/op   vs base                 │
UnmarshalMapToStruct/default_options/many_fields_one_key_per_field            1.000 ± 0%   0.000 ± 0%  -100.00% (p=0.000 n=10)

This fix cleared the way to optimize the way duplicate keys are tracked, especially for inputs that
contain no unknown fields:

                                                                               │  before.txt  │              after.txt              │
                                                                               │    sec/op    │   sec/op     vs base                │
UnmarshalMapToStruct/default_options/all_known_fields                             642.7n ± 3%   735.4n ± 0%  +14.42% (p=0.000 n=10)
UnmarshalMapToStruct/default_options/all_known_duplicate_fields                  1611.5n ± 1%   521.1n ± 1%  -67.66% (p=0.000 n=10)
UnmarshalMapToStruct/default_options/all_unknown_fields                           1.731µ ± 3%   1.174µ ± 0%  -32.18% (p=0.000 n=10)
UnmarshalMapToStruct/default_options/all_unknown_duplicate_fields                 1.726µ ± 2%   1.076µ ± 0%  -37.66% (p=0.000 n=10)
UnmarshalMapToStruct/reject_unknown/all_known_fields                              638.3n ± 1%   725.5n ± 1%  +13.66% (p=0.000 n=10)
UnmarshalMapToStruct/reject_unknown/all_known_duplicate_fields                    476.1n ± 1%   514.7n ± 0%   +8.11% (p=0.000 n=10)
UnmarshalMapToStruct/reject_unknown/all_unknown_fields                            448.1n ± 1%   398.8n ± 1%  -11.00% (p=0.000 n=10)
UnmarshalMapToStruct/reject_unknown/all_unknown_duplicate_fields                  447.9n ± 0%   399.9n ± 1%  -10.71% (p=0.000 n=10)
UnmarshalMapToStruct/reject_duplicate/all_known_fields                           1825.5n ± 0%   726.0n ± 0%  -60.23% (p=0.000 n=10)
UnmarshalMapToStruct/reject_duplicate/all_known_duplicate_fields                  744.3n ± 0%   440.8n ± 0%  -40.78% (p=0.000 n=10)
UnmarshalMapToStruct/reject_duplicate/all_unknown_fields                          2.885µ ± 1%   3.329µ ± 0%  +15.37% (p=0.000 n=10)
UnmarshalMapToStruct/reject_duplicate/all_unknown_duplicate_fields                844.8n ± 0%   738.7n ± 1%  -12.56% (p=0.000 n=10)
UnmarshalMapToStruct/reject_unknown_and_duplicate/all_known_fields               1828.5n ± 0%   723.4n ± 0%  -60.44% (p=0.000 n=10)
UnmarshalMapToStruct/reject_unknown_and_duplicate/all_known_duplicate_fields      744.3n ± 0%   436.4n ± 0%  -41.36% (p=0.000 n=10)
UnmarshalMapToStruct/reject_unknown_and_duplicate/all_unknown_fields              643.4n ± 0%   435.3n ± 0%  -32.34% (p=0.000 n=10)
UnmarshalMapToStruct/reject_unknown_and_duplicate/all_unknown_duplicate_fields    642.1n ± 0%   435.6n ± 1%  -32.15% (p=0.000 n=10)
geomean                                                                           936.7n        748.3n       -29.34%

                                                                               │ before.txt  │                after.txt                 │
                                                                               │    B/op     │    B/op      vs base                     │
UnmarshalMapToStruct/default_options/all_known_fields                             16.00 ± 0%     0.00 ± 0%  -100.00% (p=0.000 n=10)
UnmarshalMapToStruct/default_options/all_known_duplicate_fields                   16.00 ± 0%     0.00 ± 0%  -100.00% (p=0.000 n=10)
UnmarshalMapToStruct/default_options/all_unknown_fields                           16.00 ± 0%     0.00 ± 0%  -100.00% (p=0.000 n=10)
UnmarshalMapToStruct/default_options/all_unknown_duplicate_fields                 16.00 ± 0%     0.00 ± 0%  -100.00% (p=0.000 n=10)
UnmarshalMapToStruct/reject_unknown/all_known_fields                              16.00 ± 0%     0.00 ± 0%  -100.00% (p=0.000 n=10)
UnmarshalMapToStruct/reject_unknown/all_known_duplicate_fields                    24.00 ± 0%     0.00 ± 0%  -100.00% (p=0.000 n=10)
UnmarshalMapToStruct/reject_unknown/all_unknown_fields                           24.000 ± 0%    8.000 ± 0%   -66.67% (p=0.000 n=10)
UnmarshalMapToStruct/reject_unknown/all_unknown_duplicate_fields                 24.000 ± 0%    8.000 ± 0%   -66.67% (p=0.000 n=10)
UnmarshalMapToStruct/reject_duplicate/all_known_fields                            800.0 ± 0%      0.0 ± 0%  -100.00% (p=0.000 n=10)
UnmarshalMapToStruct/reject_duplicate/all_known_duplicate_fields                 648.00 ± 0%    40.00 ± 0%   -93.83% (p=0.000 n=10)
UnmarshalMapToStruct/reject_duplicate/all_unknown_fields                          800.0 ± 0%   1285.0 ± 0%   +60.62% (p=0.000 n=10)
UnmarshalMapToStruct/reject_duplicate/all_unknown_duplicate_fields                648.0 ± 0%    248.0 ± 0%   -61.73% (p=0.000 n=10)
UnmarshalMapToStruct/reject_unknown_and_duplicate/all_known_fields                800.0 ± 0%      0.0 ± 0%  -100.00% (p=0.000 n=10)
UnmarshalMapToStruct/reject_unknown_and_duplicate/all_known_duplicate_fields     648.00 ± 0%    40.00 ± 0%   -93.83% (p=0.000 n=10)
UnmarshalMapToStruct/reject_unknown_and_duplicate/all_unknown_fields             616.00 ± 0%    24.00 ± 0%   -96.10% (p=0.000 n=10)
UnmarshalMapToStruct/reject_unknown_and_duplicate/all_unknown_duplicate_fields   616.00 ± 0%    24.00 ± 0%   -96.10% (p=0.000 n=10)
geomean                                                                           113.6                     ?                       ¹ ²
¹ summaries must be >0 to compute geomean
² ratios must be >0 to compute geomean

                                                                               │ before.txt │                after.txt                │
                                                                               │ allocs/op  │ allocs/op   vs base                     │
UnmarshalMapToStruct/default_options/all_known_fields                            1.000 ± 0%   0.000 ± 0%  -100.00% (p=0.000 n=10)
UnmarshalMapToStruct/default_options/all_known_duplicate_fields                  1.000 ± 0%   0.000 ± 0%  -100.00% (p=0.000 n=10)
UnmarshalMapToStruct/default_options/all_unknown_fields                          1.000 ± 0%   0.000 ± 0%  -100.00% (p=0.000 n=10)
UnmarshalMapToStruct/default_options/all_unknown_duplicate_fields                1.000 ± 0%   0.000 ± 0%  -100.00% (p=0.000 n=10)
UnmarshalMapToStruct/reject_unknown/all_known_fields                             1.000 ± 0%   0.000 ± 0%  -100.00% (p=0.000 n=10)
UnmarshalMapToStruct/reject_unknown/all_known_duplicate_fields                   2.000 ± 0%   0.000 ± 0%  -100.00% (p=0.000 n=10)
UnmarshalMapToStruct/reject_unknown/all_unknown_fields                           2.000 ± 0%   1.000 ± 0%   -50.00% (p=0.000 n=10)
UnmarshalMapToStruct/reject_unknown/all_unknown_duplicate_fields                 2.000 ± 0%   1.000 ± 0%   -50.00% (p=0.000 n=10)
UnmarshalMapToStruct/reject_duplicate/all_known_fields                           15.00 ± 0%    0.00 ± 0%  -100.00% (p=0.000 n=10)
UnmarshalMapToStruct/reject_duplicate/all_known_duplicate_fields                 5.000 ± 0%   2.000 ± 0%   -60.00% (p=0.000 n=10)
UnmarshalMapToStruct/reject_duplicate/all_unknown_fields                         15.00 ± 0%   17.00 ± 0%   +13.33% (p=0.000 n=10)
UnmarshalMapToStruct/reject_duplicate/all_unknown_duplicate_fields               5.000 ± 0%   5.000 ± 0%         ~ (p=1.000 n=10) ¹
UnmarshalMapToStruct/reject_unknown_and_duplicate/all_known_fields               15.00 ± 0%    0.00 ± 0%  -100.00% (p=0.000 n=10)
UnmarshalMapToStruct/reject_unknown_and_duplicate/all_known_duplicate_fields     5.000 ± 0%   2.000 ± 0%   -60.00% (p=0.000 n=10)
UnmarshalMapToStruct/reject_unknown_and_duplicate/all_unknown_fields             4.000 ± 0%   2.000 ± 0%   -50.00% (p=0.000 n=10)
UnmarshalMapToStruct/reject_unknown_and_duplicate/all_unknown_duplicate_fields   4.000 ± 0%   2.000 ± 0%   -50.00% (p=0.000 n=10)
geomean                                                                          3.043                    ?                       ² ³
¹ all samples are equal
² summaries must be >0 to compute geomean
³ ratios must be >0 to compute geomean

Signed-off-by: Ben Luddy <bluddy@redhat.com>
@benluddy benluddy marked this pull request as ready for review February 14, 2024 15:16
@fxamacker
Copy link
Owner

@benluddy Thanks for opening this PR! The benchmarks you shared look great!

I'll take a look this weekend! 🙏

Copy link
Owner

@fxamacker fxamacker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! 👍 Thanks for spotting this issue and contributing this PR to resolve it! Also appreciate the added tests and benchmarks too!

Happy Presidents' Day!

Comment on lines +1905 to +1908
// For structs with typical field counts, expect that this can be
// stack-allocated.
var a [maxStackFields]bool
foundFldIdx = a[:nfields]
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice way to reduce heap allocations! 👍

Comment on lines +2095 to +2097
if mapKeys == nil {
mapKeys = make(map[interface{}]struct{}, 1)
}
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for optimizing and adding code comments! 👍

@fxamacker fxamacker merged commit afbafc4 into fxamacker:master Feb 19, 2024
17 checks passed
@fxamacker fxamacker added this to the v2.7.0 milestone Apr 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants