-
Notifications
You must be signed in to change notification settings - Fork 24.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Makes request headers lowercase in English locale. #29419
Conversation
Wait why isn’t @elasticmachine here?! |
We have disabled the auto-pings from @elasticmachine; due to having multiple PR CI jobs that can be triggered by @elasticmachine, it was leading to multiple pings from the bot reminding us on community PRs. These days, we are all in the right mindset on how to handle community PRs so we felt the value of the pings has diminished over time, and removing them was better than subjecting everyone to multiple pings. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left a comment. This does not constitute a full review.
} | ||
for (Header defaultHeader : defaultHeaders) { | ||
if (requestNames.contains(defaultHeader.getName()) == false) { | ||
if (!requestNames.contains(defaultHeader.getName().toLowerCase(Locale.ENGLISH))) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Like it or not, our style is to use == false
instead of !
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’ll change it back then!
Pinging @elastic/es-core-infra |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hi @adityasrini thanks for your PR. Could you please add a test that fails against current master, and is successful with your fix?
@javanna On it! |
} | ||
for (Header defaultHeader : defaultHeaders) { | ||
if (requestNames.contains(defaultHeader.getName()) == false) { | ||
if (requestNames.contains(defaultHeader.getName().toLowerCase(Locale.ENGLISH)) == false) { | ||
httpRequest.addHeader(defaultHeader); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@adityasrini
Rather than doing the two String.toLowerCase(Locale.ENGLISH) which requires 2 changes, you should replace the new HashMap() with a new TreeMap(String.CASE_INSENSITIVE_ORDER).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't necessarily agree on this. We don't need a sorted map here, we just need to make sure that keys are case insensitive, which is one of the properties of treemap when used in this way, but we don't need all of its other properties which also affect that data structure that's internally used to store entries. Makes sense?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry my suggestion should have been to replace the new HashSet()
with new TreeSet(String.CASE_INSENSITIVE_ORDER)
at line 433 , and remove the 2x toLowerCase(Locale.ENGLISH)
additions, obviously mentioning TreeMap was nonsense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't necessarily agree with this either, we would be introducing a sorted data structure where we don't need ordering, but only case-insensitive lookups. Not sure it is a good trade-off.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we would be introducing a sorted data structure where we don't need ordering, but only case-insensitive lookups.
The internal structure of treeset doesnt matter, any more than the internals of hashset matter.
Hashsets also sort/arrange their keys, it might not be alphabetically, but the entire buckets thing also has its own system sorting system when it allocates keys into a chain of buckets, but who cares.
All that matters is the *set allows us to determine if some key already exists in it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe in this case it won't make a huge difference, but HashSet
is backed by a HashMap
while TreeSet
is back by a TreeMap
. TreeMap
is implemented using a red-black tree, which is very different compared to HashMap
, the reason being that the former will have to re-balance the tree to make it possible to iterate through entries in their natural ordering. We are using a set here though just to quickly check if it already contains something, and we never iterate through it. Conceptually, I still like more the two calls to lowercase than using a red-black tree just because treeset allows to make strings case-insensitive. Hopefully I explained what I meant. It would be interesting to measure which of the two solutions is faster, not sure it's worth it in this case given that this is probably not a hotspot.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TreeMap is implemented using a red-black tree, which is very different compared to HashMap,
Today TreeMap is a red black tree, tomorrow it might change, it doesnt matter, all we care about is that it keeps the contract that allows case insensitive look ups of keys. We can be sure they are close enough to the same speed, the jdk collection guys know this important they arent going to make one 10x slower than the other.
A HashMap turns into a tree when it gets really full anyway, which basicaly means the jdk guys dont care why should you ?
The implementation notes comment in HashMap is a better description of HashMap's operation than I could write myself. The relevant parts for understanding the tree nodes and their ordering are:
This map usually acts as a binned (bucketed) hash table, but when bins get too large, they are transformed into bins of TreeNodes, each structured similarly to those in java.util.TreeMap. [...] Bins of TreeNodes may be traversed and used like any others, but additionally support faster lookup when overpopulated. [...]
which is very different compared to HashMap, the reason being that the former will have to re-balance the tree to make it possible to iterate through entries in their natural ordering
And a hashmap also has to rebalance when it hits some threshold. To solve that we simply call the ctor that takes initial size so in both case the rebalance never happens.
We are using a set here though just to quickly check if it already contains something, and we never iterate through it.
Thats right so why talk about something that never happens.
It would be interesting to measure which of the two solutions is faster, not sure it's worth it in this case given that this is probably not a hotspot
Except that enough of this is, because HM and TM are used everywhere and will be compiled by hotspot. If one is compiled so will be the other and vice versa.
Why ? This path takes 100s/100s of cycles vs billions for a complete request, it doesnt matter. Even if one is 2x or 10x sloewr the total request time will be basically the same.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 to keeping a hash set. Moving to a tree set would make operations perform in O(log(size))
rather than constant time. Even if this set would usually be small so that it wouldn't matter, we have seen in the past that users are sometimes very creative when it comes to pushing the system to its boundaries.
If there is any lower casing done here, it should be with the root locale, not English. |
Hi all, I really apologize for not having got back to this in a bit! I’ve been swamped by another database/API related project! Rest assured that I haven’t abandoned this PR and will be working on test cases this weekend! |
Hey everyone I'm finding an error with most tests in However when I call |
hi @adityasrini thanks for raising this, I opened #30776 to remove endless waits and improve |
@javanna Thanks for raising that issue! I'll use the other integration test class. |
given that we have decided (see #30616) to remove support for default headers, as they are already supported by the apache http client which we use to perform requests, I am going to close this PR. Thanks anyway @adityasrini ! |
Converts requestHeader names to lowercase in the English Locale. Makes overriding default headers in RestClient possible. Converted the
requestNames.contains... == false
statement to!requestNames.contains...
. Closes #22623