-
Notifications
You must be signed in to change notification settings - Fork 25.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Index time boost in multi_field ignored? #4108
Comments
Any comment? It's a real showstopper for us! I did a big (considering size of our mappings) rework getting away from using __all because I needed a combined field like _all but analyzed differently (stemmed, shingled, regular...) just to find that multifield is unusable for us because it does not take boost per contributing field into account like _all does Please let me know if it is a bug and can be fixed or it is not possible i must go back to using _all :-( |
This is again related to the use of I would keep a variation with unique name so that the boost will be taken into account, as they will actually be different lucene fields. Otherwise you could just drop index time boosting and use a multi_match query against multiple fields, giving a different weight to each field. |
Thank you @javanna but in this case my intention is to combine data from multiple properties into a single field to act like _all. _all does support bust based on which field contributes to it: "One of the nice features of the _all field is that it takes into account specific fields boost levels. Meaning that if a title field is boosted more than content, the title (part) in the _all field will mean more than the content (part) in the _all field." The boost can be applied to individual tokens right? what I expected is that each individual property contributing to the shared (collapsed) multi_field would mark its content with defined boost |
I agree with @roytmana - the field boosts are retained when indexing into In fact, I'd say that this is the one place where field-level index time boosting has a purpose. |
Yeah I see your point guys, I agree, looking into it :) |
more so, I expected analyzers and position offset gaps to be honored per contributing field so we have a fine-grained control over how such combined field get put together for example I use phrase searches and I want to make sure searches across content from different contributing fields are not matched - I would use position_offset_gap for such fields or I want some fields to contribute stemmed content and few other (say people names or some codes) not stemmed etc. This is what makes it so powerful And thanks for looking into it i was getting kind of desperate of this issue being "ignored" I banked lots of my design on multifield power now when's it going to be fixed ? :-) ha ha |
are they not? i was pretty sure they were. If not, could you provide a recreation? |
maybe they are working I guess I was dramatizing it a bit :-) after struggle with multifield and related highlighting issues (i feel current multifield primary field to index_name naming when using 'just_name' is very unintuitive right now see #4123 as it lumps all primary fields in one lucene field which is in 99% of cases is not what I would expect) I will test it later tonight or tomorrow and report |
I checked how the The reason why I said in the first place that it doesn't make sense to have more than one boost for the same field is that index time boosting is per field, using field norms, thus only one value per field. To work around this the But we should definitely take this into account in the discussion on #4099 regarding future improvements. |
oh no! back to using _all field then :-( oh well hope you will be able to pull a miracle out of the hat :-) |
@roytmana you can achieve what you are after (ie per-field index time boosts) by querying both your custom
Then if you wanted to give a slight boost to the
You could even use the
And this would probably be more efficient (and certainly more flexible) than using payloads to implement field-level index time boosts on custom |
Thank you @clintongormley I did not think of re-score but I did use the first approach. My issue is that I have a highly structured data - over 100 fields and it is just the beginning. Half of them are not very useful but i can't afford not to include them in my search but meed to massively de-emphasize them or they will drown the useful results. Another half provide good search corpus with some being more important than others. Out of this half there is a handful of highly relevant fields that gets special boost. I do not know I guess I could go this route with should clause listing all "important" fields (over 70 now) with various boosts and make sure they analyzed the same as _all (or my _all-like multifield) but I do not know if it will scale in terms of complexity as number of my fields triples in next version. Not sure if it can cause performance issues |
Closing this issue in favour of #4520, which will take care of custom |
I use multi_field to index content of multiple properties into a single _all-like multi_field. Each contributing property may define different boost when defining it's instance of the multi_field. these boosts are however ignored when searching against such multi_fied while honored while searching against _all
full recreation is here https://gist.github.com/roytmana/7330956
it compares the same query against my multi_field all and ES _all
The text was updated successfully, but these errors were encountered: