Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Supply a custom Sort.Order providing Elasticsearch specific parameters #1911

Closed
sothawo opened this issue Aug 26, 2021 · 17 comments · Fixed by #1955
Closed

Supply a custom Sort.Order providing Elasticsearch specific parameters #1911

sothawo opened this issue Aug 26, 2021 · 17 comments · Fixed by #1955
Assignees
Labels
type: enhancement A general enhancement

Comments

@sothawo
Copy link
Collaborator

sothawo commented Aug 26, 2021

Spring Data Elasticsearch should supply a Sort.Order derived class that can provide custom parameters like ignore_unmapped (see #1908) for an example where this is needed

@sothawo sothawo added the type: enhancement A general enhancement label Aug 26, 2021
@sothawo sothawo added this to To do in what's up next? Sep 10, 2021
@sothawo sothawo self-assigned this Oct 6, 2021
@sothawo sothawo moved this from To do to In progress in what's up next? Oct 6, 2021
what's up next? automation moved this from In progress to Done Oct 8, 2021
sothawo added a commit that referenced this issue Oct 8, 2021
@sothawo sothawo added this to the 4.3 RC1 (2021.1.0) milestone Oct 8, 2021
@alan-czajkowski
Copy link

can I see an example of a Sort.Order that sets ignore_unmapped to true?

@sothawo
Copy link
Collaborator Author

sothawo commented Mar 19, 2022

for example

Sort sort = Sort.by(new GeoDistanceOrder("property", new GeoPoint(47.5, 6.4)).withIgnoreUnmapped(false));

@alan-czajkowski
Copy link

@sothawo I'm trying to sort on a Java Date field (which appears to map to a long type), how would I do it?

@sothawo
Copy link
Collaborator Author

sothawo commented Mar 19, 2022

ignore_unmapped is only available for geo sorts (https://www.elastic.co/guide/en/elasticsearch/reference/8.1/sort-search-results.html#geo-sorting)

If a Date (please don't use java.utile.Date but the classes from the java.time package) is mapped to a long depends on how you define the mapping. You can set the unmapped type and the missing behaviour on Order

@alan-czajkowski
Copy link

@sothawo this is a Spring Data Elasticsearch problem, how do I achieve this with Sort/Sort.Order ?

@alan-czajkowski
Copy link

alan-czajkowski commented Mar 20, 2022

@sothawo is there some way to set unmapped_type using Spring Data ES annotation?
should this be a feature request for the org.springframework.data.elasticsearch.annotations.Setting annotation?

for a field like:
private LocalDateTime created;

it would be nice for Spring Data ES to translate something down into this ES code:
SortOrder sortOrder = SortBuilders.fieldSort("created").unmappedType("long").order();

@alan-czajkowski
Copy link

alan-czajkowski commented Mar 20, 2022

@sothawo I've also noticed that you have mentioned many times online to switch from java.util.Date to java.time.* but when I converted all of my java.util.Date instances to java.time.LocalDateTime I now get this error:
with Spring 5.3 and Spring Data ES 4.3

org.springframework.core.convert.ConverterNotFoundException: No converter found capable of converting from type [java.lang.Long] to type [java.time.LocalDateTime]

all of my java.util.Date instances worked natively with Spring Data ES, I did not have to define any explicit annotations on my date fields or implement any converters, but now none of my @Document entities work because of this new exception

@sothawo
Copy link
Collaborator Author

sothawo commented Mar 20, 2022

it would be nice for Spring Data ES to translate something down into this ES code:
SortOrder sortOrder = SortBuilders.fieldSort("created").unmappedType("long").order();

that was exactly the point of this issue. So now it is possible to do

import org.springframework.data.elasticsearch.core.query.Order;

Sort sort = Sort.by(new Order(Sort.Direction.ASC, "created").withUnmappedType("long"));

and use that sort in a Pageable as parameter or whatever.

How is this date field mapped in your index? Is it really unmapped? What does /<indexname>/_mappings show for that property? If you have data for that property in the index then it will be surely mapped, so adding an unmapped type to a sort has not effect. What is the problem that you have with the sorting?

As for the temporal types: java.util.Date is an instance in time in UTC time zone. A LocalDateTime does not have a timezone, so it cannot be instantiated from a long. The appropriate data type would be java.time.Instant

@alan-czajkowski
Copy link

alan-czajkowski commented Mar 20, 2022

@sothawo thanks for the help
I was doing:
Sort.by(new Sort.Order(Sort.Direction.ASC, "created"));
but apparently this is the wrong way to do it because I do not have access to withUnmappedType(...) with this method
instead I changed it to your recommendation:
Sort.by(new Order(Sort.Direction.ASC, "created").withUnmappedType("long"));

this is the problem i'm trying to solve:
when you first start up a Spring Boot app that uses Spring Data Elasticsearch, if there is no data populated inside of the indexes then there are no mappings for any fields, so if you perform a search query, with a sort, on an empty index (and consequently contains no mappings) then it throws the following error:

Elasticsearch exception [type=search_phase_execution_exception, reason=all shards failed]

this error goes away as soon as one document is put into the index
but I need the search with sort to work on an empty index

@sothawo
Copy link
Collaborator Author

sothawo commented Mar 20, 2022

When you use a Spring Data Elasticsearch repository, then on application startup the repository checks if the index exists, and if not creates it with the mapping. If you do not use repositories you can do something like this:

@SpringBootApplication
public class SpringdataElasticTestApplication {

      @Autowired
      private ElasticsearchOperations operations;
  
      public static void main(String[] args) {
          SpringApplication.run(SpringdataElasticTestApplication.class, args);
      }

	@Autowired
	public void initializeIndex() {
		var indexOperations = operations.indexOps(Foo.class);
		if (!indexOperations.exists()) {
			indexOperations.createWithMapping();
		}
	}
}

@alan-czajkowski
Copy link

alan-czajkowski commented Mar 20, 2022

@sothawo I am using a Spring Data Elasticsearch repository, and I believe it is creating the indexes, but without mappings:

@Document(indexName = "broadcasters")
public class Broadcaster {

  @Id
  private String id;

  private Instant created;

  ...
}
public interface BroadcasterDao extends ElasticsearchRepository<Broadcaster, String> {

  ...
}

but I do not use any explicit @Field annotations

how do I ensure that an empty index is created with mappings?

@sothawo
Copy link
Collaborator Author

sothawo commented Mar 20, 2022

A property without a @Field annotation has the implicit type of FieldType.Auto and is not written to the mapping. Take your example of a timestamp. How should Spring Data Elasticsearch know in which format you'd like that to be mapped in Elasticsearch?

@alan-czajkowski
Copy link

alan-czajkowski commented Mar 20, 2022

@sothawo why Spring Data Elasticsearch cannot have default mappings?
example: if a field is of type java.util.Instant then automatically create a mapping to "long", and that only changes if the user decides to be explicit

there are many cases like this where it is natural for Spring Boot Elasticsearch to make assumptions on what the best mapping should be, and implement that mapping implicitly, until overridden explicitly

@alan-czajkowski
Copy link

alan-czajkowski commented Mar 20, 2022

@sothawo I changed all of my java.util.Date (that worked perfectly fine) to java.time.Instant and now I'm getting:

org.springframework.core.convert.ConverterNotFoundException: No converter found capable of converting from type [java.lang.Long] to type [java.time.Instant]

if java.util.Date works perfectly fine without any explicit annotations, then I think it is reasonable for java.time.Instant to work in the same way

do you agree?

@alan-czajkowski
Copy link

is this a deficiency inside Spring Core? do you think it is reasonable to do a feature request for this converter?

@sothawo
Copy link
Collaborator Author

sothawo commented Mar 20, 2022

Spring Boot makes assumptions what might be right, Spring Data does not.

Why should a long be the default instead of a basic_date_time or date_time? Often the users of Spring Data Elasticsearch have data ingested in the indices from other sources and timestamps often are written in forms like "2022-03-20T18:42:38.016+01:00"

The default behaviour of Elasticsearch when a field is not added to the mapping is auto mapping. The same is true for Spring Data Elasticsearch. If you do not specify the type, it's auto-mapped.

Btw, date/time/timezone mapping is a pretty difficult and complicated topic. Consider the following values (skipping seconds and msecs):

2022-03-20 18:47 CET - Germany
2022-03-20 10:47 PST - Los Angeles
2022-04-21 01:47 SGT - Singapore

mapped to an Instant they are all the same. If you store these as long, you will always get it back in UTC:

2022-03-20 17:47 UTC

You then can display it in the TZ you"re in, but what was the original value sent to Elasticsearch? So how dates are mapped into Elasticsearch is a pretty specific topic depending on the application, using default implicit values is an accident waiting to happen - and one of the reasons why I refrain from using using java.util.Date, it's just not specific enough.

It might have worked for you with the default mapping (what version of Spring Data Elasticsearch are you using btw?), but since 4.0 we are quite specific about the daste handling in Spring Data Elasticsearch .

And - when talking to the maintainers of Spring Data mongo | jpa or whatever module - just metioning dates/timezones will get them running away, crying out loud, despairing. I know, it's an awful topic.

@alan-czajkowski
Copy link

@sothawo I'm using Spring 5.3 and Spring Data ES 4.3

I, personally, think it is reasonable to make java.time.Instant behave like java.util.Date since they are very closely related and because, from an engineering stand-point, normalizing your date-time's to UTC is actually good practice, as the default behaviour ... of course, the default isn't always desired, but some default can be reasonable ... but since this is a "bigger" problem that falls under the Spring Data umbrella (not just Spring Data Elasticsearch) then I'm not about to fight for this since it's not a battle I have time for

  1. would this be a reasonable definition for my "created" field, given that I want all date-times stored in ES normalized to UTC:
  @Field(type = FieldType.Date, format = DateFormat.epoch_millis)
  private Instant created;
  1. do you recommend that every field have an explicit @Field annotation?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: enhancement A general enhancement
Projects
No open projects
Development

Successfully merging a pull request may close this issue.

2 participants