Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement update by query [DATAES-873] #1446

Closed
spring-projects-issues opened this issue Jun 26, 2020 · 5 comments · Fixed by #1644
Closed

Implement update by query [DATAES-873] #1446

spring-projects-issues opened this issue Jun 26, 2020 · 5 comments · Fixed by #1644
Labels
in: core Issues in core support type: enhancement A general enhancement

Comments

@spring-projects-issues
Copy link

sothawo opened DATAES-873 and commented

Currently we only can update documents with a given id. Additionally we should add the update by query feature which allows to update multiple documents by issuing a query and script


Reference URL: https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-update-by-query.html

@spring-projects-issues spring-projects-issues added type: enhancement A general enhancement in: core Issues in core support labels Dec 31, 2020
@ffaoudi ffaoudi mentioned this issue Jan 13, 2021
5 tasks
@sothawo
Copy link
Collaborator

sothawo commented Jan 14, 2021

Currently there are these methods defined:

UpdateResponse DocumentOperations.update(UpdateQuery updateQuery, IndexCoordinates index))

and

Mono<UpdateResponse> ReactiveDocumentOperations.update(UpdateQuery updateQuery, IndexCoordinates index);

The UpdateQuery object can only created with it's builder by providing the id of a document to update.

In order to add the additional possibility to do an update by query one solution would be:

  • Extend the UpdateQuery class so it also can take a Query.
  • UpdateQuery can either be constructed with an id or with a query. The first case would be the existing logic.
  • When a query is passed in, the implementations (ElasticsearchTemplate and ElasticsearchRestTemplate for non-reactive and ReactiveElasticsearchTemplate for reactive code) must construct the matching request for their clients

This solution would not need any change to the existing API, only the UpdateQuery class would get one more property and a new builder method, which is a non-breaking change.

Note: need to check if the return type still is ok

@ffaoudi
Copy link
Contributor

ffaoudi commented Jan 15, 2021

In RequestFactory when i try to construct the UpdateByQueryRequest from UpdateQuery there are some fields that it's not possible to set because they do not exist in UpdateQuery. So does that mean we need to add them ?

The missing fields are the commented setter in the following method

public UpdateByQueryRequest updateByQueryRequest(UpdateQuery query, IndexCoordinates index) {
		String indexName = index.getIndexName();
		final UpdateByQueryRequest updateByQueryRequest = new UpdateByQueryRequest(indexName);

		if (query.getScript() != null) {
			Map<String, Object> params = query.getParams();

			if (params == null) {
				params = new HashMap<>();
			}
			Script script = new Script(ScriptType.INLINE, query.getLang(), query.getScript(), params);
			updateByQueryRequest.setScript(script);
		}

		updateByQueryRequest
				//.setAbortOnVersionConflict()
				//.setBatchSize()
				.setIndicesOptions(query.getQuery().getIndicesOptions())
				//.setMaxDocs()
				//.setMaxRetries()
				//.setPipeline()
				.setQuery(getQuery(query.getQuery()))
				//.setRefresh()
				//.setRequestsPerSecond()
				.setRouting(query.getRouting())
				.setScroll(TimeValue.timeValueMillis(query.getQuery().getScrollTime().toMillis()))
				//.setShouldStoreResult()
				//.setSlices()
				.setTimeout(query.getTimeout())
				.setWaitForActiveShards(ActiveShardCount.parseString(query.getWaitForActiveShards()));

		return updateByQueryRequest;
	}

@sothawo
Copy link
Collaborator

sothawo commented Jan 15, 2021

yes, add what is needed to build the request

@ffaoudi
Copy link
Contributor

ffaoudi commented Jan 16, 2021

Regarding the response, it is more detailed according to the elasticsearch doc responseBody. Is it necessary to have a mirroring object of the elasticsearch response?

@sothawo
Copy link
Collaborator

sothawo commented Jan 17, 2021

as for the properties that are not yet in the UpdateQuery, please add what is needed.

the return type: yes, we should definitely mirror the returned properties instead of passing out original Elasticsearch classes.

One of the reasons is currently a topic: Elastic changes the license for Elasticsearch from the next version on (7.11). If Spring Data Elasticsearch encapsulates all the ES classes, our users only use the org.springframework.data classes. If Elasticsearch changes something, the we have adapt our library. If we leak out Elasticsearch classes, then the users of our library would have breaking changes.

@sothawo sothawo added this to the 4.2 M3 (2021.0.0) milestone Jan 23, 2021
sothawo pushed a commit that referenced this issue Jan 23, 2021
Original Pull Request #1446 
Closes #1644
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
in: core Issues in core support type: enhancement A general enhancement
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants