Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove coupled resources from solr #285

Merged

Conversation

Zharktas
Copy link
Member

Coupled resources are a list of resources which link CSW server url and UUID together. These are not needed in SOLR as they hardly are searchable and might result in solr indexing error if single harvested dataset has many of them in it. This is based on discussion on ckan/ckan#4825

@amercader amercader self-assigned this Jun 21, 2022
@amercader
Copy link
Member

This looks good @Zharktas, should we also pop the actual spatial field? This is a big geojson blob that is not really needed in Solr either

@FuhuXia
Copy link
Contributor

FuhuXia commented Jun 22, 2022

@amercader
Spatial Search with Solr backend is relying on the GeoJSON data in spatial field. We should not pop it.
http://docs.ckan.org/projects/ckanext-spatial/en/latest/spatial-search.html

@amercader
Copy link
Member

@FuhuXia the plugin uses the spatial field to do the necessary calculations and index the relevant solr fields (eg bbox_area, maxx when using the solr backend and spatial_geom when using solr-spatial-field one) but it doesn't require the actual spatial field contents to be indexed in Solr

@Zharktas
Copy link
Member Author

@amercader done

@Zharktas
Copy link
Member Author

I noticed that the fields are still index under extras_coupled-resource and extras_spatial, but since extras_ is type text in the schema, indexing doesn't fail. However should those be removed as well as it not really that useful data to be searched on ?

@amercader
Copy link
Member

@Zharktas sorry I missed this. I'm not sure we can "remove" extras_* fields on the before_index() hook as these are dynamic fields in Solr. However if coupled-resource and spatial are removed before indexing the extras_* variant won't be created by Solr right?

@amercader amercader merged commit cd6667d into ckan:master Jul 6, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants