diff --git a/doc/release-notes/9276-doc-cvoc-index-in.md b/doc/release-notes/9276-doc-cvoc-index-in.md index 5c4dd4ca10f..78289201511 100644 --- a/doc/release-notes/9276-doc-cvoc-index-in.md +++ b/doc/release-notes/9276-doc-cvoc-index-in.md @@ -2,7 +2,17 @@ ### Updates on Support for External Vocabulary Services -#### Indexed field accuracy +Multiple extensions of the External Vocabulary mechanism have been added. These extensions allow interaction with services based on the Ontoportal software and are expected to be generally useful for other service types. -For more relevant indexing, you can now map external vocabulary values to a `managed-fields` of a [:CVocConf setting](https://guides.dataverse.org/en/6.3/installation/config.html#cvocconf) by adding the key `indexIn` in `retrieval-filtering`. -For more information, please check [GDCC/dataverse-external-vocab-support documentation](https://github.com/gdcc/dataverse-external-vocab-support/tree/main/docs). \ No newline at end of file +These changes include: + +#### Improved Indexing with Compound Fields + +When using an external vocabulary service with compound fields, you can now specify which field(s) will include additional indexed information, such as translations of an entry into other languages. This is done by adding the `indexIn` in `retrieval-filtering`. (#10505) +For more information, please check [GDCC/dataverse-external-vocab-support documentation](https://github.com/gdcc/dataverse-external-vocab-support/tree/main/docs). + +#### Broader Support for Indexing Service Responses + +Indexing of the results from `retrieval-filtering` responses can now handle additional formats including Json Arrays of Strings and values from arbitrary keys within a JSON Object. (#10505) + +**** This documentation must be merged with 9276-allow-flexible-params-in-retrievaluri-cvoc.md (#10404) \ No newline at end of file diff --git a/doc/sphinx-guides/source/admin/metadatacustomization.rst b/doc/sphinx-guides/source/admin/metadatacustomization.rst index cac051ddb59..04453d45568 100644 --- a/doc/sphinx-guides/source/admin/metadatacustomization.rst +++ b/doc/sphinx-guides/source/admin/metadatacustomization.rst @@ -555,6 +555,8 @@ Great care must be taken when reloading a metadata block. Matching is done on fi The ability to reload metadata blocks means that SQL update scripts don't need to be written for these changes. See also the :doc:`/developers/sql-upgrade-scripts` section of the Developer Guide. +.. _using-external-vocabulary-services: + Using External Vocabulary Services ---------------------------------- @@ -580,9 +582,9 @@ In general, the external vocabulary support mechanism may be a better choice for The specifics of the user interface for entering/selecting a vocabulary term and how that term is then displayed are managed by third-party Javascripts. The initial Javascripts that have been created provide auto-completion, displaying a list of choices that match what the user has typed so far, but other interfaces, such as displaying a tree of options for a hierarchical vocabulary, are possible. Similarly, existing scripts do relatively simple things for displaying a term - showing the term's name in the appropriate language and providing a link to an external URL with more information, but more sophisticated displays are possible. -Scripts supporting use of vocabularies from services supporting the SKOMOS protocol (see https://skosmos.org) and retrieving ORCIDs (from https://orcid.org) are available https://github.com/gdcc/dataverse-external-vocab-support. (Custom scripts can also be used and community members are encouraged to share new scripts through the dataverse-external-vocab-support repository.) +Scripts supporting use of vocabularies from services supporting the SKOMOS protocol (see https://skosmos.org), retrieving ORCIDs (from https://orcid.org), services based on Ontoportal product (see https://ontoportal.org/), and using ROR (https://ror.org/) are available https://github.com/gdcc/dataverse-external-vocab-support. (Custom scripts can also be used and community members are encouraged to share new scripts through the dataverse-external-vocab-support repository.) -Configuration involves specifying which fields are to be mapped, whether free-text entries are allowed, which vocabulary(ies) should be used, what languages those vocabulary(ies) are available in, and several service protocol and service instance specific parameters. +Configuration involves specifying which fields are to be mapped, to which Solr field they should be indexed, whether free-text entries are allowed, which vocabulary(ies) should be used, what languages those vocabulary(ies) are available in, and several service protocol and service instance specific parameters, including the ability to send HTTP headers on calls to the service. These are all defined in the :ref:`:CVocConf <:CVocConf>` setting as a JSON array. Details about the required elements as well as example JSON arrays are available at https://github.com/gdcc/dataverse-external-vocab-support, along with an example metadata block that can be used for testing. The scripts required can be hosted locally or retrieved dynamically from https://gdcc.github.io/ (similar to how dataverse-previewers work). diff --git a/src/main/java/edu/harvard/iq/dataverse/DatasetFieldServiceBean.java b/src/main/java/edu/harvard/iq/dataverse/DatasetFieldServiceBean.java index bffb6e35035..26d5940b973 100644 --- a/src/main/java/edu/harvard/iq/dataverse/DatasetFieldServiceBean.java +++ b/src/main/java/edu/harvard/iq/dataverse/DatasetFieldServiceBean.java @@ -328,7 +328,7 @@ public Map getCVocConf(boolean byTermUriField){ logger.warning("Ignoring External Vocabulary setting for non-existent child field: " + managedFields.getString(s)); } else { - logger.info("Found: " + dft.getName()); + logger.fine("Found: " + dft.getName()); } } } @@ -370,10 +370,10 @@ public void registerExternalVocabValues(DatasetField df) { /** * Retrieves indexable strings from a cached externalvocabularyvalue entry filtered through retrieval-filtering configuration. *

- * This method externalvocabularyvalue entries have been filtered and contains a single JsonObject. - * Is handled : Strings, Array of Objects with "lang" and ("value" or "content") keys, Object with Strings as value or Object with Array of Strings as value. - * The string, or the "value/content"s for each language are added to the set. - * This method can retrieve string values to be indexed in term-uri-field (parameter defined in CVOC configuration) or in "indexIn" field (optional parameter of retrieval-filtering defined in CVOC configuration). + * This method assumes externalvocabularyvalue entries have been filtered and that they contain a single JsonObject. + * Cases Handled : A String, an Array of Strings, an Array of Objects with "value" or "content" keys, an Object with one or more entries that have String values or Array values with a set of String values. + * The string(s), or the "value/content"s for each language are added to the set. + * Retrieved string values are indexed in the term-uri-field (parameter defined in CVOC configuration) by default, or in the field specified by an optional "indexIn" parameter in the retrieval-filtering defined in the CVOC configuration. *

* Any parsing error results in no entries (there can be unfiltered entries with * unknown structure - getting some strings from such an entry could give fairly