Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support choosing data dictionaries in dataset node form #4067

Merged
merged 21 commits into from
Apr 12, 2024
Merged

Conversation

dafeder
Copy link
Member

@dafeder dafeder commented Nov 21, 2023

Modify json_form_widget to allow selection of Data Dictionaries populated by the metastore.

Testing steps

  1. Build from branch
  2. Modify dataset.ui.schema distribution.describedBy to be:
"describedBy": {
  "ui:options": {
    "description": "URL to the data dictionary for the file found at the Download URL.",
      "widget": "list",
      "type": "select",
      "titleProperty": "title",
      "source": {
        "metastoreSchema": "data-dictionary",
        "returnValue": "url"
      }
    }
},
  1. Create a data dictionary. It can be empty:
POST https://dkan.ddev.site/api/1/metastore/schemas/data-dictionary/items
{
  "identifier": "c4a997e1-e1dc-5139-93b5-0a4e15b012d7",
  "data": {
    "title": "Data dictionary title",
    "fields": []
  }
}
  1. Try selecting the data dictionary on a new dataset form
  2. Check the metastore at https://dkan.ddev.site/api/1/metastore/schemas/dataset/items
  3. The describedBy field should be displaying a working URL for the data dictionary.

Note: We really need to write a spec for the ui schema, and possibly make some things about this "source" feature a bit more consistent.

Upgrade

This PR changes the data dictionary schema. We did something a little weird with data dictionary where we made a schema with "identifier" and "data" properties but also put the "title" field at the same level. This has caused problems because in different places the metastore expects everything not a dataset to have an identifier/data wrapper. (This whole pattern is problematic but outside the scope of this ticket to fix; we have been talking about refactoring this for a while).

If you have a /schema folder in your DKAN project outside the DKAN module directory, you'll need to update your data-dictionary.json file with the new schema. If you have existing data dictionary nodes, an update hook will migrate them to the new schema.

To test:

  1. Build a site on 2.x branch.
  2. Create a data dictionary with the old schema:
POST https://dkan.ddev.site/api/1/metastore/schemas/data-dictionary/items
{
  "identifier": "c4a997e1-e1dc-5139-93b5-0a4e15b012d7",
  "title": "Data dictionary title",
  "data": {
    "fields": []
  }
}
  1. Checkout choose-dict branch
  2. Run drush updb
  3. Make sure you get no errors, and https://dkan.ddev.site/api/1/metastore/schemas/data-dictionary/items shows your data dictionary now with the title under "data".

@dafeder dafeder marked this pull request as draft November 21, 2023 22:35
@dafeder dafeder marked this pull request as ready for review November 22, 2023 22:01
Copy link
Contributor

@paul-m paul-m left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First, I'm not sure why the instructions tell me to change dataset.ui.schema, because it seems like the PR should do that.

Second: I had trouble getting a data dictionary into the system using the API, using the supplied JSON above:

HTTP/1.1 400 Bad Request
Cache-Control: must-revalidate, no-cache, private
Content-Language: en
Content-Type: application/json
Date: Sun, 26 Nov 2023 16:47:39 GMT
Expires: Sun, 19 Nov 1978 05:00:00 GMT
Server: Apache/2.4.56 (Debian)
X-Content-Type-Options: nosniff
X-Frame-Options: SAMEORIGIN
X-Generator: Drupal 10 (https://www.drupal.org)
Transfer-Encoding: chunked

{
  "message": "JSON Schema validation failed.",
  "status": 400,
  "timestamp": "2023-11-26T16:47:39+00:00",
  "data": {
    "keyword": "required",
    "pointer": "",
    "message": "The attribute property 'title' is required."
  }
}

So I used the JSON from the docs. This succeeded, but then visiting the dataset add page led to this error:

The website encountered an unexpected error. Please try again later.

TypeError: Drupal\json_form_widget\WidgetRouter::metastoreOptionTitle(): Return value must be of type string, null returned in Drupal\json_form_widget\WidgetRouter->metastoreOptionTitle() (line 244 of /var/www/html/dkan/modules/json_form_widget/src/WidgetRouter.php).

Therefore I wasn't able to complete the QA steps.

@dafeder
Copy link
Member Author

dafeder commented Nov 28, 2023

@paul-m we don't want to change the default dataset.ui.json because this only makes sense if you're using data dictionaries in "reference" mode; this would be an option you would implement in your site-specific implementation of the UI schema. Obviously, this should be reflected in the docs, but I think in general the json form widget doesn't have docs so this seems like a follow-up ticket and we consider this an undocumented/experimental feature for now.

I'll try to reproduce the errors you're getting.

Copy link
Contributor

@paul-m paul-m left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ran through all the QA steps and they went fine. The only concern I have is that step 6 says the describedBy URL should be a 'working' URL, but it's a dkan:// URI. I think that's probably what's meant by working, but just wanted to call it out. I'm not sure whether an API consumer will know what to do with a dkan:// schema.

Otherwise, totally fine, passes the review. @dafeder had mentioned not wanting to merge until some other reviews had occurred.

@paul-m
Copy link
Contributor

paul-m commented Dec 22, 2023

OK after much conversation I understand now that we're concerning ourselves with the distribution data dictionary and not the one for the dataset.

The distribution describedBy value is not a dkan:// scheme URI, so yay!

[..]
        "distribution": [
            {
                "@type": "dcat:Distribution",
                "format": "csv",
                "downloadURL": "https://demo.getdkan.org/sites/default/files/distribution/cedcd327-4e5d-43f9-8eb1-c11850fa7c55/Bike_Lane.csv",
                "describedBy": "https://dkan-core.ddev.site/api/1/metastore/schemas/data-dictionary/items/c4a997e1-e1dc-5139-93b5-0a4e15b012d7"
            }
        ]

Item 1: Following the instructions, I note that after altering the UI schema, but before adding a data dictionary, there's no way to add a URL instead of selecting one that's in metadata. That is, we can't just add a URL any more, and there are none available to select. This might be by design, but it might also be something for a follow-up.

Item 2: Also, after adding the data dictionary, while adding the dataset, there's no way to select NONE for the data dictionary. This seems a little more severe than the previous one, since we're allowing per-distribution dictionaries.

Item 3: The instructions say to use an empty data dictionary, which I realize is an edge case, but if you change the site's data dictionary settings to 'Distribution reference,' then you end up with a big error message when the DD is applied to the dataset:

% ddev drush cron
 [error]  TypeError: Drupal\metastore\DataDictionary\DataDictionaryDiscovery::getReferenceDictionaryId(): Return value must be of type ?string, none returned in Drupal\metastore\DataDictionary\DataDictionaryDiscovery->getReferenceDictionaryId() (line 104 of /var/www/html/dkan/modules/metastore/src/DataDictionary/DataDictionaryDiscovery.php) #0 /var/www/html/dkan/modules/metastore/src/DataDictionary/DataDictionaryDiscovery.php(74): Drupal\metastore\DataDictionary\DataDictionaryDiscovery->getReferenceDictionaryId('96e43b6b71d7e21...', 1703274569)

Whether these issues fall into the scope of the original ticket is up for discussion...

Comment on lines +403 to +416
$metadata = $data->getMetaData();

$title = $metadata->data->title;
$data->setTitle($title);

// If there is no uuid add one.
if (!isset($metadata->identifier)) {
$metadata->identifier = $data->getIdentifier();
}
// If one exists in the uuid it should be the same in the table.
else {
$data->setIdentifier($metadata->identifier);
}
$data->setMetadata($metadata);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code looks similar to what occurs in the function setNodeValuesFromMetadata. Line 383 in the function can be modified to look like this $title = $metadata->title ?? $metadata->name ?? $metadata->data->title; that way we are handling the schema change of the title for data_dictionaries. We can then remove these changes.

you must update your site's data dictionary schema after this update. Copy
modules/contrib/dkan/schema/collections/data-dictionary.json over you local
site version before attempting to read or write any data dictionaries.");
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line return required

}
return t("Updated $count dictionaries. If you have overridden DKAN's core schemas,
you must update your site's data dictionary schema after this update. Copy
modules/contrib/dkan/schema/collections/data-dictionary.json over you local
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From modules/contrib/dkan/schema/collections/data-dictionary.json over you local to modules/contrib/dkan/schema/collections/data-dictionary.json over to your local

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if we should add a revision row with a note of what changed?

Comment on lines +294 to +303
/**
* Save the "wrapped" node.
*
* Useful for some operations - usually recommended to use the metastore
* service's POST and PUT functions rather than saving the node directly.
*/
public function save() {
$this->node->save();
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this being utilized anywhere in this PRs code? I'm not seeing it used but I could be wrong.

@kaise-lafrai
Copy link
Contributor

@paul-m we don't want to change the default dataset.ui.json because this only makes sense if you're using data dictionaries in "reference" mode; this would be an option you would implement in your site-specific implementation of the UI schema. Obviously, this should be reflected in the docs, but I think in general the json form widget doesn't have docs so this seems like a follow-up ticket and we consider this an undocumented/experimental feature for now.

I'll try to reproduce the errors you're getting.

Maybe we can add a form_alter that alters the dataset form only when the data dictionary settings are set to Distribution reference. If its set to sitewide then that can appear on the dataset form so the user knows that.

@janette janette merged commit 48ce39c into 2.x Apr 12, 2024
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants