Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Publish to CKAN Feature [Research/Discussion] #438

Open
pdelboca opened this issue Jun 24, 2024 · 1 comment
Open

Publish to CKAN Feature [Research/Discussion] #438

pdelboca opened this issue Jun 24, 2024 · 1 comment
Assignees

Comments

@pdelboca
Copy link
Member

pdelboca commented Jun 24, 2024

Publishing to CKAN: Main Challenge

Publication to CKAN can be tricky since it is directly related with the schema that each CKAN instance implements for their datasets and resources. As an example, here is the schema that opendata.swiss implements: https://ckan.opendata.swiss/api/3/action/scheming_dataset_schema_show?type=dataset.

In order to be able to publish, we need to known what CKAN expects. If CKAN does not provide that information, we cannot publish, so **requirement number 1: ** we should be able to access to the particular CKAN instance schema or assume it is a vanilla implementation.

Since the goal of Open Data Editor is to be a tool for non-technical users, to properly implement this feature we need:

  • The CKAN instance to expose the schema (Like the opendata.swiss example)
  • With that schema, we should be able to generate a form for the user to fill all the field that the instance of CKAN expects.
  • Once the form is filled, we should be able to publish to CKAN and properly report back errors.

We need better definitions

We need a proper definition of what does it mean to "Publish to CKAN":

  • Are we publishing datasets to CKAN?
  • Are we uploading a specific file as a new resource to an already created dataset?
  • Are we just re-uploading a specific file to an already existing CKAN resource?
  • Are we mapping datapackages to CKAN packages?
  • All of them?
  • Are we thinking on a UI form that simulates a CKAN upload form or are we thinking on a low-level API interaction for more technical users?

We will always have the schema issue, but going for more simpler scopes (like just replacing a File in an already created CKAN dataset.)

Current Implementation gaps

The current implementation is based on https://github.com/frictionlessdata/frictionless-ckan-mapper which provides a set of hard coded fields to map between a vanilla CKAN instance and Frictionless. The current Frictionless Portals Documentation does not provide any mention of how to handle custom schemas so I'm assuming that it is not implemented (Maybe @roll can provide some context here?)

Even if it is implemented at the core of Frictionless, we will need to still work on the UI that will power the feature. (Or work on a UI that works for technical users only)

Exposing the schema

The most widely used extension in CKAN to customize the schema is ckanext-scheming, however not all instances expose the endpoint to show the schema like Open Data Swiss does. Without the information of what CKAN expects it is not possible to define a UI form. We might be able to play around only with fields that we know for sure CKAN expects, mostly if our goal is a feature to just update a CKAN resource.

Dynamic Form

Building a dynamic form even when it is completely feasible, it is not an easy task. The good thing is that the fields that ckanext-scheming exposes are limited in number so we have limited implementations. There are some tools like https://uniforms.tools/docs/what-are-uniforms/ that creates forms from schemas (even using MUI!), but I'm not sure how flexible they are to create forms on-the-fly based on what ckanext-scheming returns.

It is worth to point that ckanext-scheming does not return the type of the data, but rather what snippet should be used to render the form. So it will not provide information of how it is stored or handled.

ckanext-scheming provides a list of validators but I assume it will be easier to not double-implement front end validation.

@romicolman
Copy link
Collaborator

Hey @pdelboca. If I understood correctly, publication in CKAN through the ODE seems more complicated than expected because the process is connected to the specific CKAN instance. Also, the best solution would be to work on the publication feature in stages, right? Can we start by publishing a file as a new resource when the dataset already exists?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

No branches or pull requests

2 participants