Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MIssing metadata fields when exporting data package #49

Open
danfowler opened this issue Jul 21, 2016 · 5 comments
Open

MIssing metadata fields when exporting data package #49

danfowler opened this issue Jul 21, 2016 · 5 comments
Labels
Milestone

Comments

@danfowler
Copy link

Without no schema on each resource, these are technically not Tabular Data Packages, even if all the resources are tabular.

This dataset, https://datahub.io/dataset/period-table-4716738971 (imported using "Import Data Package"), for example, exports the following datapackage.json:

{
  "description": "Example Data Package featuring the periodic table.", 
  "license": {
    "type": "ODC-PDDL-1.0", 
    "title": "ODC-PDDL-1.0"
  }, 
  "title": "Periodic Table", 
  "keywords": [
    "atom", 
    "chemistry", 
    "element"
  ], 
  "resources": [
    {
      "url": "https://datahub.io/dataset/aa320a4f-9ae4-4bc7-95c9-f3703bd5ceec/resource/85be2f85-17a0-4447-87df-e54b47477557/download/data.csv", 
      "title": "data", 
      "name": "data", 
      "format": "CSV"
    }
  ], 
  "name": "period-table-4716738971"
}
@danfowler
Copy link
Author

Related comment:

https://datahub.io/dataset/rockhampton-regional-council-bus-stops#comment-2824345324

I downloaded the datapackage.json, but it only had highlevel metadata, and did not contain metadata of the columns in the dataset. I also expected that the resource would be downloaded as one package with the metadata.

@EarlButterworth
Copy link

My assumption was that, as it was a datapackage that it would deliver on download as a datapackage, exactly as it had been uploaded; i.e. the one ZIP file containing the datapackage.json and the CSV. Instead, all that was delivered was a partial datapackage.json which did not contain the Resource Schema.

The aim is to bring simplicity to data publishing and consumption. Therefore a consistent user experience (UX) should be provided; i.e. what goes in is what comes out.

@danfowler
Copy link
Author

Hi @EarlButterworth I'm creating a new issue based on your comment:

#52

Thanks!

@Stephen-Gates
Copy link

Agree with above. Testing using v1.0.0

Information lost includes:

package:

  • sources

resources:

  • profile
  • encoding
  • mediatype
  • licenses
  • schema

@amercader amercader changed the title "Export Data Package" doesn't provide schemas with tabular data resources MIssing metadata fields when exporting data package Feb 8, 2018
@amercader amercader added this to the MVP v1 milestone Feb 8, 2018
@amercader
Copy link
Member

amercader commented Feb 8, 2018

Implementation

We already discussed licenses in #62 and schemas are now fixed.

For the rest of fields, modify the ckan-datapackage-tools converter to store them as extras (except mediatype, which maps to mimetype on resources). Use these extras on the way out to generate the DP descriptor

Estimate

0.5 day

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants