-
Notifications
You must be signed in to change notification settings - Fork 492
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
7844 codemeta schema #7877
7844 codemeta schema #7877
Conversation
I once had the idea to actually make the citation block pluggable, non-mandatory. I know this requires A LOT, but maybe it's a way to go, if we don't want other archictural changes like abstracting the concept of sets. However, this seems beyond scope. Thanks for the pointer for the description issue, I'll fix that right away. |
There is a list of programming languages in WikiData, containing ~1500 entries. (Via https://en.wikiversity.org/wiki/Research_in_programming_Wikidata/Programming_languages) There is an extensive list of operating systems (not names alone) in WikiData with ~1100 entries. (Via https://en.wikiversity.org/wiki/Research_in_programming_Wikidata/Operating_systems) We might wanna play with the OS query to select only instances that are not a subclass of another OS and not "based on" to gain the top level ones only. |
I checked on the autocomplete/filtering support for controlled vocabulary fields in compound fields. Here's what I found:
I guess adding the After revisiting the schema, I see that the field |
@poikilotherm as Codemeta is close to version 3.0 (https://blog.datacite.org/codemeta-we-need-your-feedback/), And what is the timing for this pull request with regards to Codemeta 2.0 vs. Codemeta 3.0 (which is still a few months away)? |
@mfenner I think there is a high demand for these fields not only within the boundaries of the Dataverse community. I know that @sdruskat is also looking into this matter for his PhD thesis. Are you aware of any existing, reusable controlled vocabularies, preferably as RDF/SKOS/JSON-LD/sth. with a PID, we could reuse for a field like
I'm not so sure about this. Maybe it would be a good start to create a schema for 2.0 now and upgrade to 3.0 later on. It's a rather low hanging fruit. It might become necessary to introduce a migration method in Dataverse, but this seems like a good addition beyond the CodeMeta use case. |
… watermark helptext IQSS#7844
- Add missing displayOrder values - Fix missing type for software requirements - Avoid splitting up compound fields too much, otherwise data is not exportable to schema.org or CodeMeta JSON-LD without special handling (IQSS#7856) - Tweak order - Tweak descriptions and examples - Fix whitespaces and line endings
@poikilotherm I couldn't get this tsv to load without making a few changes. I put them in a pull request for you to review and perhaps merge: poikilotherm#553 |
Thanks @pdurbin! Just today I picked up working on this again (not yet pushed). There's lots of stuff to be moved around, which will also incorporate your changes😉 |
fcc36d0
to
1e8567d
Compare
Chop chop here we go #9225 |
added to sprint Dec 15, 2022 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a quick review. I haven't loaded up the block.
datasetfieldtype.softwareHelp.title=Software Help/Documentation | ||
datasetfieldtype.softwareHelp.description=Link to help texts or documentation | ||
datasetfieldtype.softwareHelp.watermark=e.g. https://user.github.io/project/docs | ||
datasetfieldtype.readme.title=Readme |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't "README" little more standard? (Instead of "Readme".) If others agree, we should change the tsv as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree the filename should be sth with README
. But do we want an all caps field name in the UI?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I was trying to suggest all caps README in the UI. That's what you have in the description ("Link to the README of the project") and the watermark ("e.g. https://github.com/user/project/blob/main/README.md"), both of which appear in the UI, so it should probably be consistent, right?
It's weird, Codemeta itself has "link to software Readme file" as a description at https://codemeta.github.io/terms/ but before codemeta/codemeta@0818c31 it was all caps README:
- before: "A URL for the software README file"
- after: "link to software Readme file"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I played around with this locally and it's looking good!
I'm sending it to QA but I'll make a few observations:
- A lot of these fields would benefit from a picklist (programming languages, etc.) so I hope that we'll see a pull request to add some external controlled vocabularies.
- For the tooltips, there is some inconsistency of final periods being present or absent.
- It's weird that SVN is listed before Git, but that's because of CodeMeta and Schema.org.
- I'm slightly weirded out by the inconsistency between Readme (title) and README (tooltip and watermark).
- For some fields, it would be nice to have units (memory requirements, for example) but this is feedback to give upstream to CodeMeta and Schema.org, I imagine.
- I find "Target Product" to be a bit odd. Again, this is feedback to send upstream. I think the idea is that if, for example, you're creating a plugin for WordPress, you can put WordPress as the target product.
- There is a failing API test (FilesIT.test_008_ReplaceFileAlreadyDeleted) but I'm sure it has nothing to do with this metadata block, which isn't even loaded.
- It seems like Oliver would like more feedback earlier in the process. He posted about this at https://groups.google.com/g/dataverse-community/c/heNotzADbaQ/m/DJItrFjFBAAJ but in practice, developers like me don't take a serious look until the work (a PR in this case) make it into a sprint. So maybe we could improve our process here.
I was pinged a while back but thought I should reply now that I finally found the time to answer after the winter break.
I'm not sure what the schema.xml change was and how that's related to this being experimental. Is that what I gave my blessing to? Is the effect of the schema.xml change that this won't be a default metadatablock in future Dataverse installations? Does that mean that experimental, as it's been used for this and the workflow metadatablock, means that it'll be included in a release but the feature won't be turned on by default in Dataverse installations? I agree about more feedback earlier in the process (and @poikilotherm has been using many opportunities over the years to encourage feedback), and I'd like to add that I think it's important to plan, as early in the process as possible, for evaluating solutions after they've been merged, too, even more so if we're so uncertain about a solution that we label it experimental. |
@jggautier you probably missed the discussion but to sum up, only changes to non-experimental blocks should result in a change to schema.xml. That is, schema.xml contains field for all the block that we ship. All these blocks are enabled by default and will "just work" because schema.xml has the fields already. I hope this helps. This whole experimental blocks concept is quite new, of course! |
Ah thanks. That's how I understood it. Experimental metadatablocks shouldn't be enabled in installations by default when those installations use the version of the software that includes that experimental metadatablock. Those installations will need to take extra steps to enable it. It's just not clear to me how a metadatablock becomes not experimental. |
It hasn't happened yet! 😄 I hope we find out with CodeMeta! |
What this PR does / why we need it:
This is adding the CodeMeta Schema as a default out of the box schema for (new) installations.
This pull request is a first step. Please see the discussion points below for your review. We need to be careful about the scope of this first step to keep compatibility in mind. (There is no schema migration present in the Dataverse application, so when changing data types etc, we need to write SQL database migrations manually!)
TODOs
Which issue(s) this PR closes:
Closes #7844
Special notes for your reviewer:
displayName
for the block? (It currently is "Software Metadata (CodeMeta 2.0)")applicationCategory
?ResearchApplication
to this list and reach out to schema.org and CodeMeta people to push for adding it to the list? (Maybe Google, too?)subject
field, which is very coarse anyway)*Requirements
fields use integer values of byte? kilobyte? megabyte? (or similar for CPU) instead of arbitrary text values?Suggestions on how to test this:
Does this PR introduce a user interface change? If mockups are available, please link/include them here:
On dataset creation:
On dataset editing:
As JSON-LD export:
Is there a release notes update needed for this change?:
Additional documentation:
Tagging @doigl @atrisovic @4tikhonov @jggautier @djbrooke @pdurbin (I don't know the GH names of the other WG members)