-
Notifications
You must be signed in to change notification settings - Fork 493
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
dataset types (software, workflow, etc.) - initial support #10694
Conversation
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
2024/07/18 - 6.4 proposal request from @poikilotherm |
This comment has been minimized.
This comment has been minimized.
1 similar comment
This comment has been minimized.
This comment has been minimized.
At tech hours today, I gave a demo of dataset types as of cfac9dc. Here's a screenshot where you can see a new facet called "Dataset Type" and a couple examples of "software" and "workflow": I pointed out that I've already completed a good amount of the tasks in the issue at #10517 with the exception of this one:
However, sending different info to DataCite should probably wait until we merge the following pull request by @qqmyers, to avoid merge conflicts and extra effort: In addition, based on feedback at tech hours, I plan to focus on the following:
We also discussed the following ideas, but I don't consider any of these blockers for moving this pull request forward:
I'll go ahead and mention members (I can find) of the old Software, Workflows & Containers Working Group (not already mentioned above) in case they'd like to see this update on software datasets: @atrisovic @doigl @kmika11 @4tikhonov |
@pdurbin a while back we saw a Dataverse installation that customized those search cards to show the word "Dataset" when the item was a dataset. I wonder if that might be a better approach. Less work to figure out which icons to use for each type of object and maybe clearer for folks searching for data. Although then we have to think about internationalization. World Agroforestry at https://data.worldagroforestry.org does this. And I think I've seen another installation that does this. Can't find it but it was a bit different visually. Edit: For the sake of transparency I should say that I haven't really followed this proposal, so sorry if what I mentioned was already mentioned or isn't in scope 😬 |
Maybe someday but we're not confident about which field to use and we're not even sure if there is any interest in this because DDI usually represents data, not software or workflows.
Hey all. @scolapasta thought it would be good to have a call about the idea of icons or otherwise indicating on the search page what type of research object a user is looking at and suggested @qqmyers be on the call and that I check with @pdurbin. @qqmyers is out this week and after this week I'll be out until August 12. @pdurbin, @scolapasta, @qqmyers and @dliburd, would you be interested in a call on or after August 12 about the idea of icons or otherwise indicating on the search page what type of research object a user is looking at? If so, I could send a poll to see what times work best. I'm writing here so that there's a record of it that everyone else involved can see and so that it's close to more information about this effort. |
Also populate a few dataset types in database, with "dataset" being the default. Add default type to existing datasets. Also APIs for managing dataset types.
We bumped our db migration script to .2 Conflicts: src/main/resources/db/migration/V6.3.0.1.sql src/test/java/edu/harvard/iq/dataverse/api/UtilIT.java
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
Before this commit, the facet looked like this... Dataset Type (3) Dataset (2) software (1) workflow ... that is, "Dataset" was capitalized but "sofware" and "workflow" were not. This commit fixes this, making all types capitalized, and it makes the values translatable in other languages. However, it does nothing to address some confusion that Search API users will feel. They'll get back the capitalized values but will need to pass in the lower case version (in English) to narrow their search results.
This comment has been minimized.
This comment has been minimized.
Also add upgrade instructions for Solr. Note that the change from "software" to "Software" should have been included in the last commit about capitalization.
It doesn't seem to me that a Feature Flag is being used for this feature. What am I missing? |
It no longer is, but it did originally so some convenience methods that were added to see flags are still here. That code now could be in a separate PR, but seemed small enough to just review/let through in this. |
This comment has been minimized.
This comment has been minimized.
Also advocating putting some kind of indicator of Dataset Type in the UI - tag on the card and/or dataset page. |
There are a couple of newly added unused imports in dataverses.java |
As discussed with @sekmiller I did add some tests in 673d775 to assert that the API properly returns the value from the bundle such as "Software" (capital S) rather than the value from the database such as "software" (lower case). I suspect the facets not showing the correct values could have something to do with this issue: My hope is that after merging the fix for that issue... ... the JSF facets will show the correct values, the ones from the bundle. We also talked a bit about how a visual indicator that you are looking at a dataset would be nice, but we'll defer this to a future pull request. (This was also mentioned in a previous comment, how we'd like to pull in Julian and Dwayne.) Finally, we should probably simply close this issue: I'll check with @cmbz about this at standup. @scolapasta and @qqmyers decided we didn't need this after all, so I removed it. |
This comment has been minimized.
This comment has been minimized.
From a quick test (making a test branch, merging that PR into this one), I'm hopeful that it will be a good fix. I'm now seeing the correct values from the bundle in the facets. |
This comment has been minimized.
This comment has been minimized.
📦 Pushed preview images as
🚢 See on GHCR. Use by referencing with full name as printed above, mind the registry name. |
Just to close the loop on the discussion above, I retested now that #10158 has been merged and as we suspected when you click the "Software" facet... ... it now correctly says "Dataset Type: Software" (uppercase) instead of "Dataset Type: software" (lowercase): |
What this PR does / why we need it:
This PR provides initial support for dataset types (part of IQSS/dataverse-pm#307):
A good entry point for docs at https://dataverse-guide--10694.org.readthedocs.build/en/10694/user/dataset-management.html#dataset-types
This pull request also allows the status of feature flags to be listed via API. See https://dataverse-guide--10694.org.readthedocs.build/en/10694/api/native-api.html#list-all-feature-flags
Which issue(s) this PR closes:
Used to close but we decided against it Implement a global setting and/or API for dataset types #10518Special notes for your reviewer:
I followed Proposal: Supporting Multiple Dataset Types in Dataverse the best I could but I was also influenced by discussions at tech hours.
The only failing test is Shellspec but it should be fixed by #10682
Suggestions on how to test this:
Make sure Jenkins is passing.
As of this writing (2024-07-31) Jenkins does not have the dataset types feature flag on. Turning this on would test the new feature.(The feature flag was removed.)Test all APIs:
Test publishing to DataCite to ensure that the correct type is sent.
Does this PR introduce a user interface change? If mockups are available, please link/include them here:
Yes, there is a new "Dataset Type" facet:
Also, when you publish a dataset of type software to DataCite, it will show as such in Fabrica. In the example below, look for "Software" next to the name (pyDataverse):
Is there a release notes update needed for this change?:
Yes, included.
Additional documentation:
Included, see especially: