Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dataset file count in search results from API #6601

Closed
tainguyenbui opened this issue Feb 3, 2020 · 6 comments · Fixed by #6623
Closed

Dataset file count in search results from API #6601

tainguyenbui opened this issue Feb 3, 2020 · 6 comments · Fixed by #6623

Comments

@tainguyenbui
Copy link
Contributor

tainguyenbui commented Feb 3, 2020

Hi!

We have been recently working with search results and also retrieving dataset information.

We are wondering whether it would be possible to get the file count for a dataset as part of the information being returned in each dataset information search result item.

We would find it very handy for our application

Why do we need it?
When showing dataset information in our application, we want to display the number of files that a dataset has as part of the search result. Currently, the only way to do this operation is by retrieving dataset information and calculating the files' array length.

Why would this be useful for other people?
Taking into consideration that people might be performing searches looking for dataset files, it would be handy to know whether a dataset contains files.

Thanks! 🚀

@tainguyenbui tainguyenbui changed the title Dataset file count Dataset file count in search result items Feb 3, 2020
@mheppler
Copy link
Contributor

mheppler commented Feb 3, 2020

Related to:

@djbrooke
Copy link
Contributor

djbrooke commented Feb 3, 2020

Moving to ready as this seems clear enough.

@pdurbin
Copy link
Member

pdurbin commented Feb 4, 2020

Makes sense. I think we should take the performance hit on the indexing side. That is, store the database id of each file in an array for the dataset version in question. That way, we don't have to go to the database at search/browse time.

@djbrooke djbrooke self-assigned this Feb 5, 2020
@djbrooke djbrooke changed the title Dataset file count in search result items Dataset file count in search results from API Feb 5, 2020
@djbrooke
Copy link
Contributor

djbrooke commented Feb 5, 2020

  • We should just a number here instead of a number and the files
  • This would require a Solr reindex and solrschema update

@tainguyenbui
Copy link
Contributor Author

completely agree with you @djbrooke, it will be an overkill otherwise and could increase the payload dramatically

@pdurbin
Copy link
Member

pdurbin commented Feb 6, 2020

In non-overkill style, I just made a pull request #6623 which is small and doesn't involve reindexing or any Solr changes.

@pdurbin pdurbin removed their assignment Feb 6, 2020
kcondon added a commit that referenced this issue Feb 19, 2020
in Search API show fileCount for datasets #6601
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants