Skip to content

Code for searching for institutional dataset records from popular data repositories (e.g., Zenodo)

License

Notifications You must be signed in to change notification settings

UGent-Data-Stewards/DatasetRepositorySearch

Repository files navigation

DatasetRepositorySearch

The Dataset_Repository_Search.py code is used to search commonly-used general repositories for datasets and software associated with a specific institution. To modify the code to search for your own institution, the variables in main() need to be edited to the specifics of the institution in question. If certain repositories do not apply (e.g., if your institution lacks an OSF Institutional account), the section can be left blank. The following variables should be modified:

• ror: The ROR ID should be provided as the full html link. ROR IDs for your institution can be found here: https://ror.org/

• osfID: If your institution has an OSF account, the osfID can be found at the end of the base URL for the institution page

• osf_ACCESS_TOKEN: OSF access tokens can be requested here: https://osf.io/settings/tokens

• zenodoSearchTerms, pangaeaSearchTerms, figshareSearchTerms, GBIFSearchTerms: Specific search terms for each repository can be identified via trial and error and through consultation with search guides/manuals for each respective repository

• zenodo_ACCESS_TOKEN: users should update the Zenodo Access token to one assigned to themselves. Zenodo Access tokens for applications can be requested here: https://zenodo.org/account/settings/applications/tokens/new/

• fileIdentifier: This variable should be set to the name of your institution and will be used in the file names of all outputs

The code will output individual CSV files for each of the data repositories. Each file will contain a list of DOIs. The lists from each repository can then be subsequently analyzed and combined to remove duplicates. All dependencies for the code can be found in the requirements.txt file. These can be installed using the following Windows command in the working directory: pip install -r requirements.txt

The swagger client needs to be unzipped in the same directory as the python code.

About

Code for searching for institutional dataset records from popular data repositories (e.g., Zenodo)

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages