Allow researchers and policy makers to see how the presence and quality of links to data and software in publications are changing over time so that they can identify emergent behaviour.
This project started at Springer Nature Hackday in November 2017 and continued at the Collaborations Workshop Hackday in March 2018.
The goal is to analyse a corpus of papers for citation links into repositories which may hold research data and/or software.
We are highly motivated to support the much needed culture change regarding the recognition of open data and code sharing in the scientific community. Being able to measure the number of papers that cite their code and data in the academic literature is important to show evidence of the increasing recognition of research software and its developers.
The code-cite counter searches a corpus of literature (eg: Europe Pubmed Central) for particular terms (such as github.com
, doi.org/10.5281/zenodo
or doi.org/10.6084/m9.figshare
) and show how their prevalence is increasing over time.
We also provide measures of stability and quality for this code by resolving links found in papers and evaluating metadata such as the existence of a README
or LICENSE
file.
Finally, we provide (the beginnings of) a web interface so that users can run their own search queries from the published literature, and from specific journals of interest. You can see the source code in its (separate) github repository.
We would love for you to join us on this journey!
Check out the contributing guidelines or our list of issues to see how you can help.
Thank you to everyone who has contributed so far!
Andrew Walker 💻 🤔 |
Robin Long 💻 🤔 |
Naomi Penfold 💻 📖 🤔 |
Neil Chue Hong 💻 🤔 📢 |
Martin O'Reilly 💻 🤔 |
---|---|---|---|---|
Alexander Struck 📖 🤔 |
Matthew Upson 💻 🤔 🎨 |
Isla Staden 💻 🤔 💬 |
Kirstie Whitaker 📖 🤔 📢 |
Shoaib Sufi 📖 🤔 📢 |
This project follows the all-contributors specification and this emoji key explains the different contributions. The order was determined by reverse numerical order of authors' ORCIDs.
The project was inspired by Yo Yehudi's Code is Science project and seeks to complement the work by that community by providing some numbers associated with the prevalence of code citations in the published literature.
Our efforts also complement published work by Park et al (2017, doi:10.1007/s11192-017-2240-2).
The Code and Data Citation Counter is licensed under a MIT license and archived on Zenodo.
If you would like to cite the concept, please use this doi: 10.5281/zenodo.1209095.
If you would like to reference a specific version of the software, please use the doi associated with that version. The doi is available in the release notes, and can also be found at the link above. The most recent release is version 0.2 which has doi: 10.5281/zenodo.1209311.
There are also two files containing reference information (in cff
and codemeta
formats) within the repository which should contain all the information you need to cite this repository.
Some scripts may require the use of secrets you don't want to be stored in this public Github repository (e.g. web service API keys). You can create a "secrets" folder in the top level of this repository to store these. This "secrets" folder and all comments will be ignored by Git.