This repository contains the slide decks of my talks at the Technical University of Munich (24.10.22) and our text-as-data reading group (TADA.cool) (25.10.22). I gave some brief insight into the NLP development pipeline (📚datasets -> 🤖 models -> 📊evaluation) by showing several templates for standardizing basic steps in NLP research for transparency and accountability:
- The Model Openness Framework" (White et al. 2024)
- Model Risk Card (Derczynski et al. 2023)
- Data Statements (Bender/ Friedman 2018)
- Datasheets (Gebru et al. 2021) --> possible extensions Heger et al. (2022)
- Responsible Data Use Checklist (Rogers/ Baldwin/ Liens 2021)
- Model Cards (Mitchell et al. 2019) --> interactive model cards (Crisan et al. 2022)
- Experimental Results Checklist (Dodge et al. 2019)
- Benchmark Checklist (Reimers 2022)
- Benchmark Checklist for Reviewers (Degjani et al. 2021 based on Gebru et al. 2021)
- Framework for Algorithmic Auditing (Raji et al. 2020)
- Dataset Development Lifecycle (Hutchinson et al. 2020])
- CheckList Ribeiro et al. 2020
- Preregistering NLP research (Van Miltenburg et al. 2021)
- Ethical Guidelines (Pistilli et al. 2023)
- Self-contained Artifacts (Arvan et al. 2022)