Official repository for the paper: "Probing LLMs for Joint Encoding of Linguistic Categories." Findings of EMNLP 2023.
https://arxiv.org/abs/2310.18696
Details such as python and package versions can be found in the generated pyproject.toml and poetry.lock files.
We recommend using an environment manager such as conda. After setting up your environment with the correct python version, please proceed with the installation of the required packages. We provide a requirements.txt file for this.
pip install -r requirements.txt
This requirements.txt
file is generated by running the following
sh gen_pip_reqs.sh
.
├── data/ # Where data is kept
├── experiments/ # arrays of images
├── images/ # more individual images
├── lisa/ # SLURM jobs and configs
├── infoshare/
│ ├── datamodules/ # handle data loading, processing
│ ├── models/ # Model implementations
│ ├── run
│ │ ├── test.py # run testing
│ │ ├── test_xlingual.py # run testing across languages
│ │ └── train.py # run training
│ ├── __init__.py
│ └── utils.py # general utils
├── notebooks/ # see notebooks/README.md
├── reports/ # LaTeX and more
├── README.md # you are here
├── lswsd_lemmas.txt # lemmas used for LSWSD
├── poetry.lock # dependencies metadata
├── pyproject.toml # project metadata
├── gen_pip_reqs.sh # script for generating requirements.txt
└── requirements.txt # required packages for PIP
The above was generated with
tree . -L 3 --dirsfirst -I "*.eps|*.png|*.pdf|lightning_logs|*pycache*|backup"
followed by some manual edits.
If you use this code or find our work otherwise useful, please consider citing our paper:
@inproceedings{starace2023probing,
title={Probing LLMs for Joint Encoding of Linguistic Categories},
author={Starace, Giulio and Papakostas, Konstantinos and Choenni, Rochelle and Panagiotopoulos, Apostolos and Rosati, Matteo and Leidinger, Alina and Shutova, Ekaterina},
booktitle={Findings of the Association for Computational Linguistics: EMNLP 2023},
pages={7158--7179},
year={2023}
}