Probing LLMs for Joint Encoding of Linguistic Categories

Official repository for the paper: "Probing LLMs for Joint Encoding of Linguistic Categories." Findings of EMNLP 2023.

Requirements and Setup

Details such as python and package versions can be found in the generated pyproject.toml and poetry.lock files.

We recommend using an environment manager such as conda. After setting up your environment with the correct python version, please proceed with the installation of the required packages. We provide a requirements.txt file for this.

pip install -r requirements.txt

This requirements.txt file is generated by running the following

sh gen_pip_reqs.sh

Repository contents

.
├── data/                            # Where data is kept
├── experiments/                     # arrays of images
├── images/                          # more individual images
├── lisa/                            # SLURM jobs and configs
├── infoshare/
│   ├── datamodules/                 # handle data loading, processing
│   ├── models/                      # Model implementations
│   ├── run
│   │   ├── test.py                  # run testing
│   │   ├── test_xlingual.py         # run testing across languages
│   │   └── train.py                 # run training
│   ├── __init__.py
│   └── utils.py                     # general utils
├── notebooks/                       # see notebooks/README.md
├── reports/                         # LaTeX and more
├── README.md                        # you are here
├── lswsd_lemmas.txt                 # lemmas used for LSWSD
├── poetry.lock                      # dependencies metadata
├── pyproject.toml                   # project metadata
├── gen_pip_reqs.sh                  # script for generating requirements.txt
└── requirements.txt                 # required packages for PIP

The above was generated with

tree . -L 3 --dirsfirst -I "*.eps|*.png|*.pdf|lightning_logs|*pycache*|backup"

followed by some manual edits.

Citation

If you use this code or find our work otherwise useful, please consider citing our paper:

@inproceedings{starace2023probing,
  title={Probing LLMs for Joint Encoding of Linguistic Categories},
  author={Starace, Giulio and Papakostas, Konstantinos and Choenni, Rochelle and Panagiotopoulos, Apostolos and Rosati, Matteo and Leidinger, Alina and Shutova, Ekaterina},
  booktitle={Findings of the Association for Computational Linguistics: EMNLP 2023},
  pages={7158--7179},
  year={2023}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Probing LLMs for Joint Encoding of Linguistic Categories

Requirements and Setup

Repository contents

Citation

About

Releases

Packages

Contributors 3

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 348 Commits
experiments		experiments
images		images
infoshare		infoshare
lisa		lisa
notebooks		notebooks
.gitattributes		.gitattributes
.gitignore		.gitignore
README.md		README.md
gen_pip_reqs.sh		gen_pip_reqs.sh
lswsd_lemmas.txt		lswsd_lemmas.txt
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

thesofakillers/infoshare

Folders and files

Latest commit

History

Repository files navigation

Probing LLMs for Joint Encoding of Linguistic Categories

Requirements and Setup

Repository contents

Citation

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages