Skip to content

KihoPark/LLM_Categorical_Hierarchical_Representations

Repository files navigation

Categorical_Hierarchical

In our paper, we show that large language models represent categorical concepts as polytopes and hierarchical relations as orthogonality.

We confirm our theory with Gemma-2B and LLaMA-3-8B representations and this repo provides the code for the experiments.

Data

animals.json and plants.json are sets of words generated by ChatGPT-4.

WordNet Hierarchy data for noun and verb are obtained by get_wordnet_hypernym_gemma.ipynb and get_wordnet_hypernym_llama.ipynb.

Requirement

You need to install Python packages transformers, networkx, scikit-learn, nltk, inflect, torch, numpy, seaborn, matplotlib, json, and tqdm to run the codes. Also, some GPUs would be helpful to implement the codes efficiently.

Experiments

  • 01_eval_noun.ipynb: We validate the existence of the vector representations for each feature in WordNet noun hierarchy in Figure 3, examine the relationships between them in Figure 4, and evaluate the hierarchical orthogonality in Figures 5 and 10. (+ the evaluation for the mean vector as an estimator in Figure 11)
  • 02_eval_verb.ipynb: The same analysis for WordNet verb hierarchy in Figures 13, 14, 15, and 16.
  • 03_eval_llama.ipynb: The same analysis for WordNet noun hierarchy and LLaMA-3 model in Figures 17, 18, 19, and 20.
  • 04_intervention.ipynb: We validate the Definition 3 in Table 1.
  • 05_visualization.ipynb: We display 2D plots in Figure 2 and 3D plots in Figure 6.
  • 06_subgraph.ipynb: We show the zoomed-in tree and heatmaps in Figures 8 and 9.
  • 07_eval_gamma.ipynb: We show that the hierarchy is not encoded as the orthogonality when we use the naive Euclidean inner product in Figure 12.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published