Large Language Models based on historical text could offer informative tools for behavioral science

by Michael E. W. Varnum, Nicolas Baumard, Mohammad Atari, Kurt Gray https://www.pnas.org/doi/10.1073/pnas.2407639121

Introduction

Historical Large Language Models (HLLMs)

Background:

Traditional behavioral science focuses on present day studies
Limitations: no access to historical data
Criticism for being too parochial, call for expansion beyond WEIRD societies
Need to incorporate diverse historical data for generalizable theories

Introducing HLLMs:

Generative models trained on historical corpora
Simulate responses of populations no longer living
Offer opportunities to gather historical psychological data

Advantages of HLLMs:

Create new opportunities in behavioral science research
Expand understanding of human nature beyond present day societies
Provide insights into various historical attitudes and behaviors

Examples:

Comparing cooperative tendencies: Vikings vs Ancient Romans vs Early Modern Japanese
Exploring gender roles: Ancient Persians vs Medieval Europeans
Addressing hard-to-answer questions

Benefits:

Escape from temporal trap in behavioral science research
Diversify data beyond present day participants
Enhance understanding of historical societies and their psychological aspects.

Timely Tool

Large Language Models (LLMs)

Massive neural networks trained on natural language data
Understand and generate natural language output
Predict probable words based on sequence of prior words
Fine-tuned through supervised learning for specific tasks
Transforming psychology and adjacent fields
- Simulate human responses across domains
- Replicate patterns in moral judgment, economic game behavior, cognitive biases, obedience experiments
- Used as substitutes for human subjects with caution
Limitations:
- Reflect the cultures on which they are trained
- Limited cross-cultural generalizability due to WEIRD sampling bias
- LLMs reflect the psychology of different cultural groups through training on different corpora.
Previous techniques for inferring psychological tendencies from text data:
- Google Ngrams, newspapers, movie dialogue, etc.
- Limited to indirect proxies for psychological traits and tendencies
Proposed use of HLLMs (High-level Language Models) to venture beyond existing techniques.

Careful Training

Historical Language Models (HLLMs)

Simulate responses of diverse past societies using modern psychological instruments and behavioral measures
Enable measurement of historical populations' responses without direct access to living individuals
Based on large corpora of historical text, including fiction, diaries, letters, scholarly texts
Reproduce psychological responses of historical populations (13)
Insight into thinking of populations no longer living
Study trends in psychological tendencies over longer time spans
Test historical generalizability of contemporary psychological phenomena
Complement qualitative approaches used by historians
Encouraging first steps for simulating historical samples for research (MonadGPT, XunziALLM)
Unclear how accurately these models reflect the true underlying mindset of past populations.

Building a Historical Language Model:

Acquire sizable amount of historical text from a specific time period
Convert text to machine-readable format
Encode vectors and feed them to neural network architecture
Generate probability distributions for words
Create an LLM
Use chat interface to simulate participants and run psychological experiments.

Challenges and Caveats

Challenges and Caveats for Historical Language Models (HLLMs)

Acquiring Sufficient Training Data:

Smaller training corpora compared to current LLMs
Historical texts may not be representative of the population as a whole
Elites are overrepresented in historical text, potentially skewing results
Validation through other archival sources and traditional approaches needed

Benchmarking:

Limited availability of benchmarks for HLLMs due to lack of contemporary human data
Use of historical psychology, ethnographic data, and archaeological data to assess accuracy
Experts can fine-tune models based on socioeconomic status effects in modern populations

Generalizing from Historical Text:

Substantial challenges in creating representative samples of past populations
Importance of validating results through multiple sources and approaches

Potential Solutions and Future Directions:

Combining HLLMs with other research methods and approaches
Continued development of computational tools for working with historical data
Increased availability of larger, more diverse historical text databases.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LLM-of-historical-text-for-behavioral-science.md

LLM-of-historical-text-for-behavioral-science.md

Large Language Models based on historical text could offer informative tools for behavioral science

Contents

Introduction

Timely Tool

Careful Training

Challenges and Caveats

Files

LLM-of-historical-text-for-behavioral-science.md

Latest commit

History

LLM-of-historical-text-for-behavioral-science.md

File metadata and controls

Large Language Models based on historical text could offer informative tools for behavioral science

Contents

Introduction

Timely Tool

Careful Training

Challenges and Caveats