Greek GPT

This is a first attempt in making a MoE Language Model with only Greek text. Implementation is done using the Apple MLX framework.

Data

The data used for training and validation is Greek Version of Wikipedia, which consists of 227k URL links of Wikipedia text content.

Models

Two transformer models were developed:

One with MoE architecture with 10.5M parameters.
One Dense transformer with 10.5M parameters.

Also some sohisticated NGrams were created and trained for comparison purposes.

2-Gram up to 5-Gram.

Tokenizer

As far as the tokenizer is concerned, the GPT-2 one was used from Hugging Face and was retrained on our dataset. The vocab size was configured at 5000 tokens. No particular focus was given on the tokenizer training, but is an essential part of language modeling and is as important as the model itself.

Results

Results can be found at the benchmarks/results directory. The MoE Transformer didn't seem to be superior from the Dense one, which was expected, but it wasn't faster either as everyone has mentioned.

	CE	PPL	Inference Time (for 800 tokens)
MoE	3.629	37.659	~47 seconds
Dense	3.616	37.184	~32 seconds

The speed advantage of MoE architectures most propably comes from the networking tricks applied when training a model on GPU clusters, rather than a single GPU.

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
benchmarks		benchmarks
core		core
model		model
torch_implementation		torch_implementation
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
README.md		README.md
bench_llm.py		bench_llm.py
bench_ngram.py		bench_ngram.py
make_data.py		make_data.py
pyrightconfig.json		pyrightconfig.json
requirements.txt		requirements.txt
tokenize_data.py		tokenize_data.py
torch_model_demo.ipynb		torch_model_demo.ipynb
train_llm.py		train_llm.py
train_ngram.py		train_ngram.py
train_tokenizer.py		train_tokenizer.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Greek GPT

Data

Models

Tokenizer

Results

About

Releases

Packages

Languages

alexliap/greek_gpt

Folders and files

Latest commit

History

Repository files navigation

Greek GPT

Data

Models

Tokenizer

Results

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages