PalestineLLM

PalestineLLM is a Pro-Palestine project designed to facilitate the understanding and exploration of issues related to Palestine through the use of language models. This repository includes various datasets, text documents, and tools for training language models to improve their performance on Palestinian topics.

Dataset

The project includes various text documents and structured data related to Palestinian topics. The datasets are organized into different directories, such as decolonizepalestine.

Key Files and Folders

datasets: Contains text documents and QA pairs organized by source.
fine-tune: Includes materials for fine-tuning the language model.
concatenated_qr.jsonl: A JSON Lines file that consolidates question-answer pairs.
prompt_generator.js: A script for generating prompts.
QA.jsonl: Structured QA pairs extracted from various sources for training and evaluation purposes.

Installation

To set up the project, follow these steps:

Clone the Repository:

git clone https://github.com/mlibre/PalestineLLM.git
cd PalestineLLM

Run Ollama_Unsloth on Google Colab.
Download and Share the Model.

Usage

node qa_contacter.js

This script will create a consolidated file (final_QA_dataset.jsonl) in the fine-tune directory.

For fine-tuning, run the Jupyter notebook located in the fine-tune directory:

Open the notebook using Jupyter:

jupyter notebook Ollama_Unsloth_Llama3_jsonl.ipynb

Follow the instructions in the notebook to fine-tune the model.

How You Can Help

We encourage contributions from the community to make PalestineLLM even more impactful. Here's how you can help:

Share Your OpenAI O1 Data: If you have access to OpenAI's O1 or other cutting-edge models, you can help by generating high-quality question-answer pairs (QAs) related to Palestinian topics and contributing them to this repository. Simply copy the .prompt file contents and paste them into your AI tool. It will generate JSON lines of high-quality data.
Create and Share Custom GPTs: Consider building custom GPT models focused on Palestinian issues using OpenAI's tools. You can share these models in the OpenAI Marketplace and with the community to expand the availability of specialized LLMs.
Fine-tune and Share PalestineLLM: You can help improve PalestineLLM by fine-tuning it on additional datasets, especially those related to Palestine. Also, you can work on better models as I was only able to fine-tune on Llama 3.1 7B. Once fine-tuned, feel free to share your model on HuggingFace or other platforms to make it accessible to others.
Star this Repository and Share It: By starring this repository, you can help spread the word about this project and encourage others to contribute and use it.
Machine Learning and AI Research: If you are a researcher or student working on machine learning or AI projects, share your ideas and help make PalestineLLM better.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
datasets		datasets
fine-tune		fine-tune
infography		infography
telegram		telegram
.gitignore		.gitignore
README.md		README.md
prompt_generator.js		prompt_generator.js
qa_contacter.js		qa_contacter.js

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PalestineLLM

Table of Contents

Dataset

Key Files and Folders

Installation

Usage

How You Can Help

About

Languages

mlibre/PalestineLLM

Folders and files

Latest commit

History

Repository files navigation

PalestineLLM

Table of Contents

Dataset

Key Files and Folders

Installation

Usage

How You Can Help

About

Topics

Resources

Stars

Watchers

Forks

Languages