Skip to content

ZongqianLi/500xCompressor

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

♟️ 500xCompressor: Generalized Prompt Compression for Large Language Models

Content

🚀News✏️Todo✨Introduction

📚Corpus🤗Dataset🤗Models

💾 Download📌Citation🔖License

Links

Project PageDemo PagePaper

 

🚀 News

  • [2024.08.06] The paper was uploaded to Arxiv.
 
 
 

✏️ Todo

  • Datasets and models were uploaded to Huggingface but are not open to the public.
 
 
 

✨ Introduction

500xCompressor is a prompt compression method that could compresss a maximum of 500 natural language tokens into only 1 special token. This compressed token could regenerate the original text or be used for question answering (QA).

Initially, 500xCompressor was pretrained on the Arxiv Corpus, followed by fine-tuning on the ArxivQA dataset, and subsequently evaluated on various strictly unseen and classical QA datasets.

500xCompressor has several features and advantages:

  • Small additional parameters: only 0.3% extra parameters are added to the LLM
  • Zero-shot usage: the compressed tokens can be used by the original LLM without being finetuned
  • High compression ratio: from 6x to 480x
  • Generalization ability: could compress any unseen text and be used for unseen datasets in downstream tasks
  • Non-selective: compress all the tokens in the prompt instead of choosing partial tokens
  • Retained capabilities: 62.26-72.89% of LLM abilities compared to using non-compressed prompts

This research gave several insights:

  • Not all the compressed tokens are equally utilized
  • K V values have significant advantages over embeddings in preserving information at high compression ratios
  • Natural language prompts are highly compressive
  • Fine-grained complex information could be compressed and retrieved exactly as well

Here is an example:

 
 
 

📚 Arxiv Corpus

This is a collection of Arxiv abstracts:

  • Train: 2353924 items, based on Arxiv abstracts before 07/2023
  • Validation: 3000 items, based on Arxiv abstracts during 01-04/2024
  • Test: 2500 items, based on Arxiv abstracts during 01-04/2024
 
 
 

🤗 ArxivQA Dataset

This is an extractive QA dataset created based on the abstracts of Arxiv papers:

  • Train: 250000 items, based on Arxiv abstracts before 07/2023
  • Validation: 1000 items, based on Arxiv abstracts before 07/2023
  • Test: 1000 items, based on Arxiv abstracts during 01-04/2024
 
 
 

🤗 500xCompressor

Quick use for 500xCompressor (ours):

It should be noted that the environment should be the same as the provided environment in /env, or the model might output blank content.

# codes
python 500xCompressor_demo.py
# Example: 4 tokens compress 96 tokens

# input
context = """We show that every reciprocity sheaf gives rise to a cycle (pre)module in the sense of Rost over a perfect field. Over a perfect field of positive characteristic, we show that the first cohomology group of a logarithmic de Rham-Witt sheaf has a partial cycle module structure. As a consequence, we show that Kato complexes of logarithmic de Rham-Witt sheaves satisfy functoriality properties similar to Rost's cycle complexes."""
question = "Over what type of field do we show that Kato complexes satisfy functoriality properties?"

# output
Regeneration:
Predicted text: We show that every reciprocity sheaf gives rise to a cycle (pre)module in the sense of Rost over a perfect field. Over a perfect field of positive characteristic, we show that the first cohomology group of a logarithmic de Rham-Witt cycle module has a partial cycle structure. As a consequence, we show that Kato modules of logarithmic de Rham-Witt complexes satisfy functorial properties similar to Rost's cycle complexes.
QA:
Predicted text: perfect fields of positive characteristic
 

Quick use for ICAE (baseline):

It should be noted that the environment should be the same as the provided environment in /env, or the model might output blank content.

# codes
python ICAE_demo.py
# Example: 4 tokens compress 96 tokens

# input
context = """We show that every reciprocity sheaf gives rise to a cycle (pre)module in the sense of Rost over a perfect field. Over a perfect field of positive characteristic, we show that the first cohomology group of a logarithmic de Rham-Witt sheaf has a partial cycle module structure. As a consequence, we show that Kato complexes of logarithmic de Rham-Witt sheaves satisfy functoriality properties similar to Rost's cycle complexes."""
question = "Over what type of field do we show that Kato complexes satisfy functoriality properties?"

# output
Regeneration:
Predicted text: We show that every sheaf reciprocity gives rise to a cycle (pre)module over a Rost cycle. In the perfect field case, we show that over a positive characteristic field, the first logarithmic de Rham cohomology group of a Witt log-Witt cycle has a partial decomposition. As a consequence, we show that Kato's cycle complexes satisfy functoriality properties similar to Rost cycle complexes.
QA:
Predicted text: a perfect field of characteristic zero
 

Training process:

The compression model was pretrained on the Arxiv Corpus for regenerating the original text according to the compressed tokens. Then, it was finetuned on the ArxivQA Dataset for answering the questions based on the compressed tokens.

 

Performance:

The compression models were evaluated on various strictly unseen and classic QA benchmarks.

The detailed results for ArxivQA:

Here is an example:

 
 
 

💾 Download

The models are the LORA parameters for finetuning LLaMa-3-8b-Instruct. Regeneration means pretraining the compression model to regenerate the original text based on the compressed tokens. QA means finetuning the compression model for extractive QA based on the compressed tokens. 500->X means 500 tokens in the original text are compressed into X special token.

 
 
 

📌 Citation

@misc{li2024500xcompressorgeneralizedpromptcompression,
      title={500xCompressor: Generalized Prompt Compression for Large Language Models}, 
      author={Zongqian Li and Yixuan Su and Nigel Collier},
      year={2024},
      eprint={2408.03094},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2408.03094}, 
}
 
 
 

🔖 License

This project is licensed under the Creative Commons Attribution 4.0 International License - see the LICENSE for details.

License: CC BY 4.0

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages