Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update hf weight conversion script to llama 3 #551

Merged
merged 3 commits into from
Jul 9, 2024

Conversation

dongwang218
Copy link
Contributor

What does this PR do?

The convert_hf_weights_to_llama.py script convert huggingface llama checkpoint to consolidated llama format. This PR updates the script so that it can also handle llama 3.

Feature/Issue validation/testing

Please describe the tests that you ran to verify your changes and relevant result summary. Provide instructions so it can be reproduced.
Please also list any relevant details for your test configuration.

  • [ X] Convert Llama3 70B
mkdir test70B; cp ~/workspace/github/llama3/Meta-Llama-3-70B/params.json test70B/
python -m llama_recipes.tools.convert_hf_weights_to_llama --model-path meta-llama/Meta-Llama-3-70B-Instruct  --output-dir test70B --model-size 70B

python compare_llama_weights.py ./test70B ~/workspace/github/llama3/Meta-Llama-3-70B-Instruct
Comparing shards: 100%|████████████████| 1/1 [00:43<00:00, 43.96s/it]
Top 10 largest deltas:
  shard 6 layers.54.attention_norm.weight: 0.005859375 vs 0.1533203125
  shard 1 layers.31.attention_norm.weight: 0.00390625 vs 0.22265625
  shard 1 layers.53.attention_norm.weight: 0.00390625 vs 0.197265625
  • [X ] Convert Llama3 8B
mkdir test8B; cp ~/workspace/github/llama3/Meta-Llama-3-8B/params.json test8B/
python -m llama_recipes.tools.convert_hf_weights_to_llama --model-path meta-llama/Meta-Llama-3-8B-Instruct --output-dir test8B --model-size 8B
python compare_llama_weights.py ./test8B/ ~/workspace/github/llama3/Meta-Llama-3-8B-Instruct
Top 10 largest deltas:
  shard 0 layers.0.attention.wk.weight: 0.0 vs 0.014404296875
  shard 0 layers.0.attention.wo.weight: 0.0 vs 0.005645751953125
  • [X ] Converted 70B inference test
torchrun --nproc_per_node 8 example_chat_completion.py     --ckpt_dir ~/workspace/github/llama-recipes/src/llama_recipes/utils/hf_llama_conversion/test70B --tokenizer_path Meta-Llama-3-70B-Instruct/tokenizer.model     --max_seq_len 512 --max_batch_size 6

==================================
User: I am going to Paris, what should I see?

Assistant: Paris, the capital of France, is known for its stunning architecture, art museums, historical landmarks, and romantic atmosphere. Here are some of the top attractions to see in Paris:

1. The Eiffel Tower: The iconic Eiffel Tower is one of the most recognizable landmarks in the world and offers breathtaking views of the city.
2. The Louvre Museum: The Louvre is one of the world's largest and most famous museums, housing an impressive collection of art and artifacts, including the Mona Lisa.
3. Notre-Dame Cathedral: This beautiful cathedral is one of the most famous landmarks in Paris and is known for its Gothic architecture and stunning stained glass windows.

These are just a few of the many attractions that Paris has to offer. With so much to see and do, it's no wonder that Paris is one of the most popular tourist destinations in the world.

User: What is so great about #1?

> Assistant: The Eiffel Tower! It's an iconic symbol of Paris and one of the most recognizable landmarks in the world. Here are some reasons why it's so great:

1. **Engineering marvel**: When it was built for the 1889 World's Fair, it was the tallest structure in the world (324 meters/1,063 feet) and a technological marvel. It's still an impressive feat of engineering today.
2. **Breathtaking views**: From the top, you can see the entire city of Paris, including many of its famous landmarks like the Seine River, Notre-Dame Cathedral, and the Arc de Triomphe.
3. **Romantic atmosphere**: The Eiffel Tower is often associated with romance and love. It's a popular spot for proposals, weddings, and romantic dinners.
4. **History**: The tower has played a significant role in French history, serving as a communication tower during World War I and a symbol of French resistance during World War II.
5. **Aesthetic appeal**: The tower's lattice-like design is both functional and beautiful, making it a work of art in its own right.
6. **Day and night experiences**: The tower is stunning during the day, but it's even more magical at night when it's illuminated with thousands of twinkling lights.
7. **Accessibility**: You can take the stairs or elevator to the top, making it accessible to people of all ages and abilities.
8. **
  • [X ] Converted 8B inference test
torchrun --nproc_per_node 1 example_chat_completion.py     --ckpt_dir ~/workspace/github/llama-recipes/src/llama_recipes/utils/hf_llama_conversion/test8B --tokenizer_path Meta-Llama-3-8B-Instruct/tokenizer.model     --max_seq_len 512 --max_batch_size 6

==================================

User: I am going to Paris, what should I see?

Assistant: Paris, the capital of France, is known for its stunning architecture, art museums, historical landmarks, and romantic atmosphere. Here are some of the top attractions to see in Paris:

1. The Eiffel Tower: The iconic Eiffel Tower is one of the most recognizable landmarks in the world and offers breathtaking views of the city.
2. The Louvre Museum: The Louvre is one of the world's largest and most famous museums, housing an impressive collection of art and artifacts, including the Mona Lisa.
3. Notre-Dame Cathedral: This beautiful cathedral is one of the most famous landmarks in Paris and is known for its Gothic architecture and stunning stained glass windows.

These are just a few of the many attractions that Paris has to offer. With so much to see and do, it's no wonder that Paris is one of the most popular tourist destinations in the world.

User: What is so great about #1?

> Assistant: The Eiffel Tower! It's one of the most iconic landmarks in the world, and for good reason. Here are some reasons why it's so great:

1. **Engineering marvel**: The Eiffel Tower was the tallest structure in the world when it was built for the 1889 World's Fair. Its innovative design and construction were a marvel of engineering at the time, and it remains an impressive feat to this day.
2. **Panoramic views**: The Eiffel Tower offers stunning 360-degree views of the City of Light from its observation decks on the first and second floors. On a clear day, you can see up to 59 kilometers (37 miles) in every direction.
3. **Romantic atmosphere**: The Eiffel Tower is often associated with romance, and it's easy to see why. The tower's iron latticework is beautifully lit up at night, creating a magical atmosphere that's perfect for couples.
4. **Historical significance**: The Eiffel Tower has played a significant role in French history, serving as a symbol of French culture and engineering prowess. It's also been the site of many historic events, including the signing of the Treaty of Versailles in 1919.
5. **Iconic status**: The Eiffel Tower is instantly recognizable, and its image is synonymous with Paris and France. It's a must-see attraction for anyone visiting the city.
6. **Unique architecture
  • [ X] Convert Llama2 70B
mkdir llama2-70b; cp ~/llama2/llama-2-70b-chat/params.json llama2-70b/
python -m llama_recipes.tools.convert_hf_weights_to_llama --model-path  meta-llama/Llama-2-70b-chat-hf --output-dir llama2-70b --model-size 70B
python compare_llama_weights.py llama2-70b ~/llama2/llama-2-70b-chat
Top 10 largest deltas:
  shard 0 layers.0.attention.wk.weight: 2.9802322387695312e-08 vs 0.01544189453125
  shard 0 layers.0.attention.wo.weight: 2.9802322387695312e-08 vs 0.005828857421875
  shard 0 layers.0.attention.wq.weight: 2.9802322387695312e-08 vs 0.006591796875

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • [X ] Did you read the contributor guideline,
    Pull Request section?
  • Was this discussed/approved via a Github issue? Please add a link
    to it if that's the case.
  • Did you make sure to update the documentation with your changes?
  • Did you write any new necessary tests?

Thanks for contributing 🎉!

Copy link
Contributor

@mreso mreso left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution! Overall, LGTM but could we move the comparison tools and README from utils to tools?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you move these files to tools as well?

Copy link
Contributor

@mreso mreso left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quickly did the move myself. Thanks for the contribution again!

@mreso mreso merged commit ed3136f into meta-llama:main Jul 9, 2024
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants