Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add token counts to compute performance #1801

Merged
merged 6 commits into from
Oct 23, 2024
Merged

Add token counts to compute performance #1801

merged 6 commits into from
Oct 23, 2024

Conversation

rasbt
Copy link
Collaborator

@rasbt rasbt commented Oct 23, 2024

This is an update to the finetuning scripts to report the tok/sec performance. Right now, it's based on the total tokens with padding (that's how other tools report it), but the raw token counts are added as well to compute it differently based on the project's needs. Examples:

Batch size 1

| ------------------------------------------------------
| Token Counts
| - Input Tokens              :   1578
| - Tokens w/ Prompt          :   2674
| - Total Tokens (w/ Padding) :   2674
| -----------------------------------------------------
| Performance
| - Training Time             :  43.95 s
| - Tok/sec                   :  60.85 tok/s
| -----------------------------------------------------
| Memory Usage                                                                 
| - Memory Used               :  10.45 GB                                        
=======================================================

Batch size 2

| ------------------------------------------------------
| Token Counts
| - Input Tokens              :   1578
| - Tokens w/ Prompt          :   2674
| - Total Tokens (w/ Padding) :   3348
| -----------------------------------------------------
| Performance
| - Training Time             :  30.27 s
| - Tok/sec                   :  110.61 tok/s
| -----------------------------------------------------
| Memory Usage                                                                 
| - Memory Used               :  10.66 GB                                        
=======================================================

@rasbt rasbt requested a review from lantiga as a code owner October 23, 2024 15:41
litgpt/finetune/lora.py Outdated Show resolved Hide resolved
@rasbt rasbt merged commit d741d26 into main Oct 23, 2024
8 of 9 checks passed
@rasbt rasbt deleted the token-counts branch October 23, 2024 18:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant