Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

performance of countTokens #68

Closed
pczekaj opened this issue Dec 8, 2024 · 7 comments
Closed

performance of countTokens #68

pczekaj opened this issue Dec 8, 2024 · 7 comments
Labels

Comments

@pczekaj
Copy link

pczekaj commented Dec 8, 2024

I comparing performance of gpt-tokenizer 2.7.0 and tiktoken 1.0.17, on Intel based Mac + node 22.11.0 I'm always getting worser times for gpt-tokenizer than for tiktoken. I'm I doing something wrong or is this expected?

image

import { countTokens } from 'gpt-tokenizer';
import { encoding_for_model } from 'tiktoken';

const SAMPLE_TEXT = 'Occaecat est tempor incididunt voluptate exercitation irure quis aliqua sunt dolor. Anim nostrud incididunt eu aliquip quis culpa do incididunt eu. Magna qui dolor deserunt sit velit. Dolor anim laborum ut ad in et occaecat enim elit culpa commodo. Sit ut sit mollit adipisicing. Labore culpa do cillum proident incididunt et. Reprehenderit nisi excepteur culpa consectetur mollit consectetur laborum';

const LONG_MSG_REPEATS = 50000;
const EXPECTED_TOKENS = 86;

const gpt35Encoding = encoding_for_model('gpt-3.5-turbo');

describe('TokenizerService', () => {
  it('gpt-tokenizer short text', () => {
    const tokens = countTokens(SAMPLE_TEXT);
    expect(tokens).toBe(EXPECTED_TOKENS);
  });

  it('tiktoken short text', () => {
    const tokens = gpt35Encoding.encode(SAMPLE_TEXT).length;
    expect(tokens).toBe(EXPECTED_TOKENS);
  });

  it('gpt-tokenizer long text', () => {
    const tokens = countTokens(SAMPLE_TEXT.repeat(LONG_MSG_REPEATS));
    expect(tokens).toBe(EXPECTED_TOKENS * LONG_MSG_REPEATS);
  });

  it('tiktoken long text', () => {
    const tokens = gpt35Encoding.encode(SAMPLE_TEXT.repeat(LONG_MSG_REPEATS)).length;
    expect(tokens).toBe(EXPECTED_TOKENS * LONG_MSG_REPEATS);
  });
});
@niieani
Copy link
Owner

niieani commented Dec 9, 2024

Hi @pczekaj.

I cannot reproduce this. I'm when I benchmark it, even with your own sample text tiktoken is 2x slower. I'm on node v22.11.0 and using a Macbook Pro M1 Max.

Screenshot 2024-12-08 at 22 20 08

When including other samples in the benchmark (English, Chinese, French, code) it's even faster (3.5x faster than tiktoken).

Screenshot 2024-12-08 at 22 16 51

How are you running the benchmark? What tool do you use to benchmark?

@pczekaj
Copy link
Author

pczekaj commented Dec 9, 2024

@niieani I'm executing it as jest test without any special benchmark software, I don't do anything special like invoking GC or warming it up. Screenshot is from IDE but I get similar results when executing it on command line:

npm exec jest -t "TokenizerService"
 PASS  src/services/TokenizerService.test.ts (16.426 s)
  TokenizerService
    ✓ gpt-tokenizer short text (10 ms)
    ✓ tiktoken short text (7 ms)
    ✓ gpt-tokenizer long text (11440 ms)
    ✓ tiktoken long text (4099 ms)

Test Suites: 1 passed, 1 total
Tests:       4 passed, 4 total
Snapshots:   0 total
Time:        16.54 s
Ran all test suites matching /TokenizerService/i.

I'm only checking total execution time, I don't track memory consumption, changing order of test cases didn't affect timing.

@niieani
Copy link
Owner

niieani commented Dec 9, 2024

Okay I've tried it with SAMPLE_TEXT.repeat(LONG_MSG_REPEATS) instead of just SAMPLE_TEXT and I do see about 25% slower execution times.

Got a couple of fixes and additional optimizations incoming... 💨

@niieani niieani closed this as completed in 15d13b1 Dec 9, 2024
Copy link

github-actions bot commented Dec 9, 2024

🎉 This issue has been resolved in version 2.8.0 🎉

The release is available on:

Your semantic-release bot 📦🚀

@niieani
Copy link
Owner

niieani commented Dec 9, 2024

Could you try again in 2.8.0 and let me know if it's any better?

@pczekaj
Copy link
Author

pczekaj commented Dec 9, 2024

@niieani 2.8.0 is a lot faster than 2.7.0. Execution time went down from 11440 ms to just 615 ms which is much faster than tiktoken. Thank you very much

@niieani
Copy link
Owner

niieani commented Dec 9, 2024

Perfect! Thanks for your feedback.

Pozdrawiam

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants