Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

⚡️ Faster tokenizer of strings #5387

Merged
merged 5 commits into from
Oct 31, 2024
Merged

Commits on Oct 31, 2024

  1. ⚡️ Faster tokenizer of strings

    Our `string` arbitrary starts its initialization by tokenizing known vulnerable strings into set of units (chars). The idea behind this tokenization process is to later generate vulnerable strings while generating entries with this arbitrary.
    
    The process is the following:
    - for each string known to be vulnerable, try to tokenize it with respect to the provided constraints on length and the unit arbitrary
    - for each tokenizable string, add it to the bucket of potentially to be generated strings
    
    This original tokenizer process was able to abide by constraints on length. Computed tokens were depending on the set of provided contraints on lengths and the arbitrary being considered. But this flexibility had a runtime cost we don't want to pay anymore. The tokenizer will stop trying to optimize on the lengths and will just tokenize for the requested arbitrary.
    dubzzz authored Oct 31, 2024
    Configuration menu
    Copy the full SHA
    80346f8 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    e009720 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    ce3521c View commit details
    Browse the repository at this point in the history
  4. fix lint

    dubzzz authored Oct 31, 2024
    Configuration menu
    Copy the full SHA
    87c33fc View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    76da3a8 View commit details
    Browse the repository at this point in the history