-
Notifications
You must be signed in to change notification settings - Fork 265
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Script to figure out model sequence length limits #1579
Conversation
Can we make this a scenario that we run officially (kind of like a unit test) rather than a one-off script? |
Sure, I could look into that. Fox now, since @yifanmai needs this (I think) to fix the canary runs. Should we keep it as a script and then I could make it into a scenario in a different PR? |
This can't be a scenario because scenarios cannot access the tokenizer. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@teetone could you also take a look at this?
scripts/compute_request_limits.py
Outdated
|
||
# model_name, tokenizer_name, prefix and suffix are passed as arguments | ||
parser = argparse.ArgumentParser() | ||
parser.add_argument("--model_name", type=str, default="writer/palmyra-base") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Optional suggestion: helm-run
and helm-summarize
use hyphens for flags; you could considering doing that here for consistency. Note that argparse will autoconvert field names to use underscore e.g. args.model_name
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure I understand this, I will just skip it for now
return lower_bound + max_prompt_length | ||
|
||
|
||
def check_limits( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this just for double-checking i.e. if the measurement was performed correctly, should this method always return that the limits are correct??
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Normally it should, it's was just for me to check that my implementation was correct and that I did not have a +1 problem with my binary search
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@teetone could you also take a look at this?
This script is used to find the limits required by the window service