-
-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Speculative decoding] Add periodic log with time spent in proposal/scoring/verification #6963
Conversation
cadedaniel
commented
Jul 30, 2024
- Log values whenever rejection sampler produces metrics
- Measures average time spent proposing each token, time spent scoring, and time spent verifiying.
👋 Hi! Thank you for contributing to the vLLM project. Once the PR is approved and ready to go, please make sure to run full CI as it is required to merge (or just use auto-merge). To run full CI, you can do one of these:
🚀 |
class Timer: | ||
|
||
def __enter__(self): | ||
self.start_time = time.time() | ||
return self | ||
|
||
def __exit__(self, exc_type, exc_value, traceback): | ||
self.end_time = time.time() | ||
self.elapsed_time_s = self.end_time - self.start_time | ||
self.elapsed_time_ms = self.elapsed_time_s * 1000 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Put it in a more common space like utils?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ack
(average_time_per_proposal_tok_ms, scoring_time_ms, | ||
verification_time_ms) = stage_times | ||
logger.info( | ||
"SpecDecodeWorker stage times: " | ||
"average_time_per_proposal_tok_ms=%.02f " | ||
"scoring_time_ms=%.02f verification_time_ms=%.02f", | ||
average_time_per_proposal_tok_ms, scoring_time_ms, | ||
verification_time_ms) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like we will print this log every step? Did I miss something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
only when rejection sampler emits metrics, which is every 5s
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll add comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
QQ; should we enable it only disable_log_stats=False?
What is the motivation ? |
oh because normally many users don't want to emit this kind of logs (and I thought that's why we have the disable_log_stats as an argument) |
Hmm good point. Let me see if I can pipe down that argument. |
…coring/verification (vllm-project#6963)
…coring/verification (vllm-project#6963)
…coring/verification (vllm-project#6963) Signed-off-by: Alvant <alvasian@yandex.ru>
…coring/verification (vllm-project#6963)