Cleanup codebase: removed unnecessary code/logic #298

Qubitium · 2024-03-14T08:17:40Z

Fixed two issues in router loop:

double sleep (this may cause performance loss during high concurrency)
protect against sleep on potential GLOBAL_BACKEND_CONFIG.extend_dependency_time == 0

Cleanup:

remove unused self.eos_token_id in modle_rpc
removed unused ReqState.lock

EDIT:

Reverted due to review: (removed useless math for calculating # of completion_tokens)

Qubitium · 2024-03-14T09:06:50Z

@merrymercy @hnyls2002 PR tested with no performance regression. Ready for review/merge.

Qubitium · 2024-03-18T09:09:18Z

Latest commit removed redundant math:

eea0efa

prompt_tokens = len(req.input_ids). So adding len(req.input_ids) - prompt_tokens is always 0.

Qubitium · 2024-03-19T04:13:54Z

Latest commit : 9472363

Removed unused ReqState.lock

python/sglang/srt/managers/router/model_rpc.py

merrymercy · 2024-03-22T19:41:26Z

python/sglang/srt/managers/router/manager.py


-            await asyncio.sleep(0.0006)
+            if not slept:
+                await asyncio.sleep(0.0006)


Did you find this give better performance?

@hnyls2002 please also take a look

In my limited testing with low concurrency (1-2 concurrent requests with range 1-10 batches each) I did not find performance diff.

But this is unnecessary code/logic. There is no need to sleep for another 0.6ms (causing potential thread context switch) when it has slept for 30ms in earlier condition extend_dependency_time: float = 0.03.

Qubitium changed the title ~~[WIP] Fix double sleep and sleeping on extend_dependency_time == 0~~ [WIP] Performance: eliminate double sleep in router loop Mar 14, 2024

Qubitium changed the title ~~[WIP] Performance: eliminate double sleep in router loop~~ [WIP] Eliminate double sleep in router loop Mar 14, 2024

Qubitium changed the title ~~[WIP] Eliminate double sleep in router loop~~ Eliminate double sleep in router loop Mar 14, 2024

Qubitium force-pushed the mod-forward-sleep branch from ca73c5f to e56ce1b Compare March 15, 2024 08:58

Qubitium changed the title ~~Eliminate double sleep in router loop~~ Remove double sleep and redundant math Mar 18, 2024

Qubitium changed the title ~~Remove double sleep and redundant math~~ Cleanup codebase: removed unnecessary code/logic Mar 19, 2024

merrymercy requested changes Mar 22, 2024

View reviewed changes

merrymercy self-assigned this Mar 22, 2024

Qubitium force-pushed the mod-forward-sleep branch from 9472363 to c68572f Compare March 23, 2024 00:40

merrymercy requested a review from hnyls2002 March 23, 2024 00:42

merrymercy assigned hnyls2002 and unassigned merrymercy Mar 23, 2024

Qubitium force-pushed the mod-forward-sleep branch from c68572f to ad6410b Compare March 23, 2024 00:43

Remove unused and double sleep in router

431cbb8

Qubitium force-pushed the mod-forward-sleep branch from ad6410b to 431cbb8 Compare March 23, 2024 00:44

merrymercy approved these changes Mar 23, 2024

View reviewed changes

merrymercy merged commit ce216c8 into sgl-project:main Mar 23, 2024

Qubitium deleted the mod-forward-sleep branch March 28, 2024 15:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cleanup codebase: removed unnecessary code/logic #298

Cleanup codebase: removed unnecessary code/logic #298

Qubitium commented Mar 14, 2024 •

edited

Loading

Qubitium commented Mar 14, 2024

Qubitium commented Mar 18, 2024

Qubitium commented Mar 19, 2024

merrymercy Mar 22, 2024

Qubitium Mar 22, 2024 •

edited

Loading

Cleanup codebase: removed unnecessary code/logic #298

Cleanup codebase: removed unnecessary code/logic #298

Conversation

Qubitium commented Mar 14, 2024 • edited Loading

Qubitium commented Mar 14, 2024

Qubitium commented Mar 18, 2024

Qubitium commented Mar 19, 2024

merrymercy Mar 22, 2024

Choose a reason for hiding this comment

Qubitium Mar 22, 2024 • edited Loading

Choose a reason for hiding this comment

Qubitium commented Mar 14, 2024 •

edited

Loading

Qubitium Mar 22, 2024 •

edited

Loading