You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This is a tracker of removing vLLM dependencies in general model code (not considering quantization). This is our current import from vLLM, and we want to remove all them.
Just curious, are the following imports in model_runner.py also being considered for removal, in later stages
from vllm.config import DeviceConfig, LoadConfig
from vllm.config import ModelConfig as VllmModelConfig
from vllm.distributed import (
get_tp_group,
init_distributed_environment,
initialize_model_parallel,
set_custom_all_reduce,
)
from vllm.distributed.parallel_state import in_the_same_node_as
from vllm.model_executor.model_loader import get_model
from vllm.model_executor.models import ModelRegistry
Motivation
This is a tracker of removing vLLM dependencies in general model code (not considering quantization). This is our current import from vLLM, and we want to remove all them.
Tracker
CacheConfig
: [1/N] RemoveCacheConfig
import in all model files #1658get_tensor_model_parallel_world_size
ParallelLMHead
: Update vocab embedding deps and add TP switch #1856VocabParallelEmbedding
: Update vocab embedding deps and add TP switch #1856The text was updated successfully, but these errors were encountered: