DeepSpeed-MII is an open-source Python library developed by the DeepSpeed team at Microsoft. It is specifically designed to facilitate low-latency and low-cost inference of large, powerful models.
Platform Specific Instuctions and scripts used for LLM-Inference-Bench