mlcommons · mrmhodak · Sep 24, 2024 · Feb 26, 2024 · Feb 26, 2024 · Apr 11, 2024
@@ -5,6 +5,8 @@ hide:
 
 # Image Classification using Mobilenet models
 
+Install CM following the [installation page](site:install).
+
 Mobilenet models are not official MLPerf models and so cannot be used for a Closed division MLPerf inference submission. But since they can be run with Imagenet dataset, we are allowed to use them for Open division submission. Only CPU runs are supported now. 
 
 ## TFLite Backend

@@ -3,8 +3,10 @@ hide:
   - toc
 ---
 
+
 # Image Classification using ResNet50 
 
+
 === "MLCommons-Python"
     ## MLPerf Reference Implementation in Python
 

@@ -5,7 +5,6 @@ hide:
 
 # Text Summarization using GPT-J
 
-
 === "MLCommons-Python"
     ## MLPerf Reference Implementation in Python
 

@@ -5,7 +5,6 @@ hide:
 
 # Text Summarization using LLAMA2-70b
 
-
 === "MLCommons-Python"
     ## MLPerf Reference Implementation in Python
 
@@ -25,4 +24,4 @@ hide:
 
 {{ mlperf_inference_implementation_readme (4, "llama2-70b-99", "neuralmagic") }}
 
-{{ mlperf_inference_implementation_readme (4, "llama2-70b-99.9", "neuralmagic") }}
+{{ mlperf_inference_implementation_readme (4, "llama2-70b-99.9", "neuralmagic") }}
@@ -3,7 +3,9 @@ hide:
   - toc
 ---
 
+# Question Answering, Math, and Code Generation using Mixtral-8x7B
+
 === "MLCommons-Python"
     ## MLPerf Reference Implementation in Python
 
-{{ mlperf_inference_implementation_readme (4, "mixtral-8x7b", "reference") }}
+{{ mlperf_inference_implementation_readme (4, "mixtral-8x7b", "reference") }}
@@ -0,0 +1,48 @@
+---
+hide:
+  - toc
+---
+
+# Question and Answering using Bert Large for IndySCC 2024
+
+## Introduction
+
+This guide is designed for the [IndySCC 2024](https://sc24.supercomputing.org/students/indyscc/) to walk participants through running and optimizing the [MLPerf Inference Benchmark](https://arxiv.org/abs/1911.02549) using [Bert Large](https://github.com/mlcommons/inference/tree/master/language/bert#supported-models) across various software and hardware configurations. The goal is to maximize system throughput (measured in samples per second) without compromising accuracy.
+
+For a valid MLPerf inference submission, two types of runs are required: a performance run and an accuracy run. In this competition, we focus on the `Offline` scenario, where throughput is the key metric—higher values are better. The official MLPerf inference benchmark for Bert Large requires processing a minimum of 10833 samples in both performance and accuracy modes using the Squad v1.1 dataset. Setting up for Nvidia GPUs may take 2-3 hours but can be done offline. Your final output will be a tarball (`mlperf_submission.tar.gz`) containing MLPerf-compatible results, which you will submit to the SCC organizers for scoring.
+
+## Scoring
+
+In the SCC, your first objective will be to run a reference (unoptimized) Python implementation or a vendor-provided version (such as Nvidia's) of the MLPerf inference benchmark to secure a baseline score.
+
+Once the initial run is successful, you'll have the opportunity to optimize the benchmark further by maximizing system utilization, applying quantization techniques, adjusting ML frameworks, experimenting with batch sizes, and more, all of which can earn you additional points.
+
+Since vendor implementations of the MLPerf inference benchmark vary and are often limited to single-node benchmarking, teams will compete within their respective hardware categories (e.g., Nvidia GPUs, AMD GPUs). Points will be awarded based on the throughput achieved on your system.
+
+
+!!! info
+    Both MLPerf and CM automation are evolving projects.
+    If you encounter issues or have questions, please submit them [here](https://github.com/mlcommons/cm4mlops/issues)
+
+## Artifacts to submit to the SCC committee
+
+You will need to submit the following files:
+
+* `mlperf_submission_short.tar.gz` - automatically generated file with validated MLPerf results.
+* `mlperf_submission_short_summary.json` - automatically generated summary of MLPerf results.
+* `mlperf_submission_short.run` - CM commands to run MLPerf BERT inference benchmark saved to this file.
+* `mlperf_submission_short.tstamps` - execution timestamps before and after CM command saved to this file.
+* `mlperf_submission_short.md` - description of your platform and some highlights of the MLPerf benchmark execution.
+
+
+
+=== "MLCommons-Python"
+    ## MLPerf Reference Implementation in Python
+
+{{ mlperf_inference_implementation_readme (4, "bert-99", "reference", extra_variation_tags=",_short", scenarios=["Offline"],categories=["Edge"], setup_tips=False) }}
+
+=== "Nvidia"
+    ## Nvidia MLPerf Implementation
+{{ mlperf_inference_implementation_readme (4, "bert-99", "nvidia", extra_variation_tags=",_short", scenarios=["Offline"],categories=["Edge"], setup_tips=False, implementation_tips=False) }}
+
+
@@ -5,7 +5,6 @@ hide:
 
 # Medical Imaging using 3d-unet (KiTS 2019 kidney tumor segmentation task)
 
-
 === "MLCommons-Python"
     ## MLPerf Reference Implementation in Python
 

@@ -5,8 +5,6 @@ hide:
 
 # Recommendation using DLRM v2
 
-
-## Benchmark Implementations
 === "MLCommons-Python"
     ## MLPerf Reference Implementation in Python
 
@@ -26,4 +24,4 @@ hide:
 
 {{ mlperf_inference_implementation_readme (4, "dlrm-v2-99", "intel") }}
 
-{{ mlperf_inference_implementation_readme (4, "dlrm-v2-99.9", "intel") }}
+{{ mlperf_inference_implementation_readme (4, "dlrm-v2-99.9", "intel") }}
@@ -0,0 +1,96 @@
+---
+hide:
+  - toc
+---
+
+# Text-to-Image with Stable Diffusion for Student Cluster Competition 2024
+
+## Introduction
+
+This guide is designed for the [Student Cluster Competition 2024](https://sc24.supercomputing.org/students/student-cluster-competition/) to walk participants through running and optimizing the [MLPerf Inference Benchmark](https://arxiv.org/abs/1911.02549) using [Stable Diffusion XL 1.0](https://github.com/mlcommons/inference/tree/master/text_to_image#supported-models) across various software and hardware configurations. The goal is to maximize system throughput (measured in samples per second) without compromising accuracy. Since the model performs poorly on CPUs, it is essential to run it on GPUs.
+
+For a valid MLPerf inference submission, two types of runs are required: a performance run and an accuracy run. In this competition, we focus on the `Offline` scenario, where throughput is the key metric—higher values are better. The official MLPerf inference benchmark for Stable Diffusion XL requires processing a minimum of 5,000 samples in both performance and accuracy modes using the COCO 2014 dataset. However, for SCC, we have reduced this and we also have two variants. `scc-base` variant has dataset size reduced to 50 samples, making it possible to complete both performance and accuracy runs in approximately 5-10 minutes. `scc-main` variant has dataset size of 500 and running it will fetch extra points as compared to running just the base variant. Setting up for Nvidia GPUs may take 2-3 hours but can be done offline. Your final output will be a tarball (`mlperf_submission.tar.gz`) containing MLPerf-compatible results, which you will submit to the SCC organizers for scoring.
+
+## Scoring
+
+In the SCC, your first objective will be to run `scc-base` variant for reference (unoptimized) Python implementation or a vendor-provided version (such as Nvidia's) of the MLPerf inference benchmark to secure a baseline score.
+
+Once the initial run is successful, you'll have the opportunity to optimize the benchmark further by maximizing system utilization, applying quantization techniques, adjusting ML frameworks, experimenting with batch sizes, and more, all of which can earn you additional points.
+
+Since vendor implementations of the MLPerf inference benchmark vary and are often limited to single-node benchmarking, teams will compete within their respective hardware categories (e.g., Nvidia GPUs, AMD GPUs). Points will be awarded based on the throughput achieved on your system.
+
+Additionally, significant bonus points will be awarded if your team enhances an existing implementation, adds support for new hardware (such as an unsupported GPU), enables multi-node execution, or adds/extends scripts to [cm4mlops repository](https://github.com/mlcommons/cm4mlops/tree/main/script) supporting new devices, frameworks, implementations etc. All improvements must be made publicly available under the Apache 2.0 license and submitted alongside your results to the SCC committee to earn these bonus points, contributing to the MLPerf community.
+
+
+!!! info
+    Both MLPerf and CM automation are evolving projects.
+    If you encounter issues or have questions, please submit them [here](https://github.com/mlcommons/cm4mlops/issues)
+
+## Artifacts to submit to the SCC committee
+
+You will need to submit the following files:
+
+* `mlperf_submission.run` - CM commands to run MLPerf inference benchmark saved to this file.
+* `mlperf_submission.md` - description of your platform and some highlights of the MLPerf benchmark execution.
+* `<Team Name>` under which results are pushed to the github repository. 
+
+
+## SCC interview
+
+You are encouraged to highlight and explain the obtained MLPerf inference throughput on your system
+and describe any improvements and extensions to this benchmark (such as adding new hardware backend
+or supporting multi-node execution) useful for the community and [MLCommons](https://mlcommons.org).
+
+## Run Commands
+
+=== "MLCommons-Python"
+    ## MLPerf Reference Implementation in Python
+
+{{ mlperf_inference_implementation_readme (4, "sdxl", "reference", extra_variation_tags=",_short,_scc24-base", devices=["ROCm", "CUDA"],scenarios=["Offline"],categories=["Datacenter"], setup_tips=False, skip_test_query_count=True, extra_input_string="--precision=float16") }}
+
+=== "Nvidia"
+    ## Nvidia MLPerf Implementation
+{{ mlperf_inference_implementation_readme (4, "sdxl", "nvidia", extra_variation_tags=",_short,_scc24-base", scenarios=["Offline"],categories=["Datacenter"], setup_tips=False, implementation_tips=False, skip_test_query_count=True) }}
+
+!!! info
+    Once the above run is successful, you can change `_scc24-base` to `_scc24-main` to run the main variant.
+
+## Submission Commands
+
+### Generate actual submission tree
+
+```bash
+cm run script --tags=generate,inference,submission \
+   --clean \
+   --preprocess_submission=yes \
+   --run-checker \
+   --tar=yes \
+   --env.CM_TAR_OUTFILE=submission.tar.gz \
+   --division=open \
+   --category=datacenter \
+   --env.CM_DETERMINE_MEMORY_CONFIGURATION=yes \
+   --run_style=test \
+   --adr.submission-checker.tags=_short-run \
+   --quiet \
+   --submitter=<Team Name>
+```
+
+* Use `--hw_name="My system name"` to give a meaningful system name.
+
+
+### Push Results to GitHub
+
+Fork the repository URL at [https://github.com/gateoverflow/cm4mlperf-inference](https://github.com/gateoverflow/cm4mlperf-inference). 
+
+Run the following command after **replacing `--repo_url` with your GitHub fork URL**.
+
+```bash
+cm run script --tags=push,github,mlperf,inference,submission \
+   --repo_url=https://github.com/gateoverflow/cm4mlperf-inference \
+   --repo_branch=mlperf-inference-results-scc24 \
+   --commit_message="Results on system <HW Name>" \
+   --quiet
+```
+
+Once uploaded give a Pull Request to the origin repository. Github action will be running there and once 
+finished you can see your submitted results at [https://gateoverflow.github.io/cm4mlperf-inference](https://gateoverflow.github.io/cm4mlperf-inference).
@@ -8,24 +8,24 @@ We use MLCommons CM Automation framework to run MLPerf inference benchmarks.
 
 CM needs `git`, `python3-pip` and `python3-venv` installed on your system. If any of these are absent, please follow the [official CM installation page](https://docs.mlcommons.org/ck/install) to install them. Once the dependencies are installed, do the following
 
-## Activate a VENV for CM
+## Activate a Virtual ENV for CM
+This step is not mandatory as CM can use separate virtual environment for MLPerf inference. But the latest `pip` install requires this or else will need the `--break-system-packages` flag while installing `cm4mlops`.
+
 ```bash
    python3 -m venv cm
    source cm/bin/activate
 ```
 
 ## Install CM and pulls any needed repositories
-
-```bash
-   pip install cm4mlops
-```
-
-## To work on custom GitHub repo and branch
-
-```bash
-   pip install cmind && cm init --quiet --repo=mlcommons@cm4mlops --branch=mlperf-inference
-```
-
-Here, repo is in the format `githubUsername@githubRepo`.
+=== "Use the default fork of CM MLOps repository"
+    ```bash
+     pip install cm4mlops
+    ```
+
+=== "Use custom fork/branch of the CM MLOps repository"
+    ```bash
+     pip install cmind && cm init --quiet --repo=mlcommons@cm4mlops --branch=mlperf-inference
+    ```
+    Here, `repo` is in the format `githubUsername@githubRepo`.
 
 Now, you are ready to use the `cm` commands to run MLPerf inference as given in the [benchmarks](../index.md) page
@@ -2,3 +2,5 @@ mkdocs-material
 swagger-markdown
 mkdocs-macros-plugin
 ruamel.yaml
+mkdocs-redirects
+mkdocs-site-urls
@@ -60,63 +60,63 @@ Once all the results across all the models are ready you can use the following c
 === "Closed Edge"
     ### Closed Edge Submission
     ```bash
-       cm run script --tags=generate,inference,submission \
-          --clean \
-          --preprocess_submission=yes \
-          --run-checker \
-          --submitter=MLCommons \
-          --tar=yes \
-          --env.CM_TAR_OUTFILE=submission.tar.gz \
-          --division=closed \
-          --category=edge \
-          --env.CM_DETERMINE_MEMORY_CONFIGURATION=yes \
-          --quiet
+    cm run script --tags=generate,inference,submission \
+       --clean \
+       --preprocess_submission=yes \
+       --run-checker \
+       --submitter=MLCommons \
+       --tar=yes \
+       --env.CM_TAR_OUTFILE=submission.tar.gz \
+       --division=closed \
+       --category=edge \
+       --env.CM_DETERMINE_MEMORY_CONFIGURATION=yes \
+       --quiet
     ```
 
 === "Closed Datacenter"
     ### Closed Datacenter Submission
     ```bash
-       cm run script --tags=generate,inference,submission \
-          --clean \
-          --preprocess_submission=yes \
-          --run-checker \
-          --submitter=MLCommons \
-          --tar=yes \
-          --env.CM_TAR_OUTFILE=submission.tar.gz \
-          --division=closed \
-          --category=datacenter \
-          --env.CM_DETERMINE_MEMORY_CONFIGURATION=yes \
-          --quiet
+    cm run script --tags=generate,inference,submission \
+       --clean \
+       --preprocess_submission=yes \
+       --run-checker \
+       --submitter=MLCommons \
+       --tar=yes \
+       --env.CM_TAR_OUTFILE=submission.tar.gz \
+       --division=closed \
+       --category=datacenter \
+       --env.CM_DETERMINE_MEMORY_CONFIGURATION=yes \
+       --quiet
     ```
 === "Open Edge"
     ### Open Edge Submission
     ```bash
-       cm run script --tags=generate,inference,submission \
-          --clean \
-          --preprocess_submission=yes \
-          --run-checker \
-          --submitter=MLCommons \
-          --tar=yes \
-          --env.CM_TAR_OUTFILE=submission.tar.gz \
-          --division=open \
-          --category=edge \
-          --env.CM_DETERMINE_MEMORY_CONFIGURATION=yes \
-          --quiet
+    cm run script --tags=generate,inference,submission \
+       --clean \
+       --preprocess_submission=yes \
+       --run-checker \
+       --submitter=MLCommons \
+       --tar=yes \
+       --env.CM_TAR_OUTFILE=submission.tar.gz \
+       --division=open \
+       --category=edge \
+       --env.CM_DETERMINE_MEMORY_CONFIGURATION=yes \
+       --quiet
     ```
 === "Open Datacenter"
     ### Closed Datacenter Submission
     ```bash
-       cm run script --tags=generate,inference,submission \
-          --clean \
-          --preprocess_submission=yes \
-          --run-checker \
-          --submitter=MLCommons \
-          --tar=yes \
-          --env.CM_TAR_OUTFILE=submission.tar.gz \
-          --division=open \
-          --category=datacenter \
-          --env.CM_DETERMINE_MEMORY_CONFIGURATION=yes \
-          --quiet
+    cm run script --tags=generate,inference,submission \
+       --clean \
+       --preprocess_submission=yes \
+       --run-checker \
+       --submitter=MLCommons \
+       --tar=yes \
+       --env.CM_TAR_OUTFILE=submission.tar.gz \
+       --division=open \
+       --category=datacenter \
+       --env.CM_DETERMINE_MEMORY_CONFIGURATION=yes \
+       --quiet
     ```
 
 * Use `--hw_name="My system name"` to give a meaningful system name. Examples can be seen [here](https://github.com/mlcommons/inference_results_v3.0/tree/main/open/cTuning/systems)
@@ -134,7 +134,7 @@ If you are collecting results across multiple systems you can generate different
 Run the following command after **replacing `--repo_url` with your GitHub repository URL**.
 
 ```bash
-   cm run script --tags=push,github,mlperf,inference,submission \
+cm run script --tags=push,github,mlperf,inference,submission \
    --repo_url=https://github.com/GATEOverflow/mlperf_inference_submissions_v4.1 \
    --commit_message="Results on <HW name> added by <Name>" \
    --quiet