Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge from GO #59

Merged
merged 28 commits into from
May 31, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
28 commits
Select commit Hold shift + click to select a range
a091782
Fix skip_if_env for empty env key
arjunsuresh May 28, 2024
c90c545
Fix skip_if_env for empty env key
arjunsuresh May 28, 2024
73d85b3
Merge branch 'GATEOverflow:mlperf-inference' into mlperf-inference
arjunsuresh May 28, 2024
8fdffad
Merge pull request #34 from arjunsuresh/mlperf-inference
arjunsuresh May 28, 2024
6656410
int8 datatype alias added for intel mlperf inference
arjunsuresh May 29, 2024
1a449a8
don't output mlperf inference power efficiency when none
arjunsuresh May 29, 2024
b581ad3
Merge branch 'GATEOverflow:mlperf-inference' into mlperf-inference
arjunsuresh May 29, 2024
56fb30b
Clean TMP variables from docker env in the run command
arjunsuresh May 29, 2024
486b27f
Merge pull request #35 from arjunsuresh/mlperf-inference
arjunsuresh May 29, 2024
dc14736
Typo fix
arjunsuresh May 29, 2024
cb5bfa7
Not execute postdeps for fake_runs
arjunsuresh May 29, 2024
3115bb2
Merge branch 'GATEOverflow:mlperf-inference' into mlperf-inference
arjunsuresh May 29, 2024
cc66235
Merge pull request #36 from arjunsuresh/mlperf-inference
arjunsuresh May 29, 2024
cfbcc12
Fixes the code version for Intel mlperf inference v3.1
arjunsuresh May 29, 2024
f0d54ae
Fixes for intel mlperf inference bert
arjunsuresh May 30, 2024
c616358
Merge branch 'GATEOverflow:mlperf-inference' into mlperf-inference
arjunsuresh May 30, 2024
14f84d3
Merge pull request #37 from arjunsuresh/mlperf-inference
arjunsuresh May 30, 2024
a341ccc
Merge branch 'mlcommons:mlperf-inference' into mlperf-inference
arjunsuresh May 30, 2024
cacf36c
Added pytorch base image for reference implementation and cuda device
arjunsuresh May 31, 2024
b285782
Merge branch 'GATEOverflow:mlperf-inference' into mlperf-inference
arjunsuresh May 31, 2024
c48f383
Merge pull request #38 from arjunsuresh/mlperf-inference
arjunsuresh May 31, 2024
d474204
Added get,docker deps for cm docker script
arjunsuresh May 31, 2024
1b4cc18
Cleanup of docker install scripts
arjunsuresh May 31, 2024
9765b8a
Merge branch 'GATEOverflow:mlperf-inference' into mlperf-inference
arjunsuresh May 31, 2024
a8f4c0e
Docker run cleanups
arjunsuresh May 31, 2024
c67122a
Add nvidia,docker deps for docker run
arjunsuresh May 31, 2024
4ef87a4
Cleanups for mobilenet runs
arjunsuresh May 31, 2024
d8e8b7a
Merge pull request #39 from arjunsuresh/mlperf-inference
arjunsuresh May 31, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
42 changes: 21 additions & 21 deletions automation/script/module.py
Original file line number Diff line number Diff line change
Expand Up @@ -1165,26 +1165,27 @@ def _run(self, i):



# Check chain of posthook dependencies on other CM scripts. We consider them same as postdeps when
# script is in cache
if verbose:
print (recursion_spaces + ' - Checking posthook dependencies on other CM scripts:')
if not fake_run:
# Check chain of posthook dependencies on other CM scripts. We consider them same as postdeps when
# script is in cache
if verbose:
print (recursion_spaces + ' - Checking posthook dependencies on other CM scripts:')

clean_env_keys_post_deps = meta.get('clean_env_keys_post_deps',[])
clean_env_keys_post_deps = meta.get('clean_env_keys_post_deps',[])

r = self._call_run_deps(posthook_deps, self.local_env_keys, clean_env_keys_post_deps, env, state, const, const_state, add_deps_recursive,
r = self._call_run_deps(posthook_deps, self.local_env_keys, clean_env_keys_post_deps, env, state, const, const_state, add_deps_recursive,
recursion_spaces + extra_recursion_spaces,
remembered_selections, variation_tags_string, found_cached, debug_script_tags, verbose, show_time, extra_recursion_spaces, run_state)
if r['return']>0: return r
if r['return']>0: return r

if verbose:
print (recursion_spaces + ' - Checking post dependencies on other CM scripts:')
if verbose:
print (recursion_spaces + ' - Checking post dependencies on other CM scripts:')

# Check chain of post dependencies on other CM scripts
r = self._call_run_deps(post_deps, self.local_env_keys, clean_env_keys_post_deps, env, state, const, const_state, add_deps_recursive,
# Check chain of post dependencies on other CM scripts
r = self._call_run_deps(post_deps, self.local_env_keys, clean_env_keys_post_deps, env, state, const, const_state, add_deps_recursive,
recursion_spaces + extra_recursion_spaces,
remembered_selections, variation_tags_string, found_cached, debug_script_tags, verbose, show_time, extra_recursion_spaces, run_state)
if r['return']>0: return r
if r['return']>0: return r



Expand Down Expand Up @@ -4319,19 +4320,18 @@ def enable_or_skip_script(meta, env):
for key in meta:
meta_key = [str(v).lower() for v in meta[key]]
if key in env:
value = str(env[key]).lower()

value = str(env[key]).lower().strip()
if set(meta_key) & set(["yes", "on", "true", "1"]):
# Any set value other than false is taken as set
if value not in ["no", "off", "false", "0"]:
if value not in ["no", "off", "false", "0", ""]:
continue
elif set(meta_key) & set(["no", "off", "false", "0"]):
if value in ["no", "off", "false", "0"]:
if value in ["no", "off", "false", "0", ""]:
continue
elif value in meta_key:
continue
else:
if set(meta_key) & set(["no", "off", "false", "0"]):
if set(meta_key) & set(["no", "off", "false", "0", ""]):
# If key is missing in env, and if the expected value is False, consider it a match
continue

Expand All @@ -4348,15 +4348,15 @@ def any_enable_or_skip_script(meta, env):
for key in meta:
found = False
if key in env:
value = str(env[key]).lower()
value = str(env[key]).lower().strip()

meta_key = [str(v).lower() for v in meta[key]]

if set(meta_key) & set(["yes", "on", "true", "1"]):
if value not in ["no", "off", "false", "0"]:
if value not in ["no", "off", "false", "0", ""]:
found = True
elif set(meta_key) & set(["no", "off", "false", "0"]):
if value in ["no", "off", "false", "0"]:
elif set(meta_key) & set(["no", "off", "false", "0", ""]):
if value in ["no", "off", "false", "0", ""]:
found = True
elif value in meta_key:
found = True
Expand Down
13 changes: 12 additions & 1 deletion automation/script/module_misc.py
Original file line number Diff line number Diff line change
Expand Up @@ -1427,6 +1427,11 @@ def dockerfile(i):
continue
'''

d_env = i_run_cmd_arc.get('env', {})
for key in list(d_env.keys()):
if key.startswith("CM_TMP_"):
del(d_env[key])

# Check if need to update/map/mount inputs and env
r = process_inputs({'run_cmd_arc': i_run_cmd_arc,
'docker_settings': docker_settings,
Expand Down Expand Up @@ -1716,6 +1721,13 @@ def docker(i):
if image_repo == '':
image_repo = 'cknowledge'

# Host system needs to have docker
r = self_module.cmind.access({'action':'run',
'automation':'script',
'tags': "get,docker"})
if r['return'] > 0:
return r

for artifact in sorted(lst, key = lambda x: x.meta.get('alias','')):

meta = artifact.meta
Expand Down Expand Up @@ -1949,7 +1961,6 @@ def docker(i):
'docker_settings':docker_settings,
'docker_run_cmd_prefix':i.get('docker_run_cmd_prefix','')})
if r['return']>0: return r

run_cmd = r['run_cmd_string'] + ' ' + container_env_string + ' --docker_run_deps '

env['CM_RUN_STATE_DOCKER'] = True
Expand Down
16 changes: 14 additions & 2 deletions script/app-mlperf-inference-intel/_cm.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -173,6 +173,7 @@ variations:
inference-results
version: v4.0
v3.1:
group: version
env:
CM_MLPERF_INFERENCE_CODE_VERSION: "v3.1"
adr:
Expand All @@ -184,6 +185,7 @@ variations:
inference-results
version: v3.1


# Target devices
cpu:
group: device
Expand Down Expand Up @@ -615,6 +617,9 @@ variations:
dataset-preprocessed:
tags: _uint8,_rgb8

int8:
alias: uint8

int4,gptj_:
env:
INTEL_GPTJ_INT4: 'yes'
Expand Down Expand Up @@ -682,9 +687,16 @@ variations:
default_env:
CM_MLPERF_LOADGEN_BATCH_SIZE: 1

sapphire-rapids.24c,bert-99:
sapphire-rapids.24c,bert_:
env:
WORKERS_PER_PROC: 1
sapphire-rapids.112c,bert_,offline:
env:
WORKERS_PER_PROC: 4
sapphire-rapids.112c,bert_,server:
env:
WORKERS_PER_PROC: 8


docker:
docker_real_run: False
real_run: False
3 changes: 2 additions & 1 deletion script/app-mlperf-inference-intel/run_bert_harness.sh
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
#!/bin/bash

THREADS_PER_INSTANCE=$(((4 * ${CM_HOST_CPU_THREADS_PER_CORE}) / ${CM_HOST_CPU_SOCKETS}))
WORKERS_PER_PROC=${WORKERS_PER_PROC:-4}
THREADS_PER_INSTANCE=$((( ${WORKERS_PER_PROC} * ${CM_HOST_CPU_THREADS_PER_CORE}) / ${CM_HOST_CPU_SOCKETS}))

export LD_PRELOAD=${CONDA_PREFIX}/lib/libjemalloc.so
export MALLOC_CONF="oversize_threshold:1,background_thread:true,percpu_arena:percpu,metadata_thp:always,dirty_decay_ms:9000000000,muzzy_decay_ms:9000000000";
Expand Down
3 changes: 3 additions & 0 deletions script/app-mlperf-inference-mlcommons-python/_cm.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,9 @@ default_env:
CM_MLPERF_SUT_NAME_IMPLEMENTATION_PREFIX: reference
CM_MLPERF_SUT_NAME_RUN_CONFIG_SUFFIX: ''

docker:
real_run: False

# Map script inputs to environment variables
input_mapping:
count: CM_MLPERF_LOADGEN_QUERY_COUNT
Expand Down
14 changes: 11 additions & 3 deletions script/app-mlperf-inference/_cm.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -252,7 +252,8 @@ variations:
interactive: True
extra_run_args: ' --runtime=nvidia --ulimit memlock=-1 --cap-add SYS_ADMIN --cap-add SYS_TIME --security-opt apparmor=unconfined --security-opt seccomp=unconfined'
base_image: nvcr.io/nvidia/mlperf/mlperf-inference:mlpinf-v3.1-cuda12.2-cudnn8.9-x86_64-ubuntu20.04-l4-public
docker:os_version: "20.04"
os: "ubuntu"
os_version: "20.04"
deps:
- tags: get,mlperf,inference,nvidia,scratch,space
- tags: get,nvidia-docker
Expand Down Expand Up @@ -318,8 +319,6 @@ variations:
real_run: false
run: true
docker_input_mapping:
imagenet_path: IMAGENET_PATH
gptj_checkpoint_path: GPTJ_CHECKPOINT_PATH
criteo_preprocessed_path: CRITEO_PREPROCESSED_PATH
dlrm_data_path: DLRM_DATA_PATH
intel_gptj_int8_model_path: CM_MLPERF_INFERENCE_INTEL_GPTJ_INT8_MODEL_PATH
Expand Down Expand Up @@ -899,9 +898,16 @@ variations:
add_deps_recursive:
mlperf-inference-implementation:
tags: _cpu

cuda,reference:
docker:
base_image: nvcr.io/nvidia/pytorch:24.03-py3

cuda:
docker:
all_gpus: 'yes'
deps:
- tags: get,nvidia-docker
group:
device
env:
Expand Down Expand Up @@ -1151,6 +1157,8 @@ variations:
nvidia-inference-server:
version: r3.1
tags: _ctuning
intel-harness:
tags: _v3.1
default_env:
CM_SKIP_SYS_UTILS: 'yes'
CM_REGENERATE_MEASURE_FILES: 'yes'
Expand Down
64 changes: 64 additions & 0 deletions script/get-docker/customize.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
from cmind import utils
import os

def preprocess(i):

os_info = i['os_info']

env = i['env']

automation = i['automation']

recursion_spaces = i['recursion_spaces']

file_name = 'docker.exe' if os_info['platform'] == 'windows' else 'docker'
env['FILE_NAME'] = file_name

if 'CM_DOCKER_BIN_WITH_PATH' not in env:
r = i['automation'].find_artifact({'file_name': file_name,
'env': env,
'os_info':os_info,
'default_path_env_key': 'PATH',
'detect_version':True,
'env_path_key':'CM_DOCKER_BIN_WITH_PATH',
'run_script_input':i['run_script_input'],
'recursion_spaces':recursion_spaces})
if r['return'] >0 :
if r['return'] == 16:
run_file_name = "install"
r = automation.run_native_script({'run_script_input':i['run_script_input'], 'env':env, 'script_name':run_file_name})
if r['return'] >0: return r
else:
return r

return {'return':0}

def detect_version(i):
r = i['automation'].parse_version({'match_text': r'Docker version\s*([\d.]+)',
'group_number': 1,
'env_key':'CM_DOCKER_VERSION',
'which_env':i['env']})
if r['return'] >0: return r

version = r['version']

print (i['recursion_spaces'] + ' Detected version: {}'.format(version))
return {'return':0, 'version':version}

def postprocess(i):
env = i['env']

r = detect_version(i)

if r['return'] >0: return r

version = r['version']
found_file_path = env['CM_DOCKER_BIN_WITH_PATH']

found_path = os.path.dirname(found_file_path)
env['CM_DOCKER_INSTALLED_PATH'] = found_path
env['+PATH'] = [ found_path ]

env['CM_DOCKER_CACHE_TAGS'] = 'version-'+version

return {'return':0, 'version': version}
13 changes: 13 additions & 0 deletions script/get-docker/install-centos.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
sudo yum install -y yum-utils
sudo yum-config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo
sudo yum install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin

cmd="sudo usermod -aG docker $USER"
echo "$cmd"
eval "$cmd"
test $? -eq 0 || exit $?

echo "Please relogin to the shell so that the new group is effective"
exit 1
#exec newgrp docker
#sudo su - $USER
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,9 @@ cmd="sudo usermod -aG docker $USER"
echo "$cmd"
eval "$cmd"
test $? -eq 0 || exit $?

echo "Please relogin to the shell so that the new group is effective"
exit 1
#exec newgrp docker
#sudo su - $USER

2 changes: 2 additions & 0 deletions script/get-docker/install.bat
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
echo "Please install docker to continue"
exit 1
2 changes: 2 additions & 0 deletions script/get-docker/install.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
echo "Please install docker to continue"
exit 1
3 changes: 3 additions & 0 deletions script/get-docker/run.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
#!/bin/bash
docker --version > tmp-ver.out
test $? -eq 0 || exit 1
2 changes: 1 addition & 1 deletion script/get-mlperf-inference-loadgen/run.sh
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@ MLPERF_INFERENCE_PYTHON_SITE_BASE=${INSTALL_DIR}"/python"

cd "${CM_MLPERF_INFERENCE_SOURCE}/loadgen"
CFLAGS="-std=c++14 -O3" ${CM_PYTHON_BIN_WITH_PATH} setup.py bdist_wheel
${CM_PYTHON_BIN_WITH_PATH} -m pip install --force-reinstall `ls dist/mlperf_loadgen-*cp3${PYTHON_MINOR_VERSION}*.whl` --target=${MLPERF_INFERENCE_PYTHON_SITE_BASE}
${CM_PYTHON_BIN_WITH_PATH} -m pip install --force-reinstall `ls dist/mlperf_loadgen-*cp3${PYTHON_MINOR_VERSION}*.whl` --target="${MLPERF_INFERENCE_PYTHON_SITE_BASE}"
if [ "${?}" != "0" ]; then exit 1; fi

# Clean the built wheel
Expand Down
9 changes: 5 additions & 4 deletions script/get-mlperf-inference-utils/mlperf_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -262,19 +262,20 @@ def get_result_table(results):
row.append(val)
row.append("-")

val1 = results[model][scenario].get('TEST01')
val2 = results[model][scenario].get('TEST05')
val3 = results[model][scenario].get('TEST04')

#if results[model][scenario].get('power','') != '':
# row.append(results[model][scenario]['power'])
if results[model][scenario].get('power_efficiency','') != '':
val = str(results[model][scenario]['power_efficiency'])
if not results[model][scenario].get('power_valid', True):
val = "X "+val
row.append(val)
else:
elif val1 or val2 or val3: #Don't output unless there are any further column data
row.append(None)

val1 = results[model][scenario].get('TEST01')
val2 = results[model][scenario].get('TEST05')
val3 = results[model][scenario].get('TEST04')
if val1:
row.append(val1)
if val2:
Expand Down
Loading
Loading