Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Removed mlperf.conf check in submission checker, removed equal issue … #1887

Merged
merged 67 commits into from
Oct 31, 2024

Conversation

arjunsuresh
Copy link
Contributor

@arjunsuresh arjunsuresh commented Oct 23, 2024

…mode check in conf files

Also does

  1. Use model-info.json in the submission measurements directory: MLPerf inference v4.1 Postmortem item: Rename <system_desc_id>_<implementation_id>_<scenario>.json to model-info.json policies#182
  2. Removes hardwired VERSION in loadgen
  3. Fixes Improve the submission checker to safely exclude the invalid submissions and create a submission tarball of only valid submissions #1855
  4. Fixes the issue of pypi loadgen wheel - mlperf.conf details are now embedded in the pypi wheel file.
  5. Adds a Github action test for mlperf inference using loadgen whl downloaded from pypi
  6. Added --skip-extra-accuracy-files-check option to submission checker to skip checking images folder for SDXL.

@arjunsuresh arjunsuresh requested a review from a team as a code owner October 23, 2024 12:57
Copy link

github-actions bot commented Oct 23, 2024

MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅

@arjunsuresh
Copy link
Contributor Author

Test output

arjun@arjun-spr:~/inference/tools/submission$ python3 preprocess_submission.py --input=$HOME/inference_results_v4.1 --output=test --submitter=AMD
[2024-10-29 18:30:20,247 log_parser.py:59 INFO] Sucessfully loaded MLPerf log from closed/AMD/results/8xMI300X_2xEPYC-9374F/llama2-70b-99.9/Offline/accuracy/mlperf_log_detail.txt.
[2024-10-29 18:30:20,248 log_parser.py:59 INFO] Sucessfully loaded MLPerf log from closed/AMD/results/8xMI300X_2xEPYC-9374F/llama2-70b-99.9/Offline/performance/run_1/mlperf_log_detail.txt.
[2024-10-29 18:30:20,250 log_parser.py:59 INFO] Sucessfully loaded MLPerf log from closed/AMD/results/8xMI300X_2xEPYC-9374F/llama2-70b-99.9/Offline/performance/run_1/mlperf_log_detail.txt.
[2024-10-29 18:30:20,250 submission_checker.py:1385 INFO] Target latency: None, Latency: 919869139234, Scenario: Offline
[2024-10-29 18:30:20,252 log_parser.py:59 INFO] Sucessfully loaded MLPerf log from closed/AMD/results/8xMI300X_2xEPYC-9374F/llama2-70b-99.9/Server/accuracy/mlperf_log_detail.txt.
[2024-10-29 18:30:20,253 log_parser.py:59 INFO] Sucessfully loaded MLPerf log from closed/AMD/results/8xMI300X_2xEPYC-9374F/llama2-70b-99.9/Server/performance/run_1/mlperf_log_detail.txt.
[2024-10-29 18:30:20,255 log_parser.py:59 INFO] Sucessfully loaded MLPerf log from closed/AMD/results/8xMI300X_2xEPYC-9374F/llama2-70b-99.9/Server/performance/run_1/mlperf_log_detail.txt.
[2024-10-29 18:30:20,255 submission_checker.py:1366 INFO] Target latency: 20000000000, Early Stopping Latency: 0, Scenario: Server
[2024-10-29 18:30:20,257 log_parser.py:59 INFO] Sucessfully loaded MLPerf log from closed/AMD/results/8xMI300X_2xEPYC-9374F/llama2-70b-99/Offline/accuracy/mlperf_log_detail.txt.
[2024-10-29 18:30:20,258 log_parser.py:59 INFO] Sucessfully loaded MLPerf log from closed/AMD/results/8xMI300X_2xEPYC-9374F/llama2-70b-99/Offline/performance/run_1/mlperf_log_detail.txt.
[2024-10-29 18:30:20,259 log_parser.py:59 INFO] Sucessfully loaded MLPerf log from closed/AMD/results/8xMI300X_2xEPYC-9374F/llama2-70b-99/Offline/performance/run_1/mlperf_log_detail.txt.
[2024-10-29 18:30:20,259 submission_checker.py:1385 INFO] Target latency: None, Latency: 919869139234, Scenario: Offline
[2024-10-29 18:30:20,261 log_parser.py:59 INFO] Sucessfully loaded MLPerf log from closed/AMD/results/8xMI300X_2xEPYC-9374F/llama2-70b-99/Server/accuracy/mlperf_log_detail.txt.
[2024-10-29 18:30:20,262 log_parser.py:59 INFO] Sucessfully loaded MLPerf log from closed/AMD/results/8xMI300X_2xEPYC-9374F/llama2-70b-99/Server/performance/run_1/mlperf_log_detail.txt.
[2024-10-29 18:30:20,264 log_parser.py:59 INFO] Sucessfully loaded MLPerf log from closed/AMD/results/8xMI300X_2xEPYC-9374F/llama2-70b-99/Server/performance/run_1/mlperf_log_detail.txt.
[2024-10-29 18:30:20,264 submission_checker.py:1366 INFO] Target latency: 20000000000, Early Stopping Latency: 0, Scenario: Server
[2024-10-29 18:30:20,265 log_parser.py:59 INFO] Sucessfully loaded MLPerf log from closed/AMD/results/1xMI300X_2xEPYC-9374F/llama2-70b-99.9/Offline/accuracy/mlperf_log_detail.txt.
[2024-10-29 18:30:20,267 log_parser.py:59 INFO] Sucessfully loaded MLPerf log from closed/AMD/results/1xMI300X_2xEPYC-9374F/llama2-70b-99.9/Offline/performance/run_1/mlperf_log_detail.txt.
[2024-10-29 18:30:20,268 log_parser.py:59 INFO] Sucessfully loaded MLPerf log from closed/AMD/results/1xMI300X_2xEPYC-9374F/llama2-70b-99.9/Offline/performance/run_1/mlperf_log_detail.txt.
[2024-10-29 18:30:20,268 submission_checker.py:1385 INFO] Target latency: None, Latency: 2352997093546, Scenario: Offline
[2024-10-29 18:30:20,268 submission_checker.py:2752 ERROR] closed/AMD/compliance/1xMI300X_2xEPYC-9374F/llama2-70b-99.9/Offline/TEST06/verify_accuracy.txt is missing in closed/AMD/compliance/1xMI300X_2xEPYC-9374F/llama2-70b-99.9/Offline/TEST06
[2024-10-29 18:30:20,285 preprocess_submission.py:282 WARNING] Offline scenario result is invalid for 1xMI300X_2xEPYC-9374F: llama2-70b-99.9 in closed division. Accuracy: True, Performance: True. Compliance: False. Moving llama2-70b-99.9 results to open...
[2024-10-29 18:30:20,287 log_parser.py:59 INFO] Sucessfully loaded MLPerf log from closed/AMD/results/1xMI300X_2xEPYC-9374F/llama2-70b-99/Offline/accuracy/mlperf_log_detail.txt.
[2024-10-29 18:30:20,288 log_parser.py:59 INFO] Sucessfully loaded MLPerf log from closed/AMD/results/1xMI300X_2xEPYC-9374F/llama2-70b-99/Offline/performance/run_1/mlperf_log_detail.txt.
[2024-10-29 18:30:20,289 log_parser.py:59 INFO] Sucessfully loaded MLPerf log from closed/AMD/results/1xMI300X_2xEPYC-9374F/llama2-70b-99/Offline/performance/run_1/mlperf_log_detail.txt.
[2024-10-29 18:30:20,289 submission_checker.py:1385 INFO] Target latency: None, Latency: 2352997093546, Scenario: Offline
[2024-10-29 18:30:20,291 log_parser.py:59 INFO] Sucessfully loaded MLPerf log from closed/AMD/results/1xMI300X_2xEPYC-9374F/llama2-70b-99/Server/accuracy/mlperf_log_detail.txt.
[2024-10-29 18:30:20,292 log_parser.py:59 INFO] Sucessfully loaded MLPerf log from closed/AMD/results/1xMI300X_2xEPYC-9374F/llama2-70b-99/Server/performance/run_1/mlperf_log_detail.txt.
[2024-10-29 18:30:20,293 log_parser.py:59 INFO] Sucessfully loaded MLPerf log from closed/AMD/results/1xMI300X_2xEPYC-9374F/llama2-70b-99/Server/performance/run_1/mlperf_log_detail.txt.
[2024-10-29 18:30:20,294 submission_checker.py:1366 INFO] Target latency: 20000000000, Early Stopping Latency: 0, Scenario: Server
[2024-10-29 18:30:20,295 log_parser.py:59 INFO] Sucessfully loaded MLPerf log from closed/AMD/results/8xMI300X_2xEPYC-TURIN/llama2-70b-99.9/Offline/accuracy/mlperf_log_detail.txt.
[2024-10-29 18:30:20,297 log_parser.py:59 INFO] Sucessfully loaded MLPerf log from closed/AMD/results/8xMI300X_2xEPYC-TURIN/llama2-70b-99.9/Offline/performance/run_1/mlperf_log_detail.txt.
[2024-10-29 18:30:20,298 log_parser.py:59 INFO] Sucessfully loaded MLPerf log from closed/AMD/results/8xMI300X_2xEPYC-TURIN/llama2-70b-99.9/Offline/performance/run_1/mlperf_log_detail.txt.
[2024-10-29 18:30:20,298 submission_checker.py:1385 INFO] Target latency: None, Latency: 897168170178, Scenario: Offline
[2024-10-29 18:30:20,299 log_parser.py:59 INFO] Sucessfully loaded MLPerf log from closed/AMD/results/8xMI300X_2xEPYC-TURIN/llama2-70b-99.9/Server/accuracy/mlperf_log_detail.txt.
[2024-10-29 18:30:20,301 log_parser.py:59 INFO] Sucessfully loaded MLPerf log from closed/AMD/results/8xMI300X_2xEPYC-TURIN/llama2-70b-99.9/Server/performance/run_1/mlperf_log_detail.txt.
[2024-10-29 18:30:20,302 log_parser.py:59 INFO] Sucessfully loaded MLPerf log from closed/AMD/results/8xMI300X_2xEPYC-TURIN/llama2-70b-99.9/Server/performance/run_1/mlperf_log_detail.txt.
[2024-10-29 18:30:20,302 submission_checker.py:1366 INFO] Target latency: 20000000000, Early Stopping Latency: 0, Scenario: Server
[2024-10-29 18:30:20,304 log_parser.py:59 INFO] Sucessfully loaded MLPerf log from closed/AMD/results/8xMI300X_2xEPYC-TURIN/llama2-70b-99/Offline/accuracy/mlperf_log_detail.txt.
[2024-10-29 18:30:20,305 log_parser.py:59 INFO] Sucessfully loaded MLPerf log from closed/AMD/results/8xMI300X_2xEPYC-TURIN/llama2-70b-99/Offline/performance/run_1/mlperf_log_detail.txt.
[2024-10-29 18:30:20,306 log_parser.py:59 INFO] Sucessfully loaded MLPerf log from closed/AMD/results/8xMI300X_2xEPYC-TURIN/llama2-70b-99/Offline/performance/run_1/mlperf_log_detail.txt.
[2024-10-29 18:30:20,306 submission_checker.py:1385 INFO] Target latency: None, Latency: 897168170178, Scenario: Offline
[2024-10-29 18:30:20,308 log_parser.py:59 INFO] Sucessfully loaded MLPerf log from closed/AMD/results/8xMI300X_2xEPYC-TURIN/llama2-70b-99/Server/accuracy/mlperf_log_detail.txt.
[2024-10-29 18:30:20,308 preprocess_submission.py:176 WARNING] [Errno 2] No such file or directory: 'closed/AMD/results/8xMI300X_2xEPYC-TURIN/llama2-70b-99/Server/performance/run_1/mlperf_log_detail.txt'
[2024-10-29 18:30:20,308 preprocess_submission.py:261 WARNING] Server scenario result is invalid for 8xMI300X_2xEPYC-TURIN: llama2-70b-99 in closed and open divisions. Accuracy: True, Performance: False. Removing it...
[2024-10-29 18:30:20,321 preprocess_submission.py:284 WARNING] Server scenario result is invalid for 8xMI300X_2xEPYC-TURIN: llama2-70b-99 in closed division. Accuracy: True, Performance: False. Compliance: True. Moving other scenario results of llama2-70b-99 to open...
[2024-10-29 18:30:20,322 preprocess_submission.py:473 INFO] Division closed, submitter AMD, system 8xMI300X_2xEPYC-TURIN:                                             copying llama2-70b-99.9 results to llama2-70b-99
[2024-10-29 18:30:20,323 preprocess_submission.py:473 INFO] Division closed, submitter AMD, system 8xMI300X_2xEPYC-TURIN:                                             copying llama2-70b-99.9 results to llama2-70b-99
[2024-10-29 18:30:20,324 preprocess_submission.py:473 INFO] Division closed, submitter AMD, system 8xMI300X_2xEPYC-TURIN:                                             copying llama2-70b-99.9 results to llama2-70b-99

@mrmhodak
Copy link
Contributor

@pgmpablo157321 to take a look

Copy link
Contributor

@pgmpablo157321 pgmpablo157321 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested Loadgen built, demos run and submission checker. No issues, LGTM

@arjunsuresh
Copy link
Contributor Author

Thank you @pgmpablo157321 for checking.

@arjunsuresh arjunsuresh merged commit c8c1e61 into master Oct 31, 2024
17 checks passed
@github-actions github-actions bot locked and limited conversation to collaborators Oct 31, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
3 participants