[CI] Refactor MIGraphX model testing with Jenkins credential access. #1671

stefankoncarevic · 2024-10-07T12:37:42Z

This commit implements server access via Jenkins credentials, replacing the need for the previously used model-testing.sh script. The new testing framework now supports five architectures: MI200, MI300, Navi2x, Navi3x, and Navi4x.

It allows for performance (perf) and verification (verify) testing, with performance set as the default option. For each architecture, dedicated log files are generated, containing detailed information about model performances.

The testing framework is configured to execute batch size tests of 1, 32, and 64 for each model, ensuring comprehensive coverage. Additionally, tests are scheduled to run once a week, providing regular updates on model performance metrics.

This update enhances the efficiency and maintainability of our model testing process, streamlining access and reporting for different architectures.

Currently, there are issues with mounting models on the Navi4x machine, and efforts are underway to resolve this. All other architectures are functioning correctly.

krzysz00

Overall note: we want this to not be a shell script that's manually managing Docker containers

krzysz00 · 2024-10-07T14:51:11Z

mlir/utils/jenkins/Jenkinsfile.migraphxintegration

-"""
+    sh """ #!/bin/bash -x
+
+    if [ \$(docker ps -a -q -f name=migraphx) ]; then


... Jenkins should be managing the Docker starts/stops, not us

krzysz00 · 2024-10-07T14:51:30Z

mlir/utils/jenkins/Jenkinsfile.migraphxintegration

+        echo "sshfs is already installed."
+    fi
+
+    if ! command -v sshpass &> /dev/null


This should be in the Dockerfile

krzysz00 · 2024-10-07T14:52:58Z

mlir/utils/jenkins/Jenkinsfile.migraphxintegration

+    sshfs_port="22"
+
+    known_hosts_file="/tmp/known_hosts"
+    ssh-keyscan -p "$sshfs_port" "$sshfs_host" > "$known_hosts_file"


... I'd argue having the public key in by Jenkins credential is a better idea than trusting a keyscan

The Tuna folks like to use Jenkins environment variables for these things, so they can set them up in the job and keep them out of github. IT may also complain, even just about visible hostnames.

Our codecov stage uses withCredentials to fetch a Jenkins credential into an environment varlable.

My unpublished efforts to use an ssh tunnel for a Tuna database connection settled on sshagent (credentials: ['rocmlir-ixt-rack-15']) { ... ssh commands ... }

But note that ssh is weird on our CI nodes -- lockharts are different from 10.216.64.100 hosts are different from the rest.

krzysz00 · 2024-10-07T14:53:29Z

mlir/utils/jenkins/Jenkinsfile.migraphxintegration

+    echo "int8:$int8"
+    echo "checkFor:$checkFor"
+
+    docker pull rocm/mlir-migraphx-ci:latest


This should be a stage{} in Jenkins etc.

codecov · 2024-10-07T16:06:21Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 77.79%. Comparing base (e454b5d) to head (d31ca83).
Report is 6 commits behind head on develop.

Additional details and impacted files

@@             Coverage Diff             @@
##           develop    #1671      +/-   ##
===========================================
- Coverage    77.84%   77.79%   -0.06%     
===========================================
  Files          100      100              
  Lines        27720    27732      +12     
  Branches      4027     4028       +1     
===========================================
- Hits         21578    21573       -5     
- Misses        4498     4522      +24     
+ Partials      1644     1637       -7

Flag	Coverage Δ
mfma	`77.79% <ø> (-0.06%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

stefankoncarevic requested review from manupak and pcf000 October 7, 2024 12:37

stefankoncarevic requested review from jerryyin and sjw36 as code owners October 7, 2024 12:37

krzysz00 reviewed Oct 7, 2024

View reviewed changes

stefankoncarevic force-pushed the migraphx-model-refactor branch from 13eab37 to 2661e23 Compare October 22, 2024 13:44

[CI] Refactor MIGraphX model testing with Jenkins credential access

d31ca83

stefankoncarevic force-pushed the migraphx-model-refactor branch from 2661e23 to d31ca83 Compare October 24, 2024 09:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CI] Refactor MIGraphX model testing with Jenkins credential access. #1671

[CI] Refactor MIGraphX model testing with Jenkins credential access. #1671

stefankoncarevic commented Oct 7, 2024

krzysz00 left a comment

krzysz00 Oct 7, 2024

krzysz00 Oct 7, 2024

krzysz00 Oct 7, 2024

pcf000 Oct 7, 2024

krzysz00 Oct 7, 2024

codecov bot commented Oct 7, 2024 •

edited

Loading

[CI] Refactor MIGraphX model testing with Jenkins credential access. #1671

Are you sure you want to change the base?

[CI] Refactor MIGraphX model testing with Jenkins credential access. #1671

Conversation

stefankoncarevic commented Oct 7, 2024

krzysz00 left a comment

Choose a reason for hiding this comment

krzysz00 Oct 7, 2024

Choose a reason for hiding this comment

krzysz00 Oct 7, 2024

Choose a reason for hiding this comment

krzysz00 Oct 7, 2024

Choose a reason for hiding this comment

pcf000 Oct 7, 2024

Choose a reason for hiding this comment

krzysz00 Oct 7, 2024

Choose a reason for hiding this comment

codecov bot commented Oct 7, 2024 • edited Loading

Codecov Report

codecov bot commented Oct 7, 2024 •

edited

Loading