Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[VC_1.11.0] Failed to parse SPECjvm metrics from results because index was outside the bounds of the array if run more than 2 times. #194

Closed
xuwangfe opened this issue Oct 24, 2023 · 17 comments

Comments

@xuwangfe
Copy link

xuwangfe commented Oct 24, 2023

Describe the bug
Failed to parse SPECjvm metrics from results because index was outside the bounds of the array if run more than 2 times.

To Reproduce
Steps to reproduce the behavior:

  1. All VirtualClient versions has this issue(1.0-1.11);
  2. VirtualClient --profile=PERF-CPU-COREMARK.json --iterations=3 --parameters="CompilerVersion=11" --packages=https://virtualclient.blob.core.windows.net/packages
  3. System under test configuration (OS, VM size, architecture, etc)
    Failed on both Ubuntu 20.04 & 22.04 & redhat8.2, failure rate 100%

Expected behavior
Parse passed for multiple SPECjvm iterations.

Screenshots
See as the attachment.

Additional context
Only 1st time result could be parsed successfully;
It will be still failed if delete the old VC version and then enable a new one;
Failed_Specjvm

@yangpanMS
Copy link
Contributor

Hi @xuwangfe thanks for reporting, acknowledged, looking into it.

@yangpanMS
Copy link
Contributor

To unblock you please check if you could find anything in vc/packages/specjvm2008/results/*.txt and delete that.

image

We delete files after parsing so still trying to repro this.

@xuwangfe
Copy link
Author

There is no *.txt file in the /home/VC_1.11/content/linux-x64/packages/specjvm.2008.0.0/results path.
All the test results is list as SPECjvm2008.001, 002, 003...
And only 1st results was parsed and has htmlsub/summary data, other results only has the raw data.
We even tried to delete the whole VC_1.11 folder and then enable a new VC folder, but it was also failed.
image

@yangpanMS
Copy link
Contributor

It would help me a lot if you could help package the results folder so that I can debug right away. Otherwise my run will finish in a couple hours so I will check tomorrow.

@xuwangfe
Copy link
Author

Sorry for that i couldn't sent the data results to you due to the policy limitation.
It's OK for me to get your feedback on the tomorrow.
Thanks for your help.

@yangpanMS
Copy link
Contributor

image
Unfortunately, I was not able to repro, I ran iterations=2 and it finished successfully. I also got 18 metrics with "ops/m" (9 each run). I understand that you couldn't share your results with us, we still have a couple of ways we could debug this.

  1. The culprit is likely in the SPECjvm2008.002.txt that VC fails to parse every time. Compare that with the example we used for our unit test
    https://github.com/microsoft/VirtualClient/blob/main/src/VirtualClient/VirtualClient.Actions.UnitTests/Examples/SPECjvm/SPECjvm2008.012.txt . See if you got a complete result file, especially in this section
    image. The fact you are not getting a .summary file might mean something went wrong in the java process.

  2. If you are open to it, you could scramble (but keep digits the same) the important numbers in SPECjvm2008.002.txt so that it does not validate your policy. I can run it through our unit test and see if it fails.

  3. Another thing I would like to confirm is, if you completely deleted VC_1.11 and ran VirtualClient inside that new folder, because I'm seeing SPECjvm run from 001->011. Would just like to confirm that 11 runs were run after you deleted everything.

@xuwangfe
Copy link
Author

xuwangfe commented Oct 25, 2023

  1. Total 4 files in the SPECjvm2008.002 folder: SPECjvm2008.002.txt, SPECjvm2008.002.html, SPECjvm2008.002.raw, images folder
  2. Only SPECjvm2008.002.raw file has the raw data, but both SPECjvm2008.002.html & SPECjvm2008.002.txt & images folder were empty((The empty file size is 0 in above picture.));
  3. The SPECjvm2008.002.txt file is empty, so that it's no necessary to upload to you, any other log files could be useful?
  4. I confirmed that we tried the delete experiments, including deleting the whole VC folder, deleting all specjvm.2008 folder from packages path, changing different VC versions, different SUTs, different platforms, also tried re-install OS(Ubuntu20.04, 22.04) or install & setup a new OS environment(redhat8.2), but all failed.

Could you install a new OS in the host environment directly and try again? The failure rate is 100% in my side.

@yangpanMS
Copy link
Contributor

yangpanMS commented Oct 25, 2023

Thank you @xuwangfe , I did exactly #.4 with a fresh VM with Ubuntu2004. The fact SPECjvm2008.002.txt is empty is the problem. Our parser failed to parse because the file is empty.

If you can't share the log files, would you be open to run SPECjvm without VC and see what happens? The command looks like this Regex, you need to adjust GCThreads and memory you are using. VC uses 85% of total memory by default.
java -XX:ParallelGCThreads=[0-9]+ -XX:+UseParallelGC -XX:+UseAES -XX:+UseSHA -Xms[0-9]+m -Xmx[0-9]+m -jar SPECjvm2008.jar -ikv -ict compress crypto derby mpegaudio scimark serial sunflow

@xuwangfe
Copy link
Author

xuwangfe commented Oct 26, 2023

It could be passed with below command and generate the logs normally.
/home/VC_1.11/content/linux-x64/packages/microsoft-jdk-17.0.5/linux-x64/bin/java -XX:ParallelGCThreads=104 -XX:+UseParallelGC -XX:+UseAES -XX:+UseSHA -Xms328229m -Xmx328229m -jar SPECjvm2008.jar -ikv -ict compress crypto derby mpegaudio scimark serial sunflow
image

I will keep running for 3 times tonight instead of VC framework and check the next 013/014/015 results.

@yangpanMS
Copy link
Contributor

Sorry I couldn't unblock you right now without being able to reproduce the issue. However I still have a question, did you remove the SPECjvm2008.001-011 folders? Because looks like when you manually ran it, it started from SPECjvm2008.012 instead of SPECjvm2008.001, which makes me think you didn't delete the whole VC_1.11 when reinstalling. Please confirm you removed the whole /specjvm.2008.0.0/results and then reran. Thank you.

@xuwangfe
Copy link
Author

  1. Specjvm with VC framework was still failed after JAVA command directly;
    image
  2. I moved VC_1.11 as VC_1.11_old and re-new a VC_1.11_new folder to re-run, it was failed.
    image
    3.Upload the VC_1.11_new logs as below, hope that could be useful for you.
    vc_1.11_new_nohup.txt
    metrics-20231027.zip
    SPECjvm2008.zip

@yangpanMS
Copy link
Contributor

yangpanMS commented Oct 27, 2023

Hi xuwangfe, when you copy to VC_1.11_old or VC_1.11_new, you are also copying over the results folder which has the empty result log. So doesn't matter if you are copying, VC will still be looking at that empty result file first. You want that removed.

image

Please run this
sudo rm -r /home/VC_1.11/content/linux-x64/packages/specjvm.2008.0.0/results/

Let me know if you are open to a Zoom/Teams meeting where I help resolve this quickly.

@xuwangfe
Copy link
Author

The steps which i tried were list as below:

  1. ~# mv /home/VC_1.11/ /home/VC_1.11_old
  2. ~# mkdir -p /home/VC_1.11_new
  3. ~# cd /home/VC_1.11_new/
  4. ~# cp -rf /home/VC_1.11_old/virtualclient.1.11.0.zip /home/VC_1.11_new/virtualclient.1.11.0.zip //1.11.zip package which download from MSFT
  5. ~# unzip virtualclient.1.11.0.zip
  6. ~# chmod 777 * -R
  7. ~# echo "export PATH=$PATH:/home/VC_1.11_new/content/linux-x64/" >> /roor/.bashrc
  8. ~# source /roor/.bashrc
  9. ~# VirtualClient --profile=PERF-SPECJVM.json --iterations=2 --parameters="CompilerVersion=11" --packages=https://virtualclient.blob.core.windows.net/packages

I think this is as same as the rm.

Please feel free to open Teams meeting, thanks.

@yangpanMS
Copy link
Contributor

I would be interested to know the "ls /home/VC_1.11_new/content/linux-x64/packages/specjvm.2008.0.0/results/" after the installation, it should be empty after the unzip.

May I know your timezone? I would try to accommodate your working hour.

@xuwangfe
Copy link
Author

Beijing time and 9:00 - 17:00 is OK for me

@yangpanMS
Copy link
Contributor

What's a good email to reach you? Or you could send it to virtualclient@microsoft.com if you prefer to send it privately.

@xuwangfe
Copy link
Author

xuwangfe commented Nov 6, 2023

Thanks for YangPan's help.
This issue could be solved after add sudo command~
image

@xuwangfe xuwangfe closed this as completed Nov 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants