Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Filter out malformed nvidia-smi process_name XML tag #1910

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

jfennick
Copy link
Contributor

No description provided.

@codecov
Copy link

codecov bot commented Sep 19, 2023

Codecov Report

Merging #1910 (239ce5a) into main (b0c8dda) will decrease coverage by 1.12%.
The diff coverage is 94.44%.

@@            Coverage Diff             @@
##             main    #1910      +/-   ##
==========================================
- Coverage   83.98%   82.86%   -1.12%     
==========================================
  Files          46       46              
  Lines        8190     8200      +10     
  Branches     2174     2138      -36     
==========================================
- Hits         6878     6795      -83     
- Misses        840      903      +63     
- Partials      472      502      +30     
Files Coverage Δ
cwltool/cuda.py 92.45% <94.44%> (+6.40%) ⬆️

... and 12 files with indirect coverage changes

@jfennick jfennick force-pushed the nvidia_smi_xml_fix branch 2 times, most recently from 67683da to 7945f7d Compare September 19, 2023 19:14
@jfennick jfennick changed the title Replace malformed nvidia-smi XML Filter out malformed nvidia-smi process_name XML tag Sep 19, 2023
@mr-c mr-c requested a review from tetron September 19, 2023 21:55
@jfennick jfennick force-pushed the nvidia_smi_xml_fix branch 7 times, most recently from 8c77a20 to 081818d Compare September 20, 2023 02:35
@jfennick
Copy link
Contributor Author

FYI, as I discuss in the block comment in the code, to fix the immediate problem the only change necessary is
nvidia-smi -q -x | grep -v process_name
If that's more likely to get merged sooner, then we can do that for now.

But in an attempt to prevent future problems, I have instead removed all tags except the tags that are actually used. Let me know which version you prefer.

@mr-c mr-c force-pushed the nvidia_smi_xml_fix branch 2 times, most recently from 7bb1989 to ebaeb74 Compare October 2, 2023 09:55
@mr-c mr-c enabled auto-merge (rebase) October 2, 2023 09:56
auto-merge was automatically disabled October 3, 2023 19:19

Head branch was pushed to by a user without write access

Copy link
Member

@mr-c mr-c left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2023-10-03T19:27:36.5351674Z cwltool/context.py:46: initilizer ==> initializer
2023-10-03T19:27:36.5352997Z tests/test_toolargparse.py:183: desription ==> description
2023-10-03T19:27:36.5354037Z tests/test_toolargparse.py:192: desription ==> description
2023-10-03T19:27:36.5354867Z tests/wf/schemadef-bug-1473.cwl:452: cylces ==> cycles
2023-10-03T19:27:36.5441419Z Probable typo foun. Run "make codespell-fix" to accept suggested fixes, or add the word to the ignore list in setup.cfg

@@ -8,23 +8,48 @@
from .utils import CWLObjectType


def cuda_device_count() -> str:
Copy link
Member

@mr-c mr-c Oct 16, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def cuda_device_count() -> str:
def cuda_device_count() -> int:

and update the return as well

try:
out = subprocess.check_output(["nvidia-smi", "-q", "-x"]) # nosec
out = subprocess.check_output(cmd, shell=True) # nosec
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer that instead of using a subshell it captures the output and does the substring search in Python. For a simple substring search this really is as simple as something like:

out = subprocess.check_output(["nvidia-smi", "-q", "-x"])
if "cuda_version" not in out:
   raise ...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually instead of using minidom, to pull a single tag it would be easier to just use a regular expression to look for <cuda_version>.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants