Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Restrict transformers version until MPS issue is addressed #1039

Merged

Conversation

jmartin-tech
Copy link
Collaborator

As of transformers 4.47.0, the device specified for some detectors that utilize huffingface models is not applied in some macOS contexts.

To be specific when cpu is specified mps code paths are being activated and raise exceptions as mps was not configured or initialized.

Verification

List the steps needed to make sure this thing works

  • Automation test complete

Example failure seen with transformers == 4.47.0:

% python -m pytest tests/test_attempt.py
============================================================================================= test session starts =============================================================================================
platform darwin -- Python 3.12.4, pytest-8.3.4, pluggy-1.5.0
rootdir: /Users/vagrant/Projects/nvidia/garak
configfile: pyproject.toml
plugins: cov-6.0.0, respx-0.21.1, pytest_httpserver-1.1.0, anyio-4.7.0, mock-3.14.0, requests-mock-1.12.1
collected 12 items

tests/test_attempt.py F...........                                                                                                                                                                      [100%]

================================================================================================== FAILURES ===================================================================================================
_________________________________________________________________________________________ test_attempt_sticky_params __________________________________________________________________________________________

capsys = <_pytest.capture.CaptureFixture object at 0x1030c6ba0>

    def test_attempt_sticky_params(capsys):

        cli.main(
            f"-m test.Blank -g 1 -p atkgen,dan.Dan_6_0 --report_prefix {PREFIX}".split()
        )
        report_path = _config.transient.data_dir / _config.reporting.report_dir
        reportlines = (
            open(report_path / f"{PREFIX}.report.jsonl", "r", encoding="utf-8")
            .read()
            .split("\n")
        )
        # Note: the line numbers below are based on respecting the `-g 1` options passed
>       complete_atkgen = json.loads(reportlines[3])  # status 2 for the one atkgen attempt
E       IndexError: list index out of range

tests/test_attempt.py:27: IndexError
-------------------------------------------------------------------------------------------- Captured stdout call ---------------------------------------------------------------------------------------------
garak LLM vulnerability scanner v0.10.0.post1 ( https://github.com/NVIDIA/garak ) at 2024-12-09T07:33:48.160729
📜 logging to /Users/vagrant/.local/share/garak/garak.log
🦜 loading generator: Test: Blank
📜 reporting to /Users/vagrant/.local/share/garak/garak_runs/_garak_test_attempt_sticky_params.report.jsonl
🕵️  queue of probes: atkgen.Tox, dan.Dan_6_0
 detector load failed: toxicity.ToxicCommentModel, skipping >>
No detectors, nothing to do
---------------------------------------------------------------------------------------------- Captured log call ----------------------------------------------------------------------------------------------
WARNING  root:_plugins.py:419 Exception instantiating garak.detectors.toxicity.ToxicCommentModel: MPS backend out of memory (MPS allocated: 1.01 GB, other allocations: 16.00 KB, max allowed: 4.53 GB). Tried to allocate 256 bytes on shared pool. Use PYTORCH_MPS_HIGH_WATERMARK_RATIO=0.0 to disable upper limit for memory allocations (may cause system failure).
Traceback (most recent call last):
  File "/Users/vagrant/Projects/nvidia/garak/garak/_plugins.py", line 416, in load_plugin
    plugin_instance = klass(config_root=config_root)
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/vagrant/Projects/nvidia/garak/garak/detectors/base.py", line 122, in __init__
    self.detector = TextClassificationPipeline(
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/vagrant/Projects/nvidia/garak/lib/python3.12/site-packages/transformers/pipelines/text_classification.py", line 85, in __init__
    super().__init__(**kwargs)
  File "/Users/vagrant/Projects/nvidia/garak/lib/python3.12/site-packages/transformers/pipelines/base.py", line 926, in __init__
    self.model.to(self.device)
  File "/Users/vagrant/Projects/nvidia/garak/lib/python3.12/site-packages/transformers/modeling_utils.py", line 3164, in to
    return super().to(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/vagrant/Projects/nvidia/garak/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1340, in to
    return self._apply(convert)
           ^^^^^^^^^^^^^^^^^^^^
  File "/Users/vagrant/Projects/nvidia/garak/lib/python3.12/site-packages/torch/nn/modules/module.py", line 900, in _apply
    module._apply(fn)
  File "/Users/vagrant/Projects/nvidia/garak/lib/python3.12/site-packages/torch/nn/modules/module.py", line 927, in _apply
    param_applied = fn(param)
                    ^^^^^^^^^
  File "/Users/vagrant/Projects/nvidia/garak/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1326, in convert
    return t.to(
           ^^^^^
RuntimeError: MPS backend out of memory (MPS allocated: 1.01 GB, other allocations: 16.00 KB, max allowed: 4.53 GB). Tried to allocate 256 bytes on shared pool. Use PYTORCH_MPS_HIGH_WATERMARK_RATIO=0.0 to disable upper limit for memory allocations (may cause system failure).
ERROR    root:probewise.py:27  detector load failed: toxicity.ToxicCommentModel, skipping >>
WARNING  root:base.py:92 No detectors, nothing to do
ERROR    root:cli.py:620 No detectors, nothing to do
Traceback (most recent call last):
  File "/Users/vagrant/Projects/nvidia/garak/garak/cli.py", line 594, in main
    command.probewise_run(
  File "/Users/vagrant/Projects/nvidia/garak/garak/command.py", line 237, in probewise_run
    probewise_h.run(generator, probe_names, evaluator, buffs)
  File "/Users/vagrant/Projects/nvidia/garak/garak/harnesses/probewise.py", line 107, in run
    h.run(model, [probe], detectors, evaluator, announce_probe=False)
  File "/Users/vagrant/Projects/nvidia/garak/garak/harnesses/base.py", line 95, in run
    raise ValueError(msg)
ValueError: No detectors, nothing to do
=========================================================================================== short test summary info ===========================================================================================
FAILED tests/test_attempt.py::test_attempt_sticky_params - IndexError: list index out of range
======================================================================================== 1 failed, 11 passed in 2.74s =========================================================================================

As of transformers 4.47.0, the `device` specified for some detectors
that utilize huffingface models is not applied in some macOS contexts.

To be specific when `cpu` is specified `mps` code paths are being activated
and raise exceptions as `mps` was not configured or initialized.

Signed-off-by: Jeffrey Martin <jemartin@nvidia.com>
@leondz
Copy link
Collaborator

leondz commented Dec 9, 2024

Lgtm, thanks

@jmartin-tech jmartin-tech merged commit 0b837c1 into NVIDIA:main Dec 9, 2024
9 checks passed
@jmartin-tech jmartin-tech deleted the fix/restrict-transformers-version branch December 9, 2024 21:03
@github-actions github-actions bot locked and limited conversation to collaborators Dec 9, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants