Support M1 GPU in FARMReader #2826

mathislucka · 2022-07-15T12:17:30Z

Is your feature request related to a problem? Please describe.
Since haystack v1.6 we have support for pytorch 1.12 which also means support for the M1 GPU. However, we currently initialize the device to be either cpu or cuda depending on availability and if the user passes in the use_gpu=True parameter. For GPU use on the M1, pytorch actually uses the mps backend. See: https://pytorch.org/docs/stable/notes/mps.html

If we could allow the users to pass in the actual device into the FARMReader then this might support of GPU training and inference on the M1 possible.

Describe the solution you'd like
Allow the user to pass in devices=[<device>] into FARMReader.__init__ and use these devices in initialize_device_settings. We could make this non-breaking by making this an optional argument to the reader init and the device initialization.

Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.

Additional context
Add any other context or screenshots about the feature request here.

The text was updated successfully, but these errors were encountered:

mathislucka · 2022-07-15T12:22:24Z

It is actually already there :D

mathislucka · 2022-07-15T14:14:27Z

Reopening this, as the device is not used for the inferencer. See:

haystack/haystack/modeling/infer.py

Line 229 in 632cd1c

return cls(

mathislucka · 2022-07-15T14:24:49Z

Additionally, currently transformers does not support pytorch 1.12 (see huggingface/transformers#17971 (comment)). When changing the code in inferencer to pass on the mps device. An error is raised during prediction:

Inferencing Samples:   0%|          | 0/1 [00:00<?, ? Batches/s]
Traceback (most recent call last):
  File "/opt/homebrew/Caskroom/miniforge/base/envs/fanal/lib/python3.9/site-packages/haystack/modeling/infer.py", line 520, in _get_predictions_and_aggregate
    logits = self.model.forward(**batch)
  File "/opt/homebrew/Caskroom/miniforge/base/envs/fanal/lib/python3.9/site-packages/haystack/modeling/model/adaptive_model.py", line 477, in forward
    output_tuple = self.language_model.forward(
  File "/opt/homebrew/Caskroom/miniforge/base/envs/fanal/lib/python3.9/site-packages/haystack/modeling/model/language_model.py", line 700, in forward
    output_tuple = self.model(
  File "/opt/homebrew/Caskroom/miniforge/base/envs/fanal/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "/opt/homebrew/Caskroom/miniforge/base/envs/fanal/lib/python3.9/site-packages/transformers/models/roberta/modeling_roberta.py", line 841, in forward
    embedding_output = self.embeddings(
  File "/opt/homebrew/Caskroom/miniforge/base/envs/fanal/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl
    return forward_call(*input, **kwargs)
  File "/opt/homebrew/Caskroom/miniforge/base/envs/fanal/lib/python3.9/site-packages/transformers/models/roberta/modeling_roberta.py", line 105, in forward
    position_ids = create_position_ids_from_input_ids(input_ids, self.padding_idx, past_key_values_length)
  File "/opt/homebrew/Caskroom/miniforge/base/envs/fanal/lib/python3.9/site-packages/transformers/models/roberta/modeling_roberta.py", line 1574, in create_position_ids_from_input_ids
    incremental_indices = (torch.cumsum(mask, dim=1).type_as(mask) + past_key_values_length) * mask
NotImplementedError: The operator 'aten::cumsum.out' is not current implemented for the MPS device. If you want this op to be added in priority during the prototype phase of this feature, please comment on https://github.com/pytorch/pytorch/issues/77764. As a temporary fix, you can set the environment variable `PYTORCH_ENABLE_MPS_FALLBACK=1` to use the CPU as a fallback for this op. WARNING: this will be slower than running natively on MPS.
python-BaseException

mathislucka · 2022-07-15T14:26:52Z

Also see this for the current state of covered ops for the mps backend:

pytorch/pytorch#77764

yli223 · 2022-07-19T20:16:27Z

Hey,

Thanks for sharing this information! I am new to haystack and wondering how to enable GPU in Mac Pro M1? I have PyTorch set up already with torch.backends.mps.is_available() = True. However, I still don't know how to activate it. Can you provide a bit more information?

Best

sjrl · 2022-07-22T06:53:47Z

Hey, @yli223 we do not currently support the M1 GPU. We would need to implement the changes explained by @mathislucka above in Haystack. In addition, we need also need to wait for HuggingFace transformers to support PyTorch 1.12 which is required for the M1 GPU to work (more info here huggingface/transformers#17925).

vblagoje · 2022-08-18T11:04:51Z

Update: the HF PR has been merged to main. Therefore, we can use this feature as soon as we support HF v4.21.2 release (as soon as it gets released). Do we need to add devices optional parameter anywhere else except infer.py @mathislucka @sjrl ?

sjrl · 2022-08-18T11:22:06Z

That's great! I would say that anywhere the user passes an option to initialize_device_settings should have the option of passing a list of devices instead. Similar to what is already done in this load function for the Inferencer

haystack/haystack/modeling/infer.py

Lines 175 to 176 in be127e5

    
           if devices is None: 
        
               devices, n_gpu = initialize_device_settings(use_cuda=gpu, multi_gpu=False)

where devices is of type

haystack/haystack/modeling/infer.py

Line 128 in be127e5

devices: Optional[List[torch.device]] = None,

So what is inconsistent at the moment is that the devices option is only supported in some places in Haystack. And I think we should support it everywhere where the user can pass in the use_gpu boolean.

vblagoje · 2022-08-18T12:46:33Z

@sjrl, so what you are saying is that every function, including the component constructor where we currently pass use_gpu should have devices as an optional argument. And second, we should make sure that the deterministic approach to device selection defined in initialize_device_settings, is used in every case where we pass the devices parameter. Correct?

sjrl · 2022-08-18T13:13:54Z

so what you are saying is that every function, including the component constructor where we currently pass use_gpu should have devices as an optional argument.

Yes I think this makes sense to help standardize how devices are specified in Haystack.

And second, we should make sure that the deterministic approach to device selection defined in initialize_device_settings, is used in every case where we pass the devices parameter. Correct?

I'm not entirely sure what you mean here. Do you mean we should always use this statement everywhere we have added the devices optional parameter?

 if devices is None: 
     devices, n_gpu = initialize_device_settings(use_cuda=gpu, multi_gpu=False)

vblagoje · 2022-08-18T13:36:35Z

Yes, it seems to be already used everywhere, but we should make sure that it does get used in addition to making sure we provide devices parameter.

sjrl · 2022-08-18T13:38:13Z

Yes, it seems to be already used everywhere, but we should make sure that it does get used in addition to making sure we provide devices parameter.

Yes I agree.

vblagoje · 2022-08-30T08:09:09Z

Update: although HF has recently added support for devices in pipelines the main blocker for the Haystack deployment on Apple Silicone M1/M2 remains MPS implementation of torch cumsum operator which is used extensively in all HF models.

vblagoje · 2022-11-28T13:15:54Z

However, seq2seq generative models still don't work (whenever GenerationMixin is used). The error is

NotImplementedError: The operator 'aten::remainder.Tensor_out' is not currently implemented for the MPS device. If you want this op to be added in priority during the prototype phase of this feature, please comment on https://github.com/pytorch/pytorch/issues/77764. As a temporary fix, you can set the environment variable `PYTORCH_ENABLE_MPS_FALLBACK=1` to use the CPU as a fallback for this op. WARNING: this will be slower than running natively on MPS.

So now we have to wait for pytorch/pytorch#86806

laike9m · 2023-10-10T07:06:01Z

Hi @vblagoje, the blocking issue has been fixed. May I ask what the current status for M1 GPU support is? At least from the documentation, it didn't mention Apple Silicon support, so I suppose it's still not supported:
https://docs.haystack.deepset.ai/docs/enabling-gpu-acceleration

vblagoje · 2023-10-10T11:50:19Z

@laike9m haven't tried it in a while tbh. Having looked at pytorch/pytorch#86806 it seems like it should work now. Please try it out and let us know. If not, I'll get to this task next week or so

laike9m · 2023-10-10T18:55:55Z

Thanks. I can give it a try, where I can find the instructions to enable it? (sorry I'm pretty new to haystack)

lvdinergy · 2023-12-21T08:50:09Z

Still getting the error:
NotImplementedError: The operator 'aten::remainder.Tensor_out' is not currently implemented for the MPS device. If you want this op to be added in priority during the prototype phase of this feature, please comment on pytorch/pytorch#77764. As a temporary fix, you can set the environment variable PYTORCH_ENABLE_MPS_FALLBACK=1 to use the CPU as a fallback for this op. WARNING: this will be slower than running natively on MPS.

Running MacOS Sonoma 14.2.1 (23C71)

I have PyTorch 2.1.2

mathislucka added type:feature New feature or request topic:train topic:reader journey:advanced labels Jul 15, 2022

mathislucka closed this as completed Jul 15, 2022

mathislucka reopened this Jul 15, 2022

masci assigned vblagoje Aug 17, 2022

This was referenced Aug 18, 2022

feat: standardize devices parameter and device initialization #3062

Merged

feat: add batch evaluation method for pipelines #2942

Merged

vblagoje closed this as completed in #3062 Aug 31, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support M1 GPU in FARMReader #2826

Support M1 GPU in FARMReader #2826

mathislucka commented Jul 15, 2022

mathislucka commented Jul 15, 2022

mathislucka commented Jul 15, 2022

mathislucka commented Jul 15, 2022

mathislucka commented Jul 15, 2022

yli223 commented Jul 19, 2022

sjrl commented Jul 22, 2022

vblagoje commented Aug 18, 2022 •

edited

Loading

sjrl commented Aug 18, 2022

vblagoje commented Aug 18, 2022

sjrl commented Aug 18, 2022

vblagoje commented Aug 18, 2022

sjrl commented Aug 18, 2022

vblagoje commented Aug 30, 2022

vblagoje commented Nov 28, 2022 •

edited

Loading

laike9m commented Oct 10, 2023 •

edited

Loading

vblagoje commented Oct 10, 2023

laike9m commented Oct 10, 2023

lvdinergy commented Dec 21, 2023

Support M1 GPU in FARMReader #2826

Support M1 GPU in FARMReader #2826

Comments

mathislucka commented Jul 15, 2022

mathislucka commented Jul 15, 2022

mathislucka commented Jul 15, 2022

mathislucka commented Jul 15, 2022

mathislucka commented Jul 15, 2022

yli223 commented Jul 19, 2022

sjrl commented Jul 22, 2022

vblagoje commented Aug 18, 2022 • edited Loading

sjrl commented Aug 18, 2022

vblagoje commented Aug 18, 2022

sjrl commented Aug 18, 2022

vblagoje commented Aug 18, 2022

sjrl commented Aug 18, 2022

vblagoje commented Aug 30, 2022

vblagoje commented Nov 28, 2022 • edited Loading

laike9m commented Oct 10, 2023 • edited Loading

vblagoje commented Oct 10, 2023

laike9m commented Oct 10, 2023

lvdinergy commented Dec 21, 2023

vblagoje commented Aug 18, 2022 •

edited

Loading

vblagoje commented Nov 28, 2022 •

edited

Loading

laike9m commented Oct 10, 2023 •

edited

Loading