Optimize levenshtein_distance algorithm in peft_lora_seq2seq_accelera… #1527

SUNGOD3 · 2024-03-04T08:53:49Z

This commit refines the levenshtein_distance algorithm implemented in peft_lora_seq2seq_accelerate_ds_zero3_offload.py to improve its space complexity from O(n^2) to O(n). Additionally, thorough testing has been conducted to ensure the correctness and reliability of the revised implementation.

…te_ds_zero3_offload.py This commit refines the levenshtein_distance algorithm implemented in peft_lora_seq2seq_accelerate_ds_zero3_offload.py to improve its space complexity from O(n^2) to O(n). Additionally, thorough testing has been conducted to ensure the correctness and reliability of the revised implementation.

BenjaminBossan · 2024-03-04T13:22:16Z

Hi, thanks for providing a more efficient implementation. From what I can tell, this should have very marginal effects on this script's runtime, if any, given how little time is spent on this vs training the net. Do you think it's really worth it to make this update?

SUNGOD3 · 2024-03-04T18:48:11Z

Hi BenjaminBossan,
Thanks for your feedback. As you mentioned, the improvement in the levenshtein_distance algorithm has minimal impact on the overall effectiveness. In fact, I stumbled upon this file while tracing a bug. However, based on my experience, state compression in dynamic programming (DP) typically has minimal side effects. In comparison to the original code, it generally performs better in terms of both time and space complexity in nearly all cases. Furthermore, it hardly compromises the readability of the code. So, why not go for it?

BenjaminBossan · 2024-03-05T10:36:08Z

In general, I agree with your argument. From a maintainer's point of view, I'm just a bit reluctant because:

Is there an edge case where this returns different results? Leading to a user later opening an issue asking us why their results changed after updating PEFT.
What was the intent when this function was originally added. Is it maybe a 1:1 copy of somewhere else and is purposefully like this?

Those are not big issues, but then again the advantage is also minimal, which is why I'm hesitating.

SUNGOD3 · 2024-03-05T19:31:31Z

I understand, your hesitation is entirely reasonable. However, perhaps the following analysis might change your perspective?

Firstly, regarding the first point, the probability of encountering edge test cases resulting in different outcomes is exceedingly low. This is simply a function of a classic problem. Furthermore, under the condition where both input and output types remain unchanged, this update not only passed the 1146 test cases on Leetcode but also passed my scripted tests. (Contains special characters, long strings, empty strings, special arrangements)

As for the second point, I've gained some understanding of the context in which this function is utilized. It serves as a classification algorithm to predict which class the model's output belongs to. The source code might have been written by the original author themselves, as I only find a similar implementation in the relevant documentation (examples/causal_language_modeling/peft_lora_clm_accelerate_ds_zero3_offload.py).

Additionally, it's worth mentioning that the impact of this optimization on time might not be negligible. The reason it seems to have minimal effect currently is simply because during testing, only a few classes (just 3) were utilized, and the names of these classes were also very short (less than 10 characters). In reality, as the number and length of classes increase, this improvement could easily exceed 1 second in overall acceleration and scales linearly with the number of classes and output length. Moreover, when the tested model becomes smaller or the training epoches decreases, its overall impact will gradually become apparent.

https://colab.research.google.com/drive/1JUznpk2OAGCu5ZVAddLtKdPKyGKMp2ed?usp=sharing

same

HuggingFaceDocBuilderDev · 2024-03-06T10:45:09Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

BenjaminBossan · 2024-03-06T10:47:54Z

Given the extensive tests you ran, I think we can be fairly certain that my first concern is not valid, so I'm fine with going ahead. Could you please run make style on your code?

To pass make style.

same

SUNGOD3 · 2024-03-06T13:25:54Z

It seems that there should be no problem after the last commit? (Unit test has also been updated)

BenjaminBossan · 2024-03-06T15:45:23Z

Strange that it passes for you, CI shows this error:

examples/causal_language_modeling/peft_lora_clm_accelerate_ds_zero3_offload.py:6:17: F401 [*] `numpy` imported but unused
examples/conditional_generation/peft_lora_seq2seq_accelerate_ds_zero3_offload.py:6:17: F401 [*] `numpy` imported but unused

Delete unused numpy

same

SUNGOD3 · 2024-03-06T16:05:49Z

Maybe make quality is something I should run too?
BTW, I seem to get no errors after executing make style repeatedly. Maybe this is a bug?

BenjaminBossan

Not sure what the issue with the style check could be, maybe different ruff versions. Anyway, this passes now and LGTM. Thanks providing this small performance improvement and patiently explaining your solution.

This commit refines the levenshtein_distance algorithm implemented in peft_lora_seq2seq_accelerate_ds_zero3_offload.py to improve its space complexity from O(n^2) to O(n). Additionally, thorough testing has been conducted to ensure the correctness and reliability of the revised implementation. Also update peft_lora_clm_accelerate_ds_zero3_offload.py

Update peft_lora_clm_accelerate_ds_zero3_offload.py

1cf9490

same

SUNGOD3 added 2 commits March 6, 2024 21:17

Improve coding style

6067b1c

To pass make style.

Update peft_lora_seq2seq_accelerate_ds_zero3_offload.py

5ecdc62

same

SUNGOD3 added 2 commits March 6, 2024 23:59

np--

556bfd9

Delete unused numpy

Update peft_lora_seq2seq_accelerate_ds_zero3_offload.py

bbb854c

same

BenjaminBossan approved these changes Mar 7, 2024

View reviewed changes

BenjaminBossan merged commit 7e84dec into huggingface:main Mar 7, 2024
14 checks passed

SUNGOD3 deleted the patch-1 branch March 7, 2024 10:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize levenshtein_distance algorithm in peft_lora_seq2seq_accelera… #1527

Optimize levenshtein_distance algorithm in peft_lora_seq2seq_accelera… #1527

SUNGOD3 commented Mar 4, 2024

BenjaminBossan commented Mar 4, 2024

SUNGOD3 commented Mar 4, 2024

BenjaminBossan commented Mar 5, 2024

SUNGOD3 commented Mar 5, 2024 •

edited

Loading

HuggingFaceDocBuilderDev commented Mar 6, 2024

BenjaminBossan commented Mar 6, 2024

SUNGOD3 commented Mar 6, 2024

BenjaminBossan commented Mar 6, 2024

SUNGOD3 commented Mar 6, 2024

BenjaminBossan left a comment

Optimize levenshtein_distance algorithm in peft_lora_seq2seq_accelera… #1527

Optimize levenshtein_distance algorithm in peft_lora_seq2seq_accelera… #1527

Conversation

SUNGOD3 commented Mar 4, 2024

BenjaminBossan commented Mar 4, 2024

SUNGOD3 commented Mar 4, 2024

BenjaminBossan commented Mar 5, 2024

SUNGOD3 commented Mar 5, 2024 • edited Loading

HuggingFaceDocBuilderDev commented Mar 6, 2024

BenjaminBossan commented Mar 6, 2024

SUNGOD3 commented Mar 6, 2024

BenjaminBossan commented Mar 6, 2024

SUNGOD3 commented Mar 6, 2024

BenjaminBossan left a comment

Choose a reason for hiding this comment

SUNGOD3 commented Mar 5, 2024 •

edited

Loading