Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize levenshtein_distance algorithm in peft_lora_seq2seq_accelera… #1527

Merged
merged 6 commits into from
Mar 7, 2024

Conversation

SUNGOD3
Copy link
Contributor

@SUNGOD3 SUNGOD3 commented Mar 4, 2024

This commit refines the levenshtein_distance algorithm implemented in peft_lora_seq2seq_accelerate_ds_zero3_offload.py to improve its space complexity from O(n^2) to O(n). Additionally, thorough testing has been conducted to ensure the correctness and reliability of the revised implementation.

…te_ds_zero3_offload.py

This commit refines the levenshtein_distance algorithm implemented in peft_lora_seq2seq_accelerate_ds_zero3_offload.py to improve its space complexity from O(n^2) to O(n). Additionally, thorough testing has been conducted to ensure the correctness and reliability of the revised implementation.
@BenjaminBossan
Copy link
Member

Hi, thanks for providing a more efficient implementation. From what I can tell, this should have very marginal effects on this script's runtime, if any, given how little time is spent on this vs training the net. Do you think it's really worth it to make this update?

@SUNGOD3
Copy link
Contributor Author

SUNGOD3 commented Mar 4, 2024

Hi BenjaminBossan,
Thanks for your feedback. As you mentioned, the improvement in the levenshtein_distance algorithm has minimal impact on the overall effectiveness. In fact, I stumbled upon this file while tracing a bug. However, based on my experience, state compression in dynamic programming (DP) typically has minimal side effects. In comparison to the original code, it generally performs better in terms of both time and space complexity in nearly all cases. Furthermore, it hardly compromises the readability of the code. So, why not go for it?

@BenjaminBossan
Copy link
Member

In general, I agree with your argument. From a maintainer's point of view, I'm just a bit reluctant because:

  1. Is there an edge case where this returns different results? Leading to a user later opening an issue asking us why their results changed after updating PEFT.
  2. What was the intent when this function was originally added. Is it maybe a 1:1 copy of somewhere else and is purposefully like this?

Those are not big issues, but then again the advantage is also minimal, which is why I'm hesitating.

@SUNGOD3
Copy link
Contributor Author

SUNGOD3 commented Mar 5, 2024

I understand, your hesitation is entirely reasonable. However, perhaps the following analysis might change your perspective?

Firstly, regarding the first point, the probability of encountering edge test cases resulting in different outcomes is exceedingly low. This is simply a function of a classic problem. Furthermore, under the condition where both input and output types remain unchanged, this update not only passed the 1146 test cases on Leetcode but also passed my scripted tests. (Contains special characters, long strings, empty strings, special arrangements)

image

As for the second point, I've gained some understanding of the context in which this function is utilized. It serves as a classification algorithm to predict which class the model's output belongs to. The source code might have been written by the original author themselves, as I only find a similar implementation in the relevant documentation (examples/causal_language_modeling/peft_lora_clm_accelerate_ds_zero3_offload.py).

Additionally, it's worth mentioning that the impact of this optimization on time might not be negligible. The reason it seems to have minimal effect currently is simply because during testing, only a few classes (just 3) were utilized, and the names of these classes were also very short (less than 10 characters). In reality, as the number and length of classes increase, this improvement could easily exceed 1 second in overall acceleration and scales linearly with the number of classes and output length. Moreover, when the tested model becomes smaller or the training epoches decreases, its overall impact will gradually become apparent.

https://colab.research.google.com/drive/1JUznpk2OAGCu5ZVAddLtKdPKyGKMp2ed?usp=sharing

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@BenjaminBossan
Copy link
Member

Given the extensive tests you ran, I think we can be fairly certain that my first concern is not valid, so I'm fine with going ahead. Could you please run make style on your code?

@SUNGOD3
Copy link
Contributor Author

SUNGOD3 commented Mar 6, 2024

It seems that there should be no problem after the last commit? (Unit test has also been updated)
Screenshot 2024-03-06 at 9 21 42 PM

@BenjaminBossan
Copy link
Member

Strange that it passes for you, CI shows this error:

examples/causal_language_modeling/peft_lora_clm_accelerate_ds_zero3_offload.py:6:17: F401 [*] `numpy` imported but unused
examples/conditional_generation/peft_lora_seq2seq_accelerate_ds_zero3_offload.py:6:17: F401 [*] `numpy` imported but unused

@SUNGOD3
Copy link
Contributor Author

SUNGOD3 commented Mar 6, 2024

Maybe make quality is something I should run too?
BTW, I seem to get no errors after executing make style repeatedly. Maybe this is a bug?

Copy link
Member

@BenjaminBossan BenjaminBossan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what the issue with the style check could be, maybe different ruff versions. Anyway, this passes now and LGTM. Thanks providing this small performance improvement and patiently explaining your solution.

@BenjaminBossan BenjaminBossan merged commit 7e84dec into huggingface:main Mar 7, 2024
14 checks passed
@SUNGOD3 SUNGOD3 deleted the patch-1 branch March 7, 2024 10:59
BenjaminBossan pushed a commit to BenjaminBossan/peft that referenced this pull request Mar 14, 2024
This commit refines the levenshtein_distance algorithm implemented in peft_lora_seq2seq_accelerate_ds_zero3_offload.py to improve its space
complexity from O(n^2) to O(n). Additionally, thorough testing has been
conducted to ensure the correctness and reliability of the revised
implementation.
Also update peft_lora_clm_accelerate_ds_zero3_offload.py
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants