-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize levenshtein_distance algorithm in peft_lora_seq2seq_accelera… #1527
Conversation
…te_ds_zero3_offload.py This commit refines the levenshtein_distance algorithm implemented in peft_lora_seq2seq_accelerate_ds_zero3_offload.py to improve its space complexity from O(n^2) to O(n). Additionally, thorough testing has been conducted to ensure the correctness and reliability of the revised implementation.
Hi, thanks for providing a more efficient implementation. From what I can tell, this should have very marginal effects on this script's runtime, if any, given how little time is spent on this vs training the net. Do you think it's really worth it to make this update? |
Hi BenjaminBossan, |
In general, I agree with your argument. From a maintainer's point of view, I'm just a bit reluctant because:
Those are not big issues, but then again the advantage is also minimal, which is why I'm hesitating. |
I understand, your hesitation is entirely reasonable. However, perhaps the following analysis might change your perspective? Firstly, regarding the first point, the probability of encountering edge test cases resulting in different outcomes is exceedingly low. This is simply a function of a classic problem. Furthermore, under the condition where both input and output types remain unchanged, this update not only passed the 1146 test cases on Leetcode but also passed my scripted tests. (Contains special characters, long strings, empty strings, special arrangements) As for the second point, I've gained some understanding of the context in which this function is utilized. It serves as a classification algorithm to predict which class the model's output belongs to. The source code might have been written by the original author themselves, as I only find a similar implementation in the relevant documentation (examples/causal_language_modeling/peft_lora_clm_accelerate_ds_zero3_offload.py). Additionally, it's worth mentioning that the impact of this optimization on time might not be negligible. The reason it seems to have minimal effect currently is simply because during testing, only a few classes (just 3) were utilized, and the names of these classes were also very short (less than 10 characters). In reality, as the number and length of classes increase, this improvement could easily exceed 1 second in overall acceleration and scales linearly with the number of classes and output length. Moreover, when the tested model becomes smaller or the training epoches decreases, its overall impact will gradually become apparent. https://colab.research.google.com/drive/1JUznpk2OAGCu5ZVAddLtKdPKyGKMp2ed?usp=sharing |
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
Given the extensive tests you ran, I think we can be fairly certain that my first concern is not valid, so I'm fine with going ahead. Could you please run |
To pass make style.
Strange that it passes for you, CI shows this error:
|
Maybe make quality is something I should run too? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure what the issue with the style check could be, maybe different ruff versions. Anyway, this passes now and LGTM. Thanks providing this small performance improvement and patiently explaining your solution.
This commit refines the levenshtein_distance algorithm implemented in peft_lora_seq2seq_accelerate_ds_zero3_offload.py to improve its space complexity from O(n^2) to O(n). Additionally, thorough testing has been conducted to ensure the correctness and reliability of the revised implementation. Also update peft_lora_clm_accelerate_ds_zero3_offload.py
This commit refines the levenshtein_distance algorithm implemented in peft_lora_seq2seq_accelerate_ds_zero3_offload.py to improve its space complexity from O(n^2) to O(n). Additionally, thorough testing has been conducted to ensure the correctness and reliability of the revised implementation.