-
-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
optimize calls to number_of_columns() in SheetReader._iterate_columns() #25
Conversation
Current coverage is 99.00% (diff: 100%)@@ master #25 diff @@
==========================================
Files 31 31
Lines 2517 2518 +1
Methods 0 0
Messages 0 0
Branches 0 0
==========================================
+ Hits 2492 2493 +1
Misses 25 25
Partials 0 0
|
This can be costly for certain implementations e.g. pyexcel-xlsx (which internally uses openpyxl) where it is computed on-the-fly.
Your pull request is appreciated. However, I failed to see the logical difference although the file were changed. At the moment, pyexcel-xlsx use max_column to get the number of columns. Are you suggesting a different function to be used? In order to support columns computed on the fly, here is an example in ods where _iterate_columns was overriden. I think the interfaces in Sheet.py could be revised further to make it clearer. Do you have any suggestions? |
When In case |
As a comparison, in 0.2.x the commit advarisk/pyexcel-io@eb3bfbc87aee20ac72b39f3971650c4d34608914 feels very clear and logical (which is what I'm actually using in production). I don't think that the API with |
I see what you meant now. max_column is computed always hence your change will cache the value which could improve performance. Let me do a bit research around it. |
In case you're open to releasing |
have create branch 0.2.x and will release it while I review all plugins and make necessary changes in 0.3.0 and its plugins. |
please evaluate latest pyexcel-xlsx for performance improvement. |
This can be costly for certain implementations e.g. pyexcel-xlsx (which internally uses openpyxl) where it is computed on-the-fly.
It isn't possible to make the fix in pyexcel-xlsx as there is the possibility of it changing due to modifications via the API. One does have to make the (rather reasonable) assumption that the number of columns doesn't change during execution of
to_array
.