-
-
Notifications
You must be signed in to change notification settings - Fork 163
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix typo in tutorial_data_conversion.rst #69
Conversation
Current coverage is 98.58% (diff: 100%)@@ master #69 diff @@
==========================================
Files 61 61
Lines 5601 5601
Methods 0 0
Messages 0 0
Branches 0 0
==========================================
Hits 5522 5522
Misses 79 79
Partials 0 0
|
Surely, will do. Stay tuned. |
Does this section, What would happen if I save a multi sheet book into “csv” file help clarify save_book_as? If not clear, could you elaborate a bit? I can write more. |
Thanks. That helps. I'm still a bit confused by For my little project, I'm hoping to convert specific sheets from .xlsx/.xlsm files to CSV, so I can put the data under version control. If I can't select a specific source sheet with import pyexcel
sheet = pyexcel.get_sheet(file_name="example.xlsx", sheet_name="Sheet2")
sheet.save_as("example-sheet2.csv") ? Or, if I want to save all sheets in an Excel file to individual CSV files, import pyexcel
book = pyexcel.get_book(file_name="example.xlsx")
sheets = book.to_dict()
for key, item in sheets.items():
sheet = { key: sheets[key] }
sheet.save_as(key + ".csv") ? Does that look reasonable? |
the last line "keywords – additional keywords can be found at pyexcel.get_book()" would bring you to the get_book()'s parameters :). for 1st part of your code, you can try:
for 2nd part of your code:
or:
What's more, alternatively you could choose pyexcel-cli, which is not well documented so far but you can use help message to get around. For 1st part of your code, you could do this:
for 2nd part of your code, you could the following instead:
For the help with pyexcel-cli, you can do |
Awesome, thanks so much. I'll try out those suggestions. |
…ok_as. will find out later how to manage the same doc string in multiple places.
Sorry to turn this into a support ticket, but one more question: do you know if pyexcel offers a faster Excel-to-CSV conversion than
Is that kind of performance to be expected? Doing the conversion in Excel 2010 only takes a few seconds. (i7-5600U, 16 GB RAM, Python 3.4.4 64-bits) |
To be honest, the performance evaluation is not systematically done on readers. So far pyexcel-xls and pyexcel-xlsx can read xlsx file. The latter one may be faster as it uses read-only mode. You can try switch the library and see. Pip unstall one and pip install the other. |
I haven't had time to take another look, but I was already using "pyexcel-xlsx", I think (I installed "pyexcel" and "pyexcel-xlsx" with pip, but only imported "pyexcel" in my script). Thanks a lot for the links to those other two libraries. I hadn't seen those before. Especially the description of xlsx2csv ("it is fast, and works for huge xlsx files") looks promising. |
I am sorry to hear that. pyexcel-xlsx uses openpyxl in read-only mode, which is the best the 3rd party library could do. If you do not mind, could you please update me on the performance of xlsx2csv? |
@rmzelle , could you please try using pyexcel-xlsxr? which I hope it should match the performance of xlsx2csv. |
Are you dealing with some public dataset? I may do the performance benchmarking myself. |
No, sorry, it's a private database. |
By the way, could you add some example code on how to exactly open a multiple sheet Excel workbook and save its sheets to CSV files? It's not really clear to me what the right incantation is for
pyexcel.save_book_as
.I'm currently using
pyexcel.save_as(file_name=inputfilepath, dest_file_name=outputfilepath)
, but that only saves the first sheet.