You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I confirm that I am using English to submit this report (我已阅读并同意 Language Policy).
[FOR CHINESE USERS] 请务必使用英文提交 Issue,否则会被关闭。谢谢!:)
Please do not modify this template :) and fill in all the required fields.
Dify version
0.11.2
Cloud or Self Hosted
Self Hosted (Docker)
Steps to reproduce
When entering an excel document and extracting the document, if there are multiple sheets in excel, other sheets cannot be read, only the first sheet can be read
✔️ Expected Behavior
When loading an excel document and parsing an excel document, you can extract all the sheets in the excel document
❌ Actual Behavior
At present, when the document is extracted, only the first sheet in excel can be read, and this cannot handle the scenario where there are multiple sheets in excel
The text was updated successfully, but these errors were encountered:
To extract and read all sheets from an Excel document with multiple sheets, you can use the ExcelExtractor class. This class is designed to handle both .xlsx and .xls file formats and iterates over all sheets in the Excel file. Here's a brief overview of how it works:
For .xlsx files:
wb=load_workbook(self._file_path, data_only=True)
forsheet_nameinwb.sheetnames:
sheet=wb[sheet_name]
data=sheet.valuestry:
cols=next(data)
exceptStopIteration:
continuedf=pd.DataFrame(data, columns=cols)
df.dropna(how="all", inplace=True)
# Process each row in the DataFrame
For .xls files:
excel_file=pd.ExcelFile(self._file_path, engine="xlrd")
forexcel_sheet_nameinexcel_file.sheet_names:
df=excel_file.parse(sheet_name=excel_sheet_name)
df.dropna(how="all", inplace=True)
# Process each row in the DataFrame
This code iterates over each sheet in the workbook, reads the data into a DataFrame, and processes each row, allowing you to extract data from all sheets in the Excel document [1].
crazywoola
changed the title
When entering an excel document and extracting the document, if there are multiple sheets in excel, other sheets cannot be read, o在输入excel文档,文档提取时,如果excel中有多个sheet,则无法读取其他sheet,只能读取第一个sheetnly the first sheet can be read
When entering an excel document and extracting the document, if there are multiple sheets in excel, other sheets cannot be read
Jan 21, 2025
We already supported this feature in this PR #11215
This should be a feature request instead of bug report, if you are submit bug report, please make sure you have listed necessary steps to reproduce the error and expected behavior instead of typing them 3 times.
Self Checks
Dify version
0.11.2
Cloud or Self Hosted
Self Hosted (Docker)
Steps to reproduce
When entering an excel document and extracting the document, if there are multiple sheets in excel, other sheets cannot be read, only the first sheet can be read
✔️ Expected Behavior
When loading an excel document and parsing an excel document, you can extract all the sheets in the excel document
❌ Actual Behavior
At present, when the document is extracted, only the first sheet in excel can be read, and this cannot handle the scenario where there are multiple sheets in excel
The text was updated successfully, but these errors were encountered: