Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement indexing of simple tables in Word files #1651

Merged
merged 3 commits into from
Sep 8, 2024

Conversation

artmatsak
Copy link
Contributor

Right now, tables in Word files are completely skipped from indexing. This PR unwraps any simple tables (no nested tables, no omitted cells) on a row-by-row basis to include them with the indexed text. Some assumptions are made along the way:

  1. The first row of the table is the heading
  2. Entity attributes are in the table columns, with each table row representing an isolated entity.

Each unwrapped table row may look as follows:

No.: 2
Issue: The CD doesn’t play
Comments: The CD doesn’t start playback upon insertion into the drive. Furthermore, the drive LED doesn’t turn on.

Copy link

vercel bot commented Jun 17, 2024

@artmatsak is attempting to deploy a commit to the Danswer Team on Vercel.

A member of the Team first needs to authorize it.

@yuhongsun96 yuhongsun96 merged commit 51a13f5 into onyx-dot-app:main Sep 8, 2024
2 of 6 checks passed
rajivml pushed a commit to UiPath/danswer that referenced this pull request Oct 2, 2024
rajivml pushed a commit to UiPath/danswer that referenced this pull request Oct 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants