[ENH] - Limit displayed document pills in header #224

peachkeel · 2023-12-02T02:07:09Z

Feature description

When there are a lot of documents underlying a chat, the chat header becomes unwieldy. It might be better to display the total number of documents underlying the chat in the header than all of their filenames.

Value and/or benefit

Removing the chat_document_pills greatly improves client-side performance when dealing with large corpora.

Anything else?

I've had to comment out the following lines of code to make my system performant:

ragna/ragna/_ui/central_view.py

Lines 596 to 623 in 3d8541f

    
                   if ( 
        
                       self.current_chat is not None 
        
                       and "metadata" in self.current_chat 
        
                       and "documents" in self.current_chat["metadata"] 
        
                   ): 
        
                       doc_names = [d["name"] for d in self.current_chat["metadata"]["documents"]] 
        
                       for doc_name in doc_names: 
        
                           pill = pn.pane.HTML( 
        
                               f"""<div class="chat_document_pill">{doc_name}</div>""", 
        
                               stylesheets=[ 
        
                                   """ 
        
                                                            :host { 
        
                                                               background-color: rgb(241,241,241); 
        
                                                               margin-top: 15px; 
        
                                                               margin-left: 5px;    
        
                                                               margin-right: 5px; 
        
                                                               padding: 5px 15px; 
        
                                                               border-radius: 10px; 
        
                                                               color:var(--accent-color); 
        
                                                            }    
        
                                                            """ 
        
                               ], 
        
                           ) 
        
                           chat_documents_pills.append(pill)

The text was updated successfully, but these errors were encountered:

pmeier · 2023-12-04T12:50:55Z

Remove chat_document_pills from header

That is not going to happen. Having the documents used for the chat visible by default is intended. Instead what we should do in this case is truncate the number of visible documents. The full number is still visible when clicking the chat info button.

When there are a lot of documents underlying a chat, the chat header becomes unwieldy. It might be better to display the total number of documents underlying the chat in the header than all of their filenames.

This is special to your use case of having the whole corpus of documents active at once. In there I would even say the number of documents shouldn't even displayed as it provides no value to the user.

It becomes more and more clear that we need to support this use case in general. Will open an issue about this soon.

It seems that you are using .doc documents converted to .txt in your corpus. Would it help if Ragna supported .doc / .docx out of the box? Any other formats that are needed? We have a long list in #202 (reply in thread) although I'm against adding support for everything listed there.

peachkeel · 2023-12-04T15:18:13Z

Your take on the situation sounds reasonable. My colleague, @Tengal-Teemo, plans to post some of his insights into the performance of Ragna's UI in the discussion section sometime this week. Those insights might be helpful in making sure the UI stays responsive across a variety of conditions.

As far as data connectors are concerned, .doc support would be great in general. For our specific use-cases, though, most of the document preprocessing is already standardized and done. Thus, we're mainly using Ragna to help prototype and do discovery in the middle of these preexisting workflows. ~~Honestly, ingesting text in JSONL format would probably be a nice feature from our perspective:~~

{"text": "..."}
{"text": "..."}
{"text": "..."}

See: https://github.com/leogao2/lm_dataformat

pmeier · 2023-12-04T15:33:58Z

post some of his insights into the performance of Ragna's UI in the discussion section sometime this week. Those insights might be helpful in making sure the UI stays responsive across a variety of conditions.

Thanks a ton 🚀

As far as data connectors are concerned, .doc support would be great in general.

I've opened #225.

ingesting text in JSONL format would probably be a nice feature from our perspective

Could you open an issue for that. I'm not familiar with the format.

pmeier · 2023-12-04T15:41:11Z

We also need some handling for the popup:

Here we shouldn't truncate, but rather provide a scrollable view.

pmeier · 2023-12-06T23:14:27Z

I've added a hard limit for 20 visible documents in #235. This doesn't solve any of the graphics issues raised here, but at least prevents performance hits when one is using a large number of documents.

peachkeel added the type: enhancement 💅 New feature or request label Dec 2, 2023

pmeier added the area: web-ui 💻 label Dec 4, 2023

pmeier changed the title ~~[ENH] - Remove chat_document_pills from header~~ [ENH] - Limit displayed document pills in header Dec 4, 2023

pmeier mentioned this issue Dec 4, 2023

[ENH] - Add support for .doc / .docx #225

Closed

pmeier mentioned this issue Dec 6, 2023

set hard limit of visible document pills #235

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ENH] - Limit displayed document pills in header #224

[ENH] - Limit displayed document pills in header #224

peachkeel commented Dec 2, 2023

pmeier commented Dec 4, 2023

peachkeel commented Dec 4, 2023 •

edited

Loading

pmeier commented Dec 4, 2023

pmeier commented Dec 4, 2023

pmeier commented Dec 6, 2023

[ENH] - Limit displayed document pills in header #224

[ENH] - Limit displayed document pills in header #224

Comments

peachkeel commented Dec 2, 2023

Feature description

Value and/or benefit

Anything else?

pmeier commented Dec 4, 2023

peachkeel commented Dec 4, 2023 • edited Loading

pmeier commented Dec 4, 2023

pmeier commented Dec 4, 2023

pmeier commented Dec 6, 2023

peachkeel commented Dec 4, 2023 •

edited

Loading