Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IndexError occurs when markdown chat message that gets streamed out has incomplete markdown URLs. #6520

Closed
1 task
ericmjl opened this issue Mar 18, 2024 · 4 comments · Fixed by #6535
Closed
1 task
Labels
type: bug Something isn't correct or isn't working
Milestone

Comments

@ericmjl
Copy link

ericmjl commented Mar 18, 2024

ALL software version info

(this library, plus any other relevant software, e.g. bokeh, python, notebook, OS, browser, etc)

Libraries:

panel                     1.3.8              pyhd8ed1ab_0    conda-forge
python                    3.11.8          hdf0ec26_0_cpython    conda-forge
mdurl                     0.1.0              pyhd8ed1ab_0    conda-forge

Description of expected behavior and the observed behavior

I am encountering an issue when trying to render content inside a holoviz chat message. Specifically, I'm streaming out markdown-renderable text, and the error occurs when the text includes a colon character. The expected behavior is that the chat message renders the markdown text without errors, including links formatted in markdown.

Complete, minimal, self-contained example code that reproduces the issue

import panel as pn

string = "hey, this is a [URL](https://www.google.com). Can you see it?"

async def generate_text(msg, user, instance):
    m = ""
    for s in string:
        m += s
        yield m

chat = pn.chat.ChatInterface(callback=generate_text, callback_exception="verbose")
chat.servable()

Stack traceback and/or browser JavaScript console output

The error message is as follows:

# error message goes here between backticks
Traceback (most recent call last):
  File "/Users/ericmjl/anaconda/envs/llamabot/lib/python3.11/site-packages/panel/chat/feed.py", line 526, in _prepare_response
    await asyncio.gather(
... (truncated for brevity) ...
  File "/Users/ericmjl/anaconda/envs/llamabot/lib/python3.11/site-packages/mdurl/_parse.py", line 204, in parse
    if rest[host_end - 1] == ":":
IndexError: string index out of range

Screenshots or screencasts of the bug in action

image image
  • I may be interested in making a pull request to address this
@ahuang11
Copy link
Contributor

ahuang11 commented Mar 18, 2024

Can reproduce; I think the right behavior would be silencing these errors from markup if possible?

  File "/Users/ahuang/repos/panel/panel/pane/markup.py", line 451, in _process_param_change
    return super()._process_param_change(params)
  File "/Users/ahuang/repos/panel/panel/pane/base.py", line 520, in _process_param_change
    params.update(self._transform_object(params.pop('object')))
  File "/Users/ahuang/repos/panel/panel/pane/markup.py", line 445, in _transform_object
    ).render(obj)

Or maybe submitting an issue upstream to mdurl?


  File "/Users/ahuang/miniconda3/envs/panel/lib/python3.10/site-packages/mdurl/_parse.py", line 204, in parse
    if rest[host_end - 1] == ":":

@ericmjl
Copy link
Author

ericmjl commented Mar 18, 2024

@ahuang11 thank you for commenting! I think there's possibly a good reason for the behavior in mdurl. Using ChatGPT helped me to accelerate my understanding of the code:

The line `if rest[host_end - 1] == ":"` in the provided source code plays a specific role in parsing and handling URLs. Let's break it down to understand its purpose:
  1. Context of the Code: This line is within the function parse of the MutableURL class, which is tasked with parsing a URL string into its constituent components, like protocol, hostname, port, path, query string, etc.

  2. Variable rest: The rest variable contains the remaining part of the URL string that is yet to be processed. Initially, rest is the entire URL, but as the code progresses, it's updated to reflect the unprocessed part.

  3. Variable host_end: The host_end variable marks the end of the host portion within the rest string. It's determined by searching for characters that typically do not appear in hostnames, such as "/", "?", ";", "#".

  4. The Specific Line: The line if rest[host_end - 1] == ":" is checking if the character immediately before the host_end position in rest is a colon (":"). This is significant for two main reasons:

    • Port Separator: In a URL, a colon immediately before the end of the hostname often signifies the start of the port number. For example, in "http://example.com:80/path", the colon after "example.com" separates the hostname from the port number "80".
    • Edge Case Handling: Checking the character before host_end helps handle edge cases where the URL might end with a colon without specifying a port, or where the hostname itself might end with a colon for some reason.
  5. Purpose of the Check: If this condition is true, it likely means that the URL includes a port number, or at least a colon indicating where a port number would be. The subsequent code is expected to handle extracting the port number (if any) and updating the hostname and port attributes of the MutableURL object accordingly.

  6. Overall Role in URL Parsing: This line is a part of the logic that ensures the accurate parsing of the hostname and port from a given URL, which is crucial for correctly interpreting and utilizing URLs in network communications and web applications.

By meticulously examining each part of the URL and correctly interpreting characters like ":", the parser can effectively distinguish between different components of the URL, ensuring that applications using this code can correctly access and manipulate URL data.

If we were to silence the errors within Panel, should there be a try/except block? Just trying to explore what the solution space might look like first, though.

@ahuang11
Copy link
Contributor

I'm not exactly sure how to catch this error specifically since it's just a generic IndexError.

I'm leaning more towards raising this issue upstream to either markdownit or mdurl?

Would like others' feedback though.

@philippjfr
Copy link
Member

Hmm, I agree with raising an issue upstream but I don't think there's any condition where I want an actual error to be raised to the user when markdown parsing fails. I'd say we do want to catch the error on our side and turn it into a warning, and then perhaps fall back to an alternate markdown parser or simply render the output as empty?!

@philippjfr philippjfr added this to the v1.4.0 milestone Mar 20, 2024
@philippjfr philippjfr added the type: bug Something isn't correct or isn't working label Mar 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: bug Something isn't correct or isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants