Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide urllib.parse.ParseResult to plugins with URL triggers #2413

Closed
SnoopJ opened this issue Feb 19, 2023 · 5 comments
Closed

Provide urllib.parse.ParseResult to plugins with URL triggers #2413

SnoopJ opened this issue Feb 19, 2023 · 5 comments
Labels
Declined Requests that will not be implemented for technical or project direction reasons Feature

Comments

@SnoopJ
Copy link
Contributor

SnoopJ commented Feb 19, 2023

Requested Feature

Sopel should take better advantage of urllib.parse when responding to URLs and passing them off to plugins for handling. See discussion in #2412.

What I have in mind is something along the lines of the following, and hopefully this is in the direction that @Exirel and @dgw were thinking.

Suppose the Trigger provided to the handler had new behavior available if the handler call was caused by a URL, Trigger.parse_url() -> urllib.parse.ParseResult. In this hypothetical, the code in the wikipedia plugin that caused us all to start thinking about this might look something like:

@plugin.url(r'https?:\/\/([a-z]+(?:\.m)?\.wikipedia\.org)\/wiki\/((?!File\:)[^ #]+)#?([^ ]*)')
@plugin.output_prefix(PLUGIN_OUTPUT_PREFIX)
def mw_info(bot, trigger):
    """Retrieves and outputs a snippet of the linked page."""                                                                                                         
    url = trigger.parse_url()
    article = url.path.lstrip("/wiki")
    section = url.fragment

    do_the_rest(article, section)

But this is something I put together in less than an hour and definitely needs the eyes of the experts on it. In particular, I'm not sure how I feel about the method call, and don't know what's missing from this interface for messages that might contain multiple URLs or other things I'm not accounting for. I opted for a method here because it defers the work until the user actually wants the parsed result, but it's entirely possible that this needs to be implemented in some other manner.

Problems Solved

https://xkcd.com/1313/

Alternatives

No response

Notes

No response

@SnoopJ SnoopJ added Feature Needs Triage Issues that need to be reviewed and categorized labels Feb 19, 2023
@dgw
Copy link
Member

dgw commented Feb 19, 2023

You can implement it as a property, still deferring the work until the handler code asks for it (and caching the result once the property is accessed once, if that seems useful enough) but without making it a method call from plugin authors' perspective.

don't know what's missing from this interface for messages that might contain multiple URLs

Nothing is missing: Each URL is a separate trigger for @url callables, like each occurrence of the pattern in a line for @find rules.

Thing is, @url was intended to have some special behavior before. @url decorated callables can have a third parameter called match which once upon a time contained only data about the URL match (without the rest of the line). That's been deprecated, and it's now functionally the same as trigger.

I don't know that the extra API complexity of this idea (introducing a new nullable property on Trigger, or adding a different optional third argument only for URL callables) is worth it, when it's very easy for plugin authors to just take the matched URL and urllib.parse.urlparse() it themselves with one additional line of code vs. your example (one import; the assignment would just use the stdlib function instead of a Trigger property). Would be interested in @Exirel's take, though.

@Exirel
Copy link
Contributor

Exirel commented Feb 19, 2023

Before, trigger was for the whole line, but didn't match the regex from the @url decorator, and there was a match that was using the regex. Now, trigger uses the regex from the @url decorator, while having the whole message.

I don't really understand why would the trigger expose a ParseUrl object, because Sopel doesn't know what the plugin wants to do with the URLs it catches. In the case of Wikipedia, that was very useful. When I implemented the lichess plugin, there was no need for any urlparse. I simply constructed my regex pattern with (named) groups and used them directly.

From my perspective, it's up to the plugin author to take advantage of urlparse when they need to do that explicitly.

@Exirel
Copy link
Contributor

Exirel commented Feb 19, 2023

Note: we could, however, have a better tutorial explaining that and showing the way.

@SnoopJ
Copy link
Contributor Author

SnoopJ commented Feb 19, 2023

Giving it some more thought in light of two dissents, it's not such a big deal for the end-user to import urlparse() if they need it, especially since the proposed API requires an explicit opt-in anyway, and many plugins will not care about having this kind of structural information. Thanks for the sanity-check, closing.

@SnoopJ SnoopJ closed this as not planned Won't fix, can't repro, duplicate, stale Feb 19, 2023
@Exirel
Copy link
Contributor

Exirel commented Feb 19, 2023

I'll keep this issue as an invitation to do better tutorials, so it's a positive outcome from my perspective. ❤️

@Exirel Exirel added Declined Requests that will not be implemented for technical or project direction reasons and removed Needs Triage Issues that need to be reviewed and categorized labels Feb 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Declined Requests that will not be implemented for technical or project direction reasons Feature
Projects
None yet
Development

No branches or pull requests

3 participants