You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Following the acceptance of #27, developers could now use URL patterns to declare which Page Objects would work on specific URL patterns (reference code).
Problem
For large code bases, there might be hundreds of Page Objects which in turn could also result in hundreds of OverrideRule created using the @handle_urls annotation.
This could be unwieldy especially when they're spread out across multiple different subpackages and submodules within a Page Object Project. A project could utilize other Page Objects from other external packages, leading to a deeper roots.
Moreover, overlapping rules (e.g. POs improving on older POs) could add another layer of complexity. It should be immediately clear which PO would be executed according to URL pattern and priority.
Idea
There should be some sort of collection of utility functions that could interact with the List[OverrideRule] from the registry. Suppose that we have:
fromweb_poetimportrule_match# Explore which OverrideRules are matches a given URL.rule_match.find(rules, url="https://example.com/product/electronics?id=123")
# Returns: [OverrideRule_1, OverrideRule_2, OverrideRule_3, OverrideRule_4]# It could also narrow down the searchrule_match.find(rules, url="https://example.com/product/electronics?id=123", overridden=ProductPage)
# Returns: [OverrideRule_2, OverrideRule_4]# Finding the rules for a given set of criteria could result in multiple OverrideRules.# This could be POs improving on older POs which could also improve on other POs.# However, what we would ultimately want is the Final rule that has the highest priorityrule_match.final(rules, url="https://example.com/product/electronics?id=123", overridden=ProductPage)
# Returns: OverrideRule_2
This could help lead in creating test suites in projects that utilize other Page Object projects:
I see that the rule_match.find() is quite similar to how the PageObjectRegistry.search_override() method behaves (reference).
Refactoring it to a function (instead of a method) could cover developer use cases wherein the List[OverrideRule] is not created by the default_registry (or some custom registry). For example, it could merely be a simple configuration file containing all of the List[OverrideRule] that is manually maintained.
However, in any case, the rule_match.find() that is explored above aims to have an actual URL instead of a Pattern (which PageObjectRegistry.search_overrides() has)
The text was updated successfully, but these errors were encountered:
I think that's a good idea, but probably it would make sense to wait a bit, when a real-world use case would pop up. Then we can think about how to help solving it.
For the stated issue, I wonder if an opt-in setting in scrapy-poet that enables logging a debug message indicating which page object is used for any given URL and requested output, and why, could do the trick.
Background
Following the acceptance of #27, developers could now use URL patterns to declare which Page Objects would work on specific URL patterns (reference code).
Problem
For large code bases, there might be hundreds of Page Objects which in turn could also result in hundreds of
OverrideRule
created using the@handle_urls
annotation.This could be unwieldy especially when they're spread out across multiple different subpackages and submodules within a Page Object Project. A project could utilize other Page Objects from other external packages, leading to a deeper roots.
Moreover, overlapping rules (e.g. POs improving on older POs) could add another layer of complexity. It should be immediately clear which PO would be executed according to URL pattern and priority.
Idea
There should be some sort of collection of utility functions that could interact with the
List[OverrideRule]
from the registry. Suppose that we have:We could then have something like:
This could help lead in creating test suites in projects that utilize other Page Object projects:
Other Notes:
rule_match.find()
is quite similar to how thePageObjectRegistry.search_override()
method behaves (reference).List[OverrideRule]
is not created by thedefault_registry
(or some custom registry). For example, it could merely be a simple configuration file containing all of theList[OverrideRule]
that is manually maintained.rule_match.find()
that is explored above aims to have an actual URL instead of a Pattern (whichPageObjectRegistry.search_overrides()
has)The text was updated successfully, but these errors were encountered: