-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Split out a LinkCollector class from PackageFinder #6910
Conversation
f9120d3
to
b9a24b2
Compare
b9a24b2
to
d2af421
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've not looked at the tests yet and haven't 100% reviewed this code. This PR is just slightly too big to review comfortably for me. Or... I'm just tired after writing 2 exams today. 🙃
Anyway, posting my review eagerly because shorter feedback loops are better.
src/pip/_internal/index.py
Outdated
# (1) links from file locations, | ||
# (2) links from find_links, and | ||
# (3) a dict mapping HTML page url to links from that page. | ||
CollectedLinks = Tuple[List[Link], List[Link], Dict[str, List[Link]]] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A good follow up might be to convert this to a TypedDict, so that this value has some hint in the key names, about what the different values are for.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or a namedtuple, you mean?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 for namedtuple (and I prefer it over typeddict)
@pradyunsg Thanks! I can try breaking this up into two or three PR's so the number of lines will be smaller in each. Many of the lines are due to moving methods with no change in content, which gives the appearance of more happening than there really is. |
Okay, I created PR #6913 to do separately, which will make this PR a lot smaller afterwards. |
Move some PackageFinder methods (preparation for PR #6910)
Thanks for breaking this up! Even doing separated commits would've worked and this works well too. 🙃
Yea, it's difficult to "see" what's a functionally equivalent change vs what's a moved method. |
No problem!
I agree! I'm always more than happy to break commits and/or PR's up for you or others if it helps make it easier to review. So don't ever hesitate to ask.. |
Hello! I am an automated bot and I have noticed that this pull request is not currently able to be merged. If you are able to either merge the |
d2af421
to
263ce4b
Compare
adccd7c
to
96fd8c1
Compare
Okay, this PR is ready to review again. The main hold-up was that I wanted to add at least one non-trivial unit test of the 130 lines of this PR are for that unit test, and another 50 lines are for updating the I also addressed the previous review comments. |
Also, with this PR, |
e65a3e5
to
2464559
Compare
I just did some commit history hygiene so it's easier for people to see what's happening in the commit that adds the |
2464559
to
6e2ad4e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! Just 2 minor comments.
tests/unit/test_index.py
Outdated
given names, a link whose URL has a base name matching that name. | ||
""" | ||
for name in names: | ||
for link in links: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would do something like
assert any(
link.url.endswith(name) for link in links
), message_from_exception_below
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wouldn’t use assert here, Pytest can emit preplexing errors for asserts in functions. Personally I’d write a custom assertion, but Chris’s version would work well enough for me.
Maybe a middle of the road alternative would be satisfactory:
for name in names:
if not any(link.url.endswith(name) for link in links):
raise RuntimeError(...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pytest can emit perplexing errors for asserts in functions
Do you have an example? I've used asserts in helper functions, classes, and fixtures and haven't noticed any issues so far.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the feedback, both of you. I've updated the PR. I didn't feel strongly either way here, so I just went with @chrahunt's original suggestion. I do agree with @uranusjr that pytest often includes extra info that looks confusing and isn't especially helpful, e.g. the following in this case:
E assert False
E + where False = any(<generator object check_links_include.<locals>.<genexpr> at 0x106edb518>)
but because we're including a custom error message, we can just ignore those other portions and just look at the custom message.
2. Constructs a ``CandidateEvaluator`` object and uses that to determine | ||
the best candidate. It does this by calling the ``CandidateEvaluator`` | ||
class's ``compute_best_candidate()`` method on the return value of | ||
``find_all_candidates()``. This corresponds to steps 4-5 of the Overview. | ||
|
||
|
||
.. _link-collector-class: | ||
|
||
The ``LinkCollector`` class |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know if this discussion was had before, but would it be better to keep these kinds of docs in the code itself? Then we could just include it here in any number of ways and I think it would be more likely to be kept up to date.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for taking a look! I don't really have a preference for how this type of documentation is done. The reason I added it in the first place is that there was a request from a number of people for me to document the various classes in a separate rst
file, so I did that. And I was just updating it here. (I will note that one advantage of doing it directly in the rst
is that it lets you use the various reStructuredText formatting, links, etc, whereas having that in a docstring would be a bit weird. I'm also not sure the extent to which it would carry over.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The original discussion was here: #6787
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's true, it would look kind of strange to have to the rst in the docstring, given that we don't currently do that anywhere else.
6e2ad4e
to
ca4fc9e
Compare
This PR splits out from the
PackageFinder
class a newLinkCollector
class responsible solely for gathering all the links, but without doing any filtering of them. This reduces thePackageFinder
class down to ~360 lines and makes it so the rest ofPackageFinder
doesn't have a dependency onPipSession
or making networking requests (just filtering and sorting strings).The code in this PR should be logically equivalent to what was there before (so in particular no behavior change).
This PR makes progress e.g. on issue #6430 because the
LinkCollector
class's maincollect_links()
method provides the "output all possible links" functionality discussed in that issue. The PR also updates theindex.py
architecture section created in PR #6787 to reflect the new class.A follow-up PR can move
LinkCollector
and theindex.py
functions it depends on to a separate module.