Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle sites that use absolute links and sites that require the final slash in the URL #121

Merged
merged 5 commits into from
Sep 29, 2023

Conversation

jikamens
Copy link
Collaborator

@jikamens jikamens commented Sep 3, 2023

No description provided.

Some web sites will return 404 if you fetch a directory without the
final slash. For example, https://archive.mozilla.org/pub/ works,
https://archive.mozilla.org/pub does not. We need to do two things to
accommodate this:

* When processing the root URL of the filesystem, instead of stripping
  off the final slash, just set the offset to ignore it.
* In the link structure, store the actual URL tail of the link
  separately from its name, final slash and all if there is one, and
  append that instead of the name when constructing the URL for curl.
@jikamens
Copy link
Collaborator Author

jikamens commented Sep 3, 2023

lol I see somebody else already encountered this #115. I guess I should have checked for PRs first. Anyway I'll leave it to you to decide which of these PRs to prefer. ;-)

@jikamens
Copy link
Collaborator Author

jikamens commented Sep 3, 2023

I will say that I think my version is more robust and better documented.

On some sites, the link to each subfolder is an absolute link rather
than a relative one. To accommodate this, convert the links from
absolute to relative before storing them in the link table.
I believe an appropriate expectation is that if the user enables
debugging with a command-line flag, then that should also enable
messagse designated as debug messages in the code to be printed.
Some sites put unencoded characters in their href attributes that
really should be encoded, most notably spaces. Curl won't accept a URL
with a space in it, and perhaps other such characters as well. Address
this by properly encoding characters in URLs before feeding them to
Curl.
@jikamens
Copy link
Collaborator Author

jikamens commented Sep 3, 2023

OK, there's another bug fix for you in the last commit, plus a couple ancillary commits that aren't strictly speaking necessary for the bug fixes but I think are a good idea. 🤷

@jcharaoui
Copy link
Collaborator

I would suggest to open separate merge requests for the different proposed changes. This PR is getting a little out of hand.

@jikamens
Copy link
Collaborator Author

I am happy to do that if someone is actually going to look at the changes but considering that I opened the PR weeks ago and there has been no response I don't know if that is going to happen.
If your comment is intended to convey, "I have commit rights to this repository and if you open separate PRs for these changes then I will look at them," then I will do that.
If you are just making a suggestion as an observer then I think I will hold off on doing the extra work to split the PR until I actually here from someone who has the access to review the changes and is willing to consider merging them.

@jikamens
Copy link
Collaborator Author

Also, most of the changes in the PR before the autotools changes I pushed today are intended for essentially the same purpose, i.e., getting this tool to work on sites where it doesn't currently, and therefore I think it's reasonable to review them together.

@fangfufu
Copy link
Owner

I am happy to do that if someone is actually going to look at the changes but considering that I opened the PR weeks ago and there has been no response I don't know if that is going to happen.

Sorry I had been a bit busy with my personal life. I just hadn't got time to look at it.

@fangfufu
Copy link
Owner

fangfufu commented Sep 28, 2023

@jikamens, I agree with what @jcharaoui said.

Also, you seem to be very good at software engineering. I would be super grateful if you write some sort of test system for this. If you write a test system to prove that all your changes work, I will approve your pull request.

Right now I just have to manually test things. This really isn't ideal. I started this project before I became a professional software engineer. This sort of testing wouldn't fly in my company...

I am happy to approve your autotool related changes, if they were in a different pull request.

install-sh Outdated
#!/bin/sh
# install - install a program, script, or datafile

scriptversion=2020-11-14.01; # UTC
Copy link
Owner

@fangfufu fangfufu Sep 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is this from? Why is the scriptversion nearly 3 years old?

missing Outdated
@@ -0,0 +1,207 @@
#! /bin/sh
# Common wrapper for a few potentially missing GNU programs.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this?

@jikamens
Copy link
Collaborator Author

Hi @fangfufu,

A few things:

  1. I did not mean to imply any sort of judgment in my comment above about your not responding to the PR. I sincerely apologize if it came across that way. I understand that people have lives outside of their open-source projects and may not be able to respond immediately to PRs. It's just that from the outside there's no way to distinguish between that and someone who has decided to stop working on a project, and I am not inclined to spend additional time working on a project that isn't being actively maintained.

  2. Having said that, I don't have capacity to build a unit-test system from scratch for someone else's project. If there were existing unit tests in this project then I would be happy to add tests for my changes, but I think asking a project contributor to create a unit-test system from scratch as the price of accepting their bug-fixes into your project is a bit much. The fact of the matter is that my changes clearly fix bugs -- I even indicated in the commits the sites you could use to test that the bug exist without my changes and go away with them -- and that I submitted them to the project for your benefit, not mine. I can always just keep using my fork without anyone else benefiting from my changes. 🤷 That's your call.

  3. I just removed my most recent commit from the PR because you and @jcharaoui are correct that it muddies the waters and I don't want it to be a blocker to getting the real bug fixes in. I modified the build to use autotools because that makes it easier to build and package, so if you're interested in that change I can submit a separate PR for it, but I think it would be better to leave that aside for the time being and focus on the commits that are actual bug fixes.

@fangfufu
Copy link
Owner

@jikamens , I am interested in the autotool changes. Please submit a separate commit for it.

@fangfufu
Copy link
Owner

@jikamens

but I think asking a project contributor to create a unit-test system from scratch as the price of accepting their bug-fixes into your project is a bit much.

I didn't imply in any way that you have to create unit-tests to get these accepted. I was just saying that it would be a good contribution to do. I might just create an issue for it.

@fangfufu fangfufu merged commit 7363ada into fangfufu:master Sep 29, 2023
@fangfufu
Copy link
Owner

@jikamens Created pinned issue #122 for the unit test framework.

@fangfufu
Copy link
Owner

@jikamens

I modified the build to use autotools because that makes it easier to build and package, so if you're interested in that change I can submit a separate PR for it, but I think it would be better to leave that aside for the time being and focus on the commits that are actual bug fixes.

Autotool changes can be done simultaneously as your other bug fixes. Once again, thanks for your contribution. I think autotool changes would be good for this project.

@jikamens
Copy link
Collaborator Author

See #123.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants