-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
roffit: fix special characters and broken links #37
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see any actual roffit changes here? Does it only add tests? How can it then solve any problems?
Also, this has to be possible to test with much less data added. 23K of data is too excessive.
This comment was marked as outdated.
This comment was marked as outdated.
I approve of extending the tests to test for more problems, but I think this PR goes about it the wrong way. A first take would be to extend the existing test case with more text for the problems you have identified. Not adding thousands of lines of new files. roffit is not curl specific, it converts man pages to html so it has no knowledge of or special handling for curl related issues. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed with comments on Makefile, fixBrokenLink, fixSpecialCharacters, and roffit files.
Sorry, but this PR does not fix roffit as described. It just fixes two minor static HTML texts. |
That's fine. Thank you for considering it. Adding two bash scripts was the only way I was able to resolve issue 36. Learning more about perl is on my todo list, so if the issue is still up in several months I may try again. Regardless, this has been a very enjoyable and rewarding process! |
Separate scripts cannot solve this issue. |
Ok. Closing this for now and will reopen when able to implement in roffit. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is totally unacceptable code. Make it clean pure perl and do not generate temp files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lines 259 to 285 - perl changes
roffit
Outdated
@@ -256,6 +256,33 @@ sub linkfile { | |||
# convert https://, http:// and ftp:// URLs to <a href> links | |||
$field =~ s/(^|\W)((https|http|ftp):\/\/[a-z0-9\-._~%:\/?\#\[\]\@!\$&'()*+,;=]+)/$1<a href=\"$2\">$2<\/a>/gi; | |||
|
|||
# remove punctuation characters at end of url | |||
$field =~ s/(href=\")([^\"]*)(")/$1$2/g; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This does not remove punctuation. What is this for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
" seemed to be a troublemaker so for following lines to work it had to be removed first in process.
roffit
Outdated
@@ -256,6 +256,33 @@ sub linkfile { | |||
# convert https://, http:// and ftp:// URLs to <a href> links | |||
$field =~ s/(^|\W)((https|http|ftp):\/\/[a-z0-9\-._~%:\/?\#\[\]\@!\$&'()*+,;=]+)/$1<a href=\"$2\">$2<\/a>/gi; | |||
|
|||
# remove punctuation characters at end of url | |||
$field =~ s/(href=\")([^\"]*)(")/$1$2/g; | |||
$field =~ s/(href=\"(http:\/\/|https:\/\/|ftp:\/\/)\")*([,.;()*])*(\">)/$1$4/g; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this rather correct the pattern on line 257 to not include punctuation as terminating symbols?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tried with variations of pattern. Either my corrections were flawed or the URLs need to be built first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this rather correct the pattern on line 257 to not include punctuation as terminating symbols?
Kept getting unexpected results like <a href="https://curl">https://curl</a>.se/,
or inline punctuations being removed.
roffit
Outdated
# add anchor and anchor links to special character options | ||
my $specialcharacters = "`~!@\$%^*()-_=+{};:\'\\|,.?"; | ||
$field =~ s/(<a name=)\"-\"(><\/a>)(<span class=\"nroffip\">-)([$specialcharacters]|[\[\]])(,|<\/span>)/$1\"-$4\"$2$3$4/g; | ||
$field =~ s/(<a class=\"emphasis\") href=\"#-\">-([$specialcharacters]|[\[\]])(,|<\/a>)/$1 href=\"#-$2\">-$2/g; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think these changes should rather be done as updates to text2name()
and do_encode()
instead of adding this extra stage of edits.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unable to achieve in subroutines, either anchor links were removed, all anchors (name attribute) removed, or no effect.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think these changes should rather be done as updates to
text2name()
anddo_encode()
instead of adding this extra stage of edits.
With text2name
and do_encode
new things break. Some examples:
- name="socks5h://" (should be name="socks5h")
- name="AUNDERSCORE" (shoud be name="A_UNDERSCORE")
- name=""--any-option" (2 "" at start).
roffit
Outdated
$field =~ s/(<a name=)\"-\"(><\/a>)(<span class=\"nroffip\">-)($htmlentity)(,|<\/span>)/$1\"-$4\"$2$3$4/g; | ||
$field =~ s/#/hash/ | $field =~ s/&/ampersand/ | $field =~ s/'/single-quote/ | | ||
$field =~ s/</less-than/ | $field =~ s/>/greater-than/; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think these ones could also be included into that change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Created subroutine field_anchor to handle repeating blocks. Required $field and $htmlentity to be defined globally.
testpage.1
Outdated
|
||
.IP "-?, --special-char" | ||
Options with special characters will be included in anchors. | ||
Such as this option with special character \fI\-?, \-\-special\-char\fP. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see any code handling this new option. Plus, I don't think this needs to be managed by an option - I think it makes sense to improve the general handling of anchors and links so it should probably be used by default.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the test build. Moved into description. Anyway options are named wrong (--mumbo and --jumbo), which gave me impression that it was only for build test purposes. Does not seem like an issue though (maybe good for someone to find who has never contributed on github also) .
Regarding some suggested edits:Unable to achieve fix with Quick takeaway ( where - (1) should be => (2) becomes ): For details see:collapsed sectionDetailed report (explains above):Since curl.1 is a good use case for thoroughly testing a roffit
Below are sample outcomes from two instances of build, and then results
results in:
results in: Several variations of the above edits were done with similar (if not the same) outcome. So in conclusion; either my approaches were flawed, or I need to study the roffit When "diff" ran using -- 2. with current pull roffit --
Edits
|
Pull Summary
Tested this with files matching issues this pull request resolves and with repos that have an extensive amount of files roffit can convert to html. Made two supporting repos with test results. They are:
The two tests used more or less the same procedure, and focused on:
The two support repos have full details of the procedure, but in short the steps were: