-
Notifications
You must be signed in to change notification settings - Fork 243
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[fix] text.wrap now returns non-empty list for one character strings #400
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for taking the time to contribute!
This looks reasonable, but can you clarify ... if it was returning the wrong result for one character strings before doesn't that imply it was returning the wrong result all along for all lengths because it was failing to evaluate the last position? Or does the nxt
shenanigans mean that isn't true? If the latter, isn't this now looping one too many times for other cases? Perhaps a do ... until
loop be more appropriate here.
Also how can we test this so that it is demonstrated to work as expected and we don't break it again?
@alerque sorry for yesterday's incomplete PR I added more content :
|
tests/test-text.lua
Outdated
@@ -46,18 +46,19 @@ batteries. That is because | |||
the goal of Lua is to | |||
produce a lean expressive | |||
language that will be | |||
used on all sorts of machines, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
length of used on all sorts of
= 20
should have jumped instead of putting machines,
on same line
This thing has more problems, and I think the proposed solution is just shifting them (didn't test). Small test; local ww = require("pl.text").wrap
local t = [[
hello "world" 'this' -is- a bb ccc dddd test... but wouldn't it pass??? final. word-that-can-be-broken
]]
for _, v in ipairs(ww(t,3)) do print(v) end Output (without the changes in this PR):
|
this does remind me of this: #141 |
We can change the separator but %w and %W clearly are not ideal solutions |
I agree with @Tieske that there are lots of problems and this is just shifting them, but I'm inclined to think that shifting them has some benefit. As long as we have a naive implementation the closer it works to what it says on the tin the better. On the other hand I have lots of experience with typesetting and wrapping text context is such a hopelessly complex task I do think this naive implementation needs a disclaimer. In fact it should have been called If we wanted to actually fix this we would have to hook into Unicode data –perhaps by using ICU– to actually determine what were word characters and where breaks are allowed in a locale dependent way. There really isn't any way to fake this and get it right more than a small percentage of the time. Given that we don't want to introduce external dependencies, I'm not sure there is anything for it but to just keep faking it or deprecate this entirely. Which brings us back to "we might as well fake it as best we can". |
One advantage on splitting on space characters only is that the result is usually closer to what the user might expect. |
I rebased and force-pushed your branch @Sygmei Added the test from a previous comment as well. |
The only problem left imo is that the function takes the |
I did not change this behaviour since it was clearly documented : |
I read that in a way that words longer than the limit would extend beyond it (they would not be forcefully broken). But that's different from the current behaviour of a minimum length. Though that might as well be a different pr. Opinions @alerque ? |
Added a commit that will honour the width given, and adds an option to forcefully break words longer than width. It was harder than anticipated actually, it has soo many edge cases... added tests for all of them |
## 1.12.0 (2022-Jan-10) - deprecate: module `pl.text` the contents have moved to `pl.stringx` (removal later) [#407](lunarmodules/Penlight#407) - deprecate: module `pl.xml`, please switch to a more specialized library (removal later) [#409](lunarmodules/Penlight#409) - feat: `utils.npairs` added. An iterator with a range that honours the `n` field [#387](lunarmodules/Penlight#387) - fix: `xml.maptags` would hang if it encountered text-nodes [#396](lunarmodules/Penlight#396) - fix: `text.dedent` didn't handle declining indents nor empty lines [#402](lunarmodules/Penlight#402) - fix: `dir.getfiles`, `dir.getdirectories`, and `dir.getallfiles` now have the directory optional, as was already documented [#405](lunarmodules/Penlight#405) - feat: `array2d.default_range` now also takes a spreadsheet range, which means also other functions now take a range. [#404](lunarmodules/Penlight#404) - fix: `lapp` enums allow [patterns magic characters](https://www.lua.org/pil/20.2.html) [#393](lunarmodules/Penlight#393) - fix: `text.wrap` and `text.fill` numerous fixes for handling whitespace, accented characters, honouring width, etc. [#400](lunarmodules/Penlight#400) - feat: `text.wrap` and `text.fill` have a new parameter to forcefully break words longer than the width given. [#400](lunarmodules/Penlight#400) - fix: `stringx.expandtabs` could error out on Lua 5.3+ [#406](lunarmodules/Penlight#406) - fix: `pl` the module would not properly forward the `newindex` metamethod on the global table. [#395](lunarmodules/Penlight#395) - feat: `utils.enum` added to create enums and prevent magic strings [#408](lunarmodules/Penlight#408) - change: `xml.new` added some sanity checks on input [#397](lunarmodules/Penlight#397) - added: `xml.xml_escape` and `xml.xml_unescape` functions (previously private) [#397](lunarmodules/Penlight#397) - feat: `xml.tostring` now also takes numeric indents (previously only strings) [#397](lunarmodules/Penlight#397) - fix: `xml.walk` now detects recursion (errors out) [#397](lunarmodules/Penlight#397) - fix: `xml.clone` now detects recursion (errors out) [#397](lunarmodules/Penlight#397) - fix: `xml.compare` now detects recursion (errors out) [#397](lunarmodules/Penlight#397) - fix: `xml.compare` text compares now work [#397](lunarmodules/Penlight#397) - fix: `xml.compare` attribute order compares now only compare if both inputs provide an order [#397](lunarmodules/Penlight#397) - fix: `xml.compare` child comparisons failing now report proper error [#397](lunarmodules/Penlight#397)
text.wrap
returns empty list for one-character strings, this PR fixes this by using<=
instead of<
for the iterator