Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

truncate filter and HTML special characters [rt.cpan.org #95707] #176

Closed
atoomic opened this issue Oct 5, 2018 · 1 comment
Closed

truncate filter and HTML special characters [rt.cpan.org #95707] #176

atoomic opened this issue Oct 5, 2018 · 1 comment

Comments

@atoomic
Copy link
Collaborator

atoomic commented Oct 5, 2018

Migrated from rt.cpan.org#95707 (status was 'open')

Requestors:

From https://www.google.com/accounts/o8/id?id=AItOawk8OBCmUaR8VF4IJkvuFJCdAPwqR_w5xEk on 2014-05-16 10:00:02:

In Manual/Filters.pod it gives the example of truncate(26, '…')
This gives the impression that this is a suitable way of truncating a string and inserting an ellipsis character in HTML source. This is misleading or even broken as the truncate filter will think the replacement text is 8 characters long rather than 1 so will remove too many characters if it decides to truncate.
So it would be better to find a less misleading example. Perhaps truncate(26, '[truncated]') would do. But then would also be a good idea to give a correct example of how to handle the HTML ellipsis case as well. Trying to work out this indicates various bits of missing functionality in Template toolkit.

In many cases the correct behaviour will be to use a literal ellipsis and chain truncate and html filters.

[% somevalue | truncate( 10, "�" ) | html %]

The problem here is how to reliably insert the ellipsis character ( "�" ) as template toolkit without using UTF-8 or wide character source as Template toolkit has no way to escape a hex code. Want to be able to do
something like:
[% somevalue | truncate( 10, "\u2026" ) | html %]

Even better would be to be able to use a symbolic name.

Note that truncating something that is already HTML encoded is not a good idea as the truncation may cut into a & encoded character, and will miscount the length anyway if there is any & encoded characters. Maybe an html_truncate filter is required that allows for &encoding of the string and replacement text. (Either <elements> should be regarded as 0 length or their presence treated as an error.)

[It may actually be that HTML truncation is all better done client side in JavaScript or CSS.]

Even if you decide not to provide any additional functionality, please replace your misleading/broken example.

From ozcoder@gmail.com on 2016-02-08 07:29:35:

Hi,

On Fri May 16 06:00:02 2014, https://www.google.com/accounts/o8/id?id=AItOawk8OBCmUaR8VF4IJkvuFJCdAPwqR_w5xEk wrote:
> In Manual/Filters.pod it gives the example of truncate(26, '&hellip;')
> This gives the impression that this is a suitable way of truncating a
> string and inserting an ellipsis character in HTML source. This is
> misleading or even broken as the truncate filter will think the
> replacement text is 8 characters long rather than 1 so will remove too
> many characters if it decides to truncate.
> So it would be better to find a less misleading example. Perhaps
> truncate(26, '[truncated]') would do. But then would also be a good
> idea to give a correct example of how to handle the HTML ellipsis case
> as well. Trying to work out this indicates various bits of missing
> functionality in Template toolkit.
> 
> In many cases the correct behaviour will be to use a literal ellipsis
> and chain truncate and html filters.
> 
> [% somevalue | truncate( 10, "�" ) | html %]
> 
> The problem here is how to reliably insert the ellipsis character (
> "�" ) as template toolkit without using UTF-8 or wide character source
> as Template toolkit has no way to escape a hex code. Want to be able
> to do
> something like:
> [% somevalue | truncate( 10, "\u2026" ) | html %]
> 
> Even better would be to be able to use a symbolic name.
> 
> Note that truncating something that is already HTML encoded is not a
> good idea as the truncation may cut into a & encoded character, and
> will miscount the length anyway if there is any & encoded characters.
> Maybe an html_truncate filter is required that allows for &encoding of
> the string and replacement text. (Either <elements> should be regarded
> as 0 length or their presence treated as an error.)
> 
> [It may actually be that HTML truncation is all better done client
> side in JavaScript or CSS.]
> 
> Even if you decide not to provide any additional functionality, please
> replace your misleading/broken example.


I have created a pull request that goes half way fixing this.
See https://github.com/abw/Template2/pull/56

Gordon
@atoomic
Copy link
Collaborator Author

atoomic commented Oct 5, 2018

dup, already moved to #56

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant