Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perlop: clarify \U, \L, \F behaviour #19999

Closed

Conversation

mfontani
Copy link
Contributor

... as they do not stop "at \E or end of string" but are also
stopped by another \U, \L or \F.

See also:
https://www.nntp.perl.org/group/perl.perl5.porters/2022/07/msg264490.html

... as they do not stop "at \E or end of string" but are also
stopped by another \U, \L or \F.

See also:
    https://www.nntp.perl.org/group/perl.perl5.porters/2022/07/msg264490.html
... which were missing from the test suite.
Ensure the current behaviour - whereby a \E is not needed
to end a \L or \U "chunk" - is tested.
@mfontani mfontani force-pushed the mf/202207_clarify_backslash_l_u_behaviour branch from 09b86ff to 7f00929 Compare July 28, 2022 12:14
@khwilliamson
Copy link
Contributor

People may find #11145 instructive

@mfontani
Copy link
Contributor Author

@khwilliamson my goal here is to "at least" document (and have tests for) the current non-stacking behaviour, so that it's clearer that's what's happening / isn't as nebulous as before.

It's hopefully a simple change to merge.

@bram-perl
Copy link

Looking at some tickets; this change in docs was also suggested in #19670
but a reply (from @khwilliamson) was "It's worse then that";
Unfortunately the reply didn't specify in what way it worse/in what way documenting it like that would be incorrect :(

@khwilliamson
Copy link
Contributor

But, I added a follow-up immediately after to that reply that did give concrete examples:

Here are some tickets #8848 #11145 #18981

@iabyn
Copy link
Contributor

iabyn commented Aug 4, 2022 via email

@mfontani
Copy link
Contributor Author

mfontani commented Aug 4, 2022

Sounds good, closing this then.

@bram-perl
Copy link

Having documented some(/all?) the quirks/caveats/... for case modifiers1 it puts me in a better position to comment.

The text is mostly correct but based on the current behavior it's missing the caveats:

  1. \U\lfoo is transformed into \l\Ufoo;
  2. \L\ufoo is transformed into \u\Lfoo;
  3. A \E which follows immediately after a \U, \L, \F, \Q, \u, \l causes both symbols to be completely ignored. Examples:
    • \Ufoo\L\Ebar is parsed as \Ufoobar -> There no longer is a \L or \E!
    • applying 1. and 2.: \Ufoo\L\u\Ebar is first transformed into \Ufoo\u\L\Ebar and then into \Ufoo\ubar -> Again there is no longer a \L or \E

So if after 1., 2., 3. there are still two occurrences of \U, \L, \F in the string then the second occurrence will terminate the first \U, \L, \F.

Also note: a \L also ends a previous \L, i.e. \LFOO\LBAR\EBAZ is equal to foobarBAZ (same applies for \U and \F)

Footnotes

  1. see Interaction of case-modifiers (\U, \L, \u, \l, \F, \Q, \E) in double quoted strings #20042

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants