Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix ReferenceEscaperLexer. #733

Closed
wants to merge 2 commits into from
Closed

Conversation

ndyakov
Copy link

@ndyakov ndyakov commented May 10, 2017

Escape emails ending with a digit.
Related to #724.

Also @theofidry Keep in mind that the old Escaper would have escaped nested parameters such as @user@user1.

Escape emails ending with a digit.
@@ -44,8 +44,7 @@ public function __construct(LexerInterface $decoratedLexer)
*/
public function lex(string $value): array
{
$escapedValue = preg_replace('/(\\p{L})@/', '$1\\@', $value);

$escapedValue = preg_replace('/((\s|^)[a-zA-Z0-9_.+-]+?[A-Za-z0-9])@/', '$1\\@', $value);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

an email could have a non ASCII character, so a-zA-Z is wrong as à for example won't work, hence p{L}

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@theofidry something like
((\s|^)[^@]?(\p{L}|\p{N}|[+-_.])+?(\p{L}|\p{N}))@ should work just fine.
I will test this later.

@@ -66,5 +66,7 @@ public function provideValues()
yield 'string with a reference with members' => ['bar @foo baz'];

yield 'reference in a middle of a word' => ['email@example', 'email\\@example'];

yield 'email ending with a digit' => ['email1@example', 'email1\\@example'];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

given the issues we had, feels like we should add a few more tests with exotic emails (with non ASCII characters for example)

@theofidry
Copy link
Member

I think that's fine, it looks like more of and edge case that can very well wait #712

@theofidry
Copy link
Member

No worries, but it looks like this causes a slight parsing issue

@@ -44,7 +44,7 @@ public function __construct(LexerInterface $decoratedLexer)
*/
public function lex(string $value): array
{
$escapedValue = preg_replace('/((\s|^)[a-zA-Z0-9_.+-]+?[A-Za-z0-9])@/', '$1\\@', $value);
$escapedValue = preg_replace('/((\s|^)[^@]?(\p{L}|\p{N}|[+-_.])+?(\p{L}|\p{N}))@/', '$1\\@', $value);
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the regex was written without the optional [^@] it would work fine but only emails with at least 2 characters in the name would be matched.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could be ok with that, but I think we need way more tests regarding emails to be able to tell if it's alright or not. Also correct me if I'm wrong but it's about auto-escaping emails, so for tricky edge cases one could always fallback to manual escaping e.g. a\@b.c

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@theofidry Agreed as far as writing more tests goes. I would prefer no auto-escaping than "sometimes works" auto-escaping. You have the final say, this looks like it could create problems with time, so maybe remove auto-escaping with the next major release ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think auto-escaping is fine. Most people use regular emails or faker generated emails, it's ok for the 1-2 people using complete alien emails to not benefit from it

@theofidry
Copy link
Member

@ndyakov any updates? No pressure if you don't have time, it's just to know if I can count on it to be included in the next RC release or if this will wait later

@ndyakov
Copy link
Author

ndyakov commented Jul 2, 2017 via email

@theofidry
Copy link
Member

That's ok :)

@theofidry
Copy link
Member

Is this still relevant? As mentioned in #724 this now seems to be working fine

@ndyakov
Copy link
Author

ndyakov commented Feb 20, 2018

@theofidry since then we have moved away from emails ending with numbers for our fixtures, but I can test this if it is working tomorrow. How do you feel about closing this one and probably open a new one with just few more tests regarding the numbers in emails?

@theofidry
Copy link
Member

Fair enough :)

@theofidry theofidry closed this Feb 20, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants