Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

\IntlDateFormatter constructor throws an exception for unknown locales since 8.1.25 #12561

Closed
mpdude opened this issue Oct 30, 2023 · 35 comments
Closed

Comments

@mpdude
Copy link

mpdude commented Oct 30, 2023

Description

Since PHP 8.1.25 and due to the changes in #12282, the following code throws an IntlException:

<?php
 new IntlDateFormatter(
        'xx',
        IntlDateFormatter::FULL,
        IntlDateFormatter::FULL,
        null,
        null,
        'w'
    );

There may be many reasons why the locale value is invalid – it might be taken from the LANG environment variable and be set to C, it might be parsed from request parameters (the reason behind #12533), or it might be a locale name that is valid on one system but is not in the list of generated/supported locales on another system.

While I agree that ideally, all these conditions should be checked and caught beforehand, and I also agree that fixing #12282 without a sensible locale to assume might be difficult, my point is that turning this condition into an exception in a bugfix release is a breaking change that should be reverted.

PHP Version

PHP 8.1.25

Operating System

No response

@mpdude
Copy link
Author

mpdude commented Oct 30, 2023

@devnexen I'd appreciate to hear your thoughts on this

@devnexen
Copy link
Member

What can eventually be done is for 8.1/8.2 just returning a failure instead but keeping the exception otherwise.

@mpdude
Copy link
Author

mpdude commented Oct 30, 2023

Should Locale::setDefault() behave the same way?

@Seldaek
Copy link

Seldaek commented Oct 30, 2023

We just hit this issue as well in a test case where we used new \IntlDateFormatter(Locale::getDefault(), ...) and due to a previous
state resetting thing we have between tests we called Locale::setDefault with 'C' as locale. So we ended up passing C as the locale, and this throws.

php -r 'Locale::setDefault("C"); var_dump(Locale::getDefault()); new \IntlDateFormatter(Locale::getDefault());'
Command line code:1:
string(1) "C"
PHP Fatal error:  Uncaught IntlException: datefmt_create: invalid locale: U_ILLEGAL_ARGUMENT_ERROR in Command line code:1

So yes IMO setDefault should definitely throw as well to make this more obvious where the failure comes from.

And I'd say maybe 'C' should be accepted as locale, as it is a valid one for setlocale and the system overall, but I'm not sure if that's really sensible or not in the IntlDateFormatter context.

@drbyte
Copy link
Contributor

drbyte commented Oct 31, 2023

@Seldaek wrote:

And I'd say maybe 'C' should be accepted as locale, as it is a valid one for setlocale and the system overall

I agree.

According to https://learn.microsoft.com/en-us/cpp/c-runtime-library/locale-names-languages-and-country-region-strings?view=msvc-170 it would seem that 'C' is a valid default locale which signifies that the host system hasn't yet defined a locale.
I don't think that warrants fatally throwing an exception.

Perhaps the IntlDateFormatter needs to define fallback logic instead?

@hormus
Copy link

hormus commented Oct 31, 2023

For all PHP version.
The locale class
The null or empty string obtains the "root" locale. The "root" locale is equivalent to "en_US_POSIX" in CLDR.

<?php
$coll    = collator_create( 'C' );
$res_val = collator_get_locale( $coll, Locale::VALID_LOCALE );
$res_act = collator_get_locale( $coll, Locale::ACTUAL_LOCALE );
printf( "Valid locale name: %s\nActual locale name: %s\n",
         $res_val, $res_act );
?>

Expected result:

Valid locale name: root
Actual locale name: root
<?php

Locale::setDefault('');
$a = new IntlDateFormatter('', IntlDateFormatter::FULL, IntlDateFormatter::FULL);
var_dump(Locale::getDefault(), $a->getLocale(ULOC_VALID_LOCALE));
?>

Expected result:

string(11) "en_US_POSIX"
string(11) "en_US_POSIX"

Convert native ICU from "C" to "en_US_POSIX"

<?php
Locale::canonicalize(Locale::getDefault());

https://github.com/php/php-src/blob/master/ext/intl/locale/locale_methods.c#L798-L805 get_icu_value_src_php function

@mpdude
Copy link
Author

mpdude commented Nov 1, 2023

If there is a root locale, maybe the IntlDateFormatter should fall back to that when the given locale is invalid?

@Girgias
Copy link
Member

Girgias commented Nov 2, 2023

Fixed by #12568

@Girgias Girgias closed this as completed Nov 2, 2023
@derrabus
Copy link
Contributor

derrabus commented Nov 2, 2023

Just to clarify: The fix is to revert the BC break on 8.1/8.2 and postpone it to 8.3?

@hormus
Copy link

hormus commented Nov 2, 2023

PHP 8.1.26, 8.2.13, 8.3.0 with this fix by @Girgias #12568 Fatal error: Uncaught Error:

Previously with PHP 8.1.25, 8.2.12 Fatal error: Uncaught IntlException:

@marcos-guerrero
Copy link

Hello! Sorry, but I don't understand very well about the fix and the postpone...

Are The next releases of PHP 8.1/8.2 going to return to the previous behaviour (not throwing exception)?
Thanks for the clarification!

@adriangalbincea
Copy link

Hey, there is still an error, this time it changed. the old error disappeared but appeared a new one.

ice_screenshot_20231124-103900

@hormus
Copy link

hormus commented Nov 24, 2023

https://github.com/Icinga/icingaweb2/blob/main/modules/monitoring/application/views/scripts/partials/event-history.phtml#L164
Your error IntlDateFormatter->format() use "C" (PHP code setlocale(LC_TIME, 0) unsupported locale from IntlDateFormatter).
The locale C is en_US_POSIX from ICU format locale IDs.

@adriangalbincea
Copy link

https://github.com/Icinga/icingaweb2/blob/main/modules/monitoring/application/views/scripts/partials/event-history.phtml#L164 Your error IntlDateFormatter->format() use "C" (PHP code setlocale(LC_TIME, 0) unsopported locale from IntlDateFormatter). From "ICU" locale C is en_US.POSIX

Thanks for the reply. This means I need to wait for the devs from icinga to update the modules.

@hormus
Copy link

hormus commented Nov 24, 2023

Yes https://github.com/Icinga/icingaweb2/blob/main/modules/monitoring/application/views/scripts/partials/event-history.phtml#L33C50-L33C57 not good

$dateFormatter = new IntlDateFormatter(setlocale(LC_TIME, 0), IntlDateFormatter::FULL, IntlDateFormatter::NONE);

This produce en_US_POSIX.
This example use IntlDateFormatter::getLocale

<?php

// setLocale(LC_ALL, 'C');
$var = setlocale(LC_TIME, 0); // Maybe standards POSIX default locale is C
if('C' == $var) {
$var = Locale::canonicalize('C');
}
var_dump($var); // en_US_POSIX
$dateFormatter = new IntlDateFormatter($var, IntlDateFormatter::FULL, IntlDateFormatter::NONE);
var_dump($dateFormatter->getLocale(ULOC_VALID_LOCALE)); // en_US_POSIX

?>

Thank @adriangalbincea, I updated the previous message now it indicates error instead of exception.
#12561 (comment)

@mpdude
Copy link
Author

mpdude commented Nov 25, 2023

This change will be a constant source of confusion and surprises for a lot of people in the years to come :-)

ArthurHoaro added a commit to ArthurHoaro/Shaarli that referenced this issue Nov 25, 2023
PHP 8.2.13 and 8.1.26 introduced another error preventing the CI to pass
with default POSIX locale. We now have a wrapper to get the locale and
replace C with a valid locale.

@see php/php-src#12561

Fixes shaarli#2029
@nilmerg
Copy link

nilmerg commented Nov 30, 2023

Your error IntlDateFormatter->format() use "C" (PHP code setlocale(LC_TIME, 0) unsupported locale from IntlDateFormatter).
The locale C is en_US_POSIX from ICU format locale IDs.

icinga-dev here. If I understand this correctly, php accepts C as (and uses this as default) locale but intl does not. This seems also be proven by the script below:

<?php

echo "initial locale: ";
$initial = setlocale(LC_TIME, 0); 
echo "$initial\n";

echo "setting invalid locale: ";
setlocale(LC_TIME, 'invalid');

echo setlocale(LC_TIME, 0) . "\n";

echo "setting non-existing locale: ";
$nonExisting = 'en_US.UTF-8';
setlocale(LC_TIME, $nonExisting);

echo setlocale(LC_TIME, 0) . "\n";

echo "setting existing locale: ";
$existing = 'de_DE.UTF-8';
setlocale(LC_TIME, $existing);

echo setlocale(LC_TIME, 0) . "\n";

echo "now onto intl...\n";


function intltest($locale) {
echo "how does intl handle $locale?\n";
$dateFormatter = new IntlDateFormatter($locale, IntlDateFormatter::FULL, IntlDateFormatter::NONE);
try {
	$dateFormatter->format(0);
	echo "well\n";
} catch (Throwable $e) {
        echo "not well\n";
}
}

echo "checking the default\n";
intltest($initial);
echo "checking a non-existing but valid locale\n";
intltest($nonExisting);
echo "checking an existing and valid locale\n";
intltest($existing);

echo "checking en_US_POSIX\n";
intltest('en_US_POSIX');
echo "but how does setlocale?\n";
echo (setlocale(LC_TIME, 'en_US_POSIX') ?: "not well") . "\n";


echo "out of curiosity.. what does Locale::canonicalize('C') do?\n";
echo "this: " . Locale::canonicalize('C') . "\n";

initial locale: C
setting invalid locale: C
setting non-existing locale: C
setting existing locale: de_DE.UTF-8
now onto intl...
checking the default
how does intl handle C?
not well
checking a non-existing but valid locale
how does intl handle en_US.UTF-8?
well
checking an existing and valid locale
how does intl handle de_DE.UTF-8?
well
checking en_US_POSIX
how does intl handle en_US_POSIX?
well
but how does setlocale?
not well
out of curiosity.. what does Locale::canonicalize('C') do?
this: c

But the thing is, why does PHP support something and intl does not? I mean, shouldn't both be interoperable?

I can accept that intl supports more than the builtin php functions, given the fact that it's a extension module, but I think it should also accept what php accepts, especially the default C!

So what you recommend is guarding every usage of e.g. IntlDateFormatter against it this way?:

$var = setlocale(LC_TIME, 0); // Maybe standards POSIX default locale is C
if('C' == $var) {
    $var = Locale::canonicalize('C');
}
$dateFormatter = new IntlDateFormatter($var, IntlDateFormatter::FULL, IntlDateFormatter::NONE);

Though, for me, Locale::canonicalize does not seem to help here (the last part of my script's output), so I'd have to hardcode en_US_POSIX instead. And again, for every usage? Really?

I'm baffled at the moment, to say the least. But please enlighten me about this behavior. I'd willingly accept this with a plausible explanation.

@hormus
Copy link

hormus commented Nov 30, 2023

Use local IDs in ICU format. "Standards posix" C is supported by the locale of your operating system if necessary, if true en_US_POSIX is the ICU locale id format supported.

<?php

error_reporting(PHP_INT_MAX);
ini_set("intl.error_level", E_WARNING);
if(setlocale(LC_ALL, 'C')) {
echo 'Set';
} // Return false on Error

$LocaleDefault1 = Locale::getDefault();
$LocaleDefault2 = 'C';
if('C' == $LocaleDefault1) {
$LocaleDefault1 = strtr(Locale::canonicalize('C'), array('C' => 'en_US'));
Locale::setDefault($LocaleDefault1);
if(!$LocaleDefault2) {
$LocaleDefault2 = $LocaleDefault1;
}
}
if('C' == $LocaleDefault2) {
$LocaleDefault2 = strtr(Locale::canonicalize('C'), array('C' => 'en_US'));
}
$LocaleDefault3 = null;
try {
$dateFormatter = new IntlDateFormatter($LocaleDefault2, IntlDateFormatter::FULL, IntlDateFormatter::NONE);
$LocaleDefault3 = $dateFormatter->getLocale(ULOC_VALID_LOCALE);
} catch(IntlException $e) {
}
$b = intl_get_error_code();
var_dump($b, $LocaleDefault3, $LocaleDefault1);

?>

It's not valid ICU format locale IDS

Locale::setDefault('C');
var_dump(Locale::getDefault()); // C

ICU format locale ids

IntlDateFormatter::getLocale(ULOC_VALID_LOCALE)

@nilmerg Not using a POSIX operating system?

@Girgias
Copy link
Member

Girgias commented Dec 1, 2023

But the thing is, why does PHP support something and intl does not? I mean, shouldn't both be interoperable?

I can accept that intl supports more than the builtin php functions, given the fact that it's a extension module, but I think it should also accept what php accepts, especially the default C!

Because setlocale() has existed way before ICU and ext/intl, and C is the locale that PHP uses by default, as PHP is written in C and thus requires a POSIX locale format.

Though, for me, Locale::canonicalize does not seem to help here (the last part of my script's output), so I'd have to hardcode en_US_POSIX instead. And again, for every usage? Really?

I'm baffled at the moment, to say the least. But please enlighten me about this behavior. I'd willingly accept this with a plausible explanation.

ext/intl is a wrapper around ICU, if ICU doesn't support this, then yes you need to do this.

I don't know what to say other than setlocale() and Intl are not the same thing, nor deal with the same stuff, so I am really confused as to why people assume this to work.

@nilmerg
Copy link

nilmerg commented Dec 12, 2023

Okay. I accept the reasoning for this change being about conformity and standardization.

new IntlDateFormatter(setlocale(LC_TIME, 0)) is not guaranteed to work and should be replaced by new IntlDateFormatter(Locale::getDefault()). (Assuming 'C' hasn't been passed onto Locale::setDefault(). Although, it's still questionable to me that it accepts it, being from the same package)

Though, it's quite unfortunate that this change is included in bugfix releases.

It broke with 8.1.25 and 8.2.12 and has yet to be fixed as 8.1.26 and 8.2.13 are still affected.

That 8.3.0 includes it, is probably fine. Though, the error differs from what has been documented.

I suspect this is already being tracked somewhere, (didn't find anything though) so I summed this up for clarity only. Even if just for myself..

@Girgias
Copy link
Member

Girgias commented Dec 12, 2023

Tracking in #12943

@adriangalbincea
Copy link

Okay. I accept the reasoning for this change being about conformity and standardization.

new IntlDateFormatter(setlocale(LC_TIME, 0)) is not guaranteed to work and should be replaced by new IntlDateFormatter(Locale::getDefault()). (Assuming 'C' hasn't been passed onto Locale::setDefault(). Although, it's still questionable to me that it accepts it, being from the same package)

Though, it's quite unfortunate that this change is included in bugfix releases.

It broke with 8.1.25 and 8.2.12 and has yet to be fixed as 8.1.26 and 8.2.13 are still affected.

That 8.3.0 includes it, is probably fine. Though, the error differs from what has been documented.

I suspect this is already being tracked somewhere, (didn't find anything though) so I summed this up for clarity only. Even if just for myself..

Hi @nilmerg , do you have an eta for the icingaweb fix? History is still broken. Maybe you have a temp patch we can use?

@hormus
Copy link

hormus commented Jan 4, 2024

@adriangalbincea read #12943 (comment) this accept C but not C.UTF-8

@adriangalbincea
Copy link

@adriangalbincea read #12943 (comment) this accept C but not C.UTF-8

Hi @hormus thanks for your message. I am a regular use of Icinga Monitoring and to understand, it is something the maintainer has to fix. Am I right?

@nilmerg
Copy link

nilmerg commented Jan 8, 2024

@adriangalbincea If you're on php 8.1.x or 8.2.x you'd just have to wait for 8.1.28 or 8.2.14, respectively. (At least, I hope both will include the fix. Please correct me if I'm wrong)

@adriangalbincea
Copy link

@adriangalbincea If you're on php 8.1.x or 8.2.x you'd just have to wait for 8.1.28 or 8.2.14, respectively. (At least, I hope both will include the fix. Please correct me if I'm wrong)

Hi @nilmerg I am using php 8.2.14 and the issue is still there. Maybe in a future release?

@nielsdos
Copy link
Member

nielsdos commented Jan 8, 2024

8.2.15 will include a fix.
8.1 will not get the fix because it is out of support for bugfixes, only security fixes are applied to 8.1 now

@adriangalbincea
Copy link

8.2.15 will include a fix. 8.1 will not get the fix because it is out of support for bugfixes, only security fixes are applied to 8.1 now

That is good news. Fingers crossed. Thanks for the update.

@nilmerg
Copy link

nilmerg commented Jan 9, 2024

@nielsdos This is unfortunate. It broke with 8.1.25, one day before active support ended. In a patch release. Such a thing happens of course, but shouldn't. Please reconsider.

@nielsdos
Copy link
Member

nielsdos commented Jan 9, 2024

@nielsdos This is unfortunate. It broke with 8.1.25, one day before active support ended. In a patch release. Such a thing happens of course, but shouldn't. Please reconsider.

Yes it is unfortunate.
I don't have the power to reconsider this, this is something that needs to go through the release managers and the internals mailing list.

@adriangalbincea
Copy link

8.2.15 will include a fix. 8.1 will not get the fix because it is out of support for bugfixes, only security fixes are applied to 8.1 now

Hi, I changed php version to 8.3.1 but the issue is still there. Is there a fix planned with 8.3 too?

@xabbuh
Copy link

xabbuh commented Jan 10, 2024

Hi, I changed php version to 8.3.1 but the issue is still there.

This looks expected: https://www.php.net/manual/en/migration83.other-changes.php#migration83.other-changes.functions.intl

@devnexen
Copy link
Member

8.2.15 will include a fix. 8.1 will not get the fix because it is out of support for bugfixes, only security fixes are applied to 8.1 now

Hi, I changed php version to 8.3.1 but the issue is still there. Is there a fix planned with 8.3 too?

Would it be possible for you to give a quick try to the future 8.3.2 version ? it is a release candidate, just to test locally.

@adriangalbincea
Copy link

Hi, I changed php version to 8.3.1 but the issue is still there.

This looks expected: https://www.php.net/manual/en/migration83.other-changes.php#migration83.other-changes.functions.intl

Thanks. I just did a workaround as I need the history of events...
/usr/share/icingaweb2/modules/monitoring/application/views/scripts/partials/event-history.phtml

I replaced
$dateFormatter = new IntlDateFormatter(setlocale(LC_TIME, 0), IntlDateFormatter::FULL, IntlDateFormatter::NONE);
with
$dateFormatter = new IntlDateFormatter(setlocale(LC_TIME, ['en_GB', 'en_GB.UTF-8', 'en-GB']), IntlDateFormatter::FULL, IntlDateFormatter::NONE);
Hope that it helps someone.

@adriangalbincea
Copy link

8.2.15 will include a fix. 8.1 will not get the fix because it is out of support for bugfixes, only security fixes are applied to 8.1 now

Hi, I changed php version to 8.3.1 but the issue is still there. Is there a fix planned with 8.3 too?

Would it be possible for you to give a quick try to the future 8.3.2 version ? it is a release candidate, just to test locally.

I would love to help but I am not that advanced to install RC on top of the current one for testing...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests