Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add is_list function #4886

Closed
wants to merge 1 commit into from
Closed

Add is_list function #4886

wants to merge 1 commit into from

Conversation

duskwuff
Copy link
Contributor

@duskwuff duskwuff commented Nov 4, 2019

This function tests if an array contains only sequential integer keys. While this isn't an official type, this usage is consistent with the community usage of "list" as an annotation type, cf. https://psalm.dev/docs/annotating_code/type_syntax/array_types/#lists.

This functionality can be polyfilled, but this implementation should be much faster, as it can check for a packed array.

This function tests if an array contains only sequential integer keys. While
this isn't an official type, this usage is consistent with the community usage
of "list" as an annotation type, cf.
https://psalm.dev/docs/annotating_code/type_syntax/array_types/#lists
zend_string *str_idx;

ZEND_PARSE_PARAMETERS_START(1, 1)
Z_PARAM_ZVAL(arg)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you could use Z_PARAM_ARRAY_HT() here, to get arrval without further ado.

Copy link
Contributor Author

@duskwuff duskwuff Nov 4, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would make the function throw an error when passed a non-array argument:

Fatal error: Uncaught TypeError: is_list() expects parameter 1 to be array, null given

My intention was to make is_list behave like the other is_ type-checking functions, which quietly return false when passed a type different from the one they're checking for.

Copy link
Member

@kocsismate kocsismate Nov 4, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@duskwuff AFAIK, you could also use the following if you want to achieve an is_* alike behaviour:

ZEND_PARSE_PARAMS_START_EX(ZEND_PARSE_PARAMS_QUIET, 1, 1)
// ...

(Although I would also prefer constraining the parameter type to array)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@duskwuff, ah, I see. Thanks.

test_is_list("mixed keys", [0 => 0, "a" => 1]);
test_is_list("ordered keys", [0 => 0, 1 => 1]);
test_is_list("shuffled keys", [1 => 0, 0 => 1]);
test_is_list("skipped keys", [0 => 0, 2 => 2]);
Copy link

@rybakit rybakit Nov 4, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would also add a test to cover arrays with negative keys, e.g.: [-1 => 1, 0 => 2]

@cmb69
Copy link
Member

cmb69 commented Nov 7, 2019

It might be useful to bring this topic to the internals mailing (internals@lists.php.net) list for discussion.

@nikic nikic added the Feature label Nov 7, 2019
@nikic
Copy link
Member

nikic commented Nov 7, 2019

Most likely this is going to need a small RFC, but at least an internals discussion would be good.

One thing to consider would be whether this function should be is_list() (allowing non-array params) or array_is_list() (allowing only arrays).

Generally I do think that this is a reasonable addition, especially as we already have existing behavior (json) making this distinction, and because an internal implementation can be more efficient than any userland implementation by using the packed array information.

@TysonAndre
Copy link
Contributor

I wonder if iterating over buckets in reverse would be more efficient for rejecting arrays that weren't list, here and in ext/json

E.g. if you had an array with thousands of elements, but a few were unset in the middle of the array, then the key of the last bucket (that wasn't unset) wouldn't be zend_hash_num_elements - 1

@sshymko
Copy link

sshymko commented Nov 8, 2019

Is function is_list() meant to replace unreliable validations array_key_exists(0, $array)?

IMO, it's a rare use case. I've only seen it 2-3 times in 15 years and only in library/framework code.

Maybe something opposite like is_assoc_array($array) would be a more useful alternative?

@muglug
Copy link
Contributor

muglug commented Nov 8, 2019

Is function is_list() meant to replace workaround-ish validations array_key_exists(0, $array)?

The workaround/polyfill would be $array === array_values($array).

The value of a separate is_list is that it would benefit PHP static analysis tools that support the list type (currently just in docblocks) as a subtype of array.

One drawback of this is that it could make it more difficult to introduce an actual list type in the future, if that was desired – Hack has a similar vec type distinct from array.

@muglug
Copy link
Contributor

muglug commented Nov 8, 2019

Another approach might be to have a flag on the object representing array data to signify that it's a list. Certain operations would remove that flag:

$arr = [3, 4, 5]; // $arr created with list flag
$arr[] = 7; // list flag still on
$arr[0] = 12; // list flag still on
$arr[5] = 2; // list flag removed from $arr

$a = array_merge([2, 4, 5], [1, 3, 5]); // list flag set on $a
$b = array_filter($a); // no list flag on $b

I haven't ever touched PHP internals, but I'd be happy to explore this more.

@sshymko
Copy link

sshymko commented Nov 8, 2019

Expression $array === array_values($array) is reliable and concise. I'd rather stick with it instead of introducing the concept of "lists" that does not exist in PHP.

@TysonAndre
Copy link
Contributor

TysonAndre commented Nov 9, 2019

Another approach might be to have a flag on the object representing array data to signify that it's a list. Certain operations would remove that flag:

That would mean that the way this behaves inside a function (e.g. for a parameter) depends on what the code outside the function did to the parameter before calling it, which would be unintuitive (e.g. your array_filter example). I'd be against saying something wasn't a list if it didn't have that flag.


The workaround/polyfill would be $array === array_values($array).

There's small edge cases to that (e.g. if the array is infinitely recursive due to array references, this will crash), or due to NAN

php > $x = ['key' => 2, NAN]; unset($x['key']); var_export($x === array_values($x));
false
php > var_export($x);
array (
  0 => NAN,
)

A real polyfill handling all edge cases would be $x === [] || array_keys($x) === range(0, count($x) - 1)


I can think of a few use cases for the slight performance improvement from a native is_list:

  • User-land serialization implementations such as JSON-like formats distinguishing between associative and list data, custom protocols, etc.
  • User-land functionality for dumping PHP values (e.g. more compact formats than var_export)
  • API validation of parameters (e.g. functions that expect to be called via RPC with a real list, not an associative array)
  • Slightly faster running time of static analyzers for checking if an array in the code being analyzed can be converted to a list

@Majkl578
Copy link
Contributor

Majkl578 commented Nov 9, 2019

Since PHP has no concept of lists, not even as a pseudo-type, how do you plan to support object collections? E.g. Doctrine's Collection which implements ArrayAccess could also represent a list.

Generally I think adding such functions not related to builtin types and with this generic names is a very bad idea. is_array_list() would be at least explicit about what type it tests.

@duskwuff
Copy link
Contributor Author

duskwuff commented Nov 9, 2019

@cmb69, @nikic: What's the best way for me to start whatever process would be needed to move this forward? Just send an email to the list?

@sshymko: The primary goal of this function is, as @muglug suggests, to aid in static analysis. While the behavior of this function is broadly similar to $array === array_values($array), it's not subject to some of the gotchas mentioned, and the performance should be significantly better, especially for large list-like arrays.

@muglug: That flag already exists -- it's referenced in my patch as HT_IS_PACKED(arrval). However, I've made a deliberate decision to not expose that distinction, as it's an implementation detail of the runtime and can get cleared in some surprising ways, a couple of which I explicitly trigger in unit tests.

@Majkl578: That's outside the scope of what this function is intended to accomplish. Consider that is_array() returns false for all objects, even ones which implement ArrayAccess.

@cmb69
Copy link
Member

cmb69 commented Nov 9, 2019

@duskwuff, yes, just write a mail to internals@ please. :)

@Majkl578
Copy link
Contributor

Majkl578 commented Nov 9, 2019

@duskwuff It's not out of scope, those functions you talk about test built-in types. List is not a built in type and doesn't fall into that category.

@TysonAndre
Copy link
Contributor

TysonAndre commented Nov 9, 2019

It's not out of scope, those functions you talk about test built-in types. List is not a built in type and doesn't fall into that category.

There are already is_*() functions that don't correspond to a built in type

@Majkl578
Copy link
Contributor

Majkl578 commented Nov 9, 2019

@TysonAndre is_numeric and is_scalar test well known pseudo-types, unions of native types. And while is_countable is neither it has correct non-ambiguous behaviour: " Verify that the contents of a variable is an array or an object implementing Countable" (countables are a well-known concept since PHP 5 and also represent a union type of array|Countable).

There has never been a list in PHP though.

@muglug
Copy link
Contributor

muglug commented Nov 9, 2019

There has never been a list in PHP though.

True, but the concept of a list is very widespread in computer science and that type is available in most other languages. Moreover, the native PHP array type is routinely used with the assumption its contents represent a list:

function takesList(array $arr) {
  if ($arr) {
    doSomethingWith($arr[0]);
  }
}

@Majkl578
Copy link
Contributor

Majkl578 commented Nov 9, 2019

@muglug Yes, the same way object collections are used, but which is not supported by this PR.

@M1ke
Copy link

M1ke commented Nov 11, 2019

is_numeric and is_scalar test well known pseudo-types, unions of native types.

I'd argue that is_numeric doesn't test a union of native types, as it will return for floats & ints, but also for certain values of a string. That seems to be what is_list would do - it returns for certain values of an array.

I do prefer the above comment by nikic that it be named array_is_list which makes this more explicit and would TypeError if not passed an array, but I can see arguments to keep it wider as in future maybe it would allow something that implements an interface, similar to is_countable.

@hikari-no-yume
Copy link
Contributor

There is a new discussion on internals about the idea of a “list” type which mean more attention on this PR again. https://externals.io/message/109760

@TysonAndre
Copy link
Contributor

I forgot about this PR, but this has been brought up again in https://externals.io/message/111744

@duskwuff - did you have any interest in creating an RFC for is_list(mixed $value): bool, or objections for others to use this implementation as the basis for an RFC?

@@ -982,6 +982,10 @@ ZEND_BEGIN_ARG_WITH_RETURN_TYPE_INFO(arginfo_is_array, _IS_BOOL, 0)
ZEND_ARG_INFO(0, var)
ZEND_END_ARG_INFO()

ZEND_BEGIN_ARG_WITH_RETURN_TYPE_INFO(arginfo_is_list, _IS_BOOL, 0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

php has changed to put stubs in basic_functions.stub.php - the basic_functions_arginfo.h file is automatically regenerated from this during make

@duskwuff
Copy link
Contributor Author

duskwuff commented Sep 1, 2020

@TysonAndre I'd almost forgotten about this!

I don't have the time to draft an RFC personally, but I'd be delighted if someone else wanted to take that up.

One extra bit of discussion material, though: is_callable() is another is_* function which tests whether its argument fits a particular pattern of values. It's even a pseudotype -- a function can be declared as returning callable -- but not a full type; gettype($callable) will return either "string" or "array". That's exactly the sort of place I see list fitting into the language.

TysonAndre added a commit to TysonAndre/php-src that referenced this pull request Sep 3, 2020
This function tests if an array contains only sequential integer keys. While
this isn't an official type, this usage is consistent with the community usage
of "list" as an annotation type, cf.
https://psalm.dev/docs/annotating_code/type_syntax/array_types/#lists

Rebased version of php#4886

- Use .stub.php files
- Add opcache constant evaluation when argument is a constant

Co-Authored-By: Tyson Andre <tysonandre775@hotmail.com>
Co-Authored-By: Dusk <dusk@woofle.net>
@Girgias
Copy link
Member

Girgias commented Nov 4, 2020

This looks to have gone stale and has a more recent PR open for the same feature #6070, therefore closing this one.

@Girgias Girgias closed this Nov 4, 2020
TysonAndre added a commit to TysonAndre/php-src that referenced this pull request Dec 12, 2020
This function tests if an array contains only sequential integer keys. While
this isn't an official type, this usage is consistent with the community usage
of "list" as an annotation type, cf.
https://psalm.dev/docs/annotating_code/type_syntax/array_types/#lists

Rebased version of php#4886

- Use .stub.php files
- Add opcache constant evaluation when argument is a constant

Co-Authored-By: Tyson Andre <tysonandre775@hotmail.com>
Co-Authored-By: Dusk <dusk@woofle.net>
TysonAndre added a commit to TysonAndre/php-src that referenced this pull request Dec 19, 2020
This function tests if an array contains only sequential integer keys. While
this isn't an official type, this usage is consistent with the community usage
of "list" as an annotation type, cf.
https://psalm.dev/docs/annotating_code/type_syntax/array_types/#lists

Rebased version of php#4886

- Use .stub.php files
- Add opcache constant evaluation when argument is a constant

Co-Authored-By: Tyson Andre <tysonandre775@hotmail.com>
Co-Authored-By: Dusk <dusk@woofle.net>
TysonAndre added a commit to TysonAndre/php-src that referenced this pull request Jan 6, 2021
This function tests if an array contains only sequential integer keys. While
list isn't an official type, this usage is consistent with the community usage
of "list" as an annotation type, cf.
https://psalm.dev/docs/annotating_code/type_syntax/array_types/#lists

Rebased and modified version of php#4886

- Use .stub.php files
- Add opcache constant evaluation when argument is a constant
- Change from is_list(mixed $value) to array_is_list(array $array)

RFC: https://wiki.php.net/rfc/is_list

Co-Authored-By: Tyson Andre <tysonandre775@hotmail.com>
Co-Authored-By: Dusk <dusk@woofle.net>
TysonAndre added a commit to TysonAndre/php-src that referenced this pull request Jan 18, 2021
This function tests if an array contains only sequential integer keys. While
list isn't an official type, this usage is consistent with the community usage
of "list" as an annotation type, cf.
https://psalm.dev/docs/annotating_code/type_syntax/array_types/#lists

Rebased and modified version of php#4886

- Use .stub.php files
- Add opcache constant evaluation when argument is a constant
- Change from is_list(mixed $value) to array_is_list(array $array)

RFC: https://wiki.php.net/rfc/is_list

Co-Authored-By: Tyson Andre <tysonandre775@hotmail.com>
Co-Authored-By: Dusk <dusk@woofle.net>
php-pulls pushed a commit that referenced this pull request Jan 20, 2021
This function tests if an array contains only sequential integer keys. While
list isn't an official type, this usage is consistent with the community usage
of "list" as an annotation type, cf.
https://psalm.dev/docs/annotating_code/type_syntax/array_types/#lists

Rebased and modified version of #4886

- Use .stub.php files
- Add opcache constant evaluation when argument is a constant
- Change from is_list(mixed $value) to array_is_list(array $array)

RFC: https://wiki.php.net/rfc/is_list

Co-Authored-By: Tyson Andre <tysonandre775@hotmail.com>
Co-Authored-By: Dusk <dusk@woofle.net>

Closes GH-6070
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.