Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

can't create a set of sets #19156

Open
jhh67 opened this issue Feb 2, 2022 · 5 comments
Open

can't create a set of sets #19156

jhh67 opened this issue Feb 2, 2022 · 5 comments

Comments

@jhh67
Copy link
Contributor

jhh67 commented Feb 2, 2022

Summary of Problem

I tried to create a set of sets of strings and got the following error. The type of the elements in the inner sets doesn't appear to matter.

$CHPL_HOME/modules/standard/Set.chpl:277: In method '_addElem':
$CHPL_HOME/modules/standard/Set.chpl:290: error: unresolved call '_ddata(chpl_TableEntry(string,nothing)).hash()'
note: this candidate did not match: owned.hash()
$CHPL_HOME/modules/internal/ChapelHashtable.chpl:220: note: because method call receiver with type '_ddata(chpl_TableEntry(string,nothing))'
note: is passed to formal 'this: owned'
$CHPL_HOME/modules/standard/Set.chpl:290: note: other candidates are:
note:   shared.hash()
$CHPL_HOME/modules/internal/ChapelHashing.chpl:107: note:   (borrowed object?).hash()
note: and 55 other candidates, use --print-all-candidates to see them
  $CHPL_HOME/modules/standard/Set.chpl:251: called as (set(set(string,false),false))._addElem(elem: set(string,false)) from method 'init='
  foo.chpl:1: called as (set(set(string,false),false)).init=(other: set(set(string,false),false))
note: generic instantiations are underlined in the above callstack

Steps to Reproduce

Source Code:

use Set;

var foo: set(set(string));

Compile command:

chpl --no-devel foo.chpl

Execution command:

NA

Associated Future Test(s):

Configuration Information

  • Output of chpl --version:
chpl version 1.26.0 pre-release (27f3d59788)
Copyright 2020-2022 Hewlett Packard Enterprise Development LP
Copyright 2004-2019 Cray Inc.
(See LICENSE file for more details)
  • Output of $CHPL_HOME/util/printchplenv --anonymize:
CHPL_TARGET_PLATFORM: darwin
CHPL_TARGET_COMPILER: clang
CHPL_TARGET_ARCH: x86_64
CHPL_TARGET_CPU: native
CHPL_LOCALE_MODEL: flat
CHPL_COMM: none *
CHPL_TASKS: qthreads
CHPL_LAUNCHER: none
CHPL_TIMERS: generic
CHPL_UNWIND: none
CHPL_MEM: jemalloc
CHPL_ATOMICS: cstdlib
CHPL_GMP: none *
CHPL_HWLOC: bundled
CHPL_RE2: bundled
CHPL_LLVM: none *
CHPL_AUX_FILESYS: none
  • Back-end compiler and version, e.g. gcc --version or clang --version:
Apple clang version 12.0.0 (clang-1200.0.32.29)
Target: x86_64-apple-darwin19.6.0
Thread model: posix
InstalledDir: /Library/Developer/CommandLineTools/usr/bin
  • (For Cray systems only) Output of module list:
@ben-albrecht
Copy link
Member

ben-albrecht commented Feb 2, 2022

In case it's not clear, the fix to this would be to add a set.hash() implementation to the Set module, similar to what we've recently done with bigint in #18562, for example. I wouldn't be surprised if there are other non-hashable types lurking in our standard library that need attention in a similar manner.

Just to demonstrate, the following compiles:

use Set;

// demonstrative purposes only, don't use at home
proc set.hash() {
  return 1;
}

var foo: set(set(string));

@aconsroe-hpe
Copy link
Contributor

One challenge with supporting a hash on a set is that they are mutable, so modifying a set would change its hash (unless you're using a trivial hash function) and that messes with the parent set's invariant. (Sidebar: I've never come across a hashtable impl that handles a value whose hash changes, has anyone else?)

We might be able to at least issue a better error message if we do a reflection and check if the key type has a hash method and issue a compilerError if not

@bradcray
Copy link
Member

bradcray commented Feb 8, 2022

One challenge with supporting a hash on a set is that they are mutable, so modifying a set would change its hash (unless you're using a trivial hash function)

In general when we're hashing things, we make a copy of the value being hashed, don't we? (and then don't give away refs to it easily?). I.e., if I do:

var x = 45;
mySet.add(x);
x = 33;

it's not as though I'm changing the integer value stored in mySet, nor would I expect to be able to look it up using the value 33 rather than 45. To me sets seem the similar, where I'm considering the value of a set to be "all the values that it describes" not its identity as an object. Similarly today we (almost) support hashing on arrays like [1, 2, 3] being different from [4, 5, 6] which in turn seems like an extension of hashing on a 3-field record or a 3-tuple of ints. Or have I got something wrong here?

@aconsroe-hpe
Copy link
Contributor

My brain was not in Chapel mode, this must be a worry I've brought from other languages. I see now that add takes the key by in and these give const ref. Only reaminaing consideration is choosing a good hash function which is order independent so hash({1, 2}) == hash({2, 1})

@bradcray
Copy link
Member

bradcray commented Feb 8, 2022

Only reaminaing consideration is choosing a good hash function which is order independent so hash({1, 2}) == hash({2, 1})

That's a good point. At first, I was thinking "our set only has one implementation, so should always see keys in a specific order", but that failed to take into account what can happen with collisions, deletions and insertions in different orders, etc.

Spitballing: We could sort the keys and then hash, but that would be $$ for large sets. We could take some sort of fingerprint of the set like the min, max, and number of values as a poor hash and then rely on == to disambiguate which would be correct but potentially result in more collisions. We could sample a larger number of values to improve that fingerprint.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants