Add sub/superset checking methods to Hash #7500

Sija · 2019-03-03T03:18:28Z

Adds sub/superset checking methods to the Hash class (matching the ones in Set):

Hash#subset_of?(other : Hash)
Hash#proper_subset_of?(other : Hash)
Hash#superset_of?(other : Hash)
Hash#proper_superset_of?(other : Hash)

spec/std/hash_spec.cr

Sija · 2019-03-06T14:43:37Z

Is it GTG / anything else to do here?

src/hash.cr

spec/std/hash_spec.cr

straight-shoota · 2019-03-12T23:05:03Z

I have some doubts about this change. These methods (<,<=,>,>=) are usually used as comparison operators. Is it wise to overload them with a completely different meaning?
In math, set relationships are expressed with ⊂ and ⊃, whereas < and > are comparison operators.

Set relationship methods are also relevant for other collection types besides Hash. Maybe even every Enumerable? I could very well picture this for Array, but the thing is that that's also a Comparable. And we can't use the comparison operators simultaneously for both.

Sija · 2019-04-01T17:39:33Z

I have some doubts about this change. These methods (<,<=,>,>=) are usually used as comparison operators. Is it wise to overload them with a completely different meaning?
In math, set relationships are expressed with ⊂ and ⊃, whereas < and > are comparison operators.

Why are you doubting? ;) These operators are usually overloaded based on the context, thus I see nuffin' wrong in using them according to the most common scenario - they're being overloaded for Class to check for ancestry for instance (same as in Ruby btw). And I doubt we'd like to use ⊂ / ⊃ as method names, we ain't writing math here. Having those available in specs is also a boon I'd say.

Set relationship methods are also relevant for other collection types besides Hash. Maybe even every Enumerable? I could very well picture this for Array, but the thing is that that's also a Comparable. And we can't use the comparison operators simultaneously for both.

They're most needed IMO in places where you'd expect overlapping set of values, but only for types holding distinct value sets, like Set or Hash.

straight-shoota · 2019-04-02T08:21:38Z

I don't think the usefulness of {sub,super}set is limited to types which exclude duplicate values by design. You might as well use an Array to hold distinct values, which is totally fine if the algorithm ensures there won't be duplicates. That would likely be much more performant than using a Set.

Sija · 2019-04-07T12:50:16Z

You might as well use an Array to hold distinct values, which is totally fine if the algorithm ensures there won't be duplicates. That would likely be much more performant than using a Set.

I don't get it, Array is not designed to hold unique values, Set OTOH is, with this exact use-case in mind.

straight-shoota · 2019-04-07T13:48:33Z

Set ensures by its design that there can't be duplicate values. Array doesn't have this inherent property, but it can nevertheless be used to store distinct values. The uniqueness guarantee can be provided by how it's used. A very simple example for this would be arr << elem unless array.index?(elem).
Using an Array for this can be preferable over Set for example for performance or other characteristics such as sortability.

Sija · 2019-04-08T10:27:30Z

Array doesn't have this inherent property, but it can nevertheless be used to store distinct values. The uniqueness guarantee can be provided by how it's used. A very simple example for this would be arr << elem unless array.index?(elem).

Yeah, but why would you do that given that there's Set?

Using an Array for this can be preferable over Set for example for performance or other characteristics such as sortability.

I'd say it's pretty weird to use Array over Set for the exact use-case Set is designed for. If performance is sub-par it should be improved if anything. Sortability for the time being is the same, since Set uses Hash under-the-hood - which is ordered.

straight-shoota · 2019-04-08T13:10:36Z

A Set is per definition unordered and shouldn't be treated as ordered. Anyway, it's always insertion ordered. An array can be sorted.

But I don't think this discussion about unrelated details leads us much further.

IMHO we simply shouldn't apply different semantics to comparison operators on different collection types.
Types like Set or Hash (or think of a potential OrderedSet for example) could as well be comparable and provide sub/superset operations at the same time.

straight-shoota · 2019-11-27T11:24:15Z

We should just assign predicate names #superset?, #proper_superset?, #subset?, #proper_subset? which are already used by Set.

Sija · 2019-11-27T11:29:17Z

@straight-shoota I'd like to have ability to use it in specs with .should be > {"foo" => "bar"} notation, which would be impossible if we'd choose to make 'em a regular methods.

straight-shoota · 2019-11-27T14:09:00Z

That's a valid point. But I wouldn't want to risk conflicts in API methods just for improving spec usage.

Anyway, I've been thinking about improving the expectations API to allow making a matcher from arbitrary methods, not just comparators. This would be helpful for many other use cases as well. For example Set{1, 2, 3}.should &.superset?(Set{1, 2}) could work. But it's really tricky and I'm not sure if there can even be an easy-to-use generic solution.

asterite · 2019-11-27T15:15:11Z

It seems this is actually pretty easy to implement:

require "spec"

module Spec
  module ObjectExtensions
    def should_be(file = __FILE__, line = __LINE__)
      value = yield self
      unless value
        fail("Assertion failed", file, line)
      end
    end
  end
end

describe Array do
  it "is empty" do
    a = [1, 2, 3]
    a.should_be &.empty?
  end
end

https://play.crystal-lang.org/#/r/830q

I think a.should_be &.empty? is a bit more readable than a.should &.empty?, but both could work. Then we should have a.should_not_be &.empty?.

The only problem I see with this is that it can be confused with a.should be (note the space). I don't think there's a way to implement it with a space.

Another problem is that we can't print a friendly error message, like "expected to be empty", but I don't think that's a big problem because the line where the error happened is still shown so one can deduce what the failure is.

An advantage of using should is that one could theoretically do a.should { |obj| some_assertion_using obj } but I would expect this new feature to be used mostly, or only, with straight bang methods.

Thoughts?

This could also be discussed in a separate issue, but I think this would be fantastic to have. Writing a.should_be &.empty? is so much easier and readable than a.empty?.should be_true.

straight-shoota · 2019-11-27T15:25:01Z

I've experimented with that used an expectation type wrapping a proc. It works, but it's not as nice. Didn't figure that the block could just be executed directly and you don't need to worry about proc generics 🤦‍♂

should_be could technically be a macro and capture the block's source for display in the failure message.

asterite · 2019-11-27T15:30:00Z

I've experimented with that used an expectation type wrapping a proc. It works, but it's not as nice. Didn't figure that the block could just be executed directly and you don't need to worry about proc generics 🤦‍♂

I started writing it like that too, and later realized it could be executed directly! 😅

should_be could technically be a macro and capture the block's source for display in the failure message.

I'm not sure how, macros don't work as instance methods. Meaning that a.should_be can never me a macro call, only should_be or Foo.should_be.

Well, we could think about what an instance macro is, but it's an entirely different, bigger subject.

asterite · 2019-11-27T15:48:42Z

I wanted to implement this but in some spec files we have:

File.exists?(some_file).should be_true

With this new feature it would be written like this:

File.should_be &.exists?(some_file)

Shorter, but uglier.

Maybe we could name it assert? So we'd have:

a.assert &.empty?
File.assert &.exists?(...)
a.assert &.includes?(3)
ch.assert &.closed?
reader.assert &.has_next?
'a'.assert &.lowercase?

I think everything reads fine with assert.

asterite · 2019-11-27T16:00:19Z

Then also add .refute. Plus the confusion with .should be is gone.

straight-shoota · 2019-11-27T16:07:23Z

Let's discuss this in a dedicated issue.

Sija · 2020-11-22T17:21:56Z

I'd like to revisit this PR, as its main purpose is to add support for the sub/superset checking to Hash - either via overloading equality operators or dedicated methods. fwiw, Ruby uses equality operators for this.

Could we settle on some course of action?

I see couple of options:

Use equality operators (i.e. leave the PR as-is)
Same as above + do the same for Set
Use #sub/superset? and #proper_sub/superset?
Drop this PR altogether

Any thoughts?

straight-shoota · 2020-11-22T22:07:10Z

I still wouldn't want to compromise API integrity, so number 3.

Sija · 2020-12-03T13:59:38Z

@straight-shoota I've renamed the methods and now they're 1:1 with the ones found in Set.

straight-shoota · 2020-12-03T14:53:04Z

I just realized these methods apply in exactly the opposite direction to what I might expect.
I would probably read a predicate method subset?(b) as "is b a subset? (of the receiver)", i.e. b ⊂ a But instead a.subset?(b) returns true if a ⊂ b.

These semantics are not new. They are the same as in Set and based on Ruby's. I suspect they have it that way to have the same operand relation as with the comparison operators. But it's really confusing. When reading the code, you can't beyond doubt recognize which way it is.

So maybe we should rather introduce these methods here under a differnt, less ambiguous name. And rename those in Set.
Adding a simple _of suffix would fix this. a.subset_of?(b) clearly means a ⊂ b.

Sija · 2020-12-03T15:27:03Z

@straight-shoota Makes sense but I'd start with renaming (and deprecating) methods in Set in a separate PR, followed by changes in here.

straight-shoota · 2020-12-03T15:40:44Z

I've just opened a separate issue for this: #10020 And ideally we'd agree on that, then change this PR + add one for Set.

src/hash.cr

Sija · 2021-02-12T22:24:47Z

Can this get a 2nd approval?

Sija · 2021-04-10T10:31:31Z

🏓 /cc @sdogruyol

sdogruyol

Thank you @Sija 🙏

yxhuvud reviewed Mar 3, 2019

View reviewed changes

spec/std/hash_spec.cr Outdated Show resolved Hide resolved

straight-shoota mentioned this pull request Mar 3, 2019

Add Hash#includes?(other : Hash) overload #7499

Closed

Sija force-pushed the feature/hash-sub-superset-operators branch from 9f035f3 to 79e465b Compare March 5, 2019 18:02

straight-shoota reviewed Mar 6, 2019

View reviewed changes

src/hash.cr Outdated Show resolved Hide resolved

src/hash.cr Outdated Show resolved Hide resolved

src/hash.cr Outdated Show resolved Hide resolved

spec/std/hash_spec.cr Show resolved Hide resolved

Sija force-pushed the feature/hash-sub-superset-operators branch 2 times, most recently from aa4beb4 to de59f20 Compare March 6, 2019 20:46

straight-shoota added kind:feature topic:stdlib:collection labels Mar 12, 2019

asterite mentioned this pull request Nov 27, 2019

[RFC] Spec: add a way to check that a method's value is truthy or falsey #8527

Open

Sija force-pushed the feature/hash-sub-superset-operators branch from de59f20 to 955fae9 Compare December 3, 2020 13:58

Sija changed the title ~~Implement Hash#{<,<=,>,>=} operators for sub/superset checking~~ Add sub/superset checking methods to Hash Dec 3, 2020

straight-shoota mentioned this pull request Dec 3, 2020

Rename {super,sub} set methods to clearly indicate relation of operands #10020

Closed

Sija added 3 commits January 18, 2021 13:17

Implement Hash#{<,<=,>,>=} operators for sub/superset checking

0b14a48

Change the API to use formal naming instead of operator overloads

81dc116

Align method names with changes from crystal-lang#10020

465e548

Sija force-pushed the feature/hash-sub-superset-operators branch from 955fae9 to 465e548 Compare January 18, 2021 12:20

Sija requested a review from straight-shoota January 18, 2021 12:21

straight-shoota requested changes Jan 18, 2021

View reviewed changes

src/hash.cr Outdated Show resolved Hide resolved

Sija requested a review from straight-shoota January 18, 2021 13:33

Add suggestion from code review

5d8c8c6

Sija force-pushed the feature/hash-sub-superset-operators branch from 398b611 to 5d8c8c6 Compare January 18, 2021 13:33

Add specs for edge case with both nil values

cd23add

straight-shoota approved these changes Jan 18, 2021

View reviewed changes

sdogruyol approved these changes Apr 11, 2021

View reviewed changes

straight-shoota added this to the 1.1.0 milestone Apr 13, 2021

straight-shoota merged commit 9105f58 into crystal-lang:master Apr 16, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add sub/superset checking methods to Hash #7500

Add sub/superset checking methods to Hash #7500

Sija commented Mar 3, 2019 •

edited

Loading

Sija commented Mar 6, 2019

straight-shoota commented Mar 12, 2019

Sija commented Apr 1, 2019

straight-shoota commented Apr 2, 2019

Sija commented Apr 7, 2019

straight-shoota commented Apr 7, 2019

Sija commented Apr 8, 2019

straight-shoota commented Apr 8, 2019

straight-shoota commented Nov 27, 2019

Sija commented Nov 27, 2019 •

edited

Loading

straight-shoota commented Nov 27, 2019

asterite commented Nov 27, 2019

straight-shoota commented Nov 27, 2019

asterite commented Nov 27, 2019

asterite commented Nov 27, 2019

asterite commented Nov 27, 2019

straight-shoota commented Nov 27, 2019

Sija commented Nov 22, 2020

straight-shoota commented Nov 22, 2020

Sija commented Dec 3, 2020

straight-shoota commented Dec 3, 2020

Sija commented Dec 3, 2020

straight-shoota commented Dec 3, 2020

Sija commented Feb 12, 2021

Sija commented Apr 10, 2021

sdogruyol left a comment

Add sub/superset checking methods to Hash #7500

Add sub/superset checking methods to Hash #7500

Conversation

Sija commented Mar 3, 2019 • edited Loading

Sija commented Mar 6, 2019

straight-shoota commented Mar 12, 2019

Sija commented Apr 1, 2019

straight-shoota commented Apr 2, 2019

Sija commented Apr 7, 2019

straight-shoota commented Apr 7, 2019

Sija commented Apr 8, 2019

straight-shoota commented Apr 8, 2019

straight-shoota commented Nov 27, 2019

Sija commented Nov 27, 2019 • edited Loading

straight-shoota commented Nov 27, 2019

asterite commented Nov 27, 2019

straight-shoota commented Nov 27, 2019

asterite commented Nov 27, 2019

asterite commented Nov 27, 2019

asterite commented Nov 27, 2019

straight-shoota commented Nov 27, 2019

Sija commented Nov 22, 2020

straight-shoota commented Nov 22, 2020

Sija commented Dec 3, 2020

straight-shoota commented Dec 3, 2020

Sija commented Dec 3, 2020

straight-shoota commented Dec 3, 2020

Sija commented Feb 12, 2021

Sija commented Apr 10, 2021

sdogruyol left a comment

Choose a reason for hiding this comment

Sija commented Mar 3, 2019 •

edited

Loading

Sija commented Nov 27, 2019 •

edited

Loading