Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add concatenation operations to Slice #10157

Closed
watzon opened this issue Dec 30, 2020 · 6 comments · Fixed by #12081
Closed

Add concatenation operations to Slice #10157

watzon opened this issue Dec 30, 2020 · 6 comments · Fixed by #12081

Comments

@watzon
Copy link
Contributor

watzon commented Dec 30, 2020

This is a reopening of the conversation from #5826 to see if minds may have changed. Currently it's not possible to concatenate slices, with the rationale being "we have IO for that". However there are some cases where that could add several lines to a function and actually make it harder to understand what's going on. Take this example:

def calc_key(auth_key, msg_key, client = nil)
  x = client ? 0 : 8
  
  key1 = IO::Memory.new
  key1.write(msg_key)
  key1.write(auth_key[x..(x + 36)])

  key2 = IO::Memory.new
  key2.write(auth_key[(x + 40)..(x + 76)])
  key2.write(msg_key)

  sha256a = @sha256.digest(key1.to_slice)
  sha256b = @sha256.digest(key2.to_slice)

  aes_key = IO::Memory.new(32)
  aes_iv = IO::Memory.new(32)

  aes_key.write(sha_256a[0..8])
  aes_key.write(sha_256b[8..24])
  aes_key.write(sha_256a[24..32])

  aes_iv.write(sha_256b[0..8])
  aes_iv.write(sha_256a[8..24])
  aes_iv.write(sha_256b[24..32])

  { aes_key.to_slice, aes_iv.to_slice }
end

I know it's a long example, but it's a real one. And a common one when it comes to dealing with cryptography. With slice concatenation it becomes so much simpler:

def self.calc_key(auth_key, msg_key, client = nil)
  x = client ? 0 : 8
  sha256a = @sha256.digest(msg_key + auth_key[x..(x + 36)])
  sha256b = @sha256.digest(auth_key[(x + 40)..(x + 76)] + msg_key)

  aes_key = sha_256a[0..8] + sha_256b[8..24] + sha_256a[24..32]
  aes_iv = sha_256b[0..8] + sha_256a[8..24] + sha_256b[24..32]

  { aes_key, aes_iv }
end

I feel like the second example is a bit easier to follow, and since a Slice(T)#+ operation would likely be using memcpy operations under the hood I'd imagine it would be faster too (though I could definitely be wrong). This would also be super trivial to add.

Yes, it returns a new Slice, just like Array(T)#+ returns a new Array. For more performance when concatenating it would also be nice to have a Slice(T)#concat method which would do basically the same thing, but only create one intermediary slice. The + method could then just use that.

@watzon
Copy link
Contributor Author

watzon commented Dec 30, 2020

Here's an example of Slice.concat and Slice#+ being used. The docs should probably come with a disclaimer that if you're adding more than 2 slices it would be more efficient to use concat since + has to make an intermediary slice for each operation.

https://carc.in/#/r/a6x9

@j8r
Copy link
Contributor

j8r commented Dec 30, 2020

@watzon your implementation does not work for unions:

a = Bytes[1, 2, 3, 4]
b = Slice[5, 6, 7, 8]
c = Bytes[9, 10, 11, 12]

puts Slice(Int32 | UInt8).concat(a, b, c)
puts a + b

As @Blacksmoke16 suggested, Slice.join can be a better name, like File.join, and avoid confusion with the instance methods #concat (used in Array, Set, Deque and CSV).

@watzon
Copy link
Contributor Author

watzon commented Dec 30, 2020

Slice.join would be fine, but should unions be supported? Idk how that would work.

@j8r
Copy link
Contributor

j8r commented Dec 30, 2020

It is possible, because it works for Array, see https://github.com/crystal-lang/crystal/blob/669db89/src/array.cr#L328.

@straight-shoota
Copy link
Member

straight-shoota commented Dec 30, 2020

Slice#+(other : Slice) and Slice.concat sound good 👍 Using a type union should be fine.

@HertzDevil
Copy link
Contributor

HertzDevil commented May 29, 2021

Speaking of #+, I am not a big fan of the existing #+(offset : Int) method; it alludes to pointer-like semantics when self[offset..] expresses the intent more clearly. In fact I initially thought that method would keep the slice's size.

IMO #concat(*others : Slice) is slightly more idiomatic than .concat(*slices : Slice). But I'm not sure if #concat is the best name for this because all other #concat methods in the standard library mutate self.

If we define additionally Slice.concat(slices : Indexable(Slice)) then we could also use it to optimize Indexable(Slice)#sum. (It cannot be done on Enumerable because the returned slice's size must be precomputed prior to iteration.) But what should happen to a Slice(Slice) in that case?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants