Serialization and deserialization of hash-based collections should not re-use hashCode #1600

scabug · 2008-12-24T12:26:37Z

As explained on ticket #1387, the fact that the hashCode for tuples and case classes depends on the VM (they use getClass().hashCode()) means that the serialization and deserialization of immutable.HashSet (and probably other hash-based collections, but I have not verified) breaks.

Although I think it's worth fixing tuples and case classes, this issue is about changing hash-based collections so that serialization and deserialization work correctly even if the hashCode relies on some VM-specific computation.

The easiest way to do that is similar to how java.util.HashMap does it by serialising the objects into an array and then inserting them one by one into the Map on deserialisation. This means that the hashCodes are recomputed on deserialisation so there are no issues anymore.

scabug · 2008-12-24T12:26:37Z

Imported From: https://issues.scala-lang.org/browse/SI-1600?orig=1
Reporter: @ijuma
Attachments:

t1600.diff (created on Nov 30, 2009 7:04:50 PM UTC, 20566 bytes)

scabug · 2009-01-04T23:09:51Z

@odersky said:
I don't want to fix this myself. Any other takers?

scabug · 2009-01-05T01:44:32Z

@ijuma said:
I could supply a patch, but there are two questions:

What are the backwards compatibility requirements?
Is it worth doing this now, or better to do it for the new collections in 2.8.0?

scabug · 2009-01-05T06:24:07Z

@odersky said:
I think it's better to wait for 2.8.0, which will have looser backwards requirements as well. I'd happily integrate a patch then!

scabug · 2009-11-29T06:49:19Z

@ijuma said:
I committed this to a github branch:

http://github.com/ijuma/scala/commit/b7b0459ae872701a9486d7692b76a353b02cbfbc

I will also attach a patch.

Commit comment follows for convenience:

Fix ticket #1600: Serialization and deserialization of hash-based collections should not re-use hashCode.

Implementation and tests are included for immutable.HashMap, mutable.HashMap, mutable.LinkedHashMap, immutable.HashSet, mutable.HashSet, immutable.LinkedHashSet.

The basic idea is that we store the size, load factor and elements during serialization and we rebuild the collection on deserialization. Note that this is not compatible with the previous serialization format. All @SerialVersionUIDs have been reset to 1.

I'm not particularly happy about the way _loadFactor is used in some cases, but that seemed like the lesser evil given the current interface.

WeakHashMap is not Serializable and it would not make sense to make it so. TreeHashMap has not been reintegrated yet. OpenHashMap has not been updated. I think that collection is flawed in many ways and it should either be removed or it should be reimplemented. I'll file a separate ticket for that.

Some tests in serialization.check were updated as the toString order after serialization can be different in some cases.

ant all.clean && ant dist finished successfully.

scabug · 2009-11-29T06:50:35Z

@ijuma said:
Sorry, the github link should be:

http://github.com/ijuma/scala/commit/5627e4b40a6cc1f086c6a43c38a6bb2da2f86ddd

scabug · 2009-11-30T19:04:50Z

@ijuma said:
Implementation and tests for this issue.

scabug · 2009-11-30T19:06:07Z

@ijuma said:
Rebased the patch and github branch and improved documentation:

http://github.com/ijuma/scala/commit/733eec09d25c0580ef65a32945b053da6c2d87e8

Assigning to scala_reviewer as suggested by Toni.

scabug · 2009-11-30T19:56:01Z

@adriaanm said:
Thanks for the patch!

scabug · 2009-12-01T13:31:04Z

@adriaanm said:
Paul, could you please shepherd this?

scabug · 2009-12-01T21:56:14Z

@ijuma said:
Paul committed this in r19964. Thanks, closing.

scabug closed this as completed May 18, 2011

scabug added the enhancement label Apr 6, 2017

scabug assigned paulp Apr 6, 2017

scabug mentioned this issue Apr 7, 2017

HashSet/HashMap should override readObject/writeObject to properly restore hashtables during deserialization. #2690

Closed

lrytz mentioned this issue Apr 5, 2018

immutable.{HashMap, ChampHashMap} should use a serialization proxy scala/collection-strawman#524

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Serialization and deserialization of hash-based collections should not re-use hashCode #1600

Serialization and deserialization of hash-based collections should not re-use hashCode #1600

scabug commented Dec 24, 2008

scabug commented Dec 24, 2008

scabug commented Jan 4, 2009

scabug commented Jan 5, 2009

scabug commented Jan 5, 2009

scabug commented Nov 29, 2009

scabug commented Nov 29, 2009

scabug commented Nov 30, 2009

scabug commented Nov 30, 2009

scabug commented Nov 30, 2009

scabug commented Dec 1, 2009

scabug commented Dec 1, 2009

Serialization and deserialization of hash-based collections should not re-use hashCode #1600

Serialization and deserialization of hash-based collections should not re-use hashCode #1600

Comments

scabug commented Dec 24, 2008

scabug commented Dec 24, 2008

scabug commented Jan 4, 2009

scabug commented Jan 5, 2009

scabug commented Jan 5, 2009

scabug commented Nov 29, 2009

scabug commented Nov 29, 2009

scabug commented Nov 30, 2009

scabug commented Nov 30, 2009

scabug commented Nov 30, 2009

scabug commented Dec 1, 2009

scabug commented Dec 1, 2009