Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unify & optimise serialization in v0.7 #141

Closed
wants to merge 3 commits into from
Closed

Unify & optimise serialization in v0.7 #141

wants to merge 3 commits into from

Conversation

nskybytskyi
Copy link

@nskybytskyi nskybytskyi commented Sep 21, 2021

You are probably wondering how I got here Background: earlier today I opened #140, where @survived suggested either to wait a couple of months or open a PR. The former is not a viable option for us, hence here comes the latter.

Resolves #138 and #140 in v0.7:

  • unify & optimise serialization of BigInt;
  • optimise serialization of Secp256k1Point (compressed form).

The main question here is if you consider serialization format public API or not. I am very tempted to say that it is not, and the only publicly guaranteed invariants are Deserialize(Serialize(Point)) = Point and Serialize(Deserialize(String)) = String.

However, my opinion is clearly biased, and there are different approaches to this issue. For example, one may argue that if they got a file on a hard drive produced by the former version of Serialize, then they won't be able to Deserialize it by the current version of Deserialize, which is not ideal, to say the least.

However, #139 appears to be a bug-fix either way, so the situation above can occur in the current setting as well (when using different BigInt backends), so I assume it has to be addressed somehow...

Copy link
Contributor

@survived survived left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Sky-Nik, thanks for your contribution! I have a few suggestions that should make your code panic-less, and also I found opportunities to optimise performance/communication size even more!


Ok(Secp256k1Point::from_coor(&bx, &by))
let bp = BigInt::from_hex(&p).map_err(E::Error::custom)?;
Ok(Secp256k1Point::from_bytes(&bp.to_bytes()[1..33]).unwrap())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Malformed input that comes from untrusted source may cause this method to panic. Deserialize implementation must validate input and return error if it's invalid. You need to:

  • Ensure that bp.to_bytes() >= 33 (otherwise &bp.to_bytes()[1..33] may panic)
  • Properly handle error returned by Secp256k1Point::from_bytes

Something like that should work:

Suggested change
Ok(Secp256k1Point::from_bytes(&bp.to_bytes()[1..33]).unwrap())
let bp = bp.to_bytes();
if bp.len() < 33 {
return E::Error::invalid_length(bp.len(), &"33 bytes")
}
Secp256k1Point::from_bytes(&bp[1..33])
.map_err(|_e| E::Error::custom("invalid point"))

Copy link
Contributor

@survived survived Sep 22, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit (not important): I think BigInt can be removed from this function by using hex::decode - this would save a couple allocations. You can even remove all the allocations in this function by using hex::decode_to_slice. That would optimise deserialization.


Ok(Secp256k1Point::from_coor(&bx, &by))
let bp = BigInt::from_hex(p).map_err(V::Error::custom)?;
Ok(Secp256k1Point::from_bytes(&bp.to_bytes()[1..33]).unwrap())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same as visit_map:

  • Ensure that bp.to_bytes() is large enough
  • Propagate error returned by Secp256k1Point::from_bytes
  • Consider using hex::decode/hex::decode_slice over BigInt::from_hex

state.serialize_field("x", &self.x_coor().unwrap().to_hex())?;
state.serialize_field("y", &self.y_coor().unwrap().to_hex())?;
let mut state = serializer.serialize_struct("Secp256k1Point", 1)?;
state.serialize_field("p", &self.bytes_compressed_to_big_int().to_hex())?;
Copy link
Contributor

@survived survived Sep 22, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit:

  1. Since serialized structure Secp256k1Point consist of only one field, maybe just serialize it as string, not as structure? Ie. simple serializer.serialize_str(...) can be used here
  2. In curv v0.8 we serialize point as raw bytes, not as hex-encoded bytes. This removes verbosity, but makes it highly efficient for binary data formats like bincode. Simply saying, it reduces size of serialized point by half. Do you prefer hex format?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. True, it is shorter and simpler than my idea about newtype struct.
  2. As a matter of fact, I prefer raw bytes as well. I'll change the code soon.

@survived
Copy link
Contributor

The main question here is if you consider serialization format public API or not

Indeed, we should freeze serialization format at some point in the future. It's not expected that bumping curv v0.7.1 to v0.7.2 will invalidate all serialized points/bigints. However, we do not follow any policy regarding serialization format, so it may be considered unstable.

 - panic-less implementation with input validation;
 - slightly less verbose and more efficient serialization.
@nskybytskyi nskybytskyi marked this pull request as draft October 6, 2021 14:11
Copy link
Contributor

@survived survived left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@nskybytskyi
Copy link
Author

LGTM!

I added only the basic requested changes yesterday and still want to change from hex to raw bytes, so please don't merge it in yet. Thank you for your patience. 🙏🏼

@survived
Copy link
Contributor

survived commented Oct 7, 2021

Sure, take your time! I won't merge until PR is marked as ready for review. I just noticed that there are no unwraps anymore. Converting to raw bytes is also an important optimisation!

@survived
Copy link
Contributor

BigInt deserialization implementation has been improved in latest curv, it was not compatible with serde_json. See the PR: #145. Points/scalars deserialization has been changed as well: #143

@nskybytskyi nskybytskyi closed this by deleting the head repository Dec 14, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants