issue with re-declaring custom types #5

mimoo · 2022-01-03T16:51:11Z

Sometimes in OCaml, we can use the same custom type in different context.

For example, we use the CamlPointer custom type in different modules:

    decl_module!(w, env, "FieldVectors", {
        decl_module!(w, env, "Fp", {
            decl_type!(w, env, CamlPointer<T1> => "t");
            decl_type_alias!(w, env, "elt" => CamlFp);

            decl_func!(w, env, caml_pasta_fp_vector_create => "create");
            decl_func!(w, env, caml_pasta_fp_vector_length => "length");
            decl_func!(w, env, caml_pasta_fp_vector_emplace_back => "emplace_back");
            decl_func!(w, env, caml_pasta_fp_vector_get => "get");
        });

        decl_module!(w, env, "Fq", {
            decl_type!(w, env, CamlPointer<T1> => "t");
            decl_type_alias!(w, env, "elt" => CamlFq);

            decl_func!(w, env, caml_pasta_fq_vector_create => "create");
            decl_func!(w, env, caml_pasta_fq_vector_length => "length");
            decl_func!(w, env, caml_pasta_fq_vector_emplace_back => "emplace_back");
            decl_func!(w, env, caml_pasta_fq_vector_get => "get");
        });
    });

converts to

module FieldVectors = struct 

  module Fp = struct 
    type t
    type elt = Foundations.Fp.t
    external create : unit -> t = "caml_pasta_fp_vector_create"
    external length : t -> int = "caml_pasta_fp_vector_length"
    external emplace_back : t -> Foundations.Fp.t -> unit = "caml_pasta_fp_vector_emplace_back"
    external get : t -> int -> Foundations.Fp.t = "caml_pasta_fp_vector_get"
  end


  module Fq = struct 
    type t
    type elt = Foundations.Fq.t
    external create : unit -> t = "caml_pasta_fq_vector_create"
    external length : t -> int = "caml_pasta_fq_vector_length"
    external emplace_back : t -> Foundations.Fq.t -> unit = "caml_pasta_fq_vector_emplace_back"
    external get : t -> int -> Foundations.Fq.t = "caml_pasta_fq_vector_get"
  end

end

Redeclaring a type is usually not allowed in ocaml-gen, but it is permitted for custom types. The problem is that, the next time a type or a function from another module uses the CamlPointer type, it will be renamed as FieldVectors.Fq.t which is the last place it was declared. While not technically true, it is not great.

Ideally, we would want the generation code to panic when this happens.

It is not clear to me that there's a solution though, because I don't see how to differentiate a legitimate use-case from a non-legitimate use-case.
We could, technically, track all the declarations of the same custom type and give them a different nickname, but I don't think this it is less error-prone.

@imeckler proposed to maybe track the different declaration of a custom type by using the different concrete types used. I think this might be a good idea. For this to work, I think we'd only have to return a different value for unique_id() based on the concrete type used, and then declare the custom types with their concrete values only:

    decl_module!(w, env, "FieldVectors", {
        decl_module!(w, env, "Fp", {
-            decl_type!(w, env, CamlPointer<T1> => "t");
+           decl_type!(w, env, CamlPointer<Vec<Fp>> => "t");
            decl_type_alias!(w, env, "elt" => CamlFp);

but I might be missing some edge-cases... What if several modules want to use the same more-generic caml_pointer?

Actually, it's a bigger problem than what I originally thought. If you have a function that takes different arguments that are all essentially the same generic custom type:

fn thing(g1: CamlPointer<T1>, g2: Camlpointer<T2>, g3: CamlPointer<T3>)

The current solution will produce wrong bindings: all arguments will be of the type of the latest instance of

decl_type!(w, env, CamlPointer<T1> => "t");

Imagine that it is within the module T1, the generated OCaml will look something like that:

external thing : T1.t -> T1.t -> T1.t -> unit = "thing"

This will likely not work. For example, the type CamlPointer<T2> could be re-declared as T2.t and only creatable as a type T2.t.

What to do?

Solution 1: Prevent re-declaring custom types

The first solution now is to backtrack and prevent multiple usage of decl_type even in case where the type is a custom type. This is not great as you won't be able to distinguish between different flavors of CamlPointer.

Solution 2: Distinguish the same custom types when it is instantiated with different concrete parameters

Going with @imeckler 's idea, maybe instead of declaring custom types like so:

decl_type!(w, env, CamlPointer<T1> => "t");

where T1 is our fake generic type, we could declare it like so:

decl_type!(w, env, CamlPointer<SomeConcreteType> => "t");

or even like so:

decl_type!(w, env, CamlPointer<SomeConcreteType<T1>> => "t");

Even so, it is not a perfect solution. You might have situation where you want to split a non-generic custom type, or a generic custom type with the same concrete type (so really the same type), in different types on the OCaml side. I would argue that if you really want to do this, this distinction should also exist in Rust via wrapper types or something.

In the rest of this post I'll talk about solution 2.

Implementing solution 2

To make solution 2 work, we'd need to return a different unique_id based on the concrete type, only for custom types. Maybe we can do this by calculating a unique_id as the wrapping_multiplication/wrapping_addition/XOR of all cascading unique_ids. For example in CamlPointer we could generate the unique_id of that type as CamlPointer::unique_id() XOR SomeConcreteType::unique_id()

What's unique_id?

It's a unique id derived from the filename and line of code where a struct is defined. This means that a generic struct will always have the same unique_id no matter what concrete type it is instantiated with.

It is used in two places:

when declaring new types, it checks via the unique_id that it hasn't been declared before (and it panics if it was)
```
new_type(&mut self, ty: u128, name: &'static str)
```
when generating OCaml bindings, we need to rewrite types that have been renamed or relocated within some modules, for that we track types via their unique_id
```
get_type(&self, ty: u128, name: &str) -> String
```

The change to unique_id

A custom type needs to return a unique_id that combines its own unique_id with the unique_ids of its type parameters. That doesn't sound too complicated, but there's one issue: its type parameters have to do the same with their own type parameters. For example, the following are two different types:

CamlPointer<SomeType<This>>

and

CamlPointer<SomeType<AndThat>>

two considerations:

SomeType::unique_id() is called at runtime (when CamlPointer<SomeType<...>>::unique_id() is called)
SomeType needs to understand that it needs to do the same thing with its own type parameters. But this is only in the case where SomeType is a concrete type of a custom type. For this we can simply modify unique_id() to take an argument telling it that we want the result to use the type parameters own unique_id. This is getting confusing.

This won't work

Actually, this won't work, because when we generate that code (in the CustomType macro), we don't know yet if a type parameter is a concrete type or not. And even if it is a concrete type, it might not have a "unique_id".

So instead... we might want to just generate a unique_id out of the strings/idents that are contained in the type parameter s of our custom type.
But even this is not possible...

Backtracking

Wait. This is something that we might actually want to distinguish in Rust as well.
Maybe the solution is not in ocaml-gen, but in how we define functions that use different semantical CamlPointers. Maybe we should use different custom types for the different CamlPointer and call it a day...

Damn it this sounds like a fine solution.

re-reading this, I'm not convinced that it won't work... working on this in https://github.com/o1-labs/proof-systems/tree/mimoo/ocaml_custom

OK so trying to find a solution, I hit one dead end that's also not really a dead end but hear me out.

If I want to distinguish different instantiations of the same generic custom type in Rust, I can either:

use the unique_id() solution, but then the generic type parameters require a bound T: OcamlDesc. This is annoying because we might these might be foreign types.
use the idents of the concrete type parameters. I'm not sure if this would lead to issues, for example if some types are aliased or if two types share the same name. Although perhaps the full path X::type might help us distinguish the latter case, but I'm not 100% sure about that.

So I propose that we go with the previous solution I proposed, which is to just not have this feature and force people to write distinct types in Rust. If someone is interested to take a stab at it, we'd need some tests to make sure that solution 2 (which seems to be the right solution to pursue) doesn't have weird edge-cases.

The text was updated successfully, but these errors were encountered:

mimoo mentioned this issue Jan 3, 2022

[ocaml-gen] issue with re-declaring custom types o1-labs/proof-systems#172

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

issue with re-declaring custom types #5

issue with re-declaring custom types #5

mimoo commented Jan 3, 2022

issue with re-declaring custom types #5

issue with re-declaring custom types #5

Comments

mimoo commented Jan 3, 2022

What to do?

Solution 1: Prevent re-declaring custom types

Solution 2: Distinguish the same custom types when it is instantiated with different concrete parameters

Implementing solution 2

What's unique_id?

The change to unique_id

This won't work

Backtracking