-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
protoc-gen-go: support generating structs with non-pointer message fields #1225
Comments
We consistently represent proto messages as a pointer to a Go struct. A It is not, in general, safe to copy a proto message by value. Messages often contain a mutex, which cannot be copied. In the latest version of the code generator, messages embed a Given that messages shouldn't be shallow copied, there's no safe way to append a message to a I understand the motivation for this request, but the choice to represent messages as pointers was made long ago and is not something that can be easily undone.
This is not true as of |
I acknowledge that the path to a resolution here is not straightforward or obvious, but that doesn't mean we shouldn't try or that the decision was correct. This is a show-stopping design flaw in the ecosystem as far as I am concerned. I hear your points and don't contest any of them. I understand your hesitation to drive forward what will undoubtedly be a large effort. I've created this issue in order to collaborate to move the community forward onto the decidedly superior library you and many other have worked hard to create. I am regularly very sad about the fact that I cannot leverage this great work. However, the performance loss due to the allocations would be unacceptable. At this point the message the go community is hearing is that protobuf and grpc by extension does not care about supporting idiomatic performance sensitive applications. That seems somewhat farfetched given the common discussion about grpc's efficiency. I've been through the new APIs and think that from a usability perspective, roughly all of the gogoproto extensions can be worked around (with varying degrees of pain) other than this one. Composing structs to avoid allocation is a critical language feature. Perhaps it will take years to resolve this. Perhaps we won't be satisfied with proposals without generics which is certainly going to be years away. I, for one, will be very sad if we just let this sit there and the serious go community continue to ignore the obviously superior APIs and continues to be left behind by the ecosystem which will continue to evolve around the new APIs. Thanks for taking the time to reply. I understand the way in which challenging and unwarranted requests on open-source projects can feel taxing in light of all of the other things going on. Coming back to technical details, your point regarding shallow copying is well taken. I would like the revisit the decision to embed mutexes inside of generated message structs. I'm not familiar with where that is done. Philosophically I find it surprising given that the messages represent just plain old data. What use case demanded synchronization below rather than above the message? Your comment about the struct not implementing If these mutexes were not embedded inside of message implementations, what would prevent shallow copying or generally impede forward progress of a design in this direction? |
A Discussions about permitting them to be allocated as a single chunk, rather than piecemeal can sometimes be made, but unfortunately, protobuf code cannot practicably figure out ahead of time just how many repeated message of a field will actually appear when decoding, especially given the extremely diverse set of possible values for that message’s length from a single byte, up to 1024 bytes or more. If we’re talking here about “performance sensitive applications” then we all already know that even a small slice growth can cause an extremely bad performance drop:
The use of mutexes in the messages is principally for record keeping used in reflection, and in fast-path encoding implementations. Extracting all of that functionality seems infeasible at this point, and would likely cause worse performance regressions than the purposed gains from using embedded objects rather than pointers. |
Most proposals to generate message fields as simply Several thoughts:
|
In this is something very important: the motivation for this is the idea that the reason why many (most?) major go programs which use grpc/protobuf shun the official library is performance. Others in other forums have raised the point that regardless of the allocation concerns, the reflection-based decoding is dramatically less efficient than code-generated decoding. I’ve become convinced that we need some benchmarks on the topic. If indeed there’s more efficiency lost to the reflection based [de]serialization than the allocations, then that’s the conversation to be having first. On some level at the heart of this issue is a question as to whether the maintainers care that major projects deem this library to be unusable inefficient. Furthermore, the question is whether there is will to collaborate with the community to understand and rectify the problem in order to eliminate the widespread and growing fragmentation. In response to @dsnet
|
Several thoughts:
|
Huh. It just occurred to me something that I was always kind of already aware of, but had not really considered directly as a design principle involved here: protobufs kind of expect and direct you towards using flatter messages, rather than deeply nested messages. Just because of the overhead of embedding a small message in both time, allocations of memory, and compaction of binary output. But a lot of people are coming from entirely different (but also equally valid) design principles that say that hierarchies are good, and add form and context to a data structure. So, I think this might be one of those cases where someone is asking for a feature, when what they really want is efficient handling of their use case, and the feature would just be one way to accomplish that end goal. I think approaching this situation as “some programs, and design principles are going to have highly nested protobufs, what can we do to not make that suck?” Might get us somewhere faster than focusing heavily on the intricate details and bogged down on specifics of “how can we get embedded structs working?” I mean, maybe there just isn’t a better answer than embedded structs. But being able to scope the use case properly might get us to an answer more quickly. |
There's something nice in this sentiment but I'm not sure how much it applies. I certainly respect your point regarding not getting lost in the technical specifics but rather focusing on the problem at hand. However, my driving motivation here is desire to move https://github.com/cockroachdb/cockroach to the new v2 APIs and my perceived primary blocker is the efficiency loss. I don't think advising large projects to rework their messages is the answer. Go, as a langauge, prefers being explicit, and, as I see it, exploiting composition to represent that explicit nature. Dealing with parallel slices in go is laborious at best. If I take your claim to the extreme, maybe what you're getting at is that the idea of using structures which directly represent the layout of the protocol buffer as a go struct is unadvisable and we should be talking about something even more extreme. Perhaps a proposal direction that might interest you would be some sort of declarative syntax for mapping protocol buffer messages to go structs which have a dramatically different layout. That feels far fetched. I guess what I'd say is that while I appreciate the sentiments of the message on a high level, I don't know how to act on them or use them to guide this conversation towards a positive outcome. |
I understand your intended motivation: move cockroachdb to the new v2 API. But the original V1 api, and the V2 api both do not support deeply nested messages well at all. (And neither does the C++ or Java or python(?) implementations) Which is probably what drove cockroachdb to use gogoprotobuf in the first place. I am not proposing that cockroachdb rework all their messages to flatten them. That is obviously unreasonable, intractable, and hostile. But approaching a topic with, “I must have this feature” rather than “this is the goal I want to achieve” is often going to focus people on the why and how that specific feature can or cannot work, rather than focusing on your actual goal. You and I may not see a viable alternate solution here, but were not the entire set of all people open to this project. By framing this as the only solution acceptable, we have largely only ever been focusing on the narrow example use cases, where the actual issues at had do not manifest. Then proposed So, we have up until now only ever been touching on the most absurdly basic surface level of the feature, rather than on the deeper issue: “when I work with my messages it has to allocate 100s, or 1000s of submessages”. So, instead of addressing the problem of the performance of highly nested messages, we’ve focused instead on why slices and maps really need to use pointer values, rather than anything that actually matters to you or others. Now, I know a lot about Go and I know a lot about protobufs, but I’m not The One True Expert that can say definitely what needs to be done in order to solve your actual problem. What I am absolutely able to do is poke holes in the feature and the minimal repros demonstrating the feature and why they’re unnecessary at the scale suggested, because that what engineer do. Being able to see the problem at its deeper levels: many projects are using highly nested messages, and switching to this package will cause unreasonable performance loss, is an entirely different argument than “protobuf and grpc by extension does not care about supporting idiomatic performance sensitive applications.” Google is either using this package already extensively internally, or using a copy of this code and yet they’re still writing performance critical applications. So when you say things like that, we’re at an absolute loss of how to take it, because… like… but… we’re doing that… so what’s the difference? Without the key lynch pin of “many projects are using highly nested protobuf messages” many of us are not even going to understand what the problem is, because we simply are not ever getting anywhere near that situation. So… all that said. Let’s look at a way forward for these projects with an open-mind to all possible solutions rather than laser focusing on one purposed solution that many of us may not have even ever seen the merit of, because we have never been anywhere near the problem issue in the first place. |
I really don't see any possibility of us changing to or adding a representation that uses value semantics for messages. The design decision that messages have reference semantics is, so far as I know, pervasive across all protobuf implementations. It's certainly pervasive within the Go implementation. |
Okay. I acknowledge that my approach in this issue has been myopically focused on the issue of value types in my more general attempt to move cockroach, and ideally other large, existing projects towards the V2 API. I view the current schism over implementations as a major blocker to adoption of the new API and thus to investment in its ecosystem by the sizable number of contributors working on these large projects. Is there appetite for an issue to more generally discuss and focus on the v1 vs. v2 API schism and the properties of There was an alternative considered, which is to facilitate better object pooling, which I'll now investigate in more depth. #1192 does not directly talk about pooling and maybe I'm wrong to read in the idea that pooling is related there. I'll propose, however, that having a sync.Pool for each message type, at least each message type in use, would likely help with the allocation overhead at the cost of tracking lifecycles and walking messages to return their members to the pool. I anticipate #280 is also going to be a part of allaying fears of moving to this library from |
The PR is trying to migrate the api package from gogo to google.golang.org/protobuf because the grpc-gateway runtime has been upgraded to use protobuf-go v2 API. It's impossible to use gogo with new version grpc-gateway and we can't re-generate all the proto codes by protobuf-go because the raftpb uses gogoproto.nullable to use non-pointer for message fileds, which protobuf-go doesn't support [1]. Fortunately, both `api/etcdserverpb/rpc.proto` and `server/etcdserver/api/v3electionpb/v3election.proto` doesn't use the `gogoproto.nullable` features to message filed. For the migration, we can just re-generate code for the following the protos as the approach: * api/**/*.proto * server/lease/leasepb/*.proto * server/etcdserver/api/snap/snappb/*.proto * server/etcdserver/api/v3election/v3electionpb/*.proto * server/etcdserver/api/v3lock/v3lockpb/*.proto Highlight the change for proto: * Use `protoc-gen-go-vtproto` to generate Marshal/Unmarshal/Size code for `gogoproto.{marshaler,unmarshaler,size,}_all` interface, instead of using proto package; * There is no way to support `gogoproto.goproto_enum_prefix_all`. So in this patch, I add `*.pb.ext.go` to add the const alias without prefix manually. Maybe we can build a proto binary as generator in the future; * Both `api/etcdserverpb/etcdserver.proto` and `server/etcdserver/api/snap/snappb/snap.proto` use `gogoproto.nullable = true` for builtin type. In this patch, I upgrade the proto syntax from `2` to `3` with `optional` features. We can re-generate the optional fileds as pointer by `--experimental_allow_proto3_optional`. Since it's syntax upgrade, the marshal methods is also changed. For instance, in proto3 syntax, if the bool is empty, the marshaler won't encode it [2]. It's compatible. Highlight the change for grpc-gateway: * The grpc-gateway/third_party/googleapis has been removed. We need to download the https://github.com/googleapis/googleapis.git to get the target proto. * The protobuf-go won't generate json tag for `oneOf` filed [3]. We have to use `sed` in `scripts/genproto.sh` to append the `tag` before upstream accepts the change [4]. * Both `Hash` and `HashKV` shares the one URL `/v3/maintenance/hash`. New grpc-gateway openapi can detect the error. So in this patch, we should use `/v3/maintenance/hashkv` for `HashKV`. * New grpc-gateway always renders default value in json payload. In this patch, we use set the `EmitUnpopulated: false`. * The error message of grpc-gateway has been changed. In v1.16.0, the error message is rendered by [6]. In v2.7.0, the `error` string has been removed [7]. [1]: <golang/protobuf#1225> [2]: <https://github.com/protocolbuffers/protobuf-go/blob/fc47fdd3d3fca5283fa9428ac94cf730236e4ca3/internal/impl/codec_gen.go#L74> [3]: <https://github.com/protocolbuffers/protobuf-go/blob/fc47fdd3d3fca5283fa9428ac94cf730236e4ca3/cmd/protoc-gen-go/internal_gengo/main.go#L814> [4]: <golang/protobuf#140> [5]: <https://github.com/grpc-ecosystem/grpc-gateway/blob/40ee80cbcc0751e0c943d4abc99ca4bd196bec3b/docs/docs/development/grpc-gateway_v2_migration_guide.md?plain=1#L92> [6]: <https://github.com/grpc-ecosystem/grpc-gateway/blob/v1.16.0/internal/errors.proto#L9> [7]: <https://github.com/googleapis/go-genproto/blob/e85fd2cbaebc/googleapis/rpc/status/status.pb.go#L46> Signed-off-by: Wei Fu <fuweid89@gmail.com>
The PR is trying to migrate the api package from gogo to google.golang.org/protobuf because the grpc-gateway runtime has been upgraded to use protobuf-go v2 API. It's impossible to use gogo with new version grpc-gateway and we can't re-generate all the proto codes by protobuf-go because the raftpb uses gogoproto.nullable to use non-pointer for message fileds, which protobuf-go doesn't support [1]. Fortunately, both `api/etcdserverpb/rpc.proto` and `server/etcdserver/api/v3electionpb/v3election.proto` doesn't use the `gogoproto.nullable` features to message filed. For the migration, we can just re-generate code for the following the protos as the approach: * api/**/*.proto * server/lease/leasepb/*.proto * server/etcdserver/api/snap/snappb/*.proto * server/etcdserver/api/v3election/v3electionpb/*.proto * server/etcdserver/api/v3lock/v3lockpb/*.proto Highlight the change for proto: * Use `protoc-gen-go-vtproto` to generate Marshal/Unmarshal/Size code for `gogoproto.{marshaler,unmarshaler,size,}_all` interface, instead of using proto package; * There is no way to support `gogoproto.goproto_enum_prefix_all`. So in this patch, I add `*.pb.ext.go` to add the const alias without prefix manually. Maybe we can build a proto binary as generator in the future; * Both `api/etcdserverpb/etcdserver.proto` and `server/etcdserver/api/snap/snappb/snap.proto` use `gogoproto.nullable = true` for builtin type. In this patch, I upgrade the proto syntax from `2` to `3` with `optional` features. We can re-generate the optional fileds as pointer by `--experimental_allow_proto3_optional`. Since it's syntax upgrade, the marshal methods is also changed. For instance, in proto3 syntax, if the bool is empty, the marshaler won't encode it [2]. It's compatible. Highlight the change for grpc-gateway: * The grpc-gateway/third_party/googleapis has been removed. We need to download the https://github.com/googleapis/googleapis.git to get the target proto. * The protobuf-go won't generate json tag for `oneOf` filed [3]. We have to use `sed` in `scripts/genproto.sh` to append the `tag` before upstream accepts the change [4]. * Both `Hash` and `HashKV` shares the one URL `/v3/maintenance/hash`. New grpc-gateway openapi can detect the error. So in this patch, we should use `/v3/maintenance/hashkv` for `HashKV`. * New grpc-gateway always renders default value in json payload. In this patch, we use set the `EmitUnpopulated: false`. * The error message of grpc-gateway has been changed. In v1.16.0, the error message is rendered by [6]. In v2.7.0, the `error` string has been removed [7]. [1]: <golang/protobuf#1225> [2]: <https://github.com/protocolbuffers/protobuf-go/blob/fc47fdd3d3fca5283fa9428ac94cf730236e4ca3/internal/impl/codec_gen.go#L74> [3]: <https://github.com/protocolbuffers/protobuf-go/blob/fc47fdd3d3fca5283fa9428ac94cf730236e4ca3/cmd/protoc-gen-go/internal_gengo/main.go#L814> [4]: <golang/protobuf#140> [5]: <https://github.com/grpc-ecosystem/grpc-gateway/blob/40ee80cbcc0751e0c943d4abc99ca4bd196bec3b/docs/docs/development/grpc-gateway_v2_migration_guide.md?plain=1#L92> [6]: <https://github.com/grpc-ecosystem/grpc-gateway/blob/v1.16.0/internal/errors.proto#L9> [7]: <https://github.com/googleapis/go-genproto/blob/e85fd2cbaebc/googleapis/rpc/status/status.pb.go#L46> Signed-off-by: Wei Fu <fuweid89@gmail.com>
Is your feature request related to a problem? Please describe.
An important property of go is the ability to compose structs in order to reduce the number of allocation and improve access locality. This is at odds with languages like Java which (until project Valhalla) required that each instance live independently on the heap. Java's runtime specialization and JIT compilation helps it to mitigate that overhead. Protobuf fields (as of proto3, at least) can be omitted in serializations, a very important property for forwards compatibility. In order to support the optional nature of fields,
protoc-gen-go
generates fields as pointers. This (combined with storing unknown fields) allows a message to round-trip through deserialization back to serialization without losing information.The language guide indicates specific zero values for types other than messages which permits the compiler to use non-pointer types for fields of those types.
Consider the following contrived example:
This will generate the below struct:
This issue considers two different cases in which the protoc-gen-go compiler generates struct pointer types as opposed to struct value types in generated structs: message fields, collection fields which utilize messages (repeated, map). The above generated struct is unfortunate for two reasons.
The first is that the
Name
field has a pointer. As discussed above, this, today, is important so that upon marshaling the library can know whether to serialize a zero-value message for the field or to not serialize the message at all. In the solutions section we'll discuss how we can store information as to whether this field is populated elsewhere in thePerson
struct.The second is that the
Friends
field is a[]*Person
rather than[]Person
. The motivation for this decision is not clear to me. It is illegal for an entry in that slice to be nil for the purposes of serialization. Callingproto.Marshal
with a nil entry will result in an errorproto: repeated field Friends has nil element
.Describe the solution you'd like
The proposal here is for messages and fields to have an option, which for now we'll call
(go.nullable)
which can be set to false to indicate that pointers should not be generated. Forrepeated
andmap
fields this should be trivial to implement and does not seem to break any semantics. For regular message fields, we'll need to have some mechanism to (a) determine at serialization time whether a field is zero-value but exists or should not be serialized at all (b) to mark a zero-valued field as not existing when creating a message in memory.For (a) I propose that we augment the unexported state of messages to include a bitmask to indicate whether a field is missing. At serialization time, if the library detects that a message has a zero value, it will consult this bitmask. The previous sentence is intentionally imprecise. In terms of implementation details, in the new
google.golang.org/protobuf
world, we have some flexibility to control the implementation of protoreflect.Message. What is critical is that when a "root" message exposes its fields through theRange
method, that it provideValue
implementations which return the proper value forIsValid
. This can be done by constructing the relevantMessage
values by telling them whether they are valid.For (b) I propose generating a method for each message field which is generated as a non-pointer to set its validity. Note that the validity only applies if the value of the message is zero. This will mean that in this case,
IsValid
will be more expensive as it will need to determine whether all fields of a given message are zero before consulting its validity bit.Describe alternatives you've considered
One discussion related to this has been to allow the protobuf library to pool allocations rather than to try to eliminate those allocations: #1192
Additional context
This topic has been discussed at least twice before. In #1142 the discussion hinged upon internals to the implementation rather than the more abstract problem. That discussion went nowhere. I re-raised this issue in the context of #52 (comment) which was the incorrect place to do so.
Another important thing to note is that the demand for this functionality is not theoretical. This is deemed as the most critical improvement of the now defunct https://github.com/gogo/protobuf/ library by engineers on https://github.com/cockroachdb/cockroach. Cockroach seems to not be the only project which is struggling to move off of that deprecated and unsupported fork (firecracker-microvm/firecracker-containerd#452).
The text was updated successfully, but these errors were encountered: