Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unvendor grpc/protobuf #90

Open
h-vetinari opened this issue Jan 10, 2023 · 20 comments
Open

Unvendor grpc/protobuf #90

h-vetinari opened this issue Jan 10, 2023 · 20 comments

Comments

@h-vetinari
Copy link
Member

h-vetinari commented Jan 10, 2023

I didn't realize that ray is building grpc as a vendored project. This would be a pretty obviously candidate for effing something up when using a newer grpcio.

The biggest problem with this is how hard it is (for me at least) to tell bazel to use "foreign" libraries.

Originally posted by @h-vetinari in #87 (comment)

@ngam
Copy link

ngam commented Feb 9, 2023

Noticed the following the logs (on main, passing builds, https://github.com/conda-forge/ray-packages-feedstock/runs/10863439621). Pretty interesting stuff!

2023-01-24T21:20:07.3483964Z INFO: Analyzed 2 targets (153 packages loaded, 21326 targets configured).
2023-01-24T21:20:07.3519661Z INFO: Found 2 targets...
2023-01-24T21:20:07.4415336Z [0 / 9] [Prepa] BazelWorkspaceStatusAction stable-status.txt
2023-01-24T21:20:16.3821713Z [14 / 1,961] Compiling src/google/protobuf/compiler/cpp/cpp_field.cc; 1s processwrapper-sandbox ... (2 actions, 1 running)
2023-01-24T21:20:27.6915280Z [21 / 1,961] Compiling src/google/protobuf/compiler/cpp/cpp_message.cc; 6s processwrapper-sandbox ... (2 actions, 1 running)
2023-01-24T21:20:37.1167440Z INFO: From Compiling src/google/protobuf/message_lite.cc:
2023-01-24T21:20:37.1191266Z In file included from /home/conda/feedstock_root/build_artifacts/ray-packages_1674594838659/_build_env/bin/../x86_64-conda-linux-gnu/sysroot/usr/include/string.h:638,
2023-01-24T21:20:37.1192104Z                  from external/com_google_protobuf/src/google/protobuf/stubs/port.h:39,
2023-01-24T21:20:37.1192619Z                  from external/com_google_protobuf/src/google/protobuf/stubs/common.h:48,
2023-01-24T21:20:37.1193112Z                  from external/com_google_protobuf/src/google/protobuf/message_lite.h:45,
2023-01-24T21:20:37.1193612Z                  from external/com_google_protobuf/src/google/protobuf/message_lite.cc:36:
2023-01-24T21:20:37.1194141Z In function 'void* memcpy(void*, const void*, size_t)',
2023-01-24T21:20:37.1194953Z     inlined from 'uint8_t* google::protobuf::io::EpsCopyOutputStream::WriteRaw(const void*, int, uint8_t*)' at external/com_google_protobuf/src/google/protobuf/io/coded_stream.h:706:16,
2023-01-24T21:20:37.1196208Z     inlined from 'virtual uint8_t* google::protobuf::internal::ImplicitWeakMessage::_InternalSerialize(uint8_t*, google::protobuf::io::EpsCopyOutputStream*) const' at external/com_google_protobuf/src/google/protobuf/implicit_weak_message.h:84:28,
2023-01-24T21:20:37.1197520Z     inlined from 'bool google::protobuf::MessageLite::SerializePartialToZeroCopyStream(google::protobuf::io::ZeroCopyOutputStream*) const' at external/com_google_protobuf/src/google/protobuf/message_lite.cc:412:30:
2023-01-24T21:20:37.1199102Z /home/conda/feedstock_root/build_artifacts/ray-packages_1674594838659/_build_env/bin/../x86_64-conda-linux-gnu/sysroot/usr/include/bits/string3.h:51:33: warning: 'void* __builtin___memcpy_chk(void*, const void*, long unsigned int, long unsigned int)' specified size between 18446744071562067968 and 18446744073709551615 exceeds maximum object size 9223372036854775807 [-Wstringop-overflow=]
2023-01-24T21:20:37.1200145Z    51 |   return __builtin___memcpy_chk (__dest, __src, __len, __bos0 (__dest));
2023-01-24T21:20:37.1200619Z       |          ~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
2023-01-24T21:20:40.0947346Z [37 / 1,961] Compiling src/google/protobuf/compiler/csharp/csharp_reflection_class.cc; 0s processwrapper-sandbox ... (2 actions, 1 running)
2023-01-24T21:20:53.8654923Z [55 / 1,961] Compiling src/google/protobuf/compiler/java/java_map_field.cc; 0s processwrapper-sandbox ... (2 actions, 1 running)
2023-01-24T21:21:09.8994672Z [74 / 1,961] Compiling src/google/protobuf/compiler/plugin.pb.cc; 0s processwrapper-sandbox ... (2 actions, 1 running)
2023-01-24T21:21:27.9441159Z [95 / 1,961] Compiling src/google/protobuf/io/printer.cc; 0s processwrapper-sandbox ... (2 actions, 1 running)
2023-01-24T21:21:48.7495496Z [111 / 1,961] Compiling src/google/protobuf/compiler/objectivec/objectivec_primitive_field.cc; 1s processwrapper-sandbox ... (2 actions, 1 running)
2023-01-24T21:22:13.6192243Z [138 / 1,961] Compiling src/google/protobuf/extension_set.cc; 2s processwrapper-sandbox ... (2 actions, 1 running)
2023-01-24T21:22:41.9635614Z [186 / 2,150] Compiling src/compiler/node_generator.cc [for host]; 1s processwrapper-sandbox ... (2 actions, 1 running)
2023-01-24T21:23:14.0442354Z [231 / 2,150] Compiling src/google/protobuf/generated_message_tctable_lite.cc [for host]; 1s processwrapper-sandbox ... (2 actions running)
2023-01-24T21:23:50.5721139Z [268 / 2,150] Compiling src/google/protobuf/any.pb.cc [for host]; 0s processwrapper-sandbox ... (2 actions running)
2023-01-24T21:24:32.5789497Z [310 / 2,150] Compiling src/google/protobuf/compiler/java/java_primitive_field.cc [for host]; 1s processwrapper-sandbox ... (2 actions running)
2023-01-24T21:24:58.1737035Z INFO: From Compiling src/google/protobuf/message_lite.cc [for host]:
2023-01-24T21:24:58.1780300Z In file included from /home/conda/feedstock_root/build_artifacts/ray-packages_1674594838659/_build_env/bin/../x86_64-conda-linux-gnu/sysroot/usr/include/string.h:638,
2023-01-24T21:24:58.1781514Z                  from external/com_google_protobuf/src/google/protobuf/stubs/port.h:39,
2023-01-24T21:24:58.1782102Z                  from external/com_google_protobuf/src/google/protobuf/stubs/common.h:48,
2023-01-24T21:24:58.1791404Z                  from external/com_google_protobuf/src/google/protobuf/message_lite.h:45,
2023-01-24T21:24:58.1792308Z                  from external/com_google_protobuf/src/google/protobuf/message_lite.cc:36:
2023-01-24T21:24:58.1793098Z In function 'void* memcpy(void*, const void*, size_t)',
2023-01-24T21:24:58.1794152Z     inlined from 'uint8_t* google::protobuf::io::EpsCopyOutputStream::WriteRaw(const void*, int, uint8_t*)' at external/com_google_protobuf/src/google/protobuf/io/coded_stream.h:706:16,
2023-01-24T21:24:58.1795623Z     inlined from 'virtual uint8_t* google::protobuf::internal::ImplicitWeakMessage::_InternalSerialize(uint8_t*, google::protobuf::io::EpsCopyOutputStream*) const' at external/com_google_protobuf/src/google/protobuf/implicit_weak_message.h:84:28,
2023-01-24T21:24:58.1797153Z     inlined from 'bool google::protobuf::MessageLite::SerializePartialToZeroCopyStream(google::protobuf::io::ZeroCopyOutputStream*) const' at external/com_google_protobuf/src/google/protobuf/message_lite.cc:412:30:
2023-01-24T21:24:58.1798928Z /home/conda/feedstock_root/build_artifacts/ray-packages_1674594838659/_build_env/bin/../x86_64-conda-linux-gnu/sysroot/usr/include/bits/string3.h:51:33: warning: 'void* __builtin___memcpy_chk(void*, const void*, long unsigned int, long unsigned int)' specified size between 18446744071562067968 and 18446744073709551615 exceeds maximum object size 9223372036854775807 [-Wstringop-overflow=]
2023-01-24T21:24:58.1800554Z    51 |   return __builtin___memcpy_chk (__dest, __src, __len, __bos0 (__dest));
2023-01-24T21:24:58.1801107Z       |          ~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
2023-01-24T21:25:20.9009284Z [416 / 2,281] Compiling src/compiler/csharp_generator.cc [for host]; 1s processwrapper-sandbox ... (2 actions, 1 running)
2023-01-24T21:25:22.1662427Z INFO: From Action external/com_github_grpc_grpc/src/proto/grpc/reflection/v1alpha/reflection.grpc.pb.h:
2023-01-24T21:25:22.1669560Z bazel-out/k8-opt/bin/external/com_github_grpc_grpc/external/com_github_grpc_grpc: warning: directory does not exist.
2023-01-24T21:25:51.6341538Z INFO: From Generating Descriptor Set proto_library @com_github_cncf_udpa//xds/type/v3:pkg:
2023-01-24T21:25:51.6347480Z xds/type/v3/typed_struct.proto:10:1: warning: Import validate/validate.proto is unused.
2023-01-24T21:25:52.6425750Z INFO: From Action external/com_github_grpc_grpc/src/proto/grpc/channelz/channelz.grpc.pb.h:
2023-01-24T21:25:52.6442802Z bazel-out/k8-opt/bin/external/com_github_grpc_grpc/external/com_github_grpc_grpc: warning: directory does not exist.
2023-01-24T21:25:52.7080488Z INFO: From Action external/com_github_grpc_grpc/src/proto/grpc/testing/xds/v3/percent.grpc.pb.h:
2023-01-24T21:25:52.7082213Z bazel-out/k8-opt/bin/external/com_github_grpc_grpc/external/com_github_grpc_grpc: warning: directory does not exist.
2023-01-24T21:25:52.7434268Z INFO: From Action external/com_github_grpc_grpc/src/proto/grpc/testing/xds/v3/base.grpc.pb.h:
2023-01-24T21:25:52.7444334Z bazel-out/k8-opt/bin/external/com_github_grpc_grpc/external/com_github_grpc_grpc: warning: directory does not exist.
2023-01-24T21:25:52.7844053Z INFO: From Action external/com_github_grpc_grpc/src/proto/grpc/testing/xds/v3/config_dump.grpc.pb.h:
2023-01-24T21:25:52.7858832Z bazel-out/k8-opt/bin/external/com_github_grpc_grpc/external/com_github_grpc_grpc: warning: directory does not exist.
2023-01-24T21:25:52.8146701Z INFO: From Action external/com_github_grpc_grpc/src/proto/grpc/testing/xds/v3/csds.grpc.pb.h:
2023-01-24T21:25:52.8195413Z bazel-out/k8-opt/bin/external/com_github_grpc_grpc/external/com_github_grpc_grpc: warning: directory does not exist.
2023-01-24T21:26:17.3034890Z [719 / 2,496] Compiling src/idl_gen_rust.cpp [for host]; 0s processwrapper-sandbox ... (2 actions, 1 running)
2023-01-24T21:27:24.9172630Z [1,843 / 3,534] Compiling python/ray/_raylet.cpp; 52s processwrapper-sandbox ... (2 actions, 1 running)
2023-01-24T21:28:39.2553777Z [1,954 / 3,534] Compiling src/google/protobuf/wire_format.cc; 3s processwrapper-sandbox ... (2 actions, 1 running)
2023-01-24T21:30:05.1828782Z [2,045 / 3,534] Compiling src/ray/common/bundle_spec.cc; 8s processwrapper-sandbox ... (2 actions, 1 running)
2023-01-24T21:31:43.7234689Z [2,136 / 3,534] Compiling src/cpp/server/server_cc.cc; 4s processwrapper-sandbox ... (2 actions, 1 running)
2023-01-24T21:33:36.9333443Z [2,285 / 3,534] Compiling src/ray/raylet/node_manager.cc; 13s processwrapper-sandbox ... (2 actions, 1 running)
2023-01-24T21:35:49.1076702Z [2,396 / 3,534] Compiling src/ray/core_worker/core_worker_process.cc; 9s processwrapper-sandbox ... (2 actions, 1 running)
2023-01-24T21:38:19.0661531Z [2,533 / 3,534] Compiling src/ray/raylet/scheduling/policy/bundle_scheduling_policy.cc; 12s processwrapper-sandbox ... (2 actions, 1 running)
2023-01-24T21:41:11.3956382Z [2,723 / 3,534] Compiling src/ray/raylet/agent_manager.cc; 16s processwrapper-sandbox ... (2 actions, 1 running)
2023-01-24T21:44:29.9164559Z [2,954 / 3,534] Compiling src/core/ext/filters/client_channel/lb_policy/grpclb/grpclb.cc; 1s processwrapper-sandbox ... (2 actions, 1 running)
2023-01-24T21:48:19.1225621Z [3,208 / 3,534] Compiling src/ray/core_worker/transport/direct_task_transport.cc; 15s processwrapper-sandbox ... (2 actions, 1 running)
2023-01-24T21:52:42.6329482Z [3,508 / 3,534] Compiling src/core/lib/iomgr/tcp_posix.cc; 1s processwrapper-sandbox ... (2 actions, 1 running)
2023-01-24T21:52:54.5383055Z INFO: From Action external/com_github_grpc_grpc/src/proto/grpc/health/v1/health.grpc.pb.h:
2023-01-24T21:52:54.5384340Z bazel-out/k8-opt/bin/external/com_github_grpc_grpc/external/com_github_grpc_grpc: warning: directory does not exist.
2023-01-24T21:57:45.8430389Z [3,778 / 3,792] Compiling src/ray/gcs/gcs_server/gcs_actor_manager.cc; 5s processwrapper-sandbox ... (2 actions, 1 running)
2023-01-24T22:03:36.1203303Z [4,049 / 4,054] [Prepa] Linking cpp/libray_api.lo
2023-01-24T22:03:39.2502551Z INFO: Elapsed time: 2670.604s, Critical Path: 223.47s

@mattip
Copy link
Contributor

mattip commented Feb 9, 2023

Ahh, hang on, that is the warning that is failing the aarch64 builds in #92. So it was there all the time and the difference is a -Werror or so?

@mattip
Copy link
Contributor

mattip commented Feb 9, 2023

Do we have any bazel experts around who could either remove the build altogether or figure out how to ignore that error?

@ngam
Copy link

ngam commented Feb 10, 2023

Do we have any bazel experts around who could either remove the build altogether or figure out how to ignore that error?

You mean in conda-forge or upstream? We will try to adapt this build to make it work (with bazel). We have a specific toolchain that we likely have to use https://github.com/conda-forge/bazel-toolchain-feedstock (an example of using this toolchain successfully is jaxlib, and the tensorflow build relies on a similarly modified toolchain)

@ngam
Copy link

ngam commented Feb 10, 2023

Do you have specific needs, @mattip? I am planning to attempt fixing this in the coming weeks, but I can also try make some effort sooner

@ngam
Copy link

ngam commented Feb 10, 2023

Our bazel expert is @xhochy who may not be free these days (we miss you if you see this!)

@mattip
Copy link
Contributor

mattip commented Feb 13, 2023

It seems tensorflow has a whole scheme to allow using system libraries. Is this build deps the parallel in ray? How would that look for a local grpc?

@ngam
Copy link

ngam commented Feb 13, 2023

The main issue for me is whether or not we will have to do a lot of deep patching to get this to work. I am not that familiar with the build setup of ray yet

@mattip
Copy link
Contributor

mattip commented Feb 14, 2023

We use this sort of thing in jaxlib

That passes TF_SYSTEM_LIBS down to tensorflow, which has a whole scheme to allow using system libraries. This mechanism does not exist so far in ray.

@h-vetinari h-vetinari mentioned this issue Feb 25, 2023
3 tasks
@cread
Copy link

cread commented May 11, 2023

Is anyone still working on this? Having such an old version pinned here is starting to cause some problems for us.

@ngam
Copy link

ngam commented May 12, 2023

Is anyone still working on this? Having such an old version pinned here is starting to cause some problems for us.

Not that I'm aware of. Please feel free to have a go and tag people in this issue so that we can keep track and help if we could

@mattip
Copy link
Contributor

mattip commented May 12, 2023

Ray 2.4.0 pins to <1.49 like upstream ray on darwin. Would changing to exactly the upstream pinning (<1.51.3 on non-darwin) help your use-case?

@cread
Copy link

cread commented May 15, 2023

Ray 2.4.0 pins to <1.49 like upstream ray on darwin. Would changing to exactly the upstream pinning (<1.51.3 on non-darwin) help your use-case?

Yes, this would help a lot actually.

@h-vetinari
Copy link
Member Author

Good news: dealing with external deps in bazel might finallyyyyyyy be getting easier: conda-forge/tensorflow-feedstock#332

@mattip
Copy link
Contributor

mattip commented Sep 10, 2023

It requires bazel 6, which does not seem to work. See ray-project/ray#31504

@h-vetinari
Copy link
Member Author

Yeah, but compatibility with modern bazel is mostly just a question of time. The important update here IMO is the new capabilities that'll allow to finally improve the (un)vendoring situation here.

@anyscalesam
Copy link

anyscalesam commented Jan 12, 2024

@mattip is there anything preventing an update of ray to 2.9.0. that should bring grpcio version to 1.59

EDIT: we should guard at <1.59 not pin it.

@mattip
Copy link
Contributor

mattip commented Oct 31, 2024

I found this article about using native libaries in bazel builds. It seems we could add some patches to replace the grpc and protobuf builds with the conda-provided ones?

@mattip
Copy link
Contributor

mattip commented Oct 31, 2024

I think we could try to use these from conda: libabseil (instead of @com_google_absl/*), 'gprc' (instead of @com_github_grpc_grpc/*) 'protobuf' (instead of @com_google_protobuf), all from upstream BUILD.bazel

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants