Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The grpc gem does not work yet at runtime (it installs fine) #2247

Open
wildmaples opened this issue Feb 3, 2021 · 21 comments
Open

The grpc gem does not work yet at runtime (it installs fine) #2247

wildmaples opened this issue Feb 3, 2021 · 21 comments

Comments

@wildmaples
Copy link
Contributor

wildmaples commented Feb 3, 2021

Summary by @eregon:

The grpc gem should install fine.
On macOS you might need truffleruby-dev.

At runtime, require 'grpc' works on Linux.
Using GRPC functionality does not work yet in general.


Requiring the grpc gem causes an Invalid ElementType of Vector failure. This is affecting the storefront-renderer repository's use of require "semian/grpc".

How to reproduce

irb(main):001:0> require 'grpc'

Stacktrace

The below issue is resolved by grpc/grpc#24632, but grpc did not merge that PR yet.

truffleruby: an internal exception escaped out of the interpreter,
please report it to https://github.com/oracle/truffleruby/issues.

Invalid ElementType of Vector: VariableBitWidthType (java.lang.AssertionError)
	from com.oracle.truffle.llvm.runtime.types.VectorType.setElementType(VectorType.java:80)
	from com.oracle.truffle.llvm.parser.listeners.Types.setType(Types.java:246)
	from com.oracle.truffle.llvm.parser.listeners.Types.record(Types.java:171)
	from com.oracle.truffle.llvm.parser.scanner.LLVMScanner.passRecordToParser(LLVMScanner.java:434)
	from com.oracle.truffle.llvm.parser.scanner.LLVMScanner.unabbreviatedRecord(LLVMScanner.java:450)
	from com.oracle.truffle.llvm.parser.scanner.LLVMScanner.scanToOffset(LLVMScanner.java:164)
	from com.oracle.truffle.llvm.parser.scanner.LLVMScanner.scanToEnd(LLVMScanner.java:143)
	from com.oracle.truffle.llvm.parser.scanner.LLVMScanner.parseBitcode(LLVMScanner.java:102)
	from com.oracle.truffle.llvm.ParserDriver.parseBinary(ParserDriver.java:340)
	from com.oracle.truffle.llvm.ParserDriver.parseLibraryWithSource(ParserDriver.java:398)
	from com.oracle.truffle.llvm.ParserDriver.parseWithDependencies(ParserDriver.java:144)
	from com.oracle.truffle.llvm.ParserDriver.parseWithDependencies(ParserDriver.java:130)
	from com.oracle.truffle.llvm.ParserDriver.parse(ParserDriver.java:101)
	from com.oracle.truffle.llvm.DefaultLoader.load(DefaultLoader.java:45)
	from com.oracle.truffle.llvm.runtime.LLVMLanguage.parse(LLVMLanguage.java:480)
	from com.oracle.truffle.api.TruffleLanguage$ParsingRequest.parse(TruffleLanguage.java:848)
	from com.oracle.truffle.api.TruffleLanguage.parse(TruffleLanguage.java:1502)
	from com.oracle.truffle.api.LanguageAccessor$LanguageImpl.parse(LanguageAccessor.java:311)
	from com.oracle.truffle.polyglot.PolyglotSourceCache.parseImpl(PolyglotSourceCache.java:94)
	from com.oracle.truffle.polyglot.PolyglotSourceCache.access$300(PolyglotSourceCache.java:56)
	from com.oracle.truffle.polyglot.PolyglotSourceCache$WeakCache.lookup(PolyglotSourceCache.java:223)
	from com.oracle.truffle.polyglot.PolyglotSourceCache.parseCached(PolyglotSourceCache.java:80)
	from com.oracle.truffle.polyglot.PolyglotLanguageContext.parseCached(PolyglotLanguageContext.java:371)
	from com.oracle.truffle.polyglot.EngineAccessor$EngineImpl.parseForLanguage(EngineAccessor.java:242)
	from com.oracle.truffle.api.TruffleLanguage$Env.parseInternal(TruffleLanguage.java:2471)
	from org.truffleruby.language.loader.FeatureLoader.loadCExtLibrary(FeatureLoader.java:484)
	from org.truffleruby.language.loader.RequireNode.requireCExtension(RequireNode.java:268)
	from org.truffleruby.language.loader.RequireNode.parseAndCall(RequireNode.java:211)
	from org.truffleruby.language.loader.RequireNode.doRequire(RequireNode.java:196)
	from org.truffleruby.language.loader.RequireNode.requireConsideringAutoload(RequireNode.java:125)
	from org.truffleruby.language.loader.RequireNode.lambda$requireWithMetrics$0(RequireNode.java:78)
	from org.truffleruby.debug.MetricsProfiler.callWithMetrics(MetricsProfiler.java:40)
	from org.truffleruby.language.loader.RequireNode.requireWithMetrics(RequireNode.java:75)
	from org.truffleruby.language.loader.RequireNode.require(RequireNode.java:67)
	from org.truffleruby.language.loader.RequireNodeGen.executeRequire(RequireNodeGen.java:30)
	from org.truffleruby.core.kernel.KernelNodes$LoadFeatureNode.loadFeature(KernelNodes.java:343)
	from org.truffleruby.core.kernel.KernelNodesFactory$LoadFeatureNodeFactory$LoadFeatureNodeGen.execute(KernelNodesFactory.java:726)
	from org.truffleruby.language.control.IfElseNode.execute(IfElseNode.java:43)
	from org.truffleruby.language.control.IfElseNode.execute(IfElseNode.java:45)
	from org.truffleruby.language.control.SequenceNode.execute(SequenceNode.java:36)
	from org.truffleruby.language.arguments.CheckArityNode.execute(CheckArityNode.java:41)
	from org.truffleruby.language.control.SequenceNode.execute(SequenceNode.java:36)
	from org.truffleruby.language.methods.CatchForMethodNode.execute(CatchForMethodNode.java:42)
	from org.truffleruby.language.methods.ExceptionTranslatingNode.execute(ExceptionTranslatingNode.java:33)
	from org.truffleruby.language.RubyRootNode.execute(RubyRootNode.java:61)
	from org.graalvm.compiler.truffle.runtime.OptimizedCallTarget.executeRootNode(OptimizedCallTarget.java:591)
/Users/mapleong/src/github.com/Shopify/truffleruby/mxbuild/truffleruby-jvm-ce/jre/languages/ruby/lib/mri/rubygems/core_ext/kernel_require.rb:72:in `gem_original_require'
	from /Users/mapleong/src/github.com/Shopify/truffleruby/mxbuild/truffleruby-jvm-ce/jre/languages/ruby/lib/mri/rubygems/core_ext/kernel_require.rb:72:in `require'
	from /Users/mapleong/src/github.com/Shopify/truffleruby/mxbuild/truffleruby-jvm-ce/jre/languages/ruby/lib/gems/gems/grpc-1.35.0/src/ruby/lib/grpc/grpc.rb:22:in `<top (required)>'
	from <internal:core> core/kernel.rb:293:in `require_relative'
	from /Users/mapleong/src/github.com/Shopify/truffleruby/mxbuild/truffleruby-jvm-ce/jre/languages/ruby/lib/gems/gems/grpc-1.35.0/src/ruby/lib/grpc.rb:19:in `<top (required)>'
	from <internal:core> core/kernel.rb:234:in `gem_original_require'
	from /Users/mapleong/src/github.com/Shopify/truffleruby/mxbuild/truffleruby-jvm-ce/jre/languages/ruby/lib/mri/rubygems/core_ext/kernel_require.rb:168:in `require'
	from (irb):1:in `irb_binding'
	from <internal:core> core/kernel.rb:325:in `eval'
	from /Users/mapleong/src/github.com/Shopify/truffleruby/mxbuild/truffleruby-jvm-ce/jre/languages/ruby/lib/mri/irb/workspace.rb:114:in `evaluate'
	from /Users/mapleong/src/github.com/Shopify/truffleruby/mxbuild/truffleruby-jvm-ce/jre/languages/ruby/lib/mri/irb/context.rb:459:in `evaluate'
	from /Users/mapleong/src/github.com/Shopify/truffleruby/mxbuild/truffleruby-jvm-ce/jre/languages/ruby/lib/mri/irb.rb:541:in `block (2 levels) in eval_input'
	from /Users/mapleong/src/github.com/Shopify/truffleruby/mxbuild/truffleruby-jvm-ce/jre/languages/ruby/lib/mri/irb.rb:704:in `signal_status'
	from /Users/mapleong/src/github.com/Shopify/truffleruby/mxbuild/truffleruby-jvm-ce/jre/languages/ruby/lib/mri/irb.rb:538:in `block in eval_input'
	from /Users/mapleong/src/github.com/Shopify/truffleruby/mxbuild/truffleruby-jvm-ce/jre/languages/ruby/lib/mri/irb/ruby-lex.rb:166:in `block (2 levels) in each_top_level_statement'
	from <internal:core> core/kernel.rb:437:in `loop'
	from /Users/mapleong/src/github.com/Shopify/truffleruby/mxbuild/truffleruby-jvm-ce/jre/languages/ruby/lib/mri/irb/ruby-lex.rb:151:in `block in each_top_level_statement'
	from <internal:core> core/throw_catch.rb:36:in `catch'
	from /Users/mapleong/src/github.com/Shopify/truffleruby/mxbuild/truffleruby-jvm-ce/jre/languages/ruby/lib/mri/irb/ruby-lex.rb:150:in `each_top_level_statement'
	from /Users/mapleong/src/github.com/Shopify/truffleruby/mxbuild/truffleruby-jvm-ce/jre/languages/ruby/lib/mri/irb.rb:537:in `eval_input'
	from /Users/mapleong/src/github.com/Shopify/truffleruby/mxbuild/truffleruby-jvm-ce/jre/languages/ruby/lib/mri/irb.rb:472:in `block in run'
	from <internal:core> core/throw_catch.rb:36:in `catch'
	from /Users/mapleong/src/github.com/Shopify/truffleruby/mxbuild/truffleruby-jvm-ce/jre/languages/ruby/lib/mri/irb.rb:471:in `run'
	from /Users/mapleong/src/github.com/Shopify/truffleruby/mxbuild/truffleruby-jvm-ce/jre/languages/ruby/lib/mri/irb.rb:400:in `start'
	from /Users/mapleong/src/github.com/Shopify/truffleruby/mxbuild/truffleruby-jvm-ce/jre/languages/ruby/lib/gems/gems/irb-1.2.6/exe/irb:11:in `<top (required)>'
	from <internal:core> core/kernel.rb:400:in `load'
	from <internal:core> core/kernel.rb:400:in `load'
	from /Users/mapleong/src/github.com/Shopify/truffleruby/mxbuild/truffleruby-jvm-ce/jre/languages/ruby/bin/irb:42:in `<main>'

As chatted about in our call, it seems like a C language feature that Sulong doesn't support.

Issue about compiling/installing grpc: #1982

General internal issue about grpc: GR-23874

@eregon eregon self-assigned this Feb 3, 2021
@jeshan
Copy link

jeshan commented Feb 3, 2021

I have a different scenario that leads to the same stacktrace, involving only Sulong, not TruffleRuby.
I compile Ruby MRI 2.7.2 with bundled clang 10 in latest graal and when I run ruby or ruby --version I can see something like:
image

I have provided a gist showing a Dockerfile to reproduce the issue (files copied into the image were obtained from their official release page)

https://gist.github.com/jeshan/6c70fd0b94b6e521a54b6c71454ebd4c/revisions

The gist v1 is based on the official ruby docker image and v2 shows my changes that reproduce the issue.

@eregon
Copy link
Member

eregon commented Feb 4, 2021

It's GR-23843 internally.

@eregon
Copy link
Member

eregon commented Feb 4, 2021

@jeshan Please create a separate issue at https://github.com/oracle/graal, I'd like to focus this one on getting grpc to load on TruffleRuby.

I digged into this some time ago with @norswap.
I can reproduce on Linux with:

$ gem i grpc
Fetching google-protobuf-3.14.0.gem
Fetching googleapis-common-protos-types-1.0.6.gem
Fetching grpc-1.35.0.gem
Building native extensions. This could take a while...
Successfully installed google-protobuf-3.14.0
Successfully installed googleapis-common-protos-types-1.0.6
Building native extensions. This could take a while...
Successfully installed grpc-1.35.0
3 gems installed

$ ruby -rgrpc -e0
Invalid ElementType of Vector: VariableBitWidthType (java.lang.AssertionError)
	from com.oracle.truffle.llvm.runtime.types.VectorType.setElementType(VectorType.java:80)
	from com.oracle.truffle.llvm.parser.listeners.Types.setType(Types.java:246)
	from com.oracle.truffle.llvm.parser.listeners.Types.record(Types.java:171)
	from com.oracle.truffle.llvm.parser.scanner.LLVMScanner.passRecordToParser(LLVMScanner.java:434)
...

The error comes from a vendored version of boringssl being included by default.
The PR at https://github.com/grpc/grpc/pull/24632/files#diff-fc6f1e850a88ea978d6788c2b825d7feb1dfc2d22e572638ec9ad5061595d245R71 actually changes the extconf.rb to not include boringssl on TruffleRuby.
It's generally not a good idea to run SSL libraries and constant-time functions on top of a JIT like Sulong.

With that PR:

git clone https://github.com/norswap/grpc.git
cd grpc
git submodule update --init 
git checkout truffleruby-build-compat

truffleruby -v # make sure TruffleRuby is in PATH
bundle install
bundle exec rake build
gem uni grpc
gem i -V pkg/grpc-*.dev.gem

$ ruby -rgrpc -e 'p GRPC'
GRPC # works to require it

If we try the examples:
$ cd examples/ruby
$ ruby greeter_server.rb
/home/eregon/.rubies/truffleruby-dev/lib/truffle/truffle/cext.rb:1201:in `__allocate__': TruffleRuby doesn't have a case for the com.oracle.truffle.llvm.runtime.nodes.cast.LLVMToVectorNodeFactory$LLVMSignedCastToI64VectorNodeGen node with values of type com.oracle.truffle.llvm.runtime.vector.LLVMPointerVector (TypeError)
	from com.oracle.truffle.llvm.runtime.nodes.cast.LLVMToVectorNodeFactory$LLVMSignedCastToI64VectorNodeGen.executeAndSpecialize(LLVMToVectorNodeFactory.java:575)
	from com.oracle.truffle.llvm.runtime.nodes.cast.LLVMToVectorNodeFactory$LLVMSignedCastToI64VectorNodeGen.executeGeneric(LLVMToVectorNodeFactory.java:535)
	from com.oracle.truffle.llvm.runtime.nodes.op.LLVMVectorArithmeticNodeGen.executeGeneric(LLVMVectorArithmeticNodeGen.java:37)
	from com.oracle.truffle.llvm.runtime.nodes.vars.LLVMWriteNodeFactory$LLVMWriteVectorNodeGen.execute(LLVMWriteNodeFactory.java:835)
	from com.oracle.truffle.llvm.runtime.nodes.base.LLVMBasicBlockNode$InitializedBlockNode.execute(LLVMBasicBlockNode.java:161)
	from com.oracle.truffle.llvm.runtime.nodes.control.LLVMDispatchBasicBlockNode.doDispatch(LLVMDispatchBasicBlockNode.java:97)
	from com.oracle.truffle.llvm.runtime.nodes.control.LLVMDispatchBasicBlockNodeGen.executeGeneric(LLVMDispatchBasicBlockNodeGen.java:20)
	from com.oracle.truffle.llvm.runtime.nodes.control.LLVMFunctionRootNode.doRun(LLVMFunctionRootNode.java:85)
	from com.oracle.truffle.llvm.runtime.nodes.control.LLVMFunctionRootNodeGen.executeGeneric(LLVMFunctionRootNodeGen.java:21)
	from com.oracle.truffle.llvm.runtime.nodes.func.LLVMFunctionStartNode.execute(LLVMFunctionStartNode.java:88)
	from org.graalvm.compiler.truffle.runtime.OptimizedCallTarget.executeRootNode(OptimizedCallTarget.java:592)
	from /home/eregon/.rubies/truffleruby-dev/lib/gems/gems/grpc-1.34.0.dev/src/ruby/lib/grpc/generic/rpc_server.rb:234:in `initialize'
	from greeter_server.rb:39:in `main'
	from greeter_server.rb:48:in `<main>'

I have a fix for that one, I'll try to merge it to Sulong.

@eregon
Copy link
Member

eregon commented Feb 4, 2021

With that fix, this happens:

$ ruby greeter_server.rb |& c++filt 
/home/eregon/code/truffleruby-ws/truffleruby/mxbuild/truffleruby-jvm/jre/languages/ruby/lib/gems/gems/grpc-1.34.0.dev/src/core/lib/iomgr/exec_ctx.h:223:in `grpc_iomgr_init': \
External LLVMFunction TLS init function for grpc_core::ExecCtx::exec_ctx_ cannot be found. (com.oracle.truffle.llvm.runtime.except.LLVMLinkerException) (RuntimeError)
Translated to internal error
	from /home/eregon/code/truffleruby-ws/truffleruby/mxbuild/truffleruby-jvm/jre/languages/ruby/lib/gems/gems/grpc-1.34.0.dev/src/core/lib/surface/init.cc:149:in `grpc_init'
	from /home/eregon/code/truffleruby-ws/truffleruby/mxbuild/truffleruby-jvm/jre/languages/ruby/lib/gems/gems/grpc-1.34.0.dev/src/ruby/ext/grpc/rb_grpc.c:285:in `grpc_ruby_init'
	from /home/eregon/code/truffleruby-ws/truffleruby/mxbuild/truffleruby-jvm/jre/languages/ruby/lib/gems/gems/grpc-1.34.0.dev/src/ruby/ext/grpc/rb_server.c:131:in `grpc_rb_server_alloc'
	from /home/eregon/code/truffleruby-ws/truffleruby/mxbuild/truffleruby-jvm/jre/languages/ruby/lib/truffle/truffle/cext.rb:1202:in `__allocate__'
	from /home/eregon/code/truffleruby-ws/truffleruby/mxbuild/truffleruby-jvm/jre/languages/ruby/lib/gems/gems/grpc-1.34.0.dev/src/ruby/lib/grpc/generic/rpc_server.rb:234:in `initialize'
	from greeter_server.rb:39:in `main'
	from greeter_server.rb:48:in `<main>'

I think this TLS means Thread Local Storage (not TLS related to SSL).
I'll file a Sulong issue for that (GR-29187).

Most likely related to https://github.com/grpc/grpc/blob/4dc84aea46396cde21d13813efcf8ca3b2fda692/src/core/lib/iomgr/exec_ctx.h#L254
I'd guess this variant is used: https://github.com/grpc/grpc/blob/4dc84aea46396cde21d13813efcf8ca3b2fda692/src/core/lib/gpr/tls_stdcpp.h
out of:
https://github.com/grpc/grpc/blob/master/src/core/lib/gpr/tls.h

In any case, require 'grpc' works with grpc/grpc#24632, so I think we need to get the grpc maintainers to merge it.

@jeshan
Copy link

jeshan commented Feb 4, 2021

I've moved my comment to the separate Sulong issue just mentioned.

@eregon
Copy link
Member

eregon commented Feb 8, 2021

TruffleRuby doesn't have a case for the LLVMSignedCastToI64VectorNodeGen node with values of type com.oracle.truffle.llvm.runtime.vector.LLVMPointerVector (TypeError)
I have a fix for that one, I'll try to merge it to Sulong.

That's now fixed in oracle/graal@9d139ce.

@Palez
Copy link
Contributor

Palez commented Feb 9, 2021

Hello, we were able to diagnose the issue in Sulong. Basically the missing symbol is an external weak symbol that's not defined, it was something we didn't support before, but I will fix that now, so it'll be supported. I'll try to get a PR in soon.

@wildmaples
Copy link
Contributor Author

Are there any updates on this issue? @eregon @Palez

@Palez
Copy link
Contributor

Palez commented Feb 22, 2021

The PR for this particular issue has been created, and is in the process of being reviewed and merged. However, there is another issue with grpc gem regarding pthread. And I'm also working towards a fix for that as well.

@eregon
Copy link
Member

eregon commented Mar 1, 2021

The external weak symbol issue is fixed in oracle/graal@1ddc1c2, and I'm updating the graal import to pick that fix.

There is another issue with pthread_{g,s}etname_np in Sulong that @Palez is investigating.

@eregon
Copy link
Member

eregon commented Mar 15, 2021

Related: recent grpc/google-protobuf need WeakMap to support primitives (#2267) which is now fixed.

@eregon
Copy link
Member

eregon commented Mar 15, 2021

Trying it the examples today on Linux, I get:

$ ruby -v greeter_server.rb
truffleruby 21.1.0-dev-fac7597c, like ruby 2.7.2, GraalVM CE Native [x86_64-linux]
java.lang.UnsupportedOperationException: Thread[default-executo,5,main] was not registered
	at org.truffleruby.language.SafepointManager.leaveThread(SafepointManager.java:94)
	at org.truffleruby.core.thread.ThreadManager.leaveAndEnter(ThreadManager.java:468)
	at org.truffleruby.core.fiber.FiberManager.killOtherFibers(FiberManager.java:331)
	at org.truffleruby.core.fiber.FiberManager.shutdown(FiberManager.java:359)
	at org.truffleruby.core.thread.ThreadManager.cleanup(ThreadManager.java:412)
	at org.truffleruby.RubyLanguage.disposeThread(RubyLanguage.java:435)
	at org.truffleruby.RubyLanguage.disposeThread(RubyLanguage.java:107)
	at com.oracle.truffle.api.LanguageAccessor$LanguageImpl.disposeThread(LanguageAccessor.java:351)
	at com.oracle.truffle.polyglot.PolyglotLanguageContext.leaveThread(PolyglotLanguageContext.java:447)
	at com.oracle.truffle.polyglot.PolyglotThread$ThreadSpawnRootNode.executeImpl(PolyglotThread.java:146)
	at com.oracle.truffle.polyglot.HostToGuestRootNode.execute(HostToGuestRootNode.java:119)
	at org.graalvm.compiler.truffle.runtime.OptimizedCallTarget.executeRootNode(OptimizedCallTarget.java:603)
	at org.graalvm.compiler.truffle.runtime.OptimizedCallTarget.profiledPERoot(OptimizedCallTarget.java:574)java.lang.UnsupportedOperationException: Thread[resolver-execut,5,main] was not registered

	at org.graalvm.compiler.truffle.runtime.OptimizedCallTarget.callBoundary(OptimizedCallTarget.java:524)	at org.truffleruby.language.SafepointManager.leaveThread(SafepointManager.java:94)

	at com.oracle.svm.truffle.api.SubstrateOptimizedCallTarget.invokeCallBoundary(SubstrateOptimizedCallTarget.java:121)	at org.truffleruby.core.thread.ThreadManager.leaveAndEnter(ThreadManager.java:468)

	at org.truffleruby.core.fiber.FiberManager.killOtherFibers(FiberManager.java:331)
	at com.oracle.svm.truffle.api.SubstrateOptimizedCallTargetInstalledCode.doInvoke(SubstrateOptimizedCallTargetInstalledCode.java:164)
	at com.oracle.svm.truffle.api.SubstrateOptimizedCallTarget.doInvoke(SubstrateOptimizedCallTarget.java:104)
	at org.graalvm.compiler.truffle.runtime.OptimizedCallTarget.callIndirect(OptimizedCallTarget.java:453)
	at org.graalvm.compiler.truffle.runtime.OptimizedCallTarget.call(OptimizedCallTarget.java:434)
	at org.truffleruby.core.fiber.FiberManager.shutdown(FiberManager.java:359)
	at com.oracle.truffle.polyglot.PolyglotThread.run(PolyglotThread.java:83)
	at org.truffleruby.core.thread.ThreadManager.cleanup(ThreadManager.java:412)
	at com.oracle.svm.core.thread.JavaThreads.threadStartRoutine(JavaThreads.java:526)
	at org.truffleruby.RubyLanguage.disposeThread(RubyLanguage.java:435)
	at com.oracle.svm.core.posix.thread.PosixJavaThreads.pthreadStartRoutine(PosixJavaThreads.java:192)
Caused by: Attached Guest Language Frames (1)
	at org.truffleruby.RubyLanguage.disposeThread(RubyLanguage.java:107)
	at com.oracle.truffle.api.LanguageAccessor$LanguageImpl.disposeThread(LanguageAccessor.java:351)
	at com.oracle.truffle.polyglot.PolyglotLanguageContext.leaveThread(PolyglotLanguageContext.java:447)
	at com.oracle.truffle.polyglot.PolyglotThread$ThreadSpawnRootNode.executeImpl(PolyglotThread.java:146)
	at com.oracle.truffle.polyglot.HostToGuestRootNode.execute(HostToGuestRootNode.java:119)
	at org.graalvm.compiler.truffle.runtime.OptimizedCallTarget.executeRootNode(OptimizedCallTarget.java:603)
	at org.graalvm.compiler.truffle.runtime.OptimizedCallTarget.profiledPERoot(OptimizedCallTarget.java:574)
	at org.graalvm.compiler.truffle.runtime.OptimizedCallTarget.callBoundary(OptimizedCallTarget.java:524)
	at com.oracle.svm.truffle.api.SubstrateOptimizedCallTarget.invokeCallBoundary(SubstrateOptimizedCallTarget.java:121)
	at com.oracle.svm.truffle.api.SubstrateOptimizedCallTargetInstalledCode.doInvoke(SubstrateOptimizedCallTargetInstalledCode.java:164)
	at com.oracle.svm.truffle.api.SubstrateOptimizedCallTarget.doInvoke(SubstrateOptimizedCallTarget.java:104)
	at org.graalvm.compiler.truffle.runtime.OptimizedCallTarget.callIndirect(OptimizedCallTarget.java:453)
	at org.graalvm.compiler.truffle.runtime.OptimizedCallTarget.call(OptimizedCallTarget.java:434)
	at com.oracle.truffle.polyglot.PolyglotThread.run(PolyglotThread.java:83)
	at com.oracle.svm.core.thread.JavaThreads.threadStartRoutine(JavaThreads.java:526)
	at com.oracle.svm.core.posix.thread.PosixJavaThreads.pthreadStartRoutine(PosixJavaThreads.java:192)
Caused by: Attached Guest Language Frames (1)
#<Thread:0x6e8@/home/eregon/.rubies/truffleruby-dev/lib/truffle/truffle/cext.rb:1605 run> terminated with exception:
Traceback (most recent call last):
	from /home/eregon/.rubies/truffleruby-dev/lib/truffle/truffle/cext.rb:1606:in `block in rb_thread_create'
	from /home/eregon/.rubies/truffleruby-dev/lib/gems/gems/grpc-1.36.0.dev/src/ruby/ext/grpc/rb_event_thread.c:122:in `grpc_rb_event_thread'
	from call.c:149:in `rb_thread_call_without_gvl'
	from /home/eregon/.rubies/truffleruby-dev/lib/truffle/truffle/cext.rb:1615:in `rb_thread_call_without_gvl'
/home/eregon/.rubies/truffleruby-dev/lib/truffle/truffle/cext.rb:1615:in `block in rb_thread_call_without_gvl': TruffleRuby doesn't have a case for the com.oracle.truffle.llvm.runtime.nodes.asm.syscall.LLVMAMD64SyscallFutexNodeGen node with values of type com.oracle.truffle.llvm.runtime.pointer.LLVMPointerImpl java.lang.Long=128 java.lang.Long=0 com.oracle.truffle.llvm.runtime.pointer.LLVMPointerImpl java.lang.Long=0 java.lang.Long=0 (TypeError)
	from com.oracle.truffle.llvm.runtime.nodes.asm.syscall.LLVMAMD64SyscallFutexNodeGen.executeAndSpecialize(LLVMAMD64SyscallFutexNodeGen.java:90)
	from com.oracle.truffle.llvm.runtime.nodes.asm.syscall.LLVMAMD64SyscallFutexNodeGen.execute(LLVMAMD64SyscallFutexNodeGen.java:54)
	from com.oracle.truffle.llvm.runtime.nodes.asm.syscall.LLVMSyscallNode.cachedSyscall(LLVMSyscallNode.java:66)
	from com.oracle.truffle.llvm.runtime.nodes.asm.syscall.LLVMSyscallNodeGen.executeAndSpecialize(LLVMSyscallNodeGen.java:175)
	from com.oracle.truffle.llvm.runtime.nodes.asm.syscall.LLVMSyscallNodeGen.executeGeneric(LLVMSyscallNodeGen.java:84)
	from com.oracle.truffle.llvm.runtime.nodes.vars.LLVMWriteNodeFactory$LLVMWriteI64NodeGen.execute_generic1(LLVMWriteNodeFactory.java:365)
	from com.oracle.truffle.llvm.runtime.nodes.vars.LLVMWriteNodeFactory$LLVMWriteI64NodeGen.execute(LLVMWriteNodeFactory.java:346)
	from com.oracle.truffle.llvm.runtime.nodes.base.LLVMBasicBlockNode$InitializedBlockNode.execute(LLVMBasicBlockNode.java:161)
	from com.oracle.truffle.llvm.runtime.nodes.control.LLVMLoopDispatchNode.executeRepeatingWithValue(LLVMLoopDispatchNode.java:105)
	from org.graalvm.compiler.truffle.runtime.OptimizedOSRLoopNode.profilingLoop(OptimizedOSRLoopNode.java:165)
	from org.graalvm.compiler.truffle.runtime.OptimizedOSRLoopNode.execute(OptimizedOSRLoopNode.java:123)
	from com.oracle.truffle.llvm.runtime.nodes.control.LLVMLoopNode$LLVMLoopNodeImpl.loop(LLVMLoopNode.java:80)
	from com.oracle.truffle.llvm.runtime.nodes.control.LLVMLoopNodeFactory$LLVMLoopNodeImplNodeGen.executeLoop(LLVMLoopNodeFactory.java:22)
	from com.oracle.truffle.llvm.runtime.nodes.control.LLVMDispatchBasicBlockNode.doDispatch(LLVMDispatchBasicBlockNode.java:164)
	from com.oracle.truffle.llvm.runtime.nodes.control.LLVMDispatchBasicBlockNodeGen.executeGeneric(LLVMDispatchBasicBlockNodeGen.java:20)
	from com.oracle.truffle.llvm.runtime.nodes.control.LLVMFunctionRootNode.doRun(LLVMFunctionRootNode.java:85)
	from com.oracle.truffle.llvm.runtime.nodes.control.LLVMFunctionRootNodeGen.executeGeneric(LLVMFunctionRootNodeGen.java:21)
	from com.oracle.truffle.llvm.runtime.nodes.func.LLVMFunctionStartNode.execute(LLVMFunctionStartNode.java:91)
	from org.graalvm.compiler.truffle.runtime.OptimizedCallTarget.executeRootNode(OptimizedCallTarget.java:603)

The first error might be solved by adopting Truffle safepoints, I'm not sure though.

The second error comes from Sulong, that node seems to need some extra specializations.

@eregon
Copy link
Member

eregon commented Apr 16, 2021

With branch https://github.com/eregon/grpc/tree/truffleruby-debug which has a couple workarounds,
and latest truffleruby-dev, it works for one message ("Greeting: Hello world") and then the client hangs while exiting as it can't interrupt some native call. The server sometimes segfaults in a i64 write (GR-30218).

@eregon
Copy link
Member

eregon commented Oct 8, 2021

New PR for grpc, cleaned up and rebased on latest grpc: grpc/grpc#27660

@eregon
Copy link
Member

eregon commented Jun 28, 2022

The PR to support building the grpc gem on TruffleRuby has been merged.
So now it's about getting the grpc gem to work at runtime, which I'll track in this issue.

@eregon
Copy link
Member

eregon commented Aug 22, 2022

https://github.com/cookpad/grpc_kit seems a possible alternative to the grpc gem.
It's written in Ruby and uses the google-protobuf gem (which works fine on TruffleRuby).
I tried and both the helloworld and routeguide examples work on TruffleRuby!

The test suite also passes, except for 6 failures which are kwargs-related and also happen on CRuby 3.0.3 and one extra failure which is an easy fix:

  1) GrpcKit::Session::IO#send_event write data to inner io object
     Failure/Error: bytes = @io.write_nonblock(data, exception: false)
     
     ArgumentError:
       wrong number of arguments (given 2, expected 1)
     # /home/eregon/.rubies/truffleruby-dev/lib/truffle/stringio.rb:103:in `write_nonblock'
     # ./lib/grpc_kit/session/io.rb:39:in `send_event'
     # ./spec/grpc_kit/session/io_spec.rb:39:in `block (3 levels) in <top (required)>'

graalvmbot pushed a commit that referenced this issue Aug 23, 2022
@eregon
Copy link
Member

eregon commented Aug 24, 2022

TruffleRuby is now in grpc_kit's CI and passes all tests.

I'm not sure how much grpc_kit (and maybe griffin) can replace grpc in practice though, it would be useful if grpc users could comment on that.

@HoneyryderChuck
Copy link
Contributor

FWIW httpx also ships with a grpc plugin which has been successfully testing against truffleruby for quite a while. It's probably the closest to pure ruby (grpc_kit uses dr9 for http2 parsing, which uses C extensions and nghttp2, last time I checked).

Both have very fringe communities and usage, IME. The grpc gem has much bigger community of users, and has codegen capabilities which none of the alternatives can match (GRPC service definitions, all of them can codegen protobufs). Truffleruby should probably ensure compatibility with it, for adoption sake.

@eregon
Copy link
Member

eregon commented Aug 25, 2022

Thanks for the context. grpc seems hard to get working because it's a huge amount of rather messy C++/C code (which notably uses reflection with dlsym() and has multiple implementations of locks to give an idea) and running all that on Sulong is proving challenging.

We might be able to run some part on Sulong and some part natively but that would likely need build system changes in grpc, some help from grpc maintainers and the grpc Ruby maintainers seem overall not so responsive (typically it takes months to merge PRs).

Hence I am exploring lighter-weight other options, and I've heard multiple companies sharing similar concerns for the grpc gem when using it with CRuby.
It seems an overall feeling that grpc/grpc is heavy and hard to maintain not only for Ruby, for example see what https://buf.build/blog/connect-a-better-grpc says about it ("If you're frustrated by the complexity and instability of today's gRPC libraries").
There might be some way in the future to tell Sulong to execute some parts/functions natively, that might help.

@ollym
Copy link

ollym commented Aug 26, 2022

@eregon we're also trying to evaluate truffleruby in production and have hit grpc stumbling block #2697 which we need for development as many of our devs work on M1 macs, but this now seems blocked by oracle/graal#4726 which sounds equally as difficult to workaround. We're forced to use grpc gem because of https://github.com/googleapis/ruby-spanner-activerecord which we experimented with but found Spanner to be dissapointing vs Postgres, so once Google AlloyDB is GA we'll use that and drop Spanner/grpc and be unblocked to try truffleruby again.

All that said, I imagine grpc for all its problems pushes a lot of boundaries for Truffleruby that other gems may also face and making it work now will fix compatibility also for a number of other gems you haven't yet come across. Assuming your goal is still to be MRI compliant :)

Either way, we're trying to make this work. Also delighted @HoneyryderChuck to hear HTTPX supports truffleruby, we're big fans of your work and use it exclusively in our app.

@eregon
Copy link
Member

eregon commented Aug 26, 2022

Right, there are quite a few gems depending on grpc: https://rubygems.org/gems/grpc/reverse_dependencies
We'll look into how feasible it is to run some part natively and some part on Sulong.

I imagine grpc for all its problems pushes a lot of boundaries for Truffleruby that other gems may also face

Actually no, it's literally dozens of complicated fixes which seem to be needed only for grpc (e.g., supporting direct usages of the futex syscall on Linux). I'm aware of no other popular native extension having similar problems on TruffleRuby. Native extensions don't usually include their own "operating system" like their own SSL implementation, custom locks, network stack/layer, etc, like grpc does.

Thank you for the feedback, I'll try to get oracle/graal#4726 prioritized and we'll keep looking how we can support the grpc gem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants