-
Notifications
You must be signed in to change notification settings - Fork 13k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge branch 'trpl_embedding' into rollup
- Loading branch information
Showing
2 changed files
with
354 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,353 @@ | ||
% Rust Inside Other Languages | ||
|
||
For our third project, we’re going to choose something that shows off one of | ||
Rust’s greatest strengths: a lack of a substantial runtime. | ||
|
||
As organizations grow, they increasingly rely on a multitude of programming | ||
languages. Different programming languages have different strengths and | ||
weaknesses, and a polyglot stack lets you use a particular language where | ||
its strengths make sense, and use a different language where it’s weak. | ||
|
||
A very common area where many programming languages are weak is in runtime | ||
performance of programs. Often, using a language that is slower, but offers | ||
greater programmer productivity is a worthwhile trade-off. To help mitigate | ||
this, they provide a way to write some of your system in C, and then call | ||
the C code as though it were written in the higher-level language. This is | ||
called a ‘foreign function interface’, often shortened to ‘FFI’. | ||
|
||
Rust has support for FFI in both directions: it can call into C code easily, | ||
but crucially, it can also be called _into_ as easily as C. Combined with | ||
Rust’s lack of a garbage collector and low runtime requirements, this makes | ||
Rust a great candidate to embed inside of other languages when you need | ||
some extra oomph. | ||
|
||
There is a whole [chapter devoted to FFI][ffi] and its specifics elsewhere in | ||
the book, but in this chapter, we’ll examine this particular use-case of FFI, | ||
with three examples, in Ruby, Python, and JavaScript. | ||
|
||
[ffi]: ffi.html | ||
|
||
# The problem | ||
|
||
There are many different projects we could choose here, but we’re going to | ||
pick an example where Rust has a clear advantage over many other languages: | ||
numeric computing and threading. | ||
|
||
Many languages, for the sake of consistency, place numbers on the heap, rather | ||
than on the stack. Especially in languages that focus on object-oriented | ||
programming and use garbage collection, heap allocation is the default. Sometimes | ||
optimizations can stack allocate particular numbers, but rather than relying | ||
on an optimizer to do its job, we may want to ensure that we’re always using | ||
primitive number types rather than some sort of object type. | ||
|
||
Second, many languages have a ‘global interpreter lock’, which limits | ||
concurrency in many situations. This is done in the name of safety, which is | ||
a positive effect, but it limits the amount of work that can be done at the | ||
same time, which is a big negative. | ||
|
||
To emphasize these two aspects, we’re going to create a little project that | ||
uses these two aspects heavily. Since the focus of the example is the embedding | ||
of Rust into the languages, rather than the problem itself, we’ll just use a | ||
toy example: | ||
|
||
> Start ten threads. Inside each thread, count from one to five million. After | ||
> All ten threads are finished, print out ‘done!’. | ||
I chose five million based on my particular computer. Here’s an example of this | ||
code in Ruby: | ||
|
||
```ruby | ||
threads = [] | ||
|
||
10.times do | ||
threads << Thread.new do | ||
count = 0 | ||
|
||
5_000_000.times do | ||
count += 1 | ||
end | ||
end | ||
end | ||
|
||
threads.each {|t| t.join } | ||
puts "done!" | ||
``` | ||
|
||
Try running this example, and choose a number that runs for a few seconds. | ||
Depending on your computer’s hardware, you may have to increase or decrease the | ||
number. | ||
|
||
On my system, running this program takes `2.156` seconds. And, if I use some | ||
sort of process monitoring tool, like `top`, I can see that it only uses one | ||
core on my machine. That’s the GIL kicking in. | ||
|
||
While it’s true that this is a synthetic program, one can imagine many problems | ||
that are similar to this in the real world. For our purposes, spinning up some | ||
busy threads represents some sort of parallel, expensive computation. | ||
|
||
# A Rust library | ||
|
||
Let’s re-write this problem in Rust. First, let’s make a new project with | ||
Cargo: | ||
|
||
```bash | ||
$ cargo new embed | ||
$ cd embed | ||
``` | ||
|
||
This program is fairly easy to write in Rust: | ||
|
||
```rust | ||
use std::thread; | ||
|
||
fn process() { | ||
let handles: Vec<_> = (0..10).map(|_| { | ||
thread::spawn(|| { | ||
let mut _x = 0; | ||
for _ in (0..5_000_001) { | ||
_x += 1 | ||
} | ||
}) | ||
}).collect(); | ||
|
||
for h in handles { | ||
h.join().ok().expect("Could not join a thread!"); | ||
} | ||
} | ||
``` | ||
|
||
Some of this should look familiar from previous examples. We spin up ten | ||
threads, collecting them into a `handles` vector. Inside of each thread, we | ||
loop five million times, and add one to `_x` each time. Why the underscore? | ||
Well, if we remove it and compile: | ||
|
||
```bash | ||
$ cargo build | ||
Compiling embed v0.1.0 (file:///home/steve/src/embed) | ||
src/lib.rs:3:1: 16:2 warning: function is never used: `process`, #[warn(dead_code)] on by default | ||
src/lib.rs:3 fn process() { | ||
src/lib.rs:4 let handles: Vec<_> = (0..10).map(|_| { | ||
src/lib.rs:5 thread::spawn(|| { | ||
src/lib.rs:6 let mut x = 0; | ||
src/lib.rs:7 for _ in (0..5_000_001) { | ||
src/lib.rs:8 x += 1 | ||
... | ||
src/lib.rs:6:17: 6:22 warning: variable `x` is assigned to, but never used, #[warn(unused_variables)] on by default | ||
src/lib.rs:6 let mut x = 0; | ||
^~~~~ | ||
``` | ||
That first warning is because we are building a library. If we had a test | ||
for this function, the warning would go away. But for now, it’s never | ||
called. | ||
The second is related to `x` versus `_x`. Because we never actually _do_ | ||
anything with `x`, we get a warning about it. In our case, that’s perfectly | ||
okay, as we’re just trying to waste CPU cycles. Prefixing `x` with the | ||
underscore removes the warning. | ||
Finally, we join on each thread. | ||
Right now, however, this is a Rust library, and it doesn’t expose anything | ||
that’s callable from C. If we tried to hook this up to another language right | ||
now, it wouldn’t work. We only need to make two small changes to fix this, | ||
though. The first is modify the beginning of our code: | ||
```rust,ignore | ||
#[no_mangle] | ||
pub extern fn process() { | ||
``` | ||
We have to add a new attribute, `no_mangle`. When you create a Rust library, it | ||
changes the name of the function in the compiled output. The reasons for this | ||
are outside the scope of this tutorial, but in order for other languages to | ||
know how to call the function, we need to not do that. This attribute turns | ||
that behavior off. | ||
The other change is the `pub extern`. The `pub` means that this function should | ||
be callable from outside of this module, and the `extern` says that it should | ||
be able to be called from C. That’s it! Not a whole lot of change. | ||
The second thing we need to do is to change a setting in our `Cargo.toml`. Add | ||
this at the bottom: | ||
```toml | ||
[lib] | ||
name = "embed" | ||
crate-type = ["dylib"] | ||
``` | ||
This tells Rust that we want to compile our library into a standard dynamic | ||
library. By default, Rust compiles into an ‘rlib’, a Rust-specific format. | ||
Let’s build the project now: | ||
```bash | ||
$ cargo build --release | ||
Compiling embed v0.1.0 (file:///home/steve/src/embed) | ||
``` | ||
We’ve chosen `cargo build --release`, which builds with optimizations on. We | ||
want this to be as fast as possible! You can find the output of the library in | ||
`target/release`: | ||
```bash | ||
$ ls target/release/ | ||
build deps examples libembed.so native | ||
``` | ||
That `libembed.so` is our ‘shared object’ library. We can use this file | ||
just like any shared object library written in C! As an aside, this may be | ||
`embed.dll` or `libembed.dylib`, depending on the platform. | ||
Now that we’ve got our Rust library built, let’s use it from our Ruby. | ||
# Ruby | ||
Open up a `embed.rb` file inside of our project, and do this: | ||
```ruby | ||
require 'ffi' | ||
|
||
module Hello | ||
extend FFI::Library | ||
ffi_lib 'target/release/libembed.so' | ||
attach_function :process, [], :void | ||
end | ||
|
||
Hello.process | ||
|
||
puts "done!” | ||
``` | ||
Before we can run this, we need to install the `ffi` gem: | ||
```bash | ||
$ gem install ffi # this may need sudo | ||
Fetching: ffi-1.9.8.gem (100%) | ||
Building native extensions. This could take a while... | ||
Successfully installed ffi-1.9.8 | ||
Parsing documentation for ffi-1.9.8 | ||
Installing ri documentation for ffi-1.9.8 | ||
Done installing documentation for ffi after 0 seconds | ||
1 gem installed | ||
``` | ||
And finally, we can try running it: | ||
```bash | ||
$ ruby embed.rb | ||
done! | ||
$ | ||
``` | ||
Whoah, that was fast! On my system, this took `0.086` seconds, rather than | ||
the two seconds the pure Ruby version took. Let’s break down this Ruby | ||
code: | ||
```ruby | ||
require 'ffi' | ||
``` | ||
We first need to require the `ffi` gem. This lets us interface with our | ||
Rust library like a C library. | ||
```ruby | ||
module Hello | ||
extend FFI::Library | ||
ffi_lib 'target/release/libembed.so' | ||
``` | ||
The `ffi` gem’s authors recommend using a module to scope the functions | ||
we’ll import from the shared library. Inside, we `extend` the necessary | ||
`FFI::Library` module, and then call `ffi_lib` to load up our shared | ||
object library. We just pass it the path that our library is stored, | ||
which as we saw before, is `target/release/libembed.so`. | ||
```ruby | ||
attach_function :process, [], :void | ||
``` | ||
The `attach_function` method is provided by the FFI gem. It’s what | ||
connects our `process()` function in Rust to a Ruby function of the | ||
same name. Since `process()` takes no arguments, the second parameter | ||
is an empty array, and since it returns nothing, we pass `:void` as | ||
the final argument. | ||
```ruby | ||
Hello.process | ||
``` | ||
This is the actual call into Rust. The combination of our `module` | ||
and the call to `attach_function` sets this all up. It looks like | ||
a Ruby function, but is actually Rust! | ||
```ruby | ||
puts "done!" | ||
``` | ||
Finally, as per our project’s requirements, we print out `done!`. | ||
That’s it! As we’ve seen, bridging between the two languages is really easy, | ||
and buys us a lot of performance. | ||
Next, let’s try Python! | ||
# Python | ||
Create an `embed.py` file in this directory, and put this in it: | ||
```python | ||
from ctypes import cdll | ||
lib = cdll.LoadLibrary("target/release/libembed.so") | ||
lib.process() | ||
print("done!") | ||
``` | ||
Even easier! We use `cdll` from the `ctypes` module. A quick call | ||
to `LoadLibrary` later, and we can call `process()`. | ||
On my system, this takes `0.017` seconds. Speedy! | ||
# Node.js | ||
Node isn’t a language, but it’s currently the dominant implementation of | ||
server-side JavaScript. | ||
In order to do FFI with Node, we first need to install the library: | ||
```bash | ||
$ npm install ffi | ||
``` | ||
After that installs, we can use it: | ||
```javascript | ||
var ffi = require('ffi'); | ||
var lib = ffi.Library('target/release/libembed', { | ||
'process': [ 'void', [] ] | ||
}); | ||
lib.process(); | ||
console.log("done!"); | ||
``` | ||
It looks more like the Ruby example than the Python example. We use | ||
the `ffi` module to get access to `ffi.Library()`, which loads up | ||
our shared object. We need to annotate the return type and argument | ||
types of the function, which are 'void' for return, and an empty | ||
array to signify no arguments. From there, we just call it and | ||
print the result. | ||
On my system, this takes a quick `0.092` seconds. | ||
# Conclusion | ||
As you can see, the basics of doing this are _very_ easy. Of course, | ||
there's a lot more that we could do here. Check out the [FFI][ffi] | ||
chapter for more details. |