Manually linking Rust binaries to support out-of-tree LLVM passes

Checking some results in Ghidra

Recently I had a project develop that had some fairly unique requirements. First off, it’s a solid candidate for Rust. Second, because of some, erm, strange deployment situations… we needed a toolchain that supported arbitrary out-of-tree LLVM passes. So this work grew out of a viability test. Suffice it to say, I am not a rust expert by any means and there might even be a better way of doing this. That being said, it ‘s also a fun tour through LLVM internals and other systems concepts. I tried to write this in a way that captures the exploratory process so if you aren’t familiar with some of these lower-level concepts that you can come along for a fun ride and hopefully get something out of it anyway.

If you’re not familiar with it, LLVM is a massive project that’s been around for 20 years for building modular compiler infrastructure. It does a lot so I ‘m going to refer to the wiki + project page if you want to know more. What you need to know for this article is that LLVM is split into frontends and backends. Frontends (like rustc) generate LLVM-IR (intermediate representation) so that a backend can generate machine code. Along the way you’ll also see LLVM bitcode and object code. Passes are functional units that can read IR and potentially mutate it. Think code transformation and optimization. In fact, most of the compiler’s heavy lifting occurs in the form of passes transforming and optimizing IR before writing out the results.

So how do we add a custom pass to run on a Rust binary when cargo and rustc do all the magic for us?

Bill of Materials

This article was written with the following versions in-mind

Rust stable-x86_64-apple-darwin
rustc 1.33.0 (2aa4c46cf 2019-02-28) (Installed via rustup)
llvm: stable 8.0.0 ( installed via homebrew)

A simple binary

Let’s take the following Rust program (in main.rs)

It does nothing fancy- literally just says hello.

Normally we would compile this with:

rustc main.rs

But since we’re using the standard Rustup toolchain, it doesn’t know about our out-of-tree LLVM passes held in a shared object file elsewhere. So instead of completing the program, we’re going to compile it with some extra flags to get LLVM-IR instead of a binary:

rustc --emit=llvm-ir main.rs

Now we’ve got a new file named main.ll (the IR format) of main. If you aren’t used to seeing IR you can look through this file and see that it’s sort of like fancy assembly. For now, let’s take it the rest of the way to being a complete program:

LLVM_HOME=/usr/local/Cellar/llvm/8.0.0/

# Run the IR through the LLVM assembler to generate bitcode  
$LLVM_HOME/bin/llvm-as main.ll

# opt is the key addition. It takes in IR or BC and runs it through a pass, returning the mutated BC.  
$LLVM_HOME/bin/opt -load ~/PATH_TO_PASS.dylib -o main.bc main.bc

# Use LLVM's static compiler to transform the bitcode into object code. Generates main.o  
$LLVM_HOME/bin/llc -filetype=obj main.bc

# Lastly, run the object code through clang so we can complete the linking phase and have a complete binary. (Except this won't work yet...)  
$LLVM_HOME/bin/clang -m64 main.o

Linking?!

If we try and run that last command as-is we get

+ /usr/local/Cellar/llvm/8.0.0//bin/clang -m64 main.o  
Undefined symbols for architecture x86_64:  
  " **std::io::stdio::_print::hdec9324a4622df1e** ", referenced from:  
      main::main::hfe98083a4c87500f in main.o  
  "std::rt::lang_start_internal::h3dc68cf5532522d7", referenced from:  
      std::rt::lang_start::h149f34af029e1c5f in main.o  
  " **_rust_eh_personality** ", referenced from:  
      Dwarf Exception Unwind Info (__eh_frame) in main.o  
ld: symbol(s) not found for architecture x86_64  
clang-8: error: linker command failed with exit code 1 (use -v to see invocation)

Which makes sense. LLVM has no idea where to locate those symbols because we haven’t told it. I’m not going to go too far into the weeds on this, but you can use nm to print the symbol table of various object files. Here’s the output of main.o.The capital U in the output means ‘undefined’ and linking won’t complete until that symbol is found.

$ nm main.o | grep _rust  
                 U _rust_eh_personality  
$ nm main.o | grep _print  
                 U __ZN3std2io5stdio6_print17hdec9324a4622df1eE

Let’s try one answer that sort of relies on cheating …

Locating our missing symbols

So where are these symbols? They’re sitting inside a file /Users/chris/.rustup/toolchains/stable-x86_64-apple- darwin/lib/libstd-d4fbe66ddea5f3ce.dylib.

$ nm ~/.rustup/toolchains/stable-x86_64-apple-darwin/lib/libstd-d4fbe66ddea5f3ce.dylib | grep _rust_eh_personality

0000000000033960 T _rust_eh_personality

This time, unlike an undefined ‘U’, we have a ‘T’- meaning it’s concretely defined in the TEXT segment of this object. So let’s go back to the build directory and link against that file.

clang -m64 main.o -L /Users/chris/.rustup/toolchains/stable-x86_64-apple-darwin/lib/ -lstd-d4fbe66ddea5f3ce

If you aren’t familiar with clang’s arguments:

-L Is telling LLVM to look in a particular directory for shared libraries. On OSX these files aredylib, on Linux- so, and on Windows they’re dll.
-l is the actual name of the library we’re asking to be linked.
-m is just saying we’d like this compiled as 64 bit output

Indeed, this will link! But, womp womp, it doesn’t run as-is

$ ./a.out  
dyld: Library not loaded: [@rpath/libstd-d4fbe66ddea5f3ce](http://twitter.com/rpath/libstd-d4fbe66ddea5f3ce "Twitter profile for @rpath/libstd-d4fbe66ddea5f3ce").dylib  
  Referenced from: /Users/chris/Development/rust_play/simple/./a.out  
  Reason: image not found  
Abort trap: 6

That’s because we told the linker to dynamically link it. That path isn ‘t on the system’s normal shared library search path, so when it goes to start it, it can’t locate rust’s libstd. We can help it out by using the LD_LIBRARY_PATH trick, but that’s cumbersome and, had we just built it with rustc, it wouldn’t be needed.

$ LD_LIBRARY_PATH=/Users/chris/.rustup/toolchains/stable-x86_64-apple-darwin/lib/ ./a.out  
Hello!

So what’s happening?

Reverse engineering what rustc is doing

Let’s take a different tac. How do Rust’s static libraries work? They’re actually just archive (ar) files called rlibs. If we look in that same directory we found the dylib for libstd we can find files with the same name but the rlib extension instead. Theoretically, if we provide all these rlibs (because I’m lazy and not optimizing imports) maybe the compiler can resolve all these symbols statically and we can end up with a binary that works the way we expect.

$LLVM_HOME/bin/clang -m64 *.o $(find /Users/chris/.rustup/toolchains/stable-x86_64-apple-darwin/lib/rustlib/x86_64-apple-darwin/lib -name '*rlib')

Essentially the spray-and-pray approach. On my system this is currently another 84 rlib libraries. Note: that these entries to clang don ‘t have a prefix (no __-l_ _). Here ‘s the shortened linking output this time around:

"___rust_alloc", referenced from:  
      (omitted)  
      ...  
  "___rust_alloc_zeroed", referenced from:  
      (omitted)   
      ...      
  "___rust_dealloc", referenced from:  
      (omitted)  
      ...  
  "___rust_realloc", referenced from:  
      (omitted)  
      ...

If we try and compile the program this way we do find all the symbols the linker complained about in the first time around, but now we have new symbols it can ‘t find. This also makes sense, given the way linking works, we have to be able to resolve transitive dependencies too. So it’s possible we found the symbols our program needs… but now we need the symbols those symbols need…

Luckily though, we’re down to just a few and they all sound like memory handling functions. Where are the memory handling functions? If we use nm on all the rlibs in the directory it’s undefined everywhere. If we look in the dylib files though it’s there!

$ nm /Users/chris/.rustup/toolchains/stable-x86_64-apple-darwin/lib/rustlib/x86_64-apple-darwin/lib/libstd-d4fbe66ddea5f3ce.dylib | grep __rust_alloc  
00000000000334e0 T ___rust_alloc  
0000000000033510 T ___rust_alloc_zeroed

So rustc must be doing something special. I had to do some spelunking around rustc’s code… and it turns out the magic is hidden inside librustc_codegen_llvm and librustc_codegen_ssa. They use LLVM’s code generation capability to create an allocator shim that handles the situation where you want a standalone binary. So let’s find a way to get rustc to generate the allocator shim and we can borrow it :-).

The magic turns out to be:

rustc -C save-temps --emit=llvm-ir main.rs

Turns out that normally rustc wants to be a good citizen and cleanup temp files, except that we really need the temp files because that includes the allocator shim! This incantation will generate an extra bitcode file in the working directory that indeed has all the allocator symbols(names will vary and you’ll also note that there isn’t a concrete offset yet because this isn’t object code)

$ nm main.4s37gsrti678ik8u.rcgu.bc  
                 U ___rdl_alloc  
                 U ___rdl_alloc_zeroed  
                 U ___rdl_dealloc  
                 U ___rdl_realloc  
---------------- T ___rust_alloc  
---------------- T ___rust_alloc_zeroed  
---------------- T ___rust_dealloc  
---------------- T ___rust_realloc

Here’s the full build script

This indeed will work as expected!

$ ./a.out  
Hello!

Phew.

A more complex binary

For the sake of completeness let’s add a slightly more complicated case where we use cargo and have multiple rust level dependencies.

Here’s a program with some dependencies on logging and periodic tasks.

Instead of calling rustc directly we add some extra arguments to cargo to make sure it passes flags along to rustc.

cargo rustc --verbose -- -v -C save-temps --emit=llvm-ir

Cargo puts everything of interest in target/debug/deps including the allocator shim we discovered above. This directory will also include rlib copies of all the dependency crates which makes linking a snap.

Here’s the slightly modified build script that will let you build a more complicated binary:

With final output:

$ ./complicated_app  
2019-04-13 15:25:14 INFO  [complicated] Starting Agent!  
2019-04-13 15:25:15 INFO  [complicated] Periodic task

And with that you should be able to build just about any Rust binary with custom LLVM passes without having to rebuild them directly into rust’s LLVM toolchain!