Skip to content

Commit

Permalink
provides versioned backends for program units symbol tables
Browse files Browse the repository at this point in the history
TL;DR; follow mainstream OCaml and make dynamic loading sound and safe
(not again, as it never was before).

The Problem
===========

The long story. Before OCaml 4.08 the dynlink module was
unsound. Linking the same module resulted in GC roots table corruption
and segmentation faults. To prevent this behavior, we were tracking
loaded units. However, we weren't able to reliably track units that
were linked into the program directly with the OCaml linker. We were
using `findlib.dynload` and a corresponding functionality in dune
that provide us the information about the units that were used to link
the host program. Unfortunately, `findlib.dynload` was recording this
information in terms of the findlib packages, not in compilation
units. Therefore, in order to resolve a package name to corresponding
compilation unit names we needed a working findlib system, i.e., the
META files for all packages that are statically linked in the binary
should be present in the hard-coded locations. In normal mode of
operation, when packages that were used to build bap are present in
the file system it didn't pose any problems. However, when bap
together with its plugins was packed into a debian package and
distributed to other machines no meta files were available. To enable
binary distributions we developed an ocamlbuild plugin that was
resolving package names to compilation unit names during the
compilation time, so that bap (and other tools built from our main
source repository) wasn't dependent on the runtime presence of META
files. However, when a host program is compiled with dune, or any
other build system that doesn't reflect package names to unit names
and store them in the host file predicates (read it anywhere outside
of bap), then when packaged and distributed to other hosts the program
will fail in runtime.

The Solution
============

Sine the bug is fixed in 4.08 there is no a big problem. There is
still some impendance mismatch between the names of the
libraries (cmxs or cma) that we load and the names of the units that
comprise the library, therefore, we can't know beforehand whether a
library that we load is already linked into the main program, because
we can only query dynlink for the names of the compilation units, not
for the names of the libraries, that used during the linking
procedure. We address this problem by looking into the error code, if
the code is `Dynlink.Module_already_loaded _` then instead of failing,
we record the library that loads this module in our repository.

The only problem with this solution is that it is probably to early
for us to drop the support for OCaml 4.07. Therefore, we decided to
provide two backends. The fallback solution still uses the old findlib
approach, but we decided to make it a little bit more robust, to
minimize the debugging time in case it will fail. We now check if we
have the static information about the compilation units that comprise
the host program, and if we don't then we ensure that we have working
ocamlfind and META files. If not than we terminate the program with
more or less comprehensible message. If we have a modern compiler,
then we just use the Dynlink module (which is guarded with
ppx_optcomp).
  • Loading branch information
ivg committed Jan 27, 2020
1 parent c2f799e commit 56d6507
Show file tree
Hide file tree
Showing 11 changed files with 132 additions and 53 deletions.
10 changes: 5 additions & 5 deletions .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -37,9 +37,9 @@ cache:
- $HOME/save_opam

env:
- OCAML_VERSION=4.07 WITH_BUILD_CACHE=true
- OCAML_VERSION=4.07
- OCAML_VERSION=4.08
- OCAML_VERSION=4.09
- OCAML_VERSION=4.09 WITH_BUILD_CACHE=true

stage: Compile
script: bash -ex .travis_install.sh
Expand All @@ -48,13 +48,13 @@ jobs:
include:
- stage: Unit tests, checks and bil tests
env:
- OCAML_VERSION=4.07 WITH_BUILD_CACHE=true
- OCAML_VERSION=4.09 WITH_BUILD_CACHE=true
script: bash -ex .run_travis_tests.sh unit_tests
- stage: Unit tests, checks and bil tests
env:
- OCAML_VERSION=4.07 WITH_BUILD_CACHE=true
- OCAML_VERSION=4.09 WITH_BUILD_CACHE=true
script: bash -ex .run_travis_tests.sh checks
- stage: Unit tests, checks and bil tests
env:
- OCAML_VERSION=4.07 WITH_BUILD_CACHE=true
- OCAML_VERSION=4.09 WITH_BUILD_CACHE=true
script: bash -ex .run_travis_tests.sh veri
15 changes: 3 additions & 12 deletions lib/bap_plugins/bap_plugins.ml
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ open Bap_bundle.Std
open Bap_future.Std
open Or_error.Monad_infix

module Units = Bap_plugins_config.Units
module Units = Bap_plugins_units
module Filename = Caml.Filename

module Plugin = struct
Expand Down Expand Up @@ -39,14 +39,6 @@ module Plugin = struct
let setup_dynamic_loader loader =
load := loader

(* this function requires working Findlib infrastructure. *)
let unit_of_package pkg =
let preds = Findlib.recorded_predicates () in
try
Findlib.package_property preds pkg "archive" |>
Filename.chop_extension
with _ -> pkg (* fails if the infrastructure is broken *)

let init = lazy (Units.init ())

let of_path path =
Expand Down Expand Up @@ -79,19 +71,18 @@ module Plugin = struct
try Some (find_library_exn name) with _ -> None

let load_unit ?(don't_register=false) ~reason ~name pkg : unit or_error =
let open Format in
try
notify (`Linking name);
!load pkg;
if not (don't_register)
then Units.record name reason;
Ok ()
with
| Dynlink.Error err ->
Or_error.error_string (Dynlink.error_message err)
| Dynlink.Error err -> Units.handle_error name reason err
| exn -> Or_error.of_exn exn



let is_debugging () =
try String.(Sys.getenv "BAP_DEBUG" <> "0") with Caml.Not_found -> false

Expand Down
2 changes: 0 additions & 2 deletions lib/bap_plugins/bap_plugins_config.ml.ab
Original file line number Diff line number Diff line change
@@ -1,3 +1 @@
let plugindir = "$plugindir"

module Units = Bap_plugins_units_$plugins_backend
26 changes: 26 additions & 0 deletions lib/bap_plugins/bap_plugins_units.ml
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
open Core_kernel

open Bap_plugins_units_intf

[%%if ocaml_version < (4,08,0)]
include Bap_plugins_units_fallback
[%%else]
let name = "dynlink"

let units : reason String.Table.t = String.Table.create ()

let copy_units_from_dynlink () =
Dynlink.all_units () |>
List.iter ~f:(fun unit -> Hashtbl.add_exn units unit `In_core)

let init () = copy_units_from_dynlink ()
let list () = Hashtbl.keys units
let record name reason = Hashtbl.add_exn units name reason
let lookup = Hashtbl.find units
let handle_error name reason = function
| Dynlink.Module_already_loaded _ ->
Hashtbl.set units name reason;
Ok ()
| other ->
Or_error.error_string (Dynlink.error_message other)
[%%endif]
14 changes: 1 addition & 13 deletions lib/bap_plugins/bap_plugins_units.mli
Original file line number Diff line number Diff line change
@@ -1,13 +1 @@
type reason = [
| `In_core
| `Provided_by of string
| `Requested_by of string
]


module type S = sig
val init : unit Lazy.t

val record : string -> reason -> unit
val lookup : string -> reason option
end
include Bap_plugins_units_intf.S
Original file line number Diff line number Diff line change
Expand Up @@ -2,19 +2,32 @@ open Bap_plugins_units_intf

open Core_kernel


let name = "findlib"
let units : reason String.Table.t = String.Table.create ()

let has_meta_information pkg =
try ignore (Findlib.package_meta_file pkg : string); true
with Findlib.No_such_package _ -> false

let report_missing_meta pkg =
failwithf
"Package %s was used to build the host program, but its meta \
information is not available at runtime. To be able to use \
plugins without installed META files, either update the version \
of OCaml to 4.08 or newer or provide them at the compilation \
time via the `used_MOD` predicates for each `MOD` linked into the
host binary." pkg ()


(* this function requires working Findlib infrastructure. *)
let unit_of_package ~findlib_is_required pkg =
if findlib_is_required && not (has_meta_information pkg)
then report_missing_meta pkg;
let preds = Findlib.recorded_predicates () in
Format.eprintf "Resolving package: %s@\n%!" pkg;
try
Findlib.package_property preds pkg "archive" |>
Filename.chop_extension
with exn ->
Format.eprintf "Skipping package: %s\n%!" pkg;
pkg
with _ -> pkg


let string_of_reason = function
Expand All @@ -28,22 +41,23 @@ let extract_units_from_predicates () =
| None -> ()
| Some lib -> Hashtbl.set units ~key:lib ~data:`In_core)


let init () =
if not (Hashtbl.is_empty units)
then failwith "the plugin system is already initialized";
extract_units_from_predicates ();
let findlib_is_required = Hashtbl.is_empty units in
let extract_units_from_packages ~findlib_is_required =
Findlib.(recorded_packages Record_core) |> List.iter ~f:(fun pkg ->
Hashtbl.set units
(unit_of_package ~findlib_is_required pkg)
`In_core)

let init () =
if not (Hashtbl.is_empty units)
then failwith "the plugin system is already initialized";
extract_units_from_predicates ();
extract_units_from_packages ~findlib_is_required:(Hashtbl.is_empty units)

let record name reason =
Format.eprintf "recording %s as it %s@\n%!" name (string_of_reason reason);
Hashtbl.add_exn units name reason

let lookup = Hashtbl.find units

let list () = Hashtbl.keys units

let handle_error _ _ err = Or_error.error_string @@ Dynlink.error_message err
68 changes: 68 additions & 0 deletions lib/bap_plugins/bap_plugins_units_intf.ml
Original file line number Diff line number Diff line change
@@ -1,3 +1,39 @@
(** Internal module. *)

(** Interface for units table.
We need to maintain a set of compilation units that comprise the
program, first of all to prevent double-linking (which is possible
before 4.08 and will lead to a segfault at best) and second to
track plugin dependencies that we need and that are not
satisfied. Even after 4.08, the OCaml loader won't let us link an
already linked unit (which is not a problem, as they give us
access to the list of loaded units).
For now we have an implementation that supports older versions of
OCaml, which relies on the presence of the META files that
describe installed packages (we need to map package names (which
are recorded in the core file by findlib.dynload) to the unit
names, or on a presence of the `used_<UNIT>` predicates, which are
recorded by bap building system (or any other build system that is
capable of passing predicates to ocamlfind, e.g., _not_ dune).
If META files are not available and units are not recorded in the
host file via the predicates, then we bail out with an error. This
could happen, for example, when the host program is built with
dune (or with some other build system that doesn't record units in
the predicates) and then the toolchain is erased or otherwise made
unavailable, e.g., when the program together with plugins is
packed into a debian package. To be clear, nothing wrong will
happen with the BAP framework.
The other implementation, that is available for OCaml versions
4.08 and above, is totally safe and relies purely on facilities
provided by the language runtime.
*)

open Core_kernel

type reason = [
| `In_core
| `Provided_by of string
Expand All @@ -6,8 +42,40 @@ type reason = [


module type S = sig


(** the name of the selected backend.
Currently, it should be [findlib] or [dynlink], and is
selected at configuration time via `./configure --plugins-backend`.
*)
val name : string

(** initializes the unit system.
May fail if the selected backend is unable to provide safe
operation.
Could be only called once per program run.
*)
val init : unit -> unit


(** [list ()] enumerates all currently linked modules. *)
val list : unit -> string list


(** [record name reason] records unit [name] as well as the
reason, why it is linked.
pre: a unit with such name is not linked.
*)
val record : string -> reason -> unit


(** [lookup name] checks if a unit with the given [name] is linked,
and returns a reason why it was linked. *)
val lookup : string -> reason option

val handle_error : string -> reason -> Dynlink.error -> unit Or_error.t
end
Empty file.
7 changes: 3 additions & 4 deletions oasis/plugins
Original file line number Diff line number Diff line change
Expand Up @@ -8,8 +8,7 @@ Library bap_plugins
FindLibName: bap-plugins
Modules: Bap_plugins
InternalModules: Bap_plugins_config,
Bap_plugins_units_modern,
Bap_plugins_units_findlib,
Bap_plugins_units,
Bap_plugins_units_fallback,
Bap_plugins_units_intf
BuildDepends: core_kernel, dynlink, fileutils, findlib, bap-bundle, bap-future,
uri, ppx_jane
BuildDepends: core_kernel, dynlink, fileutils, findlib, bap-bundle, bap-future, ppx_jane
5 changes: 0 additions & 5 deletions oasis/plugins.setup.ml.in

This file was deleted.

0 comments on commit 56d6507

Please sign in to comment.