-
Notifications
You must be signed in to change notification settings - Fork 11
Guide
- Installation
- Basic usage
- Types
- Scoping
- Functions
- String Patterns
- Calling Wolfram Kernel and Inline C++
- Compilation Errors
- Working with C++
- Options to CompileToBinary
- External BLAS and LAPACK
- Caveats
-
You can find the lastest releases of MathCompile on the Releases page.
-
Install the package by executing the command using
PacletInstall
.
To load and test the package, execute
<<MathCompile`
CompileToCode[Function[{Typed[x, Integer]}, x + 2]]
It should output a C++ function as a string.
- Set up a C++ compiler (see Prerequisites for C++ Compiler).
A Wolfram Language function that can be compiled should be a function with zero or more formal parameters:
Function[{<arg1>, <arg2>, ...}, <body>]
Each argument of the function should have its type specified by Typed[<name>, <type>]
. The body of the function is an expression consist of:
- compilable functions or constants,
- function arguments or local variables, and
- integral, floating-point, or string literals, such as
-3.1
or"a string"
.
For example, the following function is effectively the native Plus
function for two integers:
Function[{Typed[x, Integer], Typed[y, Integer]}, x + y]
The function CompileToCode
compiles a Wolfram Language function to C++ code, e.g.
CompileToCode[Function[{Typed[x, Integer]}, x + 1]]
evaluates to a string:
auto main_function(const int64_t& v11) {
return wl::val(wl::plus(WL_PASS(v11), int64_t(1)));
}
The function CompileToBinary
compiles a function to binary and load it into the current session as a library function (it requires a C++ compiler), e.g.
addone = CompileToCode[Function[{Typed[x, Integer]}, x + 1]]
The compiled functions can be called just like normal functions when arguments of correct types are supplied, e.g.
addone[5] (* gives 6 *)
If a compiled function needs to be recompiled, it should be unloaded first by LibraryFunctionUnload
.
The following table shows the types supported by MathCompile, and how the types are represented:
Type | Width | Type specifier |
---|---|---|
Boolean | "Boolean" |
|
Integer | 8-bit 16-bit 32-bit 64-bit |
"Integer8" , "UnsignedInteger8" "Integer16" , "UnsignedInteger16" "Integer32" , "UnsignedInteger32" "Integer64" , "UnsignedInteger64"
|
Real | single precision double precision |
"Real32" "Real64"
|
Complex | single precision double precision |
"ComplexReal32" "ComplexReal64"
|
String | "String" |
|
Array | {<Type>,<Rank>} |
Three commonly used types — Integer
, Real
, Complex
, and String
— are aliases of "Integer64"
, "Real64"
, "ComplexReal64"
and "String"
.
The function Typed
is used to specify the type of function arguments. For example,
Function[{Typed[x, {Integer, 2}]}, ...]
denotes a function taking a matrix (rank-2 array) of integers named x
as its argument.
When an arithmetic function is applied on two numbers with distinct types, one or both of them are promoted to a different type. For addition, subtraction, and multiplication, a common type is defined for each combination of types in the table below; the numbers are convert to the common type before applying the operation. For division, the common type of two integral types is defined to be double
instead.
Types | Common type |
---|---|
integral of different widths | the wider integral |
integral of different signedness | the unsigned integral |
integral and floating-point | the floating-point |
floating-point of different widths | the wider floating-point |
integral and complex | the complex |
floating-point T and complex<T>
|
complex<T> |
float and complex<double>
|
complex<double> |
double and complex<float>
|
complex<double> |
complex<double> and complex<float>
|
complex<double> |
Other functions that implicitly depend on arithmetic operations also follow the rules of common types. For example, applying LinearSolve
on two integral matrices yields a solution matrix of double-precision numbers.
Wolfram Language has two scoping mechanisms: lexical and dynamic scoping, of which MathCompile only support lexical scoping using Module
. A module creates a scope; local variables can be introduced within the scope, then they are invalidated at the end of the scope.
Note that a local variable must be initialized before use. It is recommended that local variables are initialized at the beginning of a module, either by value or using Typed
with a type specifier:
Module[{x = 2}, ...] (* OK, x is integer 2 *)
Module[{y = Typed[{Real, 1}]}, ...] (* OK, y is an empty list of reals *)
Module[{z}, Sin[z]] (* error, z is used before initialization *)
The initialization of local variables can be delayed, but it must be one of the compound expressions in the module. A common cause of the delay is that the initialization of one variable depends on another. For example,
Module[{x = 3, y}, y = x^2; ...] (* OK, y is initialized *)
Module[{x = 3, y}, Sin[y = x^2]; ...] (* error, not a proper initialization *)
The delay of initialization also applies in multiple-assignment. For example,
Module[{lu, p, c}, {lu, p, c} = LUDecomposition[{{1, 1}, {1, 0}}];]
The functions nested in the body of the compiled function do not need to have the types of their arguments specified. For example, the two snippets below are equivalent:
Module[{f = Function[{Typed[x, Real]}, x^2]}, f[5.5]] (* OK *)
Module[{f = #^2 &}, f[5.5]] (* OK *)
When compiled, the functions can be used as values in most cases:
Module[{f, g}, f = #^2 &; g = f; ...] (* OK *)
They can also be returned by divergent code paths, the code compiles when the types from both paths is consistent:
Module[{f = If[Pi > 3, # + 1 &, Sin[#] &]}, f[1.6]] (* OK *)
Module[{f = If[Pi > 3, # + 1 &, Sin[#] &]}, f[1]] (* error *)
Slot
can be compiled except for #0
(#1 + #2 &)[3, 4, 5] (* OK; 5 is ignored *)
(#1 + #2 &) @@ Range[5] (* OK; 3, 4, 5 are ignored *)
SlotSequence
are designed to be used as the argument of variadic functions, such as List
, Plus
, and BitXor
. Some of the native functions may take different number of arguments, such as Range
and ArcTan
, but SlotSequence
cannot be used with them. For example,
ArcTan[##1] & (* error *)
ArcTan[#1] & (* OK *)
ArcTan[#1, #2] & (* OK *)
Recursive functions can be defined in two steps:
- declare the type of the function in the form of
{<argument types>} → <return type>
; - set the definition of the function.
For example, factorial takes an integer and returns an integer, so it can be defined as
Module[{f = Typed[{Integer} → Integer]}
f = If[# == 1, 1, # f[# - 1]] &;
f[10] (* gives 3628800 *)
]
As a guideline, a recursive function is favored in these conditions:
- the algorithm is naturally recursive and a recursive definition greatly reduces complexity;
- the depth of recursion is relatively small especially arrays are defined inside the function so that stack overflow is unlikely to occur.
Otherwise, employ a iterative definition of the function instead.
String patterns are the only places where pattern matching functions are supported. There are a few restrictions on string patterns and the related functions:
- Regular expression cannot be mixed with patterns;
-
RuleDelayed
has the same meaning asRule
; - The right hand side of a
Rule
can only be a string pattern; - Functions that return a list of strings or integers can only take one string as their first arguments;
-
Overlaps->True
andOverlaps->False
are the only options supported.
Internally, a string pattern needs to be compiled to a regular expression object before being used. In order to avoid compiling the same string pattern multiple times, you can explicitly create a compiled string expression by assigning it to a variable. For example,
Module[{p = "a"~~_~~"b"}, ...]
compiles the pattern p
once and it can be used multiple times in the Module
.
Functions that needs to be executed in Wolfram Language are represented by Extern[<function>,<return type>]
.
For example,
CompileToBinary@Function[{Typed[x, {Real, 1}]},
Extern[AiryAi, Typed[{Real, 1}]][x]
]
calls the function AiryAi
in Wolfram Kernel and the return type is Typed[{Real, 1}]
.
Another usage of this functionality is to pass variables to the function after being compiled and loaded:
f = CompileToBinary@Function[{Typed[x, Real]},
x + Extern[gety, Typed[Real]][]
];
gety[] := 5;
f[10] (* returns 15 *)
gety[] := 8;
f[10] (* returns 18 *)
Within a function to be compiled, a piece of C++ is represented by CXX["<C++ code>"]
.
For example,
CompileToCode@Function[{Typed[x, Real]}, CXX["std::trunc"][x]]
calls the function std::trunc
in C++.
To access variables, wrap the names of variables with pairs of back-tics. For example,
CompileToCode@Function[{Typed[n, Integer]},
Module[{x = 1}, CXX["`x` += 5"]; x]]
]
compiles to
auto main_function(const int64_t& v38) {
return wl::val([&] {
auto v37 = wl::val(int64_t(1));
v37 += 5;
return wl::val(v37);
}());
}
If the inlined C++ code depends on external header files or libraries, you can specify them in options "Includes"
and "Libraries"
when calling CompileToBinary
.
The compilation errors are divided into two categories:
- syntactic and semantic errors, issued in Wolfram Language → C++ stage;
- overload resolution and type errors, issued in C++ → binary stage.
CompileToCode[Function[{}, Table[i + j]]];
syntax::bad: Table[Plus[i,j]] does not have a correct syntax for Table.
Explanation: Functions with iterators such as Table and Sum require a certain syntax, including how to specify the iterators.
CompileToCode[Function[{}, Range[5] += 1]];
syntax::bad: AddTo[Range[5],1] does not have a correct syntax for AddTo.
Explanation: A function that changes any of its argument requires that argument to be modifiable, but Range[5]
is not.
CompileToCode[Function[{}, Module[{x}, x + 1]]];
semantics::noinit: Variable x is declared but not initialized.
semantics::badref: Variable x is referenced before initialization.
Explanation: A variable must be initialized in order to be used. In this case, an initialization x = 0
will fix the issue.
CompileToCode[Function[{x}, Dot[x, x]]];
codegen::notype: One or more arguments of the main function is declared without types.
Explanation: The argument types of the main function must be specified. In this case, if the input to this function is a list of real numbers, declare {Typed[x, {Real, 1}]}
instead of {x}
.
These errors are given by the C++ compiler, and they are formatted and forwarded by MathCompile. Be aware that the error message is always accurate but the location of the error is estimated and can be inaccurate.
CompileToBinary[Function[{}, Module[{x = 5}, x = Sin[x]]]];
cxx::error:
...[List[],Module[List[Set[x,5]],Set[x,Sin[x]]]]...
∧
The type of the source cannot be converted to that of the target.
Explanation: Each variable can only have one type. Sin[x]
gives a real number and it cannot be assigned back to x
.
CompileToBinary[Function[{}, Range[10][[;; 2.5]]]];
cxx::error:
..unction[List[],Part[Range[10],Span[1,2.5]]]...
∧
The arguments should be integers.
Explanation: Span
specifications can only be integers.
CompileToBinary[Function[{}, Sin[1, 2, 3]]];
cxx::error:
...Function[List[],Sin[1,2,3]]...
∧
no matching function for call to 'sin(int64_t, int64_t, int64_t)'
Explanation: Each function has its requirements for arguments (Sin
can only take one numerical argument). When a function is called with some inappropriate argument types, the compiler is unable to resolve it.
CompileToBinary[Function[{}, Nest[List, {1}, 3]]];
cxx::error:
...Function[List[],Nest[List,List[1],3]]...
∧
The type should be consistent when the function is applied repeatedly.
Explanation: The return type of a function is determined at compile time. Applying List
repeatedly on a variable does not give a consistent type.
Since MathCompile implements all supported functions in C++, the generated code can be used with other C++ code without the existence of Wolfram runtime libraries or installing Mathematica.
The implementation of functions comes with MathCompile as a header-only library, which is located in MathCompile/IncludeFiles. You need to download the source code of MathCompile or clone the repository and provide that to the C++ compiler later.
First, we transform a function that calculates the sum of the first n positive integers squared.
CompileToCode@Function[{Typed[n, Integer]}, Sum[i^2, {i, n}]]
gives the function in C++ as a string:
auto main_function(const int64_t& v38) {
return wl::val(wl::clause_sum([&](auto&& v37, auto&&...) {
return wl::val(wl::power(WL_PASS(v37), wl::const_int<2>{}));
},wl::var_iterator(WL_PASS(v38))));
}
Then we wrap it with some code handling input and output:
#include <cstdlib>
#include <iostream>
#include "math_compile.h"
auto main_function(const int64_t& v12) {
return wl::val(wl::clause_sum([&](auto&& v11, auto&&...) {
return wl::val(wl::power(WL_PASS(v11), wl::const_int<2>{}));
},wl::var_iterator(WL_PASS(v12))));
}
int main(int argc, char* argv[]) {
if (argc <= 1) // no argument provided
return 1;
int64_t n = std::atoi(argv[1]); // read the argument
auto result = main_function(n); // call the function
std::cout << result << '\n'; // print the result
return 0;
}
Now we compile the source file above, called source.cpp
, using GCC.
$ g++ -std=c++1z -I<path-to-MathCompile>/IncludeFiles -o example source.cpp
where <path-to-MathCompile>
should be replaced by the path to the package and -std=c++1z
or other equivalent flags is necessary to specify the C++17 standard.
Finally, we run the program:
$ ./example 10
Each compilable type in Wolfram Language corresponds to a type in C++. They are summarized below.
Wolfram Language | C++ | Comment |
---|---|---|
type of Null
|
wl::void_type |
an empty class |
"Boolean" |
wl::boolean |
convertible from and to bool
|
"Integer<n>" |
int<n>_t |
n can be 8, 16, 32, or 64 |
"UnsignedInteger<n>" |
uint<n>_t |
n can be 8, 16, 32, or 64 |
"Real64" |
double |
|
"Real32" |
float |
|
"ComplexReal64" |
wl::complex<double> |
same as std::complex<double>
|
"ComplexReal32" |
wl::complex<float> |
same as std::complex<float>
|
"String" |
wl::string |
UTF-8 string |
{<Type>, <Rank>} |
wl::ndarray<Type, Rank> |
multi-dimensional array |
wl::ndarray
is a multi-dimensional array type corresponds to PackedArray
and NumericArray
in Wolfram Language.
wl::ndarray<Type,Rank>
can be constructed from its dimensions (typed std::array<size_t, Rank>
) and one of the following:
- a value (fill the array by this value);
- an initializer list (fill the array by the contents in the list);
- a pair of iterators (fill the array by the contents in
[iter1, iter2)
); - an rvalue
std::vector
(fill the array by the contents in the vector).
Arrays x1
through x4
below are initialized by these four methods respectively.
wl::ndarray<int, 1> x1({5}, 12);
wl::ndarray<int, 2> x2({3, 2}, {1, 2, 3, 4, 5, 6});
std::vector<int> vec{1, 2, 3, 4};
wl::ndarray<int, 2> x3({2, 2}, std::begin(vec), std::end(vec));
wl::ndarray<int, 2> x4({2, 2}, std::move(vec));
Member functions that are commonly used are as follows.
Function signature | Purpose |
---|---|
std::array<size_t, Rank> dims() const |
dimensions of the array |
size_t size() const |
flattened size of the array |
Type* data() |
pointer to the first element |
const Type* data() const |
const pointer to the first element |
void copy_to(Iterator iter) const |
copy the elements to iter
|
void copy_from(Iterator iter) |
copy the elements from iter
|
wl::part
can be used to access elements in an array efficiently. wl::part
uses 1-based indexing by default:
auto list = wl::range(5); // {1, 2, 3, 4, 5}
int& x = wl::part(list, 3); // x equals 3
x = -3; // changes 3 to -3 in list
wl::part
can take multiple indices to extract elements from multi-dimensional arrays:
wl::part(a, 2, 4, -1); // means a[[2, 4, -1]] in Wolfram Language
0-based indexing is specified by wl::cidx
, e.g.
int x = wl::part(wl::range(5), wl::cidx(3)); // x equals 4
Functions that involve string patterns, e.g. StringMatchQ
, StringReplace
, etc., depend on PCRE2 library. MathCompile package includes precompiled PCRE2 for several platforms in the MathCompile/LibraryResources. If these functions are used in C++ source code, the library must be provided during linking.
The following options can be given
Option name | Default value | Meaning |
---|---|---|
"TargetDirectory" | Automatic | output directory of the compiled library |
"WorkingDirectory" | Automatic | directory for temporary files |
"Debug" | False | passes debug flags to the C++ compiler |
"MonitorAbort" | True | allows interruption by Abort[]
|
"Defines" | {} | Preprocessor definitions |
"CompileOptions" | "" | options passed to the C++ compiler |
"LinkerOptions" | "" | options passed to the linker |
"Includes" | {} | additional include files |
"IncludeDirectories" | {} | directories for header file lookup |
"Libraries" | {} | library dependencies |
"LibraryDirectories" | {} | directories for library lookup |
MathCompile supports linear algebra functions based on Eigen. Alternatively, you can link external BLAS and LAPACK libraries with a compiled function to replace Eigen.
As for now, you need to define corresponding macros to tell MathCompile to use external libraries: define WL_USE_CBLAS
for a BLAS library; and define WL_USE_LAPACKE
for a LAPACK library. In the following example, Intel MKL is linked and the code is compiled by Microsoft Visual C++ compiler, where <path-to-mkl>
is the installation path of MKL:
f = CompileToBinary[Function[{}, Inverse@RandomReal[1., {5, 5}]],
"Defines" -> {"WL_USE_LAPACKE", "WL_USE_CBLAS"},
"IncludeDirectories" -> "<path-to-mkl>/include",
"Includes" -> "mkl.h",
"LibraryDirectories" -> "<path-to-mkl>/lib/intel64_win",
"Libraries" -> {"mkl_intel_lp64","mkl_sequential","mkl_core"}
]
Example
f = CompileToBinary[Function[{}, Table[Range[i], {i, 5}]]]
Attempting to create a ragged array causes run-time error.
Example
f = CompileToBinary[Function[{Typed[x, Integer]}, x + 1]];
f[10]
evaluates to 11, but the behavior of f[2^63 - 1]
is undefined.
Example
f = CompileToBinary[Function[{}, Module[{y = 0}, {y += 5, y *= 5}; y]]]
The order of evaluating y += 5
and y *= 5
is not specified; f[]
may gives either 5 or 25.
Example
Module[{g, x = 3}, g = Module[{y = 4}, x + y &]; g[]]
In the definition of function g
, local variable y
was referred, but when g[]
is called, y
has been invalidated. A simple fix to this issue is to pull y
out of the local scope:
Module[{g, x = 3, y = 4}, g = x + y &; g[]] (* good *)