Feature Highlight: Dataflow engine

Motivation

Minerva's design goal is to offer users more flexibility and yet preserve the efficiency during runtime. Therefore, Minerva decides to provide the numpy-like NArray interface for user to write any kind of algorithm as they wish (hopefully). We directly map these NArray operators to efficient CPU and GPU kernels which meets the basic requirements of speed as many other tools outside did. But this is far from perfect. In fact, there are lots of parallelisms within the algorithm structure which a tool could utilize to further speed up the algorithm

Example of Parallelism

Consider the back propagation of a multi-layer perception below (written in Minerva's owl package and the complete example is in mnist_mlp.py).

# bp
s2 = self.w2.trans() * s3
# grad
gw2 = s3 * a2.trans() / num_samples
gb2 = s3.sum(1) / num_samples

Line s2 = self.w2.trans() * s3 is to calculate the error of hidden layer by backpropagating from the classifier layer. Line gw2 = s3 * a2.trans() / num_samples and gb2 = s3.sum(1) / num_samples are calculating the gradient of the weight and bias respectively.

Not that the data dependencies among those NArrays are as follows:

s2 -> {w2, s3}
gw2 -> {s3, a2}
gb2 -> {s3} Therefore, s2, gw2 and gb2 are independent and thus could be executed in paralellel without any worry about data race issues.

UNDER CONSTRUCTION

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature Highlight: Dataflow engine

Motivation

Example of Parallelism

Clone this wiki locally