Skip to content

GSoC 2017 proposal

Prasun Anand edited this page Sep 17, 2017 · 54 revisions

Contact information

Name : Prasun Anand

Email : prasunanand.bitsp@gmail.com

Github Username: prasunanand

Blog : www.prasunanand.com

Phone :

Why do you like Ruby, and why do you want to work on SciRuby?

As a web developer, I was introduced to Ruby initially via Ruby on Rails and soon the beauty and the productivity of the language got me hooked. In addition to the elegance of the language, a wonderful ecosystem and a very helpful community it has become my favorite language for all tasks, ranging from a simple script to complex web apps.

SciRuby bring the best out of two worlds: increased productivity due to Ruby and a whole new world of scientific tools for the Ruby community. I hope to see Ruby beyond just a language for web development, and for that I feel SciRuby is an excellent platform.

What do you like about science and why? What area do you like best?

Science is about how things work. It addresses a fundamental human need of curiosity. Science provides an opportunity for reasoning and creativity to merge into one.

I like Mathematics, especially Calculus and Algebra.

Describe your experience with the following: Ruby, C, C++, other languages.

C is the language through which I was introduced to Computer Programming in my first year of college.

Programming started as a hobby when I wanted to develop some cool websites. I taught myself HTML, CSS, JS and PHP. I interned at Biotech Park, Lucknow, India, where I set-up a standalone BLAST server and developed a CRUD based application to study gene based biomarkers.

After that, I learnt Python, Django and AngularJS to develop the very basic prototype of a discussion platform at Awaremonk where I was interning in the summers after my third year of college.

Developing the new site for edu.biojs.net I had to learn Ruby to work with the existing codebase. Gradually, I learnt Ruby on Rails to develop websites.

And contributing to css-chassis project for Jquery Foundation, I dived deep into Javascript, learnt nodejs.

I worked on “Port NMatrix to JRuby” in the context of GSoC 2016. During this period, I learnt a lot about Java, Ruby, C, C++ in depth.

After GSoC 2016, I got even more interest in scientific computing and number crunching. I explored GPU computation technologies like OpenCL and Nvidia CUDA. Heterogeneous Parallel Computing is extremely interesting for me as it enhances the speed of programs handling Big Data.

I came across some great libraries that have made GPU computing very easy by the set of APIs provided, for example, ArrayFire. The presence of GPU computing libraries for Ruby is next to non-existent. I took the task of creating ArrayFire wrapper for Ruby and was successful in implementing a few of the functionalities.

Very recently, I got a chance to explore D(which claims to be a better C when considering a Systems Programming Language). I worked on faster_lmm_d, a GWAS tool, under the mentorship of Pjotr Prins, which is used for analysis of mixed-models. D lacks a mature linear-algebra library. Thanks to my GSoC 2016 project, I could easily build a linear-algebra core from scratch. I intend to release it separately in future as a library called “Dmatrix”.

Working with faster_lmm_d, I have been able to embrace the greatness of Compiler Design. I learnt about Profile Guided Optimization getting my hands dirty with how the code executes and how program compilation on LLVM can be improved by profiling it with most use case and use JIT in the interest of getting maximum efficiency. I created a D library gperftools_d and published it as a DUB package. gperftools_d is a D binding for Google Performance Tools. It helps in profiling a binary executable.

As Knuth said, "Early optimizations are the root of all evil.", an experience with Profiling and Profile Guided Optimization will also give me an edge working on the project as after mid-summer, I will be more concerned about how ArrayFire can be used to handle real-life data. This project will have a great impact in improving the scenario of Scientific Computing on Ruby.

Describe your educational background (school, degree plan, major, past degrees, research area, publications, etc.).

Education:

I am a fifth-year student at BITS Pilani KK Birla Goa Campus, Goa, India pursuing a dual degree in M.Sc. in Biological Sciences and B.E. in Chemical Engineering. I am inclined more towards Biological Sciences.

Research Area:

I am interested in BioInformatics. It has a vast set of problems that needs to be solved with the help of cross-disciplinary knowledge in Biology, Maths and Computer Science. I find it interesting to process large amount of data by where availability of processing resource is a constraint. I also like how visualization tools help in making sense of a large amount of data. Since, last year I have been exploring High Performance Computing for fast algorithms and processing Big Data. Hacking the ArrayFire codebase was an interesting journey and I still enjoy it.

I am very keen to explore the field of Compiler Design. I am inspired by the works in field of LLVM, truffleRuby, JRuby and MRI. My goal is to hack the MRI garbage collector to handle large amount of objects. For example, MRI garbage collector has evolved a lot as Rails grew. Now its high time for Ruby Garbage Collector to evolve to cater to the needs of Number Crunching.

Talks and Presentations:

  1. Speaker at Ruby Conference India, 2017 to talk about Scientific Computing on JRuby.

    Slides| Video

  2. Speaker at Ruby DevRoom at FOSDEM, 2017 to talk about Scientific Computing on JRuby. The trip to Brussels, Belgium was sponsored by Emerging Technology Trust.

    Slides| Video

Have you offered any pull requests for SciRuby or contributed in other ways? Please provide links, if possible. Past contributions are required, and must be in the form of code. Documentation contributions are also beneficial.

Work Experience:

Open-source Contributions

  1. BioJavascript:
  • Created the growth curve for the stats page of BioJS. The website is hosted at stats.biojs.net.
  • Worked with stockchart.js.
  • Merged PR : 1
  • Developed the new UI of tutorial website of BioJS. Worked with Jekyll, Bootstrap.
  • Merged PR : 2
  1. Jquery Foundation:
  • Worked on the CSS-Chassis project. Added Semantic UI and Kendo-UI to the performance testing.
  • Merged PRs : 3, 4
  1. SciRuby
  • Port NMatrix to JRuby: Google Summer of Code 2016
  • Merged PR: 1
  1. ArrayFire
  1. Faster_lmm_d
  • A fast LMM for Genome-Wise Association Studies. Since, D currently doesn’t have a mature linear algebra core, I wrote a lot of linear algebra code myself and it will released as a separate library in near future.
  • Worked on Profile guided Optimization for LLVM.
  • I created a D library gperftools_d and published it as a DUB package. gperftools_d is a D binding for Google Performance Tools. It can be used to profile D code to find out CPU profile, stacktrace, heap_allocation profile,etc. and print the output as really nice graphs.

Gems Published

Mitab Parser : A ruby parser for PSI-MITab files. Link

What other commitments do you have this summer aside from GSoC? What obstacles do you foresee this summer as far as contributing the full forty hours per week during the GSoC period?

None.

I do not think any obstacles would arise that would prevent me from committing forty hours per week in the summer period. If any unexpected obstacles do arise I would do my best to overcome them as I am quite excited to work in this project and it would take priority over everything else.

Are you planning any fun vacations this summer?

No

How many classes are you taking this summer?

0

Do you have any other employment this summer?

No

Please talk a bit about any past GSoC projects in which you have participated. If you've done GSoC before, how could we reach your mentor(s)?

GSoC 2016 with Ruby Science Foundation

Please propose a project you would like to work on. Successful proposals will require advanced planning, communication with the project administrators and mentors, and likely a great deal of research on specific methods for achieving your project goals (e.g., what algorithms will you use? What frameworks?). A good place to start is the Ideas Page. You should also consider lurking on our IRC channel (#sciruby on FreeNode). Participation in listserv discussions is strongly recommended.

Project Proposal

Title

Creating the fastest math libraries for Ruby by using the GPU through OpenCL and ArrayFire.

Abstract

Few people realise it, but even modest computers today, including mobile phones, have powerful GPUs. And these GPUs can be used serially and in parallel to CPUs, potentially delivering awesome performance.

In this project I want to make it possible to combine the beauty of Ruby with transparent GPU processing so that software developers can easily use that power when available, and farm out computations transparently to GPU and CPU. This will work both on client computers and on servers that make use of TESLA's and Intel Xeon Phi solutions.

Background

One problem with GPUs is that they are different across platforms. Recent development in libraries such as OpenCL and ArrayFire make it easier to abstract away underlying architecture. The ArrayFire gem I intend to develop will support JRuby as well as standard Ruby. Initially I will be exposing ArrayFire linear algebra functionality to Ruby.

OpenCL is an open source low level API for sending code to GPUs and is now commonly available in software distributions. The ArrayFire accelerated computing library is a free, general-purpose, open-source library that simplifies the process of developing software by abstracting common routines, targeting parallel and massively-parallel architectures including CPUs, GPUs, and other hardware acceleration devices. ArrayFire works on devices from low-powered mobile phones to high-powered GPU-enabled supercomputers, including CPUs from all major vendors (Intel, AMD, ARM), GPUs from the dominant manufacturers (NVIDIA, AMD, and Qualcomm), as well as a variety of other accelerator devices on Windows, Mac, and Linux. Arrayfire is also increasingly available in software distributions.

Currently, the Ruby community has these projects that bind OpenCL/CUDA to Ruby:

  1. opencl-ruby: https://github.com/Nanosim-LIG/opencl-ruby
  2. Barracuda: https://github.com/lsegal/barracuda
  3. sgc-ruby-cuda: https://github.com/xman/sgc-ruby-cuda

Developer activity and support for these projects is mixed at best, and they are tough to use as they involve writing kernels and require a lot of effort to be put in buffer/RAM optimisation.

ArrayFire-Python, PyOpenCL and PyCUDA are the main competitors. ArrayFire-Python is rapidly gaining popularity due to its high-level approach, abstracting away from writing kernels and manual optimisation. PyOpenCL is difficult to use but still popular because it has been around for a long time. PyCUDA is limited to NVidia hardware.

I have started working on an ArrayFire gem and bindings for Ruby and JRuby. I think it crucial that both Rubies have access to the functionality. It will be suitable for any application that requires numeric processing. What excites me particularly is that it will create endless possibilities for scientific computation. My gem will bring with it the large collection of functionality that is already addressed by ArrayFire algorithms for linear algebra, data science, machine learning, computer vision, computational fluid dynamics, bioinformatics, etc. Performance increases can already be achieved for simple summing and sorting of arrays.

Comparing OpenCL and ArrayFire

A typical scenario would be when a Ruby programmer needs to add two arrays using GPU.

The OpenCL way:

He starts coding a host application in C or C++ which he will bind to Ruby. Every host application requires five data structures: cl_device_id, cl_kernel, cl_program, cl_command_queue, and cl_context. (A tutorial to add two arrays in OpenCL.)

He needs to determine the devices(GPU) available to him on which the addition will execute. Next he starts writing a program as a add.cl file which is loaded into the host application. The program add.cl compiles leading to kernels that are functions which will execute on the device. The host application must create a queue that make sure that kernels execute in the right order determined by the context.

If everything goes well, he can add arrays. But when the size of array elements exceeds 800 elements, it just leads to a Segmentation Fault. He can't believe that the program consumed all the memory as the GPU RAM to his disposal is 4GB. Now he starts working on optimization.

After spending long hours, he gives up and starts looking for OpenCL Math libraries. Unfortunately, he finds that he doesn't have any options in Ruby. He must write his own bindings to a GPU Math library.

The ArrayFire way:

After, the ArrayFire gem is ready, the Rubyist can simply add two arrays in a single line of code without caring for Segmentation Fault.

require 'arrayfire'

a = ArrayFire::Af_Array.new 2, [2,2],[1,2,3,4]
b = ArrayFire::Af_Array.new 2, [2,2],[4,12,1,0]

c = a + b

puts c

ArrayFire brings a great advance over OpenCL because:

  1. It abstracts away from the difficult task of writing kernels for multiple architectures; handling memory management, and performing tuning and optimisation.
  2. It is optimised both for common NVIDIA and AMD GPUs.
  3. It has great support and a lot of community developer activity.
  4. Just In Time compilation makes it faster than most of the other GPGPU libraries.
  5. When a user runs ArrayFire for computations, it first checks for presence of CUDA, then it falls to OpenCL, and next to CPU, which makes it possible to harvest the maximum GFLOPS from the machine.

Device Support

This project is about providing the next step: to make the ArrayFire gem provide NMatrix operations. There exists a port of ArrayFire to the JVM which I will use for JRuby. Its bindings, however, are incomplete, so I will also need to add bindings to this library. As in my GSoC project I will use the Ruby LMM library to measure performance.

Matrix Size

ArrayFire allows matrix operations exceeding GPU RAM. It contains a number of critical matrix operations, which we will apply to different sized matrices in real life scenarios. It has bindings of clBLAS and clLAPACK. The design of clBLAS and clLAPACK ensure that the best decomposition block sizes are generated for the problem from the kernel database file. ArrayFire provides an edge because the routines provided by clBLAS and clLAPACK are JITted which improve startup time and memory.

NVIDIA and AMD GPUs

Nvidia with its proprietary CUDA libraries also supports OpenCL. AMD is also a great supporter of OpenCL.

I read in a blog that NVIDIA provides more GFLOPS than AMD GPUs but I haven't tried benchmarking myself as I don't have access to an AMD GPU.

After developing the ArrayFire gem I hope we can generalize Ruby operations in other parts of Ruby libraries and give Ruby programs a free performance boost.

Technical Description

Dtypes:

For GSoC period I will work only on doubles(float64).

Stypes:

For GSoC period I will work on dense matrices.

Dimensions:

ArrayFire is targeted for matrices of at-most 4 dimension for sets of problems, it targets to solve. I am mostly concerned for 2 dimensional matrices for GSoC period.

Ruby Runtimes:

I target the following Ruby runtimes for ArrayFire-rb:

  1. MRI
  2. JRuby

I am targeting these two runtimes as MRI has been there for a long time and a lot of Ruby users start with MRI. JRuby has been picking up due to the better concurrency paradigm it offers and absence of GIL. This article summarises the performance of MRI, JRuby and TruffleRuby. JRuby performs better than MRI in most of the cases. However, with JRuby switching to GraalVM, in place of traditional JVM, even better performance can be expected. TruffleRuby is still under development phase.

Implementation:

The implementation of ArrayFire-rb is highly inspired by NMatrix and NArray. The architecture of ArrayFire-rb can be represented by Fig. 1 and categorised into three parts:

ArrayFire Architecture

Fig. 1

  1. Ruby Frontend
  2. MRI Frontend
  3. JRuby backend

MRI backend

Af_Array will use a C extension to store the dimension and elements of array. Like NMatrix, C++ would be mostly used to write the Ruby backend. ArrayFire is implemented in C++ and it is easy to use C++ for implementing the backend once mangling issues are taken care of.

If a templated global C++ function needs to be accessed from C (e.g., for either API or the Ruby interface), the following convention is utilized(borrowed from NMatrix):

  • The C++ function is placed in a namespace (e.g., namespace arf { }) or is declared static if possible. The C function receives the prefix af_, e.g., arf_multiply() (this function also happens to be static).

Note: ArrayFire uses namespace af. So, care must be taken while binding to the Ruby Frontend(ruby_nmatrix.c).

  • C macros are capitalized and generally have the prefix ARF_, as with ARF_DTYPE().
  • C functions (and macros, for consistency) are placed within extern "C" { } blocks to turn off C++ mangling.
  • C macros (in extern blocks) may represent C++ constants (which are always defined in namespace af {} or a child thereof).
  • I will define ArrayFire-specific allocation macros that mirror the ruby ones(borrowed from NMatrix).ARF_ALLOC(type), ARF_CONSERVATIVE and ARF_FREE(type) would be used to manage memory. More information can be found here.
#include <ruby.h>
typedef struct AF_STRUCT
{
 size_t ndims;
 size_t count;
 size_t* dimension;
 double* array;
}afstruct;

void Init_arrayfire() {
 ArrayFire = rb_define_module("ArrayFire");
 Blas = rb_define_class_under(ArrayFire, "BLAS", rb_cObject);
 rb_define_singleton_method(Blas, "matmul", (METHOD)arf_matmul, 2);
}
static VALUE arf_matmul(VALUE self, VALUE left_val, VALUE right_val){
 afstruct* left;
 afstruct* right;
 afstruct* result = ALLOC(afstruct);
 Data_Get_Struct(left_val, afstruct, left);
 Data_Get_Struct(right_val, afstruct, right);
 result->ndims = left->ndims;
 size_t dimension[2];
 dimension[0] = left->dimension[0];
 dimension[1] = right->dimension[1];
 size_t count = dimension[0]*dimension[1];
 result->dimension = dimension;
 result->count = count;
 arf::matmul(result, left, right);
 return Data_Wrap_Struct(CLASS_OF(left_val), NULL, arf_free, result);


#include <arrayfire.h>
namespace arf {
 using namespace af;
 static void matmul(afstruct *result, afstruct *left, afstruct *right)
 {
   array l = array(left->dimension[0], left->dimension[1], left->array);
   array r = array(right->dimension[0], right->dimension[1], right->array);
   array res = matmul(l,r);
   result->array = res.host<double>();
 }
}
extern "C" {
   #include "arrayfire.c"
}

JRuby Backend:

For JRuby, I need to extend ArrayFire-Java and create a wrapper for it in JRuby.

Create JNI on ArrayFire-Java for matrix multiplication, determinant calculation, Cholesky factorization, QR factorization, LU decomposition and solve.

The ArrayFire-Java wrapper for matrix multiplication is done; the wrapper for determinant and Cholesky factorization is in progress.

Repo Link: https://github.com/arrayfire/arrayfire-java

Place libaf.so in the Load path.

require 'ext/vendor/ArrayFire.jar'
class Af_Array
  attr_accessor :dims, :elements
  def matmul(other)
    Blas.matmul(self.arr, other)
  end
end

The jruby backend will need to ensure:

  1. Copying doesn't take place.
  2. Coercion of values doesn't take place.
  3. Chaining to Java methods for speed.

Ruby Frontend:

Once the core is implemented, Ruby frontend will be used to enhance the capabilities of the existing methods. #each_with_indices will be used to implement #rank and #pretty_print. Similarly, shortcuts for matrix generation will be implemented , for example #zeros, #ones, #diagonal , etc.

Benchmarks:

I wrote the sample code used for benchmarking and generating the plots below which can be found here:

https://github.com/prasunanand/ArrayFire-rb-benchmark

Matrix addition and matrix multiplication have been benchmarked using Ruby and JRuby. I was able to benchmark determinant and lu decomposition on MRI only.

RAM consumed is up to 3GB for 1e7 elements for both Ruby/JRuby. I ran some benchmark tests on arrays where it was the fastest of all.

Matrix Addition Fig. 2

Matrix Multiplication Fig. 3

Matrix Determinat Fig. 4

Matrix LU Decomposition Fig. 5

Interfacing with NMatrix

ArrayFire would be a standalone library but I would also work on making it easy to interface it with NMatrix.

static void matmul(afstruct *result, afstruct *left, afstruct *right)
 {
   array l = array(left->dimension[0], left->dimension[1], left->array);
   array r = array(right->dimension[0], right->dimension[1], right->array);
   array res = matmul(l,r);
   result->array = res.host<double>(); //copying elements from GPU to CPU.
 }

res.host<double>() would be used to interface ArrayFire with NMatrix. This can also be used to interface ArrayFire with NArray and ActiveRecords in Rails but it would be out of scope for GSoC.

TDD

I am using rspec currently for developing test-suite for ArrayFire-rb.

An example test for Af_Array#+

require 'spec_helper'
describe ArrayFire::Af_Array do
  context '#addition' do
    let(:a) { ArrayFire::Af_Array.new 2, [2,2],[1,2,3,4] }
    let(:b) { ArrayFire::Af_Array.new 2, [2,2],[1,2,3,4] }
    let(:c) { ArrayFire::Af_Array.new 2, [2,2],[2,4,6,8] }
    subject {c}
    it {expect(a+b).to eq c}
    it {expect(c.ndims).to eq a.ndims}
    it {expect(c.dimension).to eq a.dimension}
  end
end

The test-suite would consist of creation_spec, device_spec, BLAS_spec, LAPACK_spec, math_spec, accessors_spec, statistics_spec , shortcuts_spec to cover all the ArrayFire functionalities for linear-algebra.

Targeted Functionality for End-User

require 'arrayfire'

# Initialization
a= ArrayFire::Af_Array.identity(3,3)
pp aFig. 1
# prints =>
# [
#   [ 1 , 0 , 0 ]
#   [ 0 , 1 , 0 ]
#   [ 0 , 0 , 1 ]
# ]

# Indexing
puts a[2,1]
#prints =>
# 0

# Matrix addition
b = a + a
pp b
# prints =>
# [
#   [ 2 , 0 , 0 ]
#   [ 0 , 2 , 0 ]
#   [ 0 , 0 , 2 ]
# ]

# LAPACK feature: determinant calculation
puts ArrayFire::LAPACK.det(a)
# prints=>
# 1.0

# BLAS Feature: matrix multiplication
c = ArrayFire::BLAS.matmul(a, b)
pp c
# prints =>
# [
#   [ 2 , 0 , 0 ]
#   [ 0 , 2 , 0 ]
#   [ 0 , 0 , 2 ]
# ]

# LAPACK feature: solve system of equations
lhs = ArrayFire::Af_Array.new(2, [3,3], [ 4, 5, 2, 3, 2, 1, -5 , 2, 12 ])
rhs = ArrayFire::Af_Array.new(2, [3,1], [18, 15, 19] )
sol = ArrayFire::LAPACK.solve(lhs, rhs)
pp sol
# prints =>
# [
#   [ 5 ]
#   [-2 ]
#   [ 4 ]
# ]

Please provide a specific timeline for your project from application period until pencils-down. What benchmarks will you set for yourself? The greater the detail on this question and the previous, the better.

Timeline

Community bonding period

I have been active on SciRuby since last year. I would use this period to get more familiarity with what Ruby users expect a GPU library to look like on SciRuby Slack channel. Also, I would be interacting more with the ArrayFire team.

Coding Period

MRI n-Dimensional APIs

  • Spec

Tests will be added to creation_spec, math_spec, accessors_spec, to cover the MRI tests for n-dimensional matrices.

Week 1 (May 30 - June 6)

  1. Strides and Enumerators

Strides and Enumerators are necessary so that it can also be used as an independent library.

Storing an n-dimensional array when dimension is greater than 2, we need to calculate the stride along different dimensions. This helps in accessing the corresponding elements and elements along a particular dimension/rank. Thus #[] and #[]= will be implemented to get and set the elements of an array.

The beauty and simplicity of the ArrayFire library will greatly depend on methods like #row #column #rank to easily access the element(s). This will depend on implementing #each and #each_with_indices that can map to specific elements and yield according to the blocks passed. For MRI, this can be easily implemented by borrowing relevant code from NMatrix-MRI.

#each_with_indices can be implemented as link.

  1. Helper Functions

Implement #pretty_print and #to_a , just like NMatrix.

Week 2 (June 7 - June 13 )

  1. Mathematical Functions

Implement the following functions using ArrayFire C++ APIs.

  1. Arithmetic operations: +,-, *, /, >>, <<
  2. Exponential and logarithmic functions: exp, log, expm1, log1p, etc.
  3. Hyperbolic functions: sinh, cosh, tanh, etc.
  4. Logical operations: &&, ||, |, &, <, >, <=, >=,==, !
  5. Numeric functions: floor, round, min, max, etc.
  6. Trigonometric functions: sin, cos, tan, etc.

Af_Array#+ has been implemented as:

static VALUE elementwise_op(arf::ewop_t op, VALUE left_val, VALUE right_val) {

  afstruct* left;
  afstruct* right;
  afstruct* result = ALLOC(afstruct);

  Data_Get_Struct(left_val, afstruct, left);
  Data_Get_Struct(right_val, afstruct, right);


  result->ndims = left->ndims;
  result->dimension = left->dimension;
  result->count = left->count;
  arf::add(result, left, right);

  return Data_Wrap_Struct(CLASS_OF(left_val), NULL, arf_free, result);
}
  static void add(afstruct *result, afstruct *left, afstruct *right)
  {
    array l = array(left->dimension[0], left->dimension[1], left->array);
    array r = array(right->dimension[0], right->dimension[1], right->array);
    array res = operator+(l,r);
    result->array = res.host<double>();
  }

JRuby 2-Dimensional APIs

ArrayFire for JRuby will depend on ArrayFire.jar which will be compiled and packaged from ArrayFire-java.

  • Spec

As JRuby will be introduced in this period, I need to make sure that the tests in creation_spec, BLAS_spec, LAPACK_spec pass for JRuby.

Week 3 (June 14 - June 20)

  1. Creation of Af_Array

An Af_Array can be easily implemented to store the elements of an n-dimensional Ruby. Currently ArrayFire-java uses Array class instead of Af_Array to store the matrix. This results in warnings when importing the Array class.

Creating an Af_Array is as simple as:

require_relative './ext/ArrayFire.jar'
java_import 'com.arrayfire.Array'

class Af_Array
  def initialize(dims, elements)
    @dims = dims
    @elements = Af_Array.new elements
  end
end
  1. BLAS functionalities

Implement BLAS functionalities #matMul #matMulT to get dot of a matrix with another with or without Transpose.

class BLAS
  def self.matMul(other)
    result = create_dummy_af_array
    #check dimension
    raise(ShapeError, "Cannot multiply matrices with different dimension") if (@dim[1] != other.dim[0])
    result.s = Blas.matmul(self, other)
    result
  end
end

Week 4 (June 21 June 26)

  1. LAPACK functionalities

Implement LAPACK functionalities, determinants, LU Factorization, Cholesky Factorization, QR factorization, Singular Vector Decomposition.

class LAPACK
  def self.solve(lhs, rhs)
    result = create_dummy_af_array
    raise(ShapeError, "Must be called on square matrix") unless lhs.dim == 2 && lhs.shape[0] == lhs.shape[1]
    raise(ShapeError, "number of rows of b must equal number of cols of self") if 
      lhs.shape[1] != rhs.shape[0]
    result.s = Lapack.solve(lhs.elements, rhs.elements)
    result
  end
end

First Evaluation (June 26 - 30, 2017)

Deliverables

  1. Linear Algebra Support for N-dimensional matrices using ArrayFire on MRI
  2. Linear Algebra Support for 2-dimensional matrices using ArrayFire on JRuby

JRuby n-dimensional APIs

  • Spec

The goal would be to make the tests for creation_spec, math_spec, accessors_spec pass on JRuby.

Week 5 (June 27 - July 3)

  1. Strides and Enumerators

For enumerators, I will use Java code to prevent any issues with speed and error with type coercion. Strides can be calculated just like NMatrix for JRuby.

  1. Helper Functions

It will be implemented on Ruby frontend and would already be covered by MRI.

Week 6 (July 4 - July 11)

  1. Mathematical Functions

Implement the following functions using ArrayFire Java.

  1. Arithmetic operations: +,-, *, /, >>, <<
  2. Exponential and logarithmic functions: exp, log, expm1, log1p, etc.
  3. Hyperbolic functions: sinh, cosh, tanh, etc.
  4. Logical operations: &&, ||, |, &, <, >, <=, >=,==, !
  5. Numeric functions: floor, round, min, max, etc.
  6. Trigonometric functions: sin, cos, tan, etc.

Af_Array#+ would be implemented as:

class Af_Array
  def + rha
    Arith.add(result, @elements, rha.elements)
    Af_Array.new(@shape, result)
  end
end

Interfacing ArrayFire with NMatrix and NArray

Week 7 - Week 8 ( July 12 - July 26)

This is going to be tricky as we need to implement OpenCL backend for Ruby/JRuby.

Copying data from GPU RAM to CPU RAM will be done using res.host<double>() as explained in Technical Description section.

Copying data from NMatrix object to an Af_Array would be implemented using the C code as its a lot faster than Ruby code. Similarly, java methods would be used to copy data in case of JRuby.

In the nmatrix/lib/opencl directory there would be all the NMatrix wrappers that call the ArrayFire-rb equivalents. The frontend decides to call BLAS and LAPACK(CPU) routines or ArrayFire methods.

For example, if I need to configure NMatrix#det to use ArrayFire I just need to add the following code in lib/nmatrix/nmatrix-opencl.rb

class NMatrix
  def det
    af = Af_Array.new(@shape,@s)
    af.det
  end
end

The final outcome should be that any program that uses NMatrix can switch to GPU by adding:

require 'nmatrix/nmatrix-opencl'

Next, add changes to NMatrix tests. All the tests for NMatrix should be cleared.

Test mixed_models(LMM) gem to use NMatrix with OpenCL backend

  • Spec

When NMatrix is ready with an OpenCL backend, I will get all the tests for mixed_models to pass. NMatrix is a dependency for mixed_models gem. I have to make just a few changes to get mixed_models running on OpenCL.


Second Evaluation July 24 - 28, 2017

Deliverables

  1. Linear Algebra Support for N-dimensional matrices using ArrayFire on JRuby
  2. NMatrix integrated with ArrayFire with maximum test coverage for dense stype and double dtype.

Implement Statistics and Reduction APIs for vectors

  • Spec

Add tests to statistics_spec and math_spec. NMatrix would be used as reference for specifications to ensure full compatibility.

Week 9 (July 27 - August 3)

  • Implementing Basic Statistics functions:
  1. corrcoef
  2. cov
  3. mean
  4. median
  5. stdev
  6. var

Week 10 (August 3 - August 11 )

  • Searching and Sorting

ArrayFire provides Sorting functionalities using the following methods:

  1. sort
  2. sortByKey
  3. sortIndex

Optimization and Shortcut methods

Week 11 - Week 12 (August 12 - August 26)

This period would be used as to optimize the functionalities of ArrayFire by benchmarking them.

Creating shortcut methods would be an optional deliverable as I would keep this time as a buffer period. I will work on implementing shortcut methods for ArrayFire, like NMatrix that will rely on Ruby frontend only.

Documentation and Tutorials

Week 13 ( August 26 - August 30 )

  • Release docs for the ArrayFire gem.
  • Write blog posts explaining how to utilize the ArrayFire gem for High Performance Computing. This will involve writing iruby notebooks.

Final Evaluation

Deliverables

  1. ArrayFire gem for Ruby supported on MRI as well as JRuby.
  2. NMatrix integrated with ArrayFire.

Challenges

During my previous GSoC, I faced a lot of challenges. To begin with, I was not very good at C and C++. It took me a long time to understand the existing NMatrix codebase. For NMatrix JRuby, I had to find solutions on my own by experimenting as there a very little information on StackOverflow or online forums regarding how other other projects do it. At the end of GSoC, I got good at hacking other codebases. When I started working with ArrayFire MRI implementation, I got even more familiar with NMatrix -MRI codebase.

For GSoC 2017, I have done intensive ground work. GSoC 2017 will get extremely challenging for me after mid Summer, when I begin working on optimization and profiling. Also, this year, I would be working with both MRI and JRuby together.

The project would involve working with around five languages, C, C++, Java, Ruby, OpenCL(in terms of understanding kernel code for optimization). I would be contributing code to four codebases arrayfire-java, arrayFire-rb, nmatrix and mixed_models. Also, I would be working on a lot of Optimization problems.

This summer would indeed be more challenging than last year's. However, I am well-prepared to face it.

Future Work

After the grant period I intend to add more functionality to the ArrayFire gem by adding wrappers for Signal Processing and Statistics. In the future, Ruby will have its own rich machine-learning libraries like sci-kit learn. The ArrayFire gem would certainly contribute a lot to the performance of these libraries.

What is one long-term vision for something you'd like scientific software to be able to do? Think big picture, not necessarily realistic in the short term.

Scientific Software should be smart. Currently, we aim it to be efficient. However, it should have the ability to suggest what the end-user needs to do.

What are your hobbies, aside from coding? Tell us a little about yourself that isn't reflected in the rest of your application. What do you want to do with your life (if you have any idea)?

I was brought up in this typical Indian home where studying was the thing that all kids did. As I entered my university I continued to do the same. I discovered coding here. Coding became my hobby and gradually over the years it became my passion. Whenever I am tired and stressed out with coding I turn to movies and TV shows.

Regarding what I want to with my life, this last year I have come to appreciate Science as a future career prospect and where I want to end up working on. I am highly interested in applying for a Ph.D. program where I can improve and use my skills to work on complex problems, optimize the existing solutions. If not, open-source will always be there to help me to show my skills and creativity.

What else do you think we should have asked but didn't? Propose a question of your own and answer it here.

Why should you contribute to open-source software?

I think open-source software development is about helping the community with high-quality software engineered with the best ideas and methodologies that we can come-up with. The Open-source community continuously comes up with new ideas and tries to make things better. Also, it’s about how one can become a better programmer. Contributing to Open-source software development teaches me to write better code, learn the best coding practices and the latest software stack.

I feel contributing to open-source adds a lot to my own skills and resume. For example, anyone can use a framework for software development; but it’s challenging to develop one. It is also exciting when you successfully develop a framework, and the community appreciates it.

Bonus question: One aim of the Ruby Science Foundation (SciRuby) is to increase diversity in open source science software development. How do we get more women interested in open source software development and science? How do we get more people from underrepresented groups involved?

  • Ruby Science Foundation could have outreach programs for the underrepresented by setting up 48-hours workshops free-of-cost.
  • We could use leading content creators of local fame in specific localities to create awareness regarding Sciruby.
  • We could organise Women-centric coding marathons around the year.
  • We could have sciruby reps in women universities to create an interest among them.
  • Tie-up with schools and institutions in marginalised areas for creating awareness about coding in general and Sciruby in particular.
  • Create video tutorials that are immersive and interactive and collaborate with educational websites like codecademy, udemy, coursera etc. so that we could capture the interest of young minds from diverse groups at a developing age.