- Changed the return type from int to int64.
- Added an interface to use with pre-allocated output array (good for MATLAB).
- Changed the inner loop of get_stream from
while
tofor
which gained some performance. The code in the previous version reads more like the paper.