-
Notifications
You must be signed in to change notification settings - Fork 80
Google Summer of Code 2013 Ideas
Hey! We're still in the process of submitting SciRuby as an organization to Google. It's great that you're checking us out so early. If you're interested in contributing to SciRuby or any of its subprojects and/or want to know more about it, please contact us.
Feel free to reach us by joining #sciruby
on chat.freenode.net or via our mailing list.
You don't need to know a lot about Ruby before proposing a project: depending on how much you already know, it'll be pretty easy to learn enough to be able to contribute. However, you'll need some familiarity with scientific computation. If you don't have any, take a look at "Numerical Recipes in C", which you'll probably find in your university's library.
In any case, if you feel your skills aren't enough for some project, please ask us on our IRC channel (see contact section above) and we can help you.
NMatrix is SciRuby's numerical matrix core, implementing dense matrices as well as two types of sparse (linked-list-based and Yale/CSR). NMatrix is a fairly new but well-established project which has received Summer-of-Code-like grants from both Brighter Planet and the Ruby Association (in other words, from Matz, who created Ruby). Those who contribute to NMatrix will likely eventually become authors of a jointly-published peer-reviewed science article on the library. Additionally, NMatrix is a good place to gain practical C and C++ experience, while also working to improve Ruby.
- Mentors: John Woods (@mohawkjohn)
-
ATLAS Functionality. NMatrix has many but not all ATLAS (cBLAS) and LAPACK functions exposed. We would like to see a consistent interface which makes sense in Ruby. We also want to be able to design and implement several
NMatrix
methods which depend upon ATLAS, cBLAS, and cLAPACK functions. - Rational Functionality. NMatrix includes some rational number capability, but support is lacking in areas where ATLAS functions are required, since ATLAS does not have a rational type. Rational-specific equivalents of ATLAS functions are needed. Along the way it may be possible to also implement some integer-specific ATLAS function equivalents.
- List matrix element-wise operations. Element-wise operations work for Yale and dense matrices, but not yet for list matrices.
- Slicing support. @flipback began implementing slicing support, but it is incomplete. Slicing needs to be integrated into a major portion of NMatrix methods.
- Basic matrix math functionality. Specifically, exponentials and square roots, matrix decomposition/factorization, calculation of norms, tensor products, principal component analysis (PCA).
- Statistical functions for matrices and vectors. Statsample needs to support NMatrix, accepting/returning matrices and vectors as well as single values.
- Sparse improvements. The "new" Yale matrices used by NMatrix, which store diagonals (zero and non-zero) separately from non-diagonal non-zeros, are inefficient for matrices that are taller than they are wide. One way to address the problem would be to introduce an alternate "old" Yale storage. Another would be to allow matrices to be stored and operated on transposed. The goal, overall, is to be able to produce efficient Yale/sparse vectors regardless of the vectors' orientation.
-
extconf improvements. See Ruby-core improvements below. NMatrix uses
mkmf
for compilation of its C and C++ code, as well as linking ATLAS, LAPACK, and BLAS. Butmkmf
is difficult to use, and leads to compilation and linking problems -- not just in NMatrix but elsewhere as well, and particularly when working on multiple platforms (Linux, Mac, Windows, etc.). It'd be better to have a customextconf.rb
-related library for NMatrix to use for linking highly-specialize C libraries like ATLAS. A successful implementation of this project would significantly reduce barriers for NMatrix adoption (e.g., by eliminating compiling and linking difficulties).
- Mentors: John Woods (@mohawkjohn)
- Ruby-core projects, particularly
mkmf
require a good understanding of C as well as Ruby. C++ experience would also be beneficial, but is not mandatory. -
mkmf
is the library Ruby uses, typically inextconf.rb
in gems or other libraries (including NMatrix), for linking C and C++ extensions. It lacks documentation. Most people currently figure it out by trial-and-error. A successfulmkmf
-related project would accomplish one or both of the following goals:- Provide complete documentation and examples for
mkmf
, drawing from current Ruby extensions as well as supposing hypothetical extensions. - Propose and implement an update to
mkmf
, which improves Ruby extension compilation and linking. Such a project would be extremely popular in the broader Ruby community.
- Provide complete documentation and examples for
- Mentors: Carlos Agarie (@agarie), Claudio Bustos(@clbustos)
- It needs to be re-designed
- Must use NMatrix internally for speed and to use its IO module for numeric and integer data.
- Based on Pandas: http://pandas.pydata.org/pandas-docs/dev/
- Talk to BioRuby folks about what's necessary in this format.
- Mentors: Carlos Agarie (@agarie), John Woods (@mohawkjohn)
- http://d3js.org/
- Ruby D3. Rubyvis, our current visualization tool, is a Ruby port of Protovis. Protovis was recently supplanted by D3. We would like to produce a Ruby port of D3.
- Rubyvis/D3 JavaScript helpers for Rails. Rubyvis is pure Ruby code, but Protovis and D3 are Javascript. It would be nice to be able to write Rubyvis code which can either render SVGs directly or produce Javascript code that can render SVGs in a web browser. The goal is to provide interactive scientific tools for Ruby on Rails.
- Mentors: Max Makarochkin (@mac-r), Carlos Agarie (@agarie)
- It's a modular DSL (Domain Specific Language) for predictive analysis, i.e. you can use it to build diverse kinds of classifiers and systems based on machine learning. If you want to learn about Data Science, this is a very good project to work on.
- https://github.com/ajaila/ajaila/
- We must describe what tasks someone could undertake by choosing Ajaila as his/her project.
- Ruby, data analysis, data science.
- Mentors: Claudio Bustos(@clbustos)
- Remove support for Ruby version < 1.9.3
- Create more specs (with current version of RSpec)
- Improve the documentation based on RDoc and the current style used in NMatrix
- Create modules for Generalized Lineal Models (GLM) and Time Series Analysis
- Mentors: Claudio Bustos(@clbustos)
- Remove support for Ruby version < 1.9.3
- Add more minimization methods
- Create more specs (with current version of RSpec)
- Maybe 'optimization', as it'd allow more general algorithms or something?
- http://docs.scipy.org/doc/scipy/reference/optimize.html
- Improve the documentation based on RDoc and the current style used in NMatrix
- Mentors: Claudio Bustos(@clbustos)
- Improve its API
- Remove support for Ruby version < 1.9.3
- Create more specs (with current version of RSpec)
- Add more integration methods: be more explicit about each method's imprecisions
- Add support for ODEs
- http://docs.scipy.org/doc/scipy/reference/integrate.html
- Improve the documentation based on RDoc and the current style used in NMatrix
- Mentors: Claudio Bustos(@clbustos)
- Update to current versions of JRuby and MRI.
- Remove support for Ruby version < 1.9.3
- Create more specs (with current version of RSpec)
- Use a more modular approach to each distribution, i.e. Strategy pattern
- Improve the documentation based on RDoc and the current style used in NMatrix
- Related: issue #5.
Create a website for guides on how to use SciRuby and its subprojects, possibly with other Ruby packages.
- Mentors: Carlos Agarie (@agarie)
- Must be generated with a single rake command
- Use RailsGuides as an inspiration https://github.com/rails/rails/tree/master/guides
- Must be "easy to navigate on"
- Show my own "Create an NMatrix" as a starting point?
- Web development, Ruby, Jekyll, Rake, user interface, user experience
- Related: issue #24.
- Mentors: Carlos Agarie (@agarie)
- Base it on the Rails one: http://api.rubyonrails.org/
- https://github.com/tenderlove/horo
- https://github.com/rdoc/hanna-nouveau
- Technologies: Ruby, RDoc, parsing, how to package gems, documentation generators, user interface, user experience, webdesign
- Integrate it into sciruby.com's workflow: Ruby, Rake, automation
- Mentors:
- From their site: "LEMON stands for Library for Efficient Modeling and Optimization in Networks. It is a C++ template library providing efficient implementations of common data structures and algorithms with focus on combinatorial optimization tasks connected mainly with graphs and networks."
- http://lemon.cs.elte.hu/pub/doc/1.2.3/annotated.html
- This would be a great chance to learn more about Ruby's C API.
- This library would allow us to create probabilistic graphical models with SciRuby (using statsample and distribution).
- Ruby, C API, wrapping external libraries, C/C++, graph theory
- Mentors:
- From their site: "Waffles seeks to be the world's most comprehensive collection of command-line tools for machine learning and data mining. Our native tools have minimal dependencies (no interpreter, VM, or runtime environment is necessary), and build cross-platform. If you have a useful data mining tool that meets these criteria, we want it in Waffles."
- There are various projects that can be created out of this: -- Wrap the command line interface (much like mini_magick does for imagemagick). -- Create native bindings that link in with NMatrix or Sciruby::Dataset. -- Create an FFI interface.
- Technologies/skills: Ruby, C/C++, wrapping external libraries, machine learning, statistics, command-line tools
- Mentors:
- Verify the compatibility between JRuby and SciRuby's subprojects
- It'd allow us to take advantage of JRuby's multithreaded environment!
- Research if pure ruby libraries (D3, Rubyvis, minimization) can benefit from using Java "native" extensions and if the extra complexity pays off.