Skip to content

Google Summer of Code 2013 Ideas

agarie edited this page Apr 1, 2013 · 43 revisions

Hey! We're still in the process of submitting SciRuby as an organization to Google. It's great that you're checking us out so early. If you're interested in contributing to SciRuby or any of its subprojects and/or want to know more about it, please contact us.

Contact

Feel free to reach us by joining #sciruby on chat.freenode.net or via our mailing list.

Instructions for students

You don't need to know a lot about Ruby before proposing a project: depending on how much you already know, it'll be pretty easy to learn enough to be able to contribute. However, you'll need some familiarity with scientific computation. If you don't have any, take a look at "Numerical Recipes in C", which you'll probably find in your university's library.

In any case, if you feel your skills aren't enough for some project, please ask us on our IRC channel (see contact section above) and we can help you.

Projects ideas

Create a website for guides on how to use SciRuby and its subprojects, possibly with other Ruby packages.

  • Mentors: Carlos Agarie (@agarie)
  • Must be generated with a single rake command
  • Use RailsGuides as an inspiration https://github.com/rails/rails/tree/master/guides
  • Must be "easy to navigate on"
  • Show my own "Create an NMatrix" as a starting point?
  • Web development, Ruby, Jekyll, Rake, user interface, user experience
  • Related: issue #24.

Create a new RDoc template optimized for our purposes

NMatrix

  • Mentors: John Woods (@mohawkjohn)
  • ATLAS Functionality. NMatrix has many but not all ATLAS (cBLAS) and LAPACK functions exposed. We would like to see a consistent interface which makes sense in Ruby. We also want to be able to design and implement several NMatrix methods which depend upon ATLAS, cBLAS, and cLAPACK functions.
  • Rational Functionality. NMatrix includes some rational number capability, but support is lacking in areas where ATLAS functions are required, since ATLAS does not have a rational type. Rational-specific equivalents of ATLAS functions are needed. Along the way it may be possible to also implement some integer-specific ATLAS function equivalents.
  • List matrix element-wise operations. Element-wise operations work for Yale and dense matrices, but not yet for list matrices.
  • Slicing support. @flipback began implementing slicing support, but it is incomplete. Slicing needs to be integrated into a major portion of NMatrix methods.
  • Basic matrix math functionality. Specifically, exponentials and square roots, matrix decomposition/factorization, calculation of norms, tensor products, principal component analysis (PCA).
  • Statistical functions for matrices and vectors. Statsample needs to support NMatrix, accepting/returning matrices and vectors as well as single values.

SciRuby::Dataset

  • Mentors: Carlos Agarie (@agarie), Claudio Bustos(@clbustos)
  • It needs to be re-designed
  • Must use NMatrix internally for speed and to use its IO module for numeric and integer data.
  • Based on Pandas: http://pandas.pydata.org/pandas-docs/dev/
  • Talk to BioRuby folks about what's necessary in this format.

Create a visualization package based on D3

  • Mentors: Carlos Agarie (@agarie)
  • http://d3js.org/
  • Ruby D3. Rubyvis, our current visualization tool, is a Ruby port of Protovis. Protovis was recently supplanted by D3. We would like to produce a Ruby port of D3.
  • Rubyvis/D3 JavaScript helpers for Rails. Rubyvis is pure Ruby code, but Protovis and D3 are Javascript. It would be nice to be able to write Rubyvis code which can either render SVGs directly or produce Javascript code that can render SVGs in a web browser. The goal is to provide interactive scientific tools for Ruby on Rails.

Ajaila

  • Mentors: Max Makarochkin (@mac-r), Carlos Agarie (@agarie)
  • It's a modular DSL (Domain Specific Language) for predictive analysis, i.e. you can use it to build diverse kinds of classifiers and systems based on machine learning. If you want to learn about Data Science, this is a very good project to work on.
  • https://github.com/ajaila/ajaila/
  • We must describe what tasks someone could undertake by choosing Ajaila as his/her project.
  • Ruby, data analysis, data science.

Statsample

  • Mentors: Claudio Bustos(@clbustos)
  • Remove support for Ruby version < 1.9.3
  • Create more specs (with current version of RSpec)
  • Improve the documentation based on RDoc and the current style used in NMatrix
  • Create modules for Generalized Lineal Models (GLM) and Time Series Analysis

Minimization

  • Mentors: Claudio Bustos(@clbustos)
  • Remove support for Ruby version < 1.9.3
  • Add more minimization methods
  • Create more specs (with current version of RSpec)
  • Maybe 'optimization', as it'd allow more general algorithms or something?
  • http://docs.scipy.org/doc/scipy/reference/optimize.html
  • Improve the documentation based on RDoc and the current style used in NMatrix

Integration

  • Mentors: Claudio Bustos(@clbustos)
  • Improve its API
  • Remove support for Ruby version < 1.9.3
  • Create more specs (with current version of RSpec)
  • Add more integration methods: be more explicit about each method's imprecisions
  • Add support for ODEs
  • http://docs.scipy.org/doc/scipy/reference/integrate.html
  • Improve the documentation based on RDoc and the current style used in NMatrix

Distribution

  • Mentors: Claudio Bustos(@clbustos)
  • Update to current versions of JRuby and MRI.
  • Remove support for Ruby version < 1.9.3
  • Create more specs (with current version of RSpec)
  • Use a more modular approach to each distribution, i.e. Strategy pattern
  • Improve the documentation based on RDoc and the current style used in NMatrix
  • Related: issue #5.

Create a Ruby wrapper for LEMON

  • Mentors:
  • From their site: "LEMON stands for Library for Efficient Modeling and Optimization in Networks. It is a C++ template library providing efficient implementations of common data structures and algorithms with focus on combinatorial optimization tasks connected mainly with graphs and networks."
  • http://lemon.cs.elte.hu/pub/doc/1.2.3/annotated.html
  • This would be a great chance to learn more about Ruby's C API.
  • This library would allow us to create probabilistic graphical models with SciRuby (using statsample and distribution).
  • Ruby, C API, wrapping external libraries, C/C++, graph theory

Create a Ruby wrapper for Waffles

  • Mentors:
  • From their site: "Waffles seeks to be the world's most comprehensive collection of command-line tools for machine learning and data mining. Our native tools have minimal dependencies (no interpreter, VM, or runtime environment is necessary), and build cross-platform. If you have a useful data mining tool that meets these criteria, we want it in Waffles."
  • There are various projects that can be created out of this: -- Wrap the command line interface (much like mini_magick does for imagemagick). -- Create native bindings that link in with NMatrix or Sciruby::Dataset. -- Create an FFI interface.
  • Technologies/skills: Ruby, C/C++, wrapping external libraries, machine learning, statistics, command-line tools

JRuby

  • Mentors:
  • Verify the compatibility between JRuby and SciRuby's subprojects
  • It'd allow us to take advantage of JRuby's multithreaded environment!
  • Research if pure ruby libraries (D3, Rubyvis, minimization) can benefit from using Java "native" extensions and if the extra complexity pays off.
Clone this wiki locally