Skip to content
Ondřej Moravčík edited this page Mar 14, 2015 · 16 revisions

There are 3 way to configure ruby-spark and spark.

By environment variable

SPARK_RUBY_SERIALIZER="oj" bin/ruby-spark pry

By configuration

This muts be done before starting.

Spark.config do
  set_app_name "RubySpark"
  set_master "local[*]"
  set "spark.ruby.serializer", "oj"
end

During data uploading

$sc.parallelize(1..10, 3, serializer: "oj")
Clone this wiki locally