Skip to content
This repository has been archived by the owner on Jan 2, 2023. It is now read-only.

Commit

Permalink
Add support for LazyOutputFormat in the DSL
Browse files Browse the repository at this point in the history
The `output` DSL method now takes a `:lazy` option, which configures LazyOutputFormat.
  • Loading branch information
iconara committed Oct 16, 2015
1 parent 5243f40 commit 429d978
Show file tree
Hide file tree
Showing 4 changed files with 30 additions and 0 deletions.
3 changes: 3 additions & 0 deletions lib/rubydoop/dsl.rb
Original file line number Diff line number Diff line change
Expand Up @@ -139,6 +139,9 @@ def output(dir=nil, options={})
end
format.set_output_path(@job, Hadoop::Fs::Path.new(@output_dir))
@job.set_output_format_class(format)
if options[:lazy]
Hadoop::Mapreduce::Lib::Output::LazyOutputFormat.set_output_format_class(@job, format)
end
end
@output_dir
end
Expand Down
7 changes: 7 additions & 0 deletions spec/integration/hadoop_system_spec.rb
Original file line number Diff line number Diff line change
Expand Up @@ -155,5 +155,12 @@ def isolated_run(dir, cmd)
expect(uniques['e']).to eq 128
end
end

context 'the lazy output job' do
it 'produces no output files' do
expect(File.exist?('data/output/lazy_output/_SUCCESS')).to be_truthy
expect(Dir['data/output/lazy_output/part-r-*']).to be_empty
end
end
end
end
8 changes: 8 additions & 0 deletions spec/resources/test_project/lib/lazy_output.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
# encoding: utf-8

module LazyOutput
class Reducer
def reduce(key, values, context)
end
end
end
12 changes: 12 additions & 0 deletions spec/resources/test_project/lib/test_project.rb
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@

require 'word_count'
require 'uniques'
require 'lazy_output'


Rubydoop.configure do |input_path, output_path|
Expand Down Expand Up @@ -46,6 +47,17 @@
output_key Hadoop::Io::Text
output_value Hadoop::Io::Text
end

job 'lazy_output' do
input input_path
output "#{output_path}/lazy_output", lazy: true

mapper WordCount::Mapper
reducer LazyOutput::Reducer

output_key Hadoop::Io::Text
output_value Hadoop::Io::IntWritable
end
end

cc = Rubydoop::ConfigurationDefinition.new
Expand Down

0 comments on commit 429d978

Please sign in to comment.