-
-
Notifications
You must be signed in to change notification settings - Fork 88
Parallel Transform
Kiba Pro ParallelTransform
provides an easy way to process a group of rows at the same time using a pool of threads.
In its current state, it is intended to accelerate ETL transforms doing IO operations such as HTTP requests, by going multithreaded instead of single threaded.
Currently tested against: MRI Ruby 2.4-2.7. Not tested strictly speaking against JRuby and TruffleRuby, yet, but will likely work equally (if it does not, get in touch!).
Requirements: add concurrent-ruby
to your Gemfile
.
require 'kiba-pro/transforms/parallel_transform'
job = Kiba.parse do
extend Kiba::Pro::Transforms::ParallelTransform::DSLExtension
# SNIP
parallel_transform(concurrency: 10) do |r|
extra_data = get_extra_json_hash_from_http!(r.fetch(:extra_data_url))
r.merge(extra_data: extra_data)
end
# SNIP
end
The parallel_transform
call is actually a shortcut for:
transform Kiba::Pro::Transforms::ParallelTransform,
concurrency: 10,
on_row: -> (r) { ... transform code ... }
Home | Core Concepts | Defining jobs | Running jobs | Writing sources | Writing transforms | Writing destinations | Implementation Guidelines | Kiba Pro
This wiki is tracked by git and publicly editable. You are welcome to fix errors and typos. Any defacing or vandalism of content will result in your changes being reverted and you being blocked.