Skip to content
This repository has been archived by the owner on Jun 9, 2023. It is now read-only.

Commit

Permalink
Merge pull request embulk#23 from trocco-io/add_column_type_numeric
Browse files Browse the repository at this point in the history
[Ruby版][ADD]NUMERIC対応
  • Loading branch information
HokutoMorita authored Jul 20, 2022
2 parents be2ad7c + 3f863c9 commit 61fade9
Show file tree
Hide file tree
Showing 5 changed files with 44 additions and 4 deletions.
25 changes: 25 additions & 0 deletions .github/workflows/gem-push.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
name: Ruby Gem

on:
workflow_dispatch:
push:
tags:
- 'v*'

jobs:
build:
name: Build + Publish
runs-on: ubuntu-latest
permissions:
packages: write
contents: read
steps:
- uses: actions/checkout@v2
- name: Set up Ruby 2.7
uses: ruby/setup-ruby@v1
with:
ruby-version: 2.7
- name: push gem
uses: trocco-io/push-gem-to-gpr-action@v1
with:
github-token: "${{ secrets.GITHUB_TOKEN }}"
2 changes: 1 addition & 1 deletion Gemfile
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
source 'https://rubygems.org/'

gemspec
gem 'embulk'
gem 'embulk', '< 0.10'
gem 'liquid', '= 4.0.0' # the version included in embulk.jar
gem 'embulk-parser-none'
gem 'embulk-parser-jsonl'
Expand Down
4 changes: 3 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -307,18 +307,20 @@ Column options are used to aid guessing BigQuery schema, or to define conversion

- **column_options**: advanced: an array of options for columns
- **name**: column name
- **type**: BigQuery type such as `BOOLEAN`, `INTEGER`, `FLOAT`, `STRING`, `TIMESTAMP`, `DATETIME`, `DATE`, and `RECORD`. See belows for supported conversion type.
- **type**: BigQuery type such as `BOOLEAN`, `INTEGER`, `FLOAT`, `STRING`, `TIMESTAMP`, `DATETIME`, `DATE`, `RECORD`, and `NUMERIC`. See belows for supported conversion type.
- boolean: `BOOLEAN`, `STRING` (default: `BOOLEAN`)
- long: `BOOLEAN`, `INTEGER`, `FLOAT`, `STRING`, `TIMESTAMP` (default: `INTEGER`)
- double: `INTEGER`, `FLOAT`, `STRING`, `TIMESTAMP` (default: `FLOAT`)
- string: `BOOLEAN`, `INTEGER`, `FLOAT`, `STRING`, `TIMESTAMP`, `DATETIME`, `DATE`, `RECORD` (default: `STRING`)
- timestamp: `INTEGER`, `FLOAT`, `STRING`, `TIMESTAMP`, `DATETIME`, `DATE` (default: `TIMESTAMP`)
- json: `STRING`, `RECORD` (default: `STRING`)
- numeric: `STRING`
- **mode**: BigQuery mode such as `NULLABLE`, `REQUIRED`, and `REPEATED` (string, default: `NULLABLE`)
- **fields**: Describes the nested schema fields if the type property is set to RECORD. Please note that this is **required** for `RECORD` column.
- **timestamp_format**: timestamp format to convert into/from `timestamp` (string, default is `default_timestamp_format`)
- **timezone**: timezone to convert into/from `timestamp`, `date` (string, default is `default_timezone`).
- **description**: description for the column.
- **scale**: optional, [scale](https://cloud.google.com/bigquery/docs/reference/standard-sql/data-types?hl=ja#decimal_types) for numeric column (long, default is 9).
- **default_timestamp_format**: default timestamp format for column_options (string, default is "%Y-%m-%d %H:%M:%S.%6N")
- **default_timezone**: default timezone for column_options (string, default is "UTC")

Expand Down
2 changes: 1 addition & 1 deletion embulk-output-bigquery.gemspec
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
Gem::Specification.new do |spec|
spec.name = "embulk-output-bigquery"
spec.version = "0.6.13"
spec.version = "0.7.0"
spec.authors = ["Satoshi Akama", "Naotoshi Seo"]
spec.summary = "Google BigQuery output plugin for Embulk"
spec.description = "Embulk plugin that insert records to Google BigQuery."
Expand Down
15 changes: 14 additions & 1 deletion lib/embulk/output/bigquery/value_converter_factory.rb
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
require 'time'
require 'time_with_zone'
require 'json'
require 'bigdecimal'
require_relative 'helper'

module Embulk
Expand All @@ -14,6 +15,7 @@ class TypeCastError < StandardError; end

DEFAULT_TIMESTAMP_FORMAT = "%Y-%m-%d %H:%M:%S.%6N" # BigQuery timestamp format
DEFAULT_TIMEZONE = "UTC"
DEFAULT_SCALE = 9

# @param [Hash] task
# @option task [String] default_timestamp_format
Expand All @@ -29,13 +31,15 @@ def self.create_converters(task, schema)
column_name = column[:name]
embulk_type = column[:type]
column_option = column_options_map[column_name] || {}
scale = column_option['scale'] || DEFAULT_SCALE
self.new(
embulk_type, column_option['type'],
timestamp_format: column_option['timestamp_format'],
timezone: column_option['timezone'],
strict: column_option['strict'],
default_timestamp_format: default_timestamp_format,
default_timezone: default_timezone,
scale: scale
).create_converter
end
end
Expand All @@ -46,7 +50,8 @@ def initialize(
embulk_type, type = nil,
timestamp_format: nil, timezone: nil, strict: nil,
default_timestamp_format: DEFAULT_TIMESTAMP_FORMAT,
default_timezone: DEFAULT_TIMEZONE
default_timezone: DEFAULT_TIMEZONE,
scale: DEFAULT_SCALE
)
@embulk_type = embulk_type
@type = (type || Helper.bq_type_from_embulk_type(embulk_type)).upcase
Expand All @@ -55,6 +60,7 @@ def initialize(
@timezone = timezone || default_timezone
@zone_offset = TimeWithZone.zone_offset(@timezone)
@strict = strict.nil? ? true : strict
@scale = scale
end

def create_converter
Expand Down Expand Up @@ -231,6 +237,13 @@ def string_converter
JSON.parse(val)
end
}
when 'NUMERIC'
Proc.new {|val|
next nil if val.nil?
with_typecast_error(val) do |val|
BigDecimal(val).round(@scale, BigDecimal::ROUND_CEILING)
end
}
else
raise NotSupportedType, "cannot take column type #{type} for string column"
end
Expand Down

0 comments on commit 61fade9

Please sign in to comment.