-
Notifications
You must be signed in to change notification settings - Fork 55
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
* Add Proto BigDiffy example * Update READMEs * Clean up some irrelevant changes * Change to Example Proto record, fix SBT settings * Scalastyle fixes * Add protoBufSettings to other projects * Add proto settings to CLI project
- Loading branch information
1 parent
d75eda4
commit 91bb160
Showing
9 changed files
with
100 additions
and
23 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,9 +1,15 @@ | ||
Examples | ||
======= | ||
|
||
These example cover different use cases for generating Avro or TableRow data with Ratatool and Scalacheck. | ||
## Scalacheck | ||
These examples cover different use cases for generating Avro or TableRow data with Ratatool and Scalacheck. | ||
The constraints are based on arbitrary criteria defined for [Avro](https://github.com/spotify/ratatool/blob/master/ratatool-examples/src/main/avro/schema.avsc) | ||
and [BigQuery](https://github.com/spotify/ratatool/blob/master/ratatool-examples/src/main/resources/schema.json) | ||
which should mirror some real life use cases of generating data where some fields have expected values | ||
or behaviour. It is recommended to do some reading on ScalaCheck and how Generators work before digging into | ||
these examples. Some resources are provided [here](https://github.com/spotify/ratatool/wiki/Generators). | ||
these examples. Some resources are provided [here](https://github.com/spotify/ratatool/wiki/Generators). | ||
|
||
## Diffy | ||
Contains an example of using BigDiffy with Protobuf programmatically, as this is not currently supported | ||
in the CLI. This should serve as a reasonable workaround for users to build their own specific pipelines | ||
until a more generic version can be made. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,9 @@ | ||
syntax = "proto2"; | ||
|
||
option java_package = "com.spotify.ratatool.examples.proto"; | ||
option optimize_for = SPEED; | ||
|
||
message ExampleRecord { | ||
required string string_field = 1; | ||
required int64 int64_field = 2; | ||
} |
50 changes: 50 additions & 0 deletions
50
...examples/src/main/scala/com/spotify/ratatool/examples/diffy/ProtobufBigDiffyExample.scala
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,50 @@ | ||
/* | ||
* Copyright 2018 Spotify AB. | ||
* | ||
* Licensed under the Apache License, Version 2.0 (the "License"); | ||
* you may not use this file except in compliance with the License. | ||
* You may obtain a copy of the License at | ||
* | ||
* http://www.apache.org/licenses/LICENSE-2.0 | ||
* | ||
* Unless required by applicable law or agreed to in writing, | ||
* software distributed under the License is distributed on an | ||
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY | ||
* KIND, either express or implied. See the License for the | ||
* specific language governing permissions and limitations | ||
* under the License. | ||
*/ | ||
|
||
package com.spotify.ratatool.examples.diffy | ||
|
||
import java.net.URI | ||
|
||
import com.spotify.ratatool.GcsConfiguration | ||
import com.spotify.ratatool.diffy.{BigDiffy, ProtoBufDiffy} | ||
import com.spotify.ratatool.examples.proto.Schemas.ExampleRecord | ||
import org.apache.hadoop.fs.{FileSystem, Path} | ||
import com.spotify.scio._ | ||
|
||
object ProtobufBigDiffyExample { | ||
def recordKeyFn(t: ExampleRecord): String = { | ||
t.getStringField | ||
} | ||
|
||
def main(cmdlineArgs: Array[String]): Unit = { | ||
val (sc, args) = ContextAndArgs(cmdlineArgs) | ||
|
||
val (lhs, rhs, output, header, ignore, unordered) = | ||
(args("lhs"), args("rhs"), args("output"), | ||
args.boolean("with-header", false), args.list("ignore").toSet, | ||
args.list("unordered").toSet) | ||
|
||
val fs = FileSystem.get(new URI(rhs), GcsConfiguration.get()) | ||
val path = fs.globStatus(new Path(rhs)).head.getPath | ||
val diffy = new ProtoBufDiffy[ExampleRecord](ignore, unordered) | ||
val result = BigDiffy.diffProtoBuf[ExampleRecord](sc, lhs, rhs, recordKeyFn, diffy) | ||
|
||
BigDiffy.saveStats(result, output, header) | ||
|
||
sc.close().waitUntilDone() | ||
} | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters