Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve TypedParquetTuple #1302 #1303

Merged
merged 3 commits into from
May 30, 2015
Merged

Improve TypedParquetTuple #1302 #1303

merged 3 commits into from
May 30, 2015

Conversation

JiJiTang
Copy link
Contributor

No description provided.

@JiJiTang
Copy link
Contributor Author

Hi @johnynek here's my commit with macros to generate Parquet read/write support instances. And I'm wondering maybe need to create a module like "scalding-parquet-macro" to put all the macros there and use them from "TypedParquet". Thank you very much for reviewing it and please let me know your thoughts.

object ParquetInputOutputFormat {
val READ_SUPPORT_INSTANCE = "parquet.read.support.instance"
val WRITE_SUPPORT_INSTANCE = "parquet.write.support.instance"
val deserialize = (GZippedBase64String(_)) andThen Bijection.bytes2GZippedBase64.inverse andThen KryoInjection.invert
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's make one Injection and use that:

something like:

val injection: Injection[Any, String] = KryoInjection.andThen(Injection.connect[Array[Byte], GZippedBase64String, String])

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed, that's how injection should be used:P

@johnynek
Copy link
Collaborator

Thanks a ton! I really appreciate how much work this is. I think we can make it super easy to read and write parquet: just as easy as TypedTsv[(Int, Int, String)]("mydata") is, except with all the benefits of Parquet.

@JiJiTang
Copy link
Contributor Author

Hi @johnynek thank you very much for your kind words and your code review and also your good advices. I am so glad to contribute to Scalding. Please check my latest commit.

@johnynek
Copy link
Collaborator

Looks great! thanks for making the changes so quickly. Looks very close to merging.

@johnynek
Copy link
Collaborator

Closes #1302

 *Refacto
 *Add example in README
@JiJiTang
Copy link
Contributor Author

Hi @johnynek thanks a lot. I've just added another commit. Please let me know if any problem before good to go.

@johnynek
Copy link
Collaborator

Looks good to me!

Will merge when green. Thanks a lot.

Next up: macros to create Filters from scala code:

Filter[SomeClass] {
  _.y == "foo"
}
// returns:
// FilterApi.eq(binaryColumn("y"), Binary.fromString("foo"))

johnynek added a commit that referenced this pull request May 30, 2015
@johnynek johnynek merged commit 7af2fae into twitter:develop May 30, 2015
@JiJiTang
Copy link
Contributor Author

JiJiTang commented Jun 1, 2015

Hi @johnynek , thank you very much for merging the PR. In parquet-mr/parquet-scala, there's a DSL for defining filter more fluidly. And good idea, macro generated filters would be even more smoothie.

@ianoc ianoc mentioned this pull request Aug 10, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants