Skip to content
This repository has been archived by the owner on Dec 20, 2018. It is now read-only.

Support for logical datatypes like Decimal type #121

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

cpbhagtani
Copy link

Avro doesn't support very big numbers directly. It supports it through logicalTypes where you can specify value as string type but send the actual data type of the field as logicalType. Following is the example of a decimal type

{"type": "record",
"name": "test",
"fields" : [
{"name": "a","type": "string",
  "logicalType": "decimal",
  "precision": 4,
  "scale": 2
},
{"name": "b", "type": "string"}
]}

This pull request add support for reading such datatype. For now I have added only decimal type, but we can add more logicalTypes. In dataframe it actually appears as decimal(4,2) type.

@codecov-io
Copy link

Current coverage is 94.02%

Merging #121 into master will increase coverage by +0.57% as of bb290a5

@@            master    #121   diff @@
======================================
  Files            6       6       
  Stmts          275     318    +43
  Branches        45      60    +15
  Methods          0       0       
======================================
+ Hit            257     299    +42
  Partial          0       0       
- Missed          18      19     +1

Review entire Coverage Diff as of bb290a5

Powered by Codecov. Updated on successful CI builds.

@ghost
Copy link

ghost commented Oct 26, 2016

In addition to adding support for DateType, this fix is important to be able to read and write logical types in spark-bigquery.

To fully incorporate logical types, the logicalType attribute should be set for all logical types in convertTypeToAvro() when building Avro schema.

@progrexor
Copy link

@cpbhagtani
Do you have any plans to add support for bytes/decimal types soon?

i.e. to support this schema:

{
  "type" : "record",
  "name" : "avro_decimal1",
  "namespace" : "default",
  "fields" : [ {
    "name" : "dec_col1",
    "type" : [ "null", {
      "type" : "bytes",             <-----  bytes type instead of string type
      "logicalType" : "decimal",    <-----
      "precision" : 38,
      "scale" : 35
    } ],
    "default" : null
  } ]
}

@cpbhagtani
Copy link
Author

@progrexor , No currently we are sending decimal as string type with logical type.

@karthikkadiyam
Copy link

karthikkadiyam commented Mar 30, 2017

@cpbhagtani

Any plans on updating this code compatible to updated version of spark-avro.

i.e Resolve the branch conflicts to support spark-avro 3.2.0

@hsn10
Copy link

hsn10 commented Apr 2, 2017

I am also interested in this issue. You decided on correct approach of fixes in pull request and only problem is that they currently do not merge cleanly?

@alind-billore
Copy link

alind-billore commented Apr 5, 2017

Hi,

Sorry for asking same question again.
Do you have any plans to add support for bytes/decimal types soon?

Thanks,
Alind

@cpbhagtani
Copy link
Author

@karthikkadiyam I will make my PR compatible with 3.2.0 soon

@cpbhagtani
Copy link
Author

@hsn10 my PR is compatible with branch 2.0. I will try to make it compatible with 3.0

@cpbhagtani
Copy link
Author

@alind-billore decimal is already supported in my PR through avro logical type. bytes is not supported.

@alind-billore
Copy link

@cpbhagtani Thanks a lot for your reply ! :)

@karthikkadiyam
Copy link

@cpbhagtani Any update on when can we expect this PR to be compatible with 3.2 . I tried it myself but ran through some runtime exceptions and couldn't fix them.

@gengliangwang
Copy link
Contributor

In Avro spec:

A decimal logical type annotates Avro bytes or fixed types

@cpbhagtani Can you update this PR? I will follow it.

@lhoss
Copy link

lhoss commented Dec 7, 2018

Should also link to following, related, more recent PR (even if closed): #291
(Even that PR is closed, since Spark 2.4.0 provided avro logicalTypes support) this can still be useful for people that are bound to an older spark version

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants