ENH: avro to/from serialization #11752

jreback · 2015-12-03T15:01:26Z

discussed in #3525

shiny new fast version of avro might be interesting: vericast/cyavro#1
by @mariusvniekerk

jreback · 2015-12-03T15:01:35Z

cc @wesm

wesm · 2015-12-04T02:33:10Z

any performance numbers on cyavro vs pyavroc vs fastavro?

VelizarVESSELINOV · 2016-03-25T04:08:07Z

👍

manugarri · 2017-08-29T09:23:11Z

👍

Khrol · 2017-10-04T12:41:40Z

👍

manugarri · 2017-10-04T15:07:44Z

any updates on this? I am willing to put time to implement this, however would need some pointers, specially regarding:

Would we cast pandas dtypes into avro automatically? Or would we enforce a user defined schema? I believe the pandas approach would be the former, but it would be good (and easier) to allow for a specific schema. Casting the types would require some assumptions (avro has support for a set of primitives that might not match pandas').
Would read_avro support for headerless data (i.e. streaming data without the header)? Would make sense, to allow this feature as long as the user provides the schema.

wesm · 2017-10-04T15:14:53Z

One possible route for this is https://issues.apache.org/jira/browse/ARROW-1209 -- @mariusvniekerk has started working on this, and it would be easier to accommodate Avro's types in Arrow and deal with the marshaling to/from pandas DataFrame at a central location in Arrow-land.

jbrockmendel · 2019-09-12T01:10:23Z

Is this a live topic or is it subsumed by more recent arrow ecosystem developments?

wesm · 2019-09-12T03:45:41Z

Arrow-based Avro read/write isn't shipping yet, @emkornfield is interested in this as it relates to various Google services (among other uses). Hard to predict the timeline for that, though.

emkornfield · 2019-09-12T05:10:01Z

Yes, I'm hoping to make some progress on this towards the end of this month/early next month.

jbrockmendel · 2019-12-22T16:56:05Z

Closing and adding to a tracker issue #30407 for IO format requests, can re-open if interest is expressed.

jreback added IO Data IO issues that don't fit into a more specific label Compat pandas objects compatability with Numpy or Python functions labels Dec 3, 2015

jreback added this to the Someday milestone Dec 3, 2015

jreback mentioned this issue Dec 3, 2015

Add a backend for apache Avro blaze/odo#370

Open

dargueta mentioned this issue Mar 12, 2019

Fix incorrect type inference, Python compatibility issues, support compression ynqa/pandavro#12

Merged

datapythonista mentioned this issue Sep 12, 2019

DEPR: Move rarely used I/O connectors to third party modules #28409

Closed

jbrockmendel mentioned this issue Dec 22, 2019

ENH: Requested IO readers/writers #30407

Open

13 tasks

jbrockmendel closed this as completed Dec 22, 2019

jreback mentioned this issue Jan 25, 2022

ENH: In-built functions/API or utility to handle avro files #45614

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ENH: avro to/from serialization #11752

ENH: avro to/from serialization #11752

jreback commented Dec 3, 2015

jreback commented Dec 3, 2015

wesm commented Dec 4, 2015

VelizarVESSELINOV commented Mar 25, 2016

manugarri commented Aug 29, 2017

Khrol commented Oct 4, 2017

manugarri commented Oct 4, 2017 •

edited

Loading

wesm commented Oct 4, 2017

jbrockmendel commented Sep 12, 2019

wesm commented Sep 12, 2019

emkornfield commented Sep 12, 2019

jbrockmendel commented Dec 22, 2019

ENH: avro to/from serialization #11752

ENH: avro to/from serialization #11752

Comments

jreback commented Dec 3, 2015

jreback commented Dec 3, 2015

wesm commented Dec 4, 2015

VelizarVESSELINOV commented Mar 25, 2016

manugarri commented Aug 29, 2017

Khrol commented Oct 4, 2017

manugarri commented Oct 4, 2017 • edited Loading

wesm commented Oct 4, 2017

jbrockmendel commented Sep 12, 2019

wesm commented Sep 12, 2019

emkornfield commented Sep 12, 2019

jbrockmendel commented Dec 22, 2019

manugarri commented Oct 4, 2017 •

edited

Loading