create a community supported set of typical converters for read_csv #1180

timmie · 2012-05-02T17:30:01Z

Unfortauntely, nearly no input data file uses ISO format but rather random columns and formats.

Following
https://groups.google.com/forum/?fromgroups#!topic/pydata/pZjQMX_avmY

and

I would suggest to insert:

https://github.com/pydata/pandas/tree/master/pandas/tseries/converters.py

Where users could contribute/share typical converters for getting their date and times parsed into a pandas object from input.

timmie · 2012-05-07T20:19:22Z

OK, I assume that this is not core Pandas stuff.

So we could have a separate pandas.contrib package.

There users could upload their converters along with a sample data file.

The converters may then be documented in proper docstrings.

I did this for sckits.timeseries.tsfromtxt. and it worked quite well.
Only problem is how to assign meaningful names for the converter functions.

What do you think?

changhiskhan · 2012-05-07T20:31:24Z

I think this is a great idea. We hope to make an announcement about this once v0.8 is released and the API is stable.

In the mean time, would you be interested in taking the lead and create pandas/io/converters.py with a some docs and a few sample converters? Further feedback on the converter API/interface would be greatly appreciated.

timmie · 2012-05-07T21:25:54Z

Yes, sure.

But I'd rather wait until the mutli-column date time functionality is there:
#1186
#1174

timmie · 2012-05-08T19:16:22Z

Here an example (still for tsfromtxt):

def dc_h_0to23_cols(year, month, day, hour):
    """column separated datetime counting 0-23 

    .. csv-table:: Hourly Values: 0-23
           :header: "YYYY", "MM", "DD", "HH:MM", "value"
           :delim: ;

           2004;2;1;00:00;0
           2004;2;1;01:00;0
           [...];[...];[...];[...];[...]
           2004;2;1;22:00;0
           2004;2;1;23:00;0

    Note
    -----
    assumed datecols::

        datecols = (0,1, 2, 3)

    """

ghost · 2014-01-01T03:40:06Z

It seems to me like this wiill either stagnate or grow into a melange of tailored
solutions to the 1001 weird data problems found in the wild that most pepole won't see.

I don't think users will look for these recipes when they encounter these problems
in their own data. They'll either hack a collection of helpers to suit the data they
work with or just solve the problem with a once-off. There's no general pattern here
to grow into a coherent collection of solutions.

The idea of a pandas.contrib is interesting in itself, not clear conception of that project
yet. We'll wait for that concensus to materialize.

closing.

timmie · 2014-01-01T22:22:41Z

@y-p
I understand that you want to close this very stalled PR.
But the solution is not understood:
What if we add an example file for each converter template?
I think it could be a useful resource...

timmie mentioned this issue Jun 1, 2012

accompany each release with a document: entry points for help / input #1370

Closed

ghost closed this as completed Jan 1, 2014

ghost mentioned this issue Jan 1, 2014

now sectionwise: date_converter: delta / time #4632

Closed

dacoex mentioned this issue Mar 12, 2015

add a io module pvlib/pvlib-python#29

Closed

This issue was closed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

create a community supported set of typical converters for read_csv #1180

create a community supported set of typical converters for read_csv #1180

timmie commented May 2, 2012

timmie commented May 7, 2012

changhiskhan commented May 7, 2012

timmie commented May 7, 2012

timmie commented May 8, 2012

ghost commented Jan 1, 2014

timmie commented Jan 1, 2014

create a community supported set of typical converters for read_csv #1180

create a community supported set of typical converters for read_csv #1180

Comments

timmie commented May 2, 2012

timmie commented May 7, 2012

changhiskhan commented May 7, 2012

timmie commented May 7, 2012

timmie commented May 8, 2012

ghost commented Jan 1, 2014

timmie commented Jan 1, 2014