Support parsing thousands separators in floating point data #2594

wesm · 2012-12-24T17:30:22Z

It seems that the decimal format works ok for the decimal sign or for the thousands but not combined.
Reopen the issue?

Example

import pandas as pd
from StringIO import StringIO
data = """A;B;C
0;0,11;0,11
1.000;1000,11;1.000,11
20.000;20000,22;20.000,22
300.000;300000,33;300.000,33
4.000.000;4000000,44;4.000.000,44
5.000.000.000;5000000000,55;5.000.000.000,55"""

df = pd.read_csv(StringIO(data), sep=';', thousands='.', decimal =',')
print df.dtypes
print df

Results in

A int64
B float64
C object
A B C
0 0 1.100000e-01 0,11
1 1000 1.000110e+03 1.000,11
2 20000 2.000022e+04 20.000,22
3 300000 3.000003e+05 300.000,33
4 4000000 4.000000e+06 4.000.000,44
5 5000000000 5.000000e+09 5.000.000.000,55

The text was updated successfully, but these errors were encountered:

matthias-ollig · 2013-07-31T12:38:00Z

I wrote a converter that removes the thousand separator and tried to use that in combination with the dtype argument of read_csv without success. What does work though, is removing and casting at the same time:

# in your case you want to replace the dot as that is your thousand separator
rem_thousand_sep_and_cast_to_float = lambda x: pd.np.float(x.replace(",", ""))

You can then use that function to convert the desired columns with the converters argument of read_csv. Let me know if that works for you.

Used in an example:

df = pd.io.parsers.read_csv("my.csv", sep=",", thousands=",",
                            converters={"a": rem_thousand_sep_and_cast_to_float,
                                        "b": rem_thousand_sep_and_cast_to_float})

jreback · 2013-07-31T14:52:36Z

does this not work? http://pandas.pydata.org/pandas-docs/dev/io.html#thousand-separators

hayd · 2013-08-26T19:00:12Z

fixed by #4598

hayd mentioned this issue Jul 31, 2013

read csv thousands separator #4322

Closed

guyrt mentioned this issue Aug 18, 2013

csv_import: Thousands separator works in floating point numbers #4598

Merged

hayd closed this as completed Aug 26, 2013

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support parsing thousands separators in floating point data #2594

Support parsing thousands separators in floating point data #2594

wesm commented Dec 24, 2012

matthias-ollig commented Jul 31, 2013

jreback commented Jul 31, 2013

hayd commented Aug 26, 2013

Support parsing thousands separators in floating point data #2594

Support parsing thousands separators in floating point data #2594

Comments

wesm commented Dec 24, 2012

matthias-ollig commented Jul 31, 2013

jreback commented Jul 31, 2013

hayd commented Aug 26, 2013