You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It seems that the decimal format works ok for the decimal sign or for the thousands but not combined.
Reopen the issue?
Example
import pandas as pd
from StringIO import StringIO
data = """A;B;C
0;0,11;0,11
1.000;1000,11;1.000,11
20.000;20000,22;20.000,22
300.000;300000,33;300.000,33
4.000.000;4000000,44;4.000.000,44
5.000.000.000;5000000000,55;5.000.000.000,55"""
df = pd.read_csv(StringIO(data), sep=';', thousands='.', decimal =',')
print df.dtypes
print df
Results in
A int64
B float64
C object
A B C
0 0 1.100000e-01 0,11
1 1000 1.000110e+03 1.000,11
2 20000 2.000022e+04 20.000,22
3 300000 3.000003e+05 300.000,33
4 4000000 4.000000e+06 4.000.000,44
5 5000000000 5.000000e+09 5.000.000.000,55
The text was updated successfully, but these errors were encountered:
I wrote a converter that removes the thousand separator and tried to use that in combination with the dtype argument of read_csv without success. What does work though, is removing and casting at the same time:
# in your case you want to replace the dot as that is your thousand separatorrem_thousand_sep_and_cast_to_float=lambdax: pd.np.float(x.replace(",", ""))
You can then use that function to convert the desired columns with the converters argument of read_csv. Let me know if that works for you.
xref #584
The text was updated successfully, but these errors were encountered: