Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Google cloud storage csv file write with pandas empty lines #176

Closed
P00L opened this issue Sep 13, 2019 · 11 comments
Closed

Google cloud storage csv file write with pandas empty lines #176

P00L opened this issue Sep 13, 2019 · 11 comments

Comments

@P00L
Copy link

P00L commented Sep 13, 2019

Hi,
Seems to be a difference in creating file local with respect to google cloud storage

import pandas as pd

df = pd.DataFrame({'a': [1, 2], 'b': [1, 2]})
df.to_csv('filename.csv')
df.to_csv('gs://BUCKET_NAME/filename.csv')

filename.csv

,a,b
0,1,1
1,2,2

gs://BUCKET_NAME/filename.csv

,a,b

0,1,1

1,2,2

gs://BUCKET_NAME/filename.csv file seems to add a new line for each inserted line.

python version 3.6.4
gcsfs version 0.3.0
pandas version 0.25.1

@TomAugspurger
Copy link
Contributor

What platform are you on?

@P00L
Copy link
Author

P00L commented Sep 13, 2019

What platform are you on?

I'm using windows local environment

@TomAugspurger
Copy link
Contributor

TomAugspurger commented Sep 13, 2019 via email

@martindurant
Copy link
Member

Wasn't there an issue with text mode and line endings? I thought we had dealt with that, so perhaps getting latest fsspec will fix.

@TomAugspurger
Copy link
Contributor

TomAugspurger commented Sep 13, 2019 via email

@P00L
Copy link
Author

P00L commented Sep 16, 2019

Hmm yeah that's what I guessed. Possibly something with universal line endings. Do you have time to check whether the issue is in pandas or gcsfs?

On Fri, Sep 13, 2019 at 7:15 AM Paolo Fusari @.***> wrote: What platform are you on? I'm using windows local environment — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#176?email_source=notifications&email_token=AAKAOITOJS7QOJ2XNZIOGPLQJN76ZA5CNFSM4IWN4RDKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD6U24WQ#issuecomment-531213914>, or mute the thread https://github.com/notifications/unsubscribe-auth/AAKAOIXOFQKSONCMRKGDAP3QJN76ZANCNFSM4IWN4RDA .

I'll try to make the test this week. I also make a test creating the file on Google App Engine and the file was created correctly without empty lines.

@P00L
Copy link
Author

P00L commented Sep 17, 2019

Hmm yeah that's what I guessed. Possibly something with universal line endings. Do you have time to check whether the issue is in pandas or gcsfs?

On Fri, Sep 13, 2019 at 7:15 AM Paolo Fusari @.***> wrote: What platform are you on? I'm using windows local environment — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#176?email_source=notifications&email_token=AAKAOITOJS7QOJ2XNZIOGPLQJN76ZA5CNFSM4IWN4RDKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD6U24WQ#issuecomment-531213914>, or mute the thread https://github.com/notifications/unsubscribe-auth/AAKAOIXOFQKSONCMRKGDAP3QJN76ZANCNFSM4IWN4RDA .

I tried to write a file line by line only with gcsfs and worked correctly without empty lines

@TomAugspurger
Copy link
Contributor

I think pandas-dev/pandas#21406 is the relevant pandas issue.

What happens with

import pandas as pd

df = pd.DataFrame({'a': [1, 2], 'b': [1, 2]})
df.to_csv('filename.csv')
df.to_csv('gs://BUCKET_NAME/filename.csv', newline="\n")

@P00L
Copy link
Author

P00L commented Oct 2, 2019

I think pandas-dev/pandas#21406 is the relevant pandas issue.

What happens with

import pandas as pd

df = pd.DataFrame({'a': [1, 2], 'b': [1, 2]})
df.to_csv('filename.csv')
df.to_csv('gs://BUCKET_NAME/filename.csv', newline="\n")

Sorry for the delay.
I tried the following code with line_terminator instead of newline and both file where correctly written without empty lines.

import pandas as pd

df = pd.DataFrame({'a': [1, 2], 'b': [1, 2]})
df.to_csv('filename.csv')
df.to_csv('gs://BUCKET_NAME/filename.csv', line_terminator="\n")

@martindurant
Copy link
Member

OK to close, then?

@P00L
Copy link
Author

P00L commented Oct 2, 2019

Yes, thanks

@P00L P00L closed this as completed Oct 2, 2019
@P00L P00L changed the title Google clous storage csv file write with pandas empty lines Google cloud storage csv file write with pandas empty lines Oct 3, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants