-
-
Notifications
You must be signed in to change notification settings - Fork 18.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
read_csv with iterator=True does not seem to work as expected without chunksize #3967
Comments
Well, I'm writing a test and waiting for some comments about this. Edit: Aarghhh. Force Damien G. |
Wouldn't making empty DataFrames be Falsey also solve this ? Edit: No it wouldn't! |
I was thinking you could do:
but that is probably semantically different. |
i believe this was fixed #3406 |
@jreback This still causes an infinite loop (isn't this the issue?):
|
Update. >>> reader = pd.read_csv('SampleData.csv', iterator=True, engine='python')
>>> reader.chunksize is None
True
>>> for row in reader:
print(row)
A B C
foo 1 2 3
bar 4 5 6
baz 7 8 9 No infinite loop here but returns the full >>> reader = pd.read_csv('SampleData.csv', chunksize=1, engine='python')
>>> for row in reader:
print(row)
A B C
foo 1 2 3
bar 4 5 6
A B C
baz 7 8 9 loop in For the C engine: >>> reader = pd.read_csv('SampleData.csv', iterator=True, engine='c)
>>> reader.chunksize is None
True
>>> for row in reader:
print(row)
Empty DataFrame
Columns: [Region, Rep, Item, Units, Unit Cost, Total]
Index: []
Empty DataFrame
Columns: [Region, Rep, Item, Units, Unit Cost, Total]
Index: []
... infinitely. I have the Segmentation Fault when I try to pass the |
leave it all in this issue its all related |
OK. The SegFault is my bad (inconsistency between my source and install dirs and Take a look at garaud@b0d8903 Works well with Damien G. |
if you specify |
OK, @jreback You were faster than me ! It sounds good to me. Set the Anyway, the chunksize bug with Python engine was fixed, good job and thanks ! Damien G. |
prob I think this was setup to allow differeing chunk sizes, via pretty easy I thin to make a default chunksize if its not set (and iterator is True), then still allow get_chunk to work (which will just override).....if you think that makes sense, pls make an issue (and PR!) |
Hi there,
I tested:
I thought that I could do:
since
reader
is iterable. Unfortunately, it calls the generatorTextFileReader.__iter__
:where
self.chunksize
isNone
. Maybe setself.chunsize
to 1 when it's notdefined and there is
iterator=True
. I'll propose a patch as soon as possible--- today or tomorrow.
Best regards,
Damien G.
The text was updated successfully, but these errors were encountered: