Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reading from file streams. #8

Open
jonex opened this issue Nov 17, 2018 · 7 comments
Open

Reading from file streams. #8

jonex opened this issue Nov 17, 2018 · 7 comments
Labels
enhancement New feature or request requires changes in Readstat waiting for changes in the C library Readstat

Comments

@jonex
Copy link

jonex commented Nov 17, 2018

It would be really nice if it was possible to provide a file stream instead of a file path.

One use case is to allow reading from a zip/bz2/etc compressed file, this saves disk space for large files.

@ofajardo
Copy link
Collaborator

hey! amazing code! and a very nice feature. Thanks a lot for contributing.

I would be very willing to take your changes to the pyreadstat cython code (anything under the subfolder pyreadstat).

I am however going to be more conservative with changes to the Readstat code (anything under the subfolder src). For me it is highly important to keep using the original Readstat code. The reason is that code is used by folks in R and Julia and I am sure at some point those guys will push beneficial changes for us, and I want to be able to take them without any extra effort. If we start changing the readstat code over here we will slowly diverge from the original, and at some point it would be difficult to merge. I think the thing goes in the other direction as well: nice features as yours should be made available in the original Readstat so that the other platforms can benefit of them as well. If everybody contributes to the same pot, things are going to move faster compared than if everybody works on its own fork.

For that reason, I am going to suggest that you submit a PR to Readstat with your changes to their code. Once it's approved, I'll take your changes to pyreadstat immediately. I think this is the best approach.

Alternatively, one could think on ways to make your changes work without changing anything in Readstat. The result would however not be so elegant as the current solution, so I would encourage you to go through the first path.

I hope it sounds reasonable to you!

Otto

@jonathon-love
Copy link
Contributor

I am however going to be more conservative with changes to the Readstat code

you should use a git submodule!

@ofajardo
Copy link
Collaborator

thanks for the suggestion @jonathon-love, yes that would make my intentions more clear. However it would make life more difficult to the not so git aware user who wants to quickly clone and run setup.py or do pip install from the git repo directly.
I'll think about it for the future.

@jonathon-love
Copy link
Contributor

yeah, that is true (and a very good thing to be thinking about). i have to confess to getting myself all confused with git submodules semi-regularly.

@ofajardo
Copy link
Collaborator

Ouch! Somebody here advised me not to get into submodules as they are confusing ... Now hearing at you I guess he was right.

@ofajardo
Copy link
Collaborator

Closing this one after more than one month of inactivity. Feel free to re-open it if necessary.

@ofajardo ofajardo added requires changes in Readstat waiting for changes in the C library Readstat enhancement New feature or request labels Jun 3, 2020
@ofajardo
Copy link
Collaborator

ofajardo commented Jun 3, 2020

PR in Readstat: WizardMac/ReadStat#179

@ofajardo ofajardo reopened this Jun 3, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request requires changes in Readstat waiting for changes in the C library Readstat
Projects
None yet
Development

No branches or pull requests

3 participants