Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vaex expression to pandas.Series support #456

Merged
merged 2 commits into from
Nov 21, 2019
Merged

vaex expression to pandas.Series support #456

merged 2 commits into from
Nov 21, 2019

Conversation

JovanVeljanoski
Copy link
Member

No description provided.

def to_pandas_series(self):
"""Return a pandas.Series representation of the expression.

Note: a memory copy is created.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that is true. If a column is a non-virtual one, we just pass the ndarray, I doubt pandas will copy that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well.. if a vaex dataframe has say 2 billion rows and it is reading them from an hdf5 file, the moment you do to_pandas_series i think it will materialize them to in memory right..?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, the columns in vaex, are memory mapped (usually), so it's a regular numpy array, pointing to a piece of memory mapped data, there is no 'materializing', there is no difference between a memory mapped array or a in memory array except when you start accessing it. Passing the numpy array around does not cause any copying, so as long as pandas does not copy it (I think it does not when you pass it to an Series but you can check).

@maartenbreddels maartenbreddels merged commit c0f0f13 into master Nov 21, 2019
@JovanVeljanoski JovanVeljanoski deleted the to_series branch January 10, 2020 01:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants