Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[data] to_pandas failed on datasets returned by from_spark #32967

Closed
kira-lin opened this issue Mar 2, 2023 · 0 comments · Fixed by #32968
Closed

[data] to_pandas failed on datasets returned by from_spark #32967

kira-lin opened this issue Mar 2, 2023 · 0 comments · Fixed by #32968
Labels
bug Something that is supposed to be working; but isn't data Ray Data-related issues P2 Important issue, but not time-critical

Comments

@kira-lin
Copy link
Contributor

kira-lin commented Mar 2, 2023

What happened + What you expected to happen

such code

df = spark.range(100)
ds = ray.data.from_spark(df)
ds.to_pandas

will fail because type check failed. Blocks saved by from_spark is bytes, although it's actually in arrow format.

Versions / Dependencies

ray 2.1.0

Reproduction script

df = spark.range(100)
ds = ray.data.from_spark(df)
ds.to_pandas

Issue Severity

High: It blocks me from completing my task.

@kira-lin kira-lin added bug Something that is supposed to be working; but isn't triage Needs triage (eg: priority, bug/not-bug, and owning component) labels Mar 2, 2023
@amogkam amogkam added P2 Important issue, but not time-critical and removed triage Needs triage (eg: priority, bug/not-bug, and owning component) labels Mar 3, 2023
@richardliaw richardliaw changed the title [Ray Data] to_pandas failed on datasets returned by from_spark [data] to_pandas failed on datasets returned by from_spark Mar 21, 2023
@richardliaw richardliaw added the data Ray Data-related issues label Mar 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something that is supposed to be working; but isn't data Ray Data-related issues P2 Important issue, but not time-critical
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants