Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crash with parallel cumulative_eval #18884

Closed
2 tasks done
jtanx opened this issue Sep 24, 2024 · 2 comments · Fixed by #18885
Closed
2 tasks done

Crash with parallel cumulative_eval #18884

jtanx opened this issue Sep 24, 2024 · 2 comments · Fixed by #18885
Assignees
Labels
accepted Ready for implementation bug Something isn't working P-medium Priority: medium python Related to Python Polars

Comments

@jtanx
Copy link

jtanx commented Sep 24, 2024

Checks

  • I have checked that this issue has not already been reported.
  • I have confirmed this bug exists on the latest version of Polars.

Reproducible example

import polars as pl

df = pl.LazyFrame({"A": range(1, 10000)}, strict=False)
df.select(pl.col("A").cumulative_eval(pl.element().tail(1), parallel=True)).collect()

Log output

Segmentation fault (core dumped)

Issue description

A bit of a contrived example here, but seems to crash on what's possibly a stack overflow when parallel=True. Otherwise seems to run if parallel=False or if the input dataframe was smaller

Expected behavior

Does not crash/calculates something

Installed versions

--------Version info---------
Polars:              1.8.1
Index type:          UInt32
Platform:            Linux-6.8.0-45-generic-x86_64-with-glibc2.39
Python:              3.12.3 (main, Sep 11 2024, 14:17:37) [GCC 13.2.0]

----Optional dependencies----
adbc_driver_manager  <not installed>
altair               <not installed>
cloudpickle          <not installed>
connectorx           <not installed>
deltalake            <not installed>
fastexcel            <not installed>
fsspec               <not installed>
gevent               <not installed>
great_tables         <not installed>
matplotlib           3.9.2
nest_asyncio         1.6.0
numpy                2.1.1
openpyxl             <not installed>
pandas               <not installed>
pyarrow              <not installed>
pydantic             <not installed>
pyiceberg            <not installed>
sqlalchemy           <not installed>
torch                <not installed>
xlsx2csv             <not installed>
xlsxwriter           <not installed>
@jtanx jtanx added bug Something isn't working needs triage Awaiting prioritization by a maintainer python Related to Python Polars labels Sep 24, 2024
@coastalwhite
Copy link
Collaborator

coastalwhite commented Sep 24, 2024

This does not seem to be a regression of a recent version. I can reproduce on 1.3.0 as well.

@coastalwhite coastalwhite added accepted Ready for implementation P-high Priority: high and removed needs triage Awaiting prioritization by a maintainer labels Sep 24, 2024
@github-project-automation github-project-automation bot moved this to Ready in Backlog Sep 24, 2024
@ritchie46
Copy link
Member

ritchie46 commented Sep 24, 2024

This is a stackoverflow in rayon.

@ritchie46 ritchie46 added P-medium Priority: medium and removed P-high Priority: high labels Sep 24, 2024
@ritchie46 ritchie46 self-assigned this Sep 24, 2024
@github-project-automation github-project-automation bot moved this from Ready to Done in Backlog Sep 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
accepted Ready for implementation bug Something isn't working P-medium Priority: medium python Related to Python Polars
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

3 participants