Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

modin_df.reset_index is failed with ValueError: cannot insert colname, already exists but pandas works #4208

Closed
prutskov opened this issue Feb 16, 2022 · 0 comments · Fixed by #4209
Assignees
Labels
bug 🦗 Something isn't working

Comments

@prutskov
Copy link
Contributor

System information

  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Ubuntu 20.04
  • Modin version (modin.__version__): b702759
  • Python version: 3.8.11
  • Code we can use to reproduce:
# import pandas as pd
import modin.pandas as pd

df = pd.DataFrame({"col0": [0,1,2,3]}, index=pd.Index([11, 22, 33, 44], name="col0"))

df.columns = ["col0_1"]
df = df.reset_index()

print(df)

Describe the problem

Modin throws an error but pandas doesn't. The issue is observed in Santander workload. A possible reason is not correct propagating of new names of columns to lowest levels of architecture.

Source code / logs

Modin output:

During handling of the above exception, another exception occurred:

ray::apply_list_of_funcs() (pid=2800927, ip=10.241.129.69)
  File "/localdisk/aprutsko/modin/modin/core/execution/ray/implementations/pandas_on_ray/partitioning/partition.py", line 417, in apply_list_of_funcs
    partition = func(partition.copy(), *args, **kwargs)
  File "/localdisk/aprutsko/modin/modin/core/dataframe/pandas/dataframe/dataframe.py", line 827, in from_labels_executor
    return df.reset_index()
  File "/localdisk/aprutsko/miniconda/envs/modin/lib/python3.8/site-packages/pandas/util/_decorators.py", line 311, in wrapper
    return func(*args, **kwargs)
  File "/localdisk/aprutsko/miniconda/envs/modin/lib/python3.8/site-packages/pandas/core/frame.py", line 5841, in reset_index
    new_obj.insert(0, name, level_values)
  File "/localdisk/aprutsko/miniconda/envs/modin/lib/python3.8/site-packages/pandas/core/frame.py", line 4442, in insert
    raise ValueError(f"cannot insert {column}, already exists")
ValueError: cannot insert col0, already exists

Pandas output:

   col0  col0_1
0    11       0
1    22       1
2    33       2
3    44       3
@prutskov prutskov added the bug 🦗 Something isn't working label Feb 16, 2022
@prutskov prutskov self-assigned this Feb 16, 2022
prutskov added a commit to prutskov/modin that referenced this issue Feb 21, 2022
….from_labels`

Signed-off-by: Alexey Prutskov <alexey.prutskov@intel.com>
devin-petersohn added a commit that referenced this issue Feb 28, 2022
…#4209)

Co-authored-by: Devin Petersohn <devin-petersohn@users.noreply.github.com>
Signed-off-by: Alexey Prutskov <alexey.prutskov@intel.com>
vnlitvinov pushed a commit that referenced this issue Mar 17, 2022
…#4209)

Co-authored-by: Devin Petersohn <devin-petersohn@users.noreply.github.com>
Signed-off-by: Alexey Prutskov <alexey.prutskov@intel.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug 🦗 Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant