Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't copy dataframes or use inplace=True #305

Merged
merged 5 commits into from
Jan 31, 2025
Merged

Conversation

siddharth-krishna
Copy link
Collaborator

Pandas is moving away from the inplace=True argument in its next version, and recommends against using it as most methods make copies of the underlying data anyway, and using inplace is more likely to lead to bugs:
pandas-dev/pandas#16529
pandas-dev/pandas#51466

We also have a lot of code that looks like:

df = table.dataframe.copy()
df.foobar(..., inplace=True)
...
table = replace(table, dataframe=df)

which is better written as the following, as most pandas methods foobar return a new dataframe with the results of the operation:

df = table.dataframe.foobar(...)
...
table = replace(table, dataframe=df)

It is not necessary to copy dataframes unless in very special cases, for example, you are adding a column to the dataframe:

df = model.topology.copy()  # This copy is needed, otherwise the next line adds a column to `model.topology`
df['new_col'] = df['old_col'] * 2

@siddharth-krishna
Copy link
Collaborator Author

I benchmarked it on my laptop with --seq and saw a 7% increase in performance!

2025-01-31 09:47:26.945 | INFO     : Total runtime: 194.55s (main: 210.89s) (__main__:MainThread:pid-17764 "/Users/sid/code/times-excel-reader/utils/run_benchmarks.py:401")
2025-01-31 09:47:26.945 | INFO     : Change in runtime (negative == faster): -16.34s (-7.7%) (__main__:MainThread:pid-17764 "/Users/sid/code/times-excel-reader/utils/run_benchmarks.py:402")

Copy link

github-actions bot commented Jan 31, 2025

Regression test results on commit cd392a0

         Benchmark    Time (s)                 GDX Diff     Accuracy       Correct    Additional
------------------  ----------  -----------------------  -----------  ------------  ------------
     DemoS_001-all   3.3   3.9            OK         OK  100.0 100.0    118    118      3      3
     DemoS_002-all   3.8   4.3            OK         OK  100.0 100.0    344    344      3      3
     DemoS_003-all   4.0   4.7            OK         OK  100.0 100.0    633    633      6      6
         DemoS_004   4.1   4.8            OK         OK  100.0 100.0    662    662     12     12
        DemoS_004a   4.3   4.7            OK         OK  100.0 100.0    665    665     12     12
DemoS_004a-ie-test   4.1   5.4            OK         OK  100.0 100.0    667    667     12     12
        DemoS_004b   4.2   4.6            OK         OK  100.0 100.0    665    665     12     12
     DemoS_004-all   4.2   5.0            OK         OK  100.0 100.0    667    667     12     12
     DemoS_005-all   4.8   5.8            13         13   99.7  99.7   1160   1160     12     12
     DemoS_006-all   5.3   6.4            13         13   99.7  99.7   1258   1258     12     12
     DemoS_007-all   6.6   7.3            13         13   99.8  99.8   2155   2155     12     12
  DemoS_007-all-1r   5.4   6.0            11         11   99.8  99.8   1179   1179     12     12
     DemoS_008-all   8.5   8.6            13         13   99.9  99.9   5333   5333     18     18
     DemoS_009-all   9.1   9.1            30         32   99.9  99.9   5807   5807     29     29
     DemoS_010-all   9.2   9.2           826        822   88.4  88.4   6139   6139     29     29
     DemoS_011-all   9.4   9.2           858        856   88.0  88.0   6150   6150     29     29
     DemoS_012-all   9.6   9.5           905        905   88.2  88.2   6306   6306     53     53
  DemoS_special-t1   5.5   5.2  Error runn… Error runn…   92.2  92.2   2108   2108     50     50
      TIMES-IE-all  27.0  26.0          3131       3131   97.5  97.5  42460  42460   2826   2826
      TIMES-IE-NoM  24.1  23.4           363        363   99.3  99.3  40528  40528    632    632
      TIMES-IE-MCB  25.1  23.4          1106       1106   97.6  97.6  42067  42067    632    632
      TIMES-NZ-KEA  33.2  30.9          3932       3932   98.3  98.3  76416  76416    458    458
      TIMES-NZ-TUI  25.6  23.8          3966       3966   98.3  98.3  76042  76042    474    474
 (__main__:MainThread:pid-5951 "/home/runner/work/xl2times/xl2times/xl2times/utils/run_benchmarks.py:377")
2025-01-31 12:08:05.814 | INFO     : Total runtime: 241.29s (main: 240.43s) (__main__:MainThread:pid-5951 "/home/runner/work/xl2times/xl2times/xl2times/utils/run_benchmarks.py:401")
2025-01-31 12:08:05.814 | INFO     : Change in runtime (negative == faster): +0.86s (+0.4%) (__main__:MainThread:pid-5951 "/home/runner/work/xl2times/xl2times/xl2times/utils/run_benchmarks.py:402")
2025-01-31 12:08:05.815 | INFO     : Change in correct rows (higher == better): +0 (+0.0%) (__main__:MainThread:pid-5951 "/home/runner/work/xl2times/xl2times/xl2times/utils/run_benchmarks.py:409")
2025-01-31 12:08:05.815 | INFO     : Change in additional rows: +0 (+0.0%) (__main__:MainThread:pid-5951 "/home/runner/work/xl2times/xl2times/xl2times/utils/run_benchmarks.py:416")
2025-01-31 12:08:05.815 | SUCCESS  : No regressions. You're awesome! (__main__:MainThread:pid-5951 "/home/runner/work/xl2times/xl2times/xl2times/utils/run_benchmarks.py:430")

This comment will be updated when new commits are added to the PR.

Copy link
Member

@olejandro olejandro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @siddharth-krishna! Tremendous work!

@olejandro olejandro merged commit f1c16c0 into main Jan 31, 2025
2 checks passed
@olejandro olejandro deleted the sid/no-copy-inplace branch January 31, 2025 15:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants