Add inplace option to DataFrame.update() #10730

dov · 2015-08-03T07:28:59Z

Currently DataFrame.update() modifies the dataframe in place. I feel this is not symmetric to how most other methods work, e.g. set_index(), that returns a new DataFrame. It is probably too late to change this default behavior (isi it?), but if at the very least an inplace option was added (with default value True to support the current behaviour), it will be easier to chain a sequence of operators that includes update().

The text was updated successfully, but these errors were encountered:

jreback · 2015-08-03T10:31:43Z

can you show an example of your use case. .update has a very limited usecase, often .combine/.combine_first/.assign can substitue.

As far as the default, this ship has sailed, would be better to introduce a .updated that defaults to returning a new object (and has an inplace parameter), and deprecate .update. Would take a PR for that.

dov · 2015-08-03T11:02:29Z

I first encountered update as an answer to my stackoverflow question:

http://stackoverflow.com/questions/31769754/looking-up-multiple-values-from-a-pandas-dataframe/31779537#31779537

regarding how to a "parallell lookup" from all rows of one DataFrame by the values of another DataFrame.

In that example it might be useful to directly after the call to update() e.g. write the resulting dataframe to a csv file.

Safrone · 2019-04-24T15:37:46Z

I had a similar feeling. Is there a way to do this in a more functional way?

giuliobeseghi · 2020-01-27T23:10:06Z

can you show an example of your use case. .update has a very limited usecase, often .combine/.combine_first/.assign can substitue.

As far as the default, this ship has sailed, would be better to introduce a .updated that defaults to returning a new object (and has an inplace parameter), and deprecate .update. Would take a PR for that.

My use case is:

# combine three datetime series referring to periods that partially overlap
overall_forecast = long_term_forecast.update(medium_term_forecast).update(short_term_forecast)

I can't use loc because the index of a series is not necessarily contained in another

Liam3851 · 2020-01-28T00:38:42Z

@giuliobeseghi If I understand correctly, it sounds like your use case would be solved by using combine_first instead of update?

Combine Series values, choosing the calling Series's values
first. Result index will be the union of the two indexes

Parameters
----------
other : Series

giuliobeseghi · 2020-01-28T09:54:47Z

@giuliobeseghi If I understand correctly, it sounds like your use case would be solved by using combine_first instead of update?
Combine Series values, choosing the calling Series's values
first. Result index will be the union of the two indexes

Parameters
----------
other : Series

You are right.

Thanks for that, I got confused with the example: I might have to keep the nans of the most recent series. I realised I can do with reindex + loc:

import pandas as pd
import numpy as np

s = pd.Series([1, 2, 3, 4, 5])
t = pd.Series([10, 11, 12], index=range(3, 6))
t.iloc[1] = np.nan
print(t)

res = s.reindex(t.index | s.index)
res.loc[t.index] = t.to_numpy()
print(res)

It would be good if combine_first had an option to overwrite values even if thery're np.nan :)

giuliobeseghi · 2020-01-29T09:29:52Z

Shall we suggest combine_first in the docs of Series.update and DataFrame.update?

"For non-inplace update consider combine_first"

peterhadlaw · 2023-03-30T00:28:33Z

combine_first only works when the original DataFrame has null values to be ... updated. update however will overwrite whatever the original values were at a given index regardless if it was null or not (and more importantly, leave values intact from the starting DF if their index was not in the updating DF).

I think there is no way to do this sort of pick the values from a new DataFrame in a functional manner

I guess maybe .combine(other, lambda a, b: b) would accomplish this? but update directly feels more elegant and appropriate

mroeschke · 2023-05-11T23:40:08Z

I think with the more active discussion about deprecating/retooling inplace parameters across pandas, going to close this in favor of that #16529

jreback added Indexing Related to indexing on series/frames, not to indexes themselves Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate API Design labels Aug 3, 2015

jreback added this to the Next Major Release milestone Aug 3, 2015

jreback added the Needs Discussion Requires discussion from core team before further action label Aug 3, 2015

jreback mentioned this issue May 23, 2017

BUG: handle nan values in DataFrame.update when overwrite=False (#15593) #16430

Merged

4 tasks

giuliobeseghi mentioned this issue Jan 27, 2020

Allow non-inplace option for ndframe.update() #31372

Closed

mroeschke added Enhancement and removed API Design Indexing Related to indexing on series/frames, not to indexes themselves labels Apr 18, 2021

jbrockmendel added the inplace Relating to inplace parameter or equivalent label Oct 29, 2021

mroeschke removed this from the Contributions Welcome milestone Oct 13, 2022

mroeschke closed this as completed May 11, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add inplace option to DataFrame.update() #10730

Add inplace option to DataFrame.update() #10730

dov commented Aug 3, 2015

jreback commented Aug 3, 2015

dov commented Aug 3, 2015

Safrone commented Apr 24, 2019

giuliobeseghi commented Jan 27, 2020

Liam3851 commented Jan 28, 2020

giuliobeseghi commented Jan 28, 2020

giuliobeseghi commented Jan 29, 2020

peterhadlaw commented Mar 30, 2023 •

edited

Loading

mroeschke commented May 11, 2023

Add inplace option to DataFrame.update() #10730

Add inplace option to DataFrame.update() #10730

Comments

dov commented Aug 3, 2015

jreback commented Aug 3, 2015

dov commented Aug 3, 2015

Safrone commented Apr 24, 2019

giuliobeseghi commented Jan 27, 2020

Liam3851 commented Jan 28, 2020

giuliobeseghi commented Jan 28, 2020

giuliobeseghi commented Jan 29, 2020

peterhadlaw commented Mar 30, 2023 • edited Loading

mroeschke commented May 11, 2023

peterhadlaw commented Mar 30, 2023 •

edited

Loading