Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DOC: DataFrame.update() fails to change values in calling dataframe if new value is NaN #16812

Closed
lvphj opened this issue Jul 1, 2017 · 5 comments · Fixed by #17859
Closed
Labels
Docs good first issue Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate
Milestone

Comments

@lvphj
Copy link

lvphj commented Jul 1, 2017

Code Sample, a copy-pastable example if possible

# Define dataframe
tempDF1 = pd.DataFrame({'c1':[1,2,3,4,5],
                       'c2':[True,True,False,False,True]})

# Select part of original dataframe
tempDF2 = tempDF1.loc[tempDF1['c2']==True,:].copy()

print('Original dataframe')
print(tempDF1)
print('\nSelection of dataframe')
print(tempDF2)

# Make some changes to selection
tempDF2.loc[:,'c1'] = tempDF2.loc[:,'c1'] + 100
tempDF2.iloc[2,0] = np.nan

# Update original dataframe with new values
tempDF1.update(tempDF2,overwrite=True)

print('\nSelection of dataframe - with changes')
print(tempDF2)
print('\nUpdated original dataframe')
print(tempDF1)

Problem description

DataFrame.update() function fails to update a dataframe with new NaN values. However, non-NaN values are updated to original dataframe with no issues (except the dtype of the dataframe is altered in the update process, namely int64 changed to float64).

Expected Output

Expected output would be

      c1     c2
0  101   True
1  102   True
2    3  False
3    4  False
4  NaN   True

However, actual output is:

      c1     c2
0  101.0   True
1  102.0   True
2    3.0  False
3    4.0  False
4    5.0   True

Output of pd.show_versions()

INSTALLED VERSIONS

commit: None
python: 3.4.6.final.0
python-bits: 64
OS: Darwin
OS-release: 15.6.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: en_GB.UTF-8
LOCALE: en_GB.UTF-8

pandas: 0.20.2
pytest: None
pip: 9.0.1
setuptools: 34.3.3
Cython: None
numpy: 1.13.0
scipy: None
xarray: None
IPython: 5.3.0
sphinx: None
patsy: None
dateutil: 2.6.0
pytz: 2017.2
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: None
openpyxl: 2.4.7
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: 0.999999999
sqlalchemy: None
pymysql: 0.7.11.None
psycopg2: None
jinja2: 2.9.6
s3fs: None
pandas_gbq: None
pandas_datareader: None

@dsm054
Copy link
Contributor

dsm054 commented Jul 1, 2017

I think that's what it's supposed to do, though. From the documentation:

Modify DataFrame in place using non-NA values from passed DataFrame.

@jreback
Copy link
Contributor

jreback commented Jul 1, 2017

some examples in the doc-string would be great for this. @lvphj want to do a PR ?

@jreback jreback added Difficulty Novice Docs Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate labels Jul 1, 2017
@jreback jreback added this to the Next Major Release milestone Jul 1, 2017
@jreback jreback changed the title DataFrame.update() fails to change values in calling dataframe if new value is NaN DOC: DataFrame.update() fails to change values in calling dataframe if new value is NaN Jul 1, 2017
@lvphj
Copy link
Author

lvphj commented Jul 1, 2017

You're right. Sorry, somehow I missed that in the docs. Yes, I'll add some examples to the docs and send a PR.

reidy-p added a commit to reidy-p/pandas that referenced this issue Oct 12, 2017
@jreback jreback modified the milestones: Next Major Release, 0.21.0 Oct 14, 2017
jorisvandenbossche pushed a commit that referenced this issue Oct 16, 2017
* DOC: Adding examples to update docstring (#16812)

* formatting issues

* improving examples
ghost pushed a commit to reef-technologies/pandas that referenced this issue Oct 16, 2017
…ev#17859)

* DOC: Adding examples to update docstring (pandas-dev#16812)

* formatting issues

* improving examples
ghost pushed a commit to reef-technologies/pandas that referenced this issue Oct 16, 2017
* upstream/master: (76 commits)
  CategoricalDtype construction: actually use fastpath (pandas-dev#17891)
  DEPR: Deprecate tupleize_cols in to_csv (pandas-dev#17877)
  BUG: Fix wrong column selection in drop_duplicates when duplicate column names (pandas-dev#17879)
  DOC: Adding examples to update docstring (pandas-dev#16812) (pandas-dev#17859)
  TST: Skip if no openpyxl in test_excel (pandas-dev#17883)
  TST: Catch read_html slow test warning (pandas-dev#17874)
  flake8 cleanup (pandas-dev#17873)
  TST: remove moar warnings (pandas-dev#17872)
  ENH: tolerance now takes list-like argument for reindex and get_indexer. (pandas-dev#17367)
  ERR: Raise ValueError when week is passed in to_datetime format witho… (pandas-dev#17819)
  TST: remove some deprecation warnings (pandas-dev#17870)
  Refactor index-as-string groupby tests and fix spurious warning (Bug 17383) (pandas-dev#17843)
  BUG: merging with a boolean/int categorical column (pandas-dev#17841)
  DEPR: Deprecate read_csv arguments fully (pandas-dev#17865)
  BUG: to_json - prevent various segfault conditions (GH14256) (pandas-dev#17857)
  CLN: Use pandas.core.common for None checks (pandas-dev#17816)
  BUG: set tz on DTI from fixed format HDFStore (pandas-dev#17844)
  RLS: v0.21.0rc1
  Whatsnew cleanup (pandas-dev#17858)
  DEPR: Deprecate the convert parameter completely (pandas-dev#17831)
  ...
yeemey pushed a commit to yeemey/pandas that referenced this issue Oct 20, 2017
…ev#17859)

* DOC: Adding examples to update docstring (pandas-dev#16812)

* formatting issues

* improving examples
alanbato pushed a commit to alanbato/pandas that referenced this issue Nov 10, 2017
…ev#17859)

* DOC: Adding examples to update docstring (pandas-dev#16812)

* formatting issues

* improving examples
No-Stream pushed a commit to No-Stream/pandas that referenced this issue Nov 28, 2017
…ev#17859)

* DOC: Adding examples to update docstring (pandas-dev#16812)

* formatting issues

* improving examples
@maxwellflitton
Copy link

Is there a parameter to track NaN values and change old DF values to NaN in the update function?

@joh-mue
Copy link

joh-mue commented Dec 18, 2018

Yes, what if I want to force NaN values in the update function?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Docs good first issue Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants