DataFrame doesn't define density #19028

hexgnu · 2018-01-01T18:48:46Z

Code Sample, a copy-pastable example if possible

This is more of an open question whether this should be implemented or not. If a DataFrame has a SparseSeries inside of it shouldn't 'density' also be defined?

df = pd.DataFrame({'a': pd.SparseSeries([1,0,0,1])})
df.a.density #=> 0.5
df.density #=> Throws error I would think this should be 0.5

sdf = pd.SparseDataFrame({'a': pd.SparseSeries([1,0,0,1])})
sdf.density #=> 0.5
sdf.a.density #=> 0.5

Problem description

Basically this is a consistency problem between SparseDataFrame and DataFrame. Since DataFrame's can contain SparseSeries it should probably define 'density' as well.

Expected Output

I would expect a DataFrame to have density defined. If it is dense it would just be 1.0.

Output of `pd.show_versions()`

INSTALLED VERSIONS

commit: None
python: 3.6.1.final.0
python-bits: 64
OS: Linux
OS-release: 4.13.16-202.fc26.x86_64
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: en_US.UTF-8
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8

pandas: 0.20.3
pytest: 3.3.1
pip: 9.0.1
setuptools: 28.8.0
Cython: 0.27.3
numpy: 1.13.1
scipy: 0.19.1
xarray: 0.10.0
IPython: 6.1.0
sphinx: 1.6.5
patsy: None
dateutil: 2.6.1
pytz: 2017.2
blosc: 1.5.1
bottleneck: 1.2.1
tables: 3.4.2
numexpr: 2.6.4
feather: 0.4.0
matplotlib: 2.0.2
openpyxl: 2.4.9
xlrd: 1.1.0
xlwt: 1.3.0
xlsxwriter: 1.0.2
lxml: 4.1.1
bs4: 4.6.0
html5lib: 1.0b10
sqlalchemy: 1.1.15
pymysql: 0.7.11.None
psycopg2: None
jinja2: 2.9.6
s3fs: 0.1.2
pandas_gbq: None
pandas_datareader: None

The text was updated successfully, but these errors were encountered:

hexgnu · 2018-01-01T18:49:12Z

See issue #16874

jreback · 2018-01-01T19:48:43Z

yeah suppose we could define this function on DataFrame itself. maybe want to rename this (e.g. deprecate .density()) to avoid namespace pollution, or better yet have a .sparse namespace``

hexgnu · 2018-01-05T01:27:15Z

Another issue from #16874 is that to_coo won't be defined on a DataFrame either.

I feel a good piece of work is to go through all of the differences and at least document then or implement them if you can. to_coo doesn't seem like it should be implemented imho.

TomAugspurger · 2019-09-16T17:16:50Z

This should be an attribute of the .sparse accessor.

lithomas1 · 2021-07-30T15:46:16Z

This is implemented already in the sparse accessor.

hexgnu mentioned this issue Jan 1, 2018

Concatting dense and sparse dataframes breaks many common operations #16874

Closed

jreback added API Design Numeric Operations Arithmetic, Comparison, and Logical operations Sparse Sparse Data Type labels Jan 1, 2018

mroeschke added the Enhancement label Apr 20, 2020

mroeschke removed API Design Numeric Operations Arithmetic, Comparison, and Logical operations labels Jun 12, 2021

lithomas1 closed this as completed Jul 30, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DataFrame doesn't define density #19028

DataFrame doesn't define density #19028

hexgnu commented Jan 1, 2018

INSTALLED VERSIONS

hexgnu commented Jan 1, 2018

jreback commented Jan 1, 2018 •

edited

Loading

hexgnu commented Jan 5, 2018

TomAugspurger commented Sep 16, 2019

lithomas1 commented Jul 30, 2021 •

edited

Loading

DataFrame doesn't define density #19028

DataFrame doesn't define density #19028

Comments

hexgnu commented Jan 1, 2018

Code Sample, a copy-pastable example if possible

Problem description

Expected Output

Output of pd.show_versions()

INSTALLED VERSIONS

hexgnu commented Jan 1, 2018

jreback commented Jan 1, 2018 • edited Loading

hexgnu commented Jan 5, 2018

TomAugspurger commented Sep 16, 2019

lithomas1 commented Jul 30, 2021 • edited Loading

Output of `pd.show_versions()`

jreback commented Jan 1, 2018 •

edited

Loading

lithomas1 commented Jul 30, 2021 •

edited

Loading