Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Concatting dense and sparse dataframes breaks many common operations #16874

Closed
kbattocchi opened this issue Jul 10, 2017 · 2 comments · Fixed by #18924
Closed

Concatting dense and sparse dataframes breaks many common operations #16874

kbattocchi opened this issue Jul 10, 2017 · 2 comments · Fixed by #18924
Labels
Reshaping Concat, Merge/Join, Stack/Unstack, Explode Sparse Sparse Data Type
Milestone

Comments

@kbattocchi
Copy link

kbattocchi commented Jul 10, 2017

xref #15737

Code Sample, a copy-pastable example if possible

import pandas as pd

pd.concat([pd.DataFrame([0.0]), pd.SparseDataFrame([0.0])], axis=1).isnull() 
pd.concat([pd.DataFrame([0.0]), pd.SparseDataFrame([0.0])], axis=1).density
pd.concat([pd.DataFrame([0.0], columns=['A']), pd.SparseDataFrame([0.0])], axis=1)['A'] 
pd.concat([pd.DataFrame([0.0], columns=['A']), pd.SparseDataFrame([0.0])], axis=1).iloc[0,0]

Problem description

Each of the above lines generates an error when the dataframes are of mixed sparsity, but would succeed if both dataframes were dense or both were sparse. This means that we can't seamlessly swap sparse dataframes for dense ones without knowing how they'll be used downstream.

Expected Output

Each line does not generate an error.

Output of pd.show_versions()

##INSTALLED VERSIONS

commit: None
python: 3.5.2.final.0
python-bits: 64
OS: Windows
OS-release: 10
machine: AMD64
processor: Intel64 Family 6 Model 63 Stepping 2, GenuineIntel
byteorder: little
LC_ALL: None
LANG: None
LOCALE: None.None

pandas: 0.20.2
pytest: 2.9.2
pip: 8.1.2
setuptools: 27.2.0
Cython: 0.25.2
numpy: 1.13.0
scipy: 0.18.1
xarray: None
IPython: 5.1.0
sphinx: 1.4.6
patsy: 0.4.1
dateutil: 2.5.3
pytz: 2016.6.1
blosc: None
bottleneck: 1.1.0
tables: 3.2.2
numexpr: 2.6.1
feather: None
matplotlib: 2.0.2
openpyxl: 2.3.2
xlrd: 1.0.0
xlwt: 1.1.2
xlsxwriter: 0.9.3
lxml: 3.6.4
bs4: 4.5.1
html5lib: None
sqlalchemy: 1.0.13
pymysql: None
psycopg2: None
jinja2: 2.8
s3fs: None
pandas_gbq: None
pandas_datareader: 0.4.0

@kawochen kawochen mentioned this issue Jul 11, 2017
18 tasks
@jreback
Copy link
Contributor

jreback commented Jul 11, 2017

certainly would take a PR to fix. lots of sparse issues are open.

@jreback jreback added Difficulty Intermediate Reshaping Concat, Merge/Join, Stack/Unstack, Explode Sparse Sparse Data Type labels Jul 11, 2017
@jreback jreback added this to the Next Major Release milestone Jul 11, 2017
@hexgnu
Copy link
Contributor

hexgnu commented Jan 1, 2018

I have a fix for all of these problems except for

pd.concat([pd.DataFrame([0.0]), pd.SparseDataFrame([0.0])], axis=1).density

I will open up an issue about this. (See #19028)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Reshaping Concat, Merge/Join, Stack/Unstack, Explode Sparse Sparse Data Type
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants