Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parallel categories not recognizing dimension columns #2008

Closed
robroc opened this issue Jun 6, 2019 · 7 comments · Fixed by #2102
Closed

Parallel categories not recognizing dimension columns #2008

robroc opened this issue Jun 6, 2019 · 7 comments · Fixed by #2102
Milestone

Comments

@robroc
Copy link

robroc commented Jun 6, 2019

Data frame with two categorical columns and one numerical column being passed to px.parallel_categories. Only one column gets visualized, no matter what is passed to dimensions: a single column, two columns, or no columns.

Data types changed to str and category, but no change.

Data:

px

Code:

px.parallel_categories(occupations_by_year, dimensions=['Main_jobs', 'Election year'], color = 'count')

Result:

px2

Python 3.6
plotly==3.10.0
plotly-express==0.3.0

@nicolaskruchten
Copy link
Contributor

How many unique values of main_jobs are there?

@robroc
Copy link
Author

robroc commented Jun 6, 2019 via email

@nicolaskruchten
Copy link
Contributor

Yes, but I'll have to generalize the cutoff heuristic, which right now caps the number of values at 20 IIRC. This a little blunt for cases like this where you only have 2 dimensions. It's reasonably if you had e.g. 10 dimensions because in that case you might have up to 20^10 combinations :)

@jacobbaron
Copy link

I have encountered this issue as well. I have four columns, one of which has ~30 categories. There are about 1100 unique combinations in the data, so not ridiculous, but the column with the 30 categories is left out of the plot.

@jasonsross
Copy link

I 2nd the desire to set the cutoff of number of categories. Like these charts but I've hit this roadblock and I think I was just barely over the cutoff limit. I'm ok to group my categories down to reduce complexity, but wish to have a little more flexibility in doing so.

@nicolaskruchten
Copy link
Contributor

I'll get this fixed in the next version :) I'll move this issue to the plotly.py repo to get it scheduled.

@nicolaskruchten nicolaskruchten transferred this issue from plotly/plotly_express Dec 17, 2019
@nicolaskruchten nicolaskruchten added this to the v4.5.0 milestone Dec 17, 2019
@fjprobos
Copy link

Same issue over here. It is an easy way to look for colinearity among categorical features. Please get it fixed.

unique transmission_model : 19
unique model_family: 24

image

image

Python 3.5.6
Plotly 4.4.1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants