Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allowing hover_data and custom_data to pass string of column name instead of requiring a list. #4083

Merged
merged 5 commits into from
Mar 6, 2023

Conversation

lukefeilberg
Copy link
Contributor

@lukefeilberg lukefeilberg commented Feb 25, 2023

Howdy y'all!

Quick background

I often get tripped up when using the hover_data argument since nearly all arguments can be a single string of a column name except it (and custom_data). To make matters a bit worse the error message is quite confusing, example below.

import plotly.express as px
df = px.data.tips()
px.scatter(
    df,
    x='total_bill',
    y='tip',
    color='sex',
    hover_data='day',
    title="Hover data won't work as string 😢"
)

This produces the following error

ValueError: Value of 'hover_data_0' is not the name of a column in 'data_frame'. Expected one of ['total_bill', 'tip', 'sex', 'smoker', 'day', 'time', 'size'] but received: d

which is pretty misleading since I did pass 'day'!

This has been brought up in the following issues:

Instead of making the error message more clear, I think we should just allow a single string of a column name to be consistent with nearly all other args.


Source of problem

It appears the source of the problem is hover_data (and custom_data) are both in array_attrables and get cast as a list using list(arg) . And if arg is a string like 'my_col' this turns it into ['m', 'y', '_', 'c', 'o', 'l'] which is undesirable.

Simple Solution

Thus my simple band-aid solution is checking if the field is in ["custom_data", "hover_data"] and if args[field] in args["data_frame"].columns then we instead turn it into a list simply doing args[field] = [args[field]]. This suffices as shown in the screenshot below.
image

Other ideas?

I'm open to ideas if folks prefer we do it in a different way (particularly the hardcoded list might be good define explicitly up top with an informative name) but I wanted to get a simple enough solution proposed that works.

I've also update the docs in this PR to reflect that this change would allow you to pass a string to both hover_data and custom_data -- I don't know if it's preferable or not for these to be in the same PR.

Happy to get any other feedback as well as this is my first attempt at an open source contribution. 😎

Thanks,
Luke Feilberg


Documentation PR

  • I've seen the doc/README.md file
  • This change runs in the current version of Plotly on PyPI and targets the doc-prod branch OR it targets the master branch
  • If this PR modifies the first example in a page or adds a new one, it is a px example if at all possible
  • Every new/modified example has a descriptive title and motivating sentence or paragraph
  • Every new/modified example is independently runnable
  • Every new/modified example is optimized for short line count and focuses on the Plotly/visualization-related aspects of the example rather than the computation required to produce the data being visualized
  • Meaningful/relatable datasets are used for all new examples instead of randomly-generated data where possible
  • The random seed is set if using randomly-generated data in new/modified examples
  • New/modified remote datasets are loaded from https://plotly.github.io/datasets and added to https://github.com/plotly/datasets
  • Large computations are avoided in the new/modified examples in favour of loading remote datasets that represent the output of such computations
  • Imports are plotly.graph_objects as go / plotly.express as px / plotly.io as pio
  • Data frames are always called df
  • fig = <something> call is high up in each new/modified example (either px.<something> or make_subplots or go.Figure)
  • Liberal use is made of fig.add_* and fig.update_* rather than go.Figure(data=..., layout=...) in every new/modified example
  • Specific adders and updaters like fig.add_shape and fig.update_xaxes are used instead of big fig.update_layout calls in every new/modified example
  • fig.show() is at the end of each new/modified example
  • plotly.plot() and plotly.iplot() are not used in any new/modified example
  • Hex codes for colors are not used in any new/modified example in favour of these nice ones

Code PR

  • I have read through the contributing notes and understand the structure of the package. In particular, if my PR modifies code of plotly.graph_objects, my modifications concern the codegen files and not generated files.
  • I have added tests (if submitting a new feature or correcting a bug) or
    modified existing tests.
  • For a new feature, I have added documentation examples in an existing or
    new tutorial notebook (please see the doc checklist as well).
  • I have added a CHANGELOG entry if fixing/changing/adding anything substantial.
  • For a new feature or a change in behaviour, I have updated the relevant docstrings in the code to describe the feature or behaviour (please see the doc checklist as well).

@nicolaskruchten
Copy link
Contributor

Thanks very much for this PR! I agree that the current behaviour is annoying enough that we should change it, and just accepting bare strings should work. I'll add some thoughts in the PR comments.

@lukefeilberg
Copy link
Contributor Author

The build-doc is failing on make html. I made some real small changes to the documentation but just within strings. Can't tell if I've got some hard to spot or silly mistake or if this is just being funky. 🤔

@nicolaskruchten
Copy link
Contributor

build-doc is unfortunately flakey sometimes, and in this case it's not due to your PR :)

@nicolaskruchten nicolaskruchten merged commit 8a151e1 into plotly:master Mar 6, 2023
@nicolaskruchten
Copy link
Contributor

Thanks very much for this PR!

@lukefeilberg
Copy link
Contributor Author

lukefeilberg commented Mar 6, 2023

Happy to contribute and I appreciate your help in the process!

Out of curiosity, what's the general cadence of releases or expected next release date? Not in any rush but will be excited to see my minor contribution in the wild 😎

EDIT: Actually looking at the changelog gives me a sense. Thanks again!

@nicolaskruchten
Copy link
Contributor

We generally do minor releases timed with Plotly.js minor releases, or sooner if there are a batch of Python-driven features waiting for release. I'm not sure when the next Plotly.js minor is expected but likely within a few weeks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants