Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update to Vega-Lite 4.17.0 #2513

Merged
merged 48 commits into from
Nov 8, 2021
Merged

Update to Vega-Lite 4.17.0 #2513

merged 48 commits into from
Nov 8, 2021

Conversation

ChristopherDavisUCI
Copy link
Contributor

Hi again,

@mattijn and I have tried to update the Pull Request from last week, taking into account your suggestions and trying to get all the tests (that I know about) to work.

Here are some of the main changes from the current Altair release:

  • Changed the Vega-Lite schema version to 4.17.0.
  • Added definitions of DatumSchemaGenerator and DatumChannelMixin in generate_schema_wrapper.py.
  • Updated the loop in generate_vegalite_channel_wrappers here to allow for the possibility of a datum.
  • Updated infer_encoding_types from altair/utils/core.py here to recognize Datum class names.
  • Changed get_valid_identifier in tools/schemapi/utils.py to deal with some symbols like [] appearing in certain names in the Vega-Lite schema. (Removing those symbols led to some duplicated class names.)
  • Updated TopLevelMixin from altair/vegalite/v4/api.py to allow for layer to be repeated, in addition to row and column.
  • Updated the Encoding Channel Options part of the docs here. (I wasn't sure if there was a way to auto-generate these groupings of for example Row with Column with Facet.)
  • Added some examples to the example gallery here and some additional documentation about mark_arc and layering in repeat here.
  • Applied a temporary fix and linked Vega-Lite issue to get test_vegalite_to_vega_mimebundle to work here.

Main outstanding issue that we know of:

  • Starting with v4.9, the format of TopLevelRepeatSpec changed in the Vega-Lite schema. We have some code to deal with this by editing the definition of TopLevelRepeatSpec in the downloaded JSON file. We've tried asking on the Vega-Lite Slack channel and as a Vega-Lite GitHub issue, but it seems like the issue is on the Altair side, not the Vega-Lite side. If the rest of the changes seem mostly in good shape, I can focus on finding a more adequate solution.

This is not part of the current Pull Request, but to get some of the tests to work, we made the following changes to altair_viewer:

Thank you for any feedback!

@jakevdp
Copy link
Collaborator

jakevdp commented Nov 1, 2021

This looks great! Thanks for taking a stab at it.

In terms of the RepeatSpec issue, I do think this is an Altair issue rather than a Vega-Lite issue. In the past, all top-level specs have been objects rather than unions, but I don't think deviating from this is indicative of a bug. It's just something the Altair wrappers need to be expanded to handle.

I think rather than patching the JSON, the best fix would be to define the expected API in this class: https://github.com/altair-viz/altair/blob/8a8642b2e7eeee3b914850a8f7aacd53335302d9/altair/vegalite/v3/api.py#L1824

That is, you can derive that from VegaLiteSchema and set the _schema element manually. That way there's no mucking with the Vega-Lite schema itself.

@jakevdp
Copy link
Collaborator

jakevdp commented Nov 1, 2021

Regarding the testing: the way I've done it in the past is to add a branch to altair viewer with the necessary changes, and temporarily adjust the CI in this branch to point to that. Then when we're happy with everything, we cut a new altair-viewer release and then change this CI back to normal.

@mattijn
Copy link
Contributor

mattijn commented Nov 2, 2021

From here: 3fc9a86 I see you update master of altair_viewer to include support for VL4.8.1. Once altair-viz/altair_viewer#43 is merged we can follow same principle.

1eef3de and d594587 now point to master of altair_viewer

@ChristopherDavisUCI
Copy link
Contributor Author

Hmm, this isn't an error I've seen before; is it obvious what it means? Trying it on my computer I got a similar error, and it seemed to point to line 278 here: https://github.com/ChristopherDavisUCI/altair/blob/25baa86dbd921a30761bcc5300b0c06397eae99d/altair/sphinxext/altairplot.py#L275-L282

@jakevdp
Copy link
Collaborator

jakevdp commented Nov 4, 2021

Ah, yeah I think the issue there is that our default class wrapper thing is too aggressive now. alt.InlineDataSet has a type signature that looks something like anyOf(List(float), List(string), List(boolean), string), which means that any object that in from_dict(), any object that matches these types will be wrapped in an InlineDataSet class.

We probably don't want that.

So we'll need a way to distinguish between anyOf schemas that we do want to wrap with the parent type vs. ones that we don't.

Perhaps we could explicitly register "simple" schemas (like {type: "string"}, {type: "array", "items": {type: "float"}}, etc.) that we don't want to be wrapped with schema classes?

@jakevdp
Copy link
Collaborator

jakevdp commented Nov 4, 2021

Another piece of context here: it's very tempting to introduce specific patches or checks to fix this kind of bug as it comes up... the problem with that is that as Vega-Lite evolves, those kinds of one-off changes become very difficult to manage and maintain. It's a mistake I made early on in writing Altair. That's why, for each of these issues, my inclination is to try to solve the most general version of the problem that's surfacing, because then maybe future upgrades will be just a bit easier.

@ChristopherDavisUCI
Copy link
Contributor Author

I experimented a little with some of the examples and most of them still seem to work correctly. Here is the smallest example I could find that does not work:

d = {'data': {'values': [{}]},
       'mark': {'type': 'point'}}

alt.Chart.from_dict(d).to_json()

Here is the result of alt.Chart.from_dict(d).to_dict():

{'config': {'view': {'continuousWidth': 400, 'continuousHeight': 300}},
 'data': {'name': 'data-a21cafb4c405e6997671a02e578b9b1e'}, 
 'mark': {'type': 'point'}, 
 '$schema': 'https://vega.github.io/schema/vega-lite/v4.17.0.json', 
 'datasets': {'data-a21cafb4c405e6997671a02e578b9b1e': InlineDataset([{}])}}

@jakevdp
Copy link
Collaborator

jakevdp commented Nov 5, 2021

Ah, that’s helpful. My guess is there’s a to_dict call missing somewhere in the dataset handler.

@mattijn
Copy link
Contributor

mattijn commented Nov 5, 2021

I'm not sure if my commit resolves the effect or the cause. But before, this:

'data': {'values': [{}]}

resolved into alt.InlineData([{}]).
Where it now resolves into alt.InlineData(alt.InlineDataset([{}]).

So I apply alt.InlineDataset([{}]).to_dict() within the data consolidation function.

if isinstance(data.values, core.InlineDataset):
values = data.values.to_dict()
else:
values = data.values
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice find! This seems like the right fix. Maybe make it a bit more robust and do

values = data.to_dict()['values']

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@jakevdp
Copy link
Collaborator

jakevdp commented Nov 5, 2021

I think this is looking pretty good! What do you think? Anything missing at this point?

Before merging, I'll create a new altair_viewer release and we can change the CI requirements back to normal.

@ChristopherDavisUCI
Copy link
Contributor Author

I think this is looking pretty good! What do you think? Anything missing at this point?

Nothing missing that I'm aware of!

@mattijn
Copy link
Contributor

mattijn commented Nov 5, 2021

Merge-ready! +1

@joelostblom
Copy link
Contributor

joelostblom commented Nov 5, 2021

This is awesome! Thank you all so much for working on this PR, the notifications from this thread have been my favorite morning read over the last few days ❤️

@jakevdp
Copy link
Collaborator

jakevdp commented Nov 6, 2021

In altair-viz/altair_viewer#45 I added the most recent vega & vega-embed versions to altair viewer. I triggered a re-run of the CI here to make sure there are no issues.

@jakevdp
Copy link
Collaborator

jakevdp commented Nov 6, 2021

Alright, altair-viewer 0.4.0 is released! We can remove the temporary reference to its github branch from this PR

@mattijn
Copy link
Contributor

mattijn commented Nov 6, 2021

I checked if the tests use altair_viewer version 0.4.0, and this seems to be the case. All good.

@jakevdp
Copy link
Collaborator

jakevdp commented Nov 8, 2021

Thanks all - I think we should merge this!

Given the number of commits & nonlinear history, I think I'll opt to squash and merge – I believe that will maintain correct authorship (@ChristopherDavisUCI will be in the author field, @mattijn will be listed in "Co-authored By" and I would be in the committer field). Is everyone happy with that?

@ChristopherDavisUCI
Copy link
Contributor Author

Great! That definitely sounds good to me. (It will be especially nice to get rid of some of those commits going back and forth between different fixes for TopLevelRepeatSpec.)

@mattijn
Copy link
Contributor

mattijn commented Nov 8, 2021

Perfect

@jakevdp jakevdp merged commit 55bb4b7 into vega:master Nov 8, 2021
@jakevdp
Copy link
Collaborator

jakevdp commented Nov 8, 2021

Alright, it's done! Thanks all, this is really great work 🎉

@ChristopherDavisUCI
Copy link
Contributor Author

Awesome! Will there be a corresponding "release"?

This has been a very fun (and fast-moving) process so I'd like to try playing around with vega-lite version 5. Is it better if I make a pull request early and then you can see what we're trying along the way, or should I wait until we're "finished" or "stuck" and then make the pull request? I think many changes will be necessary related to selections. In general I'm very open to suggestions, like "make fewer commits" or "make smaller self-contained pull requests" or whatever.

Thanks for all the attention!

@jakevdp
Copy link
Collaborator

jakevdp commented Nov 8, 2021

Regarding a release... I'd like to iron-out the "default data" issue discussed above. I also did a bit of maintenance stuff this morning (updating CI jobs, etc.) but yes, it would be great if we could get a release candidate out sometime this week.

Regarding VL5 - that would be great if you want to open a draft PR! The biggest issue I anticipate there is that VL5 replaced "selections" with a more general "params" specification, so I suspect a number of things will have to change to accommodate that.

@ChristopherDavisUCI
Copy link
Contributor Author

I've experimented a little with Vega-Lite v5 on this branch, although I might want to redo some things now that I better understand what's going on.

Surprisingly the biggest obstacle so far hasn't been params but a change to layer. I think in the newest Vega-Lite schema, charts in a layer are not allowed to specify height or width, which seems to break many Altair examples. Here is a minimal example that doesn't work:

no_data = pd.DataFrame()

c = alt.Chart(no_data).mark_circle().properties(
    width=100
)

c+c

I don't see a good way to deal with that. Do you have a suggestion?

I've read the list of "breaking changes" for the Vega-Lite 5.0.0 release and don't see anything that seems related to this, so it does make me wonder if maybe I misunderstand the cause of the problem.

@ChristopherDavisUCI ChristopherDavisUCI deleted the sample417 branch November 9, 2021 05:56
@mattijn
Copy link
Contributor

mattijn commented Nov 9, 2021

Hi Chris, if you move your last comment to here #2425, or start a new pull request titled 'WIP: update to Vega-Lite 5' and add your observations there, than I'm more than happy to join the process (and maybe others as well!). It would be nice if we can get further with fewer help from Jake.

Its indeed super awesome to align a few brains and go forward, but be careful, it's addicting and if its your spare-time than other things also probably like attention;)

But if we can give ourselves a Christmas gift with a new Altair release, either based on VL4.17 or 5, then I think we are doing great.

@ChristopherDavisUCI
Copy link
Contributor Author

ChristopherDavisUCI commented Nov 9, 2021

Thanks @mattijn and yes, "addicting" is a good word for it! Thanks for the suggestion; I made a new draft PR #2516 #2517

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants