-
Notifications
You must be signed in to change notification settings - Fork 794
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce the number of entries available in the top level API #2918
Comments
I really like the idea of cleaning up the top-level API! Thanks for bringing this up. I also find it quite overwhelming to go through the API references in the docs and to use autocompletion if you don't specifically know what you are looking for (and therefore I never use completion on In favour of cleaning up the API references in the docs which we can maybe do without any code changes except to Exclusion from tab completion is a bit tricky. As we noticed in #2814, hiding in Jupyter works great but we haven't yet found a way to do the same in VS Code. Of course still worth the effort to adjust |
My hope was to move it to a submodule and import it at top level, but hide it from the top level tab completion and only allow tab completion via the submodule. I think this would allow for old code to work, while also encouraging people to use the submodules from now on as well as reduce the top-level tab completion.
I think it would be great to support VS Code too, but if there isn't a way that pylance offers for hiding individual classes/functions, I am not sure there is much we can do until they add that. I tried asking in the discussion you linked before, let's see if there is a way. |
Based on the helpful reply in microsoft/pylance-release#3709 (reply in thread) it seems like we need to get #2614 or similar merged, but maybe it is worthwhile if it resolves some of the VS Code issues we have been seeing and makes it easier to maintain both VS Code and JupyterLab support? |
Some notes from an offline discussion with @mattijn and @arvind:
What is very unlikely to break users code but would improve the situation a lot is removing classes from the top-level namespace which are never used by users anyway as they just represent a primitive datatype such as A first step could be to analyze which class names only appear ones in our code base, i.e. are defined but never used anywhere else, not even in a type hint. This can be a manual list and with every VL release, we could check in the PR if new entries get added to the top-level, if we really want them there. Maybe we can remove these classes completely instead of just not making them available in the top-level namespace. This would make the Altair wheels even smaller. It might require a bit of a rewrite of |
Due to the many objects, Altair takes up 23MB of memory (if I measure this correctly?) and this excludes it's requirements which are small: from memory_profiler import profile
@profile
def import_packages():
import typing_extensions
import jinja2
import jsonschema
import packaging
import narwhals
import altair
if __name__ == "__main__":
import_packages() |
Exciting to hear this being discussed on the @binste is there any chance this relates to better support for generic types in more recent drafts of A while back I tried exploring what we could do in If there is a |
If it makes it easier for us to generate type hints in Python, we should definitely look at it once we get to that rewrite, thanks for flagging! In general, I think their is some consensus around simplifying the type hierarchy in VL. At least for Altair, I don't think we need |
When we look into this, let's also review SPEC 1 which has recommendations on how to do lazy-loading submodules and functions. |
Will find the link later, but EditIn terms of LOC, not much more than utils._importers and is doing a whole lot more: |
I haven't seen this mentioned anywhere. We also have several modules available at multiple levels, which is making it complicated to resolve #3610 I discovered this yesterday and have written a summary of my findings here: |
@joelostblom @binste #2918 (comment) I think I can see a path forward here. Building on some of the ideas from #3366 (comment) and
|
Currently, the top level of the Altair API is huge, there are 573 entries in the tab completion menu of
alt.<TAB>
. This can make it rather hard to find what you're looking for unless you know the first few character of the name already, and exploring the API from scratch as a new user must be intimidating. Many of these seem unhelpful to have exposed at the top-level, e.g. there are 70 different entries that relate to conditions, but I only usealt.condition
myself. I suggest that the top level API only includes the most relevant classes and functions that we think are helpful on a day to day basis, and the rest go under sub-modules.The Altair API consists of four parts:
I have listed my comments below on how we could reduce these, and I would love to hear what you think @mattijn, @ChristopherDavisUCI, @binste (and anyone else that wants to chime in)
API functions
Introduced specifically by Altair as conveniences and we should keep all of them (although maybe
check_fields_and_encodings
is meant to be a private function based on the name and since there is no docstring?).(just showing the first few)
Top level objects
The are only eight of these and we can probably keep all of them. Alternatively, since all except
Chart
have corresponding API functions and are rarely used directly we could move them to a charts sub domain:alt.charts.HConcatChart
, etc, to make it more clear that the preferred usage is the API methodalt.hconcat
, etc.Encoding channels
There is wasted space here by having each channel repeated three times, e.g.
Angle
,AngleDatum
,AngleValue
. Especially as the most common use of datum and value is by using the top level API functionsalt.datum
andalt.value
. I see to it possible ways of improving the situation:AngleDatum
we haveAngle.datum
.AngleDatum
, we would havealt.datum.Angle
.Maybe the second is lightly more natural in syntax, but it will probably make it confusing since you can use
datum.<colum_name>
in expression strings, so the first would be better for that reason.(just showing the first few)
Low level schema wrappers
This is where we could make the most progress. There are around 400 low level schemer wrappers of which I've only used around 10 myself, most of which are now unnecessary since we added the method based syntax for channel encoding options.
We should identify which of these are just internal helper classes for Altair that end users rarely need to know about. To me it seems like almost everything here could be moved to a submodule, except the classes that end in
Params
(although they could also be moved if we introduce aliases).I don't know what a good name of of this submodule might be: maybe
alt.extras
,alt.schema_wrappers
,alt.misc
,alt.wrappers
, or similar?(just showing the first few)
Again, none of these changes should break old code. For example, if we move
FillDatum
toalt.Fill.datum
, then there should no longer be any tab completion foralt.FillDatum
, but old code that use it should work as it just callsalt.Fill.datum
under the hood. The tab completion should be onalt.Fill.<tab>
instead, which also seems more natural to me.The text was updated successfully, but these errors were encountered: