-
-
Notifications
You must be signed in to change notification settings - Fork 539
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow pandas DataFrame df
and rows
as two sources of rows, make columns optional, and introduce default column config
#3263
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this pull request, @tmlmt!
I've been thinking about it again and again this week. Somehow I'm not a fan of adding even more functionality to the from_pandas
method. Manipulating columns seems a bit unrelated, and could even be useful when creating tables without pandas.
What do you think about extracting a dedicated update_columns
method? It could be called like this:
ui.table.from_pandas(df).update_columns(
{'sortable': True},
age={':format': '(val, _row) => `${val} years`'},
)
A single positional argument controls "default" settings, and keyword arguments control settings per column.
Thanks for the comment @falkoschindler and sorry to have caused you trouble the entire week 😅
I think it would actually be even greater to be able to either pass columns and rows to As you suggest though, we could also introduce a You're the maintainer, so I'll let you decide and I can update this PR accordingly 😊 Let me know what you think. |
Ok, you're starting to convince me that support for pandas (or polars) dataframes should be improved. Calling some column transformation every time the data is updated from a dataframe is tedious.
Let's draft an API for this idea: class Table(FilterElement, component='table.js'):
def __init__(self,
*,
columns: Optional[List[Dict]] = None,
rows: Optional[List[Dict]] = None,
df: Union['pd.DataFrame', 'pl.DataFrame'],
...)
# 1. "normal" table from row and column dictionaries
ui.table(rows=[...], columns=[...], ...)
# 2. table from pandas dataframe
ui.table(df=df)
# 3. table from pandas dataframe, extending given colums dictionaries
ui.table(columns=[...], df=df)
# 4. table from pandas dataframe _and_ row dictionaries raises an exception
ui.table(rows=[...], df=df) This would remove some code (and documentation) duplication by merging different ways to create tables into one. Same holds for With this API you can define some columns using dictionaries and still load data with additional columns from a dataframe. But how to define default attributes that apply to all columns like "sortable"? Maybe we just need an additional # 5. table from pandas dataframe, extending given columns and default column definition
ui.table(default_column={'sortable': True}, columns=[...], df=df) And like table.df = my_updated_df |
I like the API, more or less what I had in mind. A few things:
What's the behavior in this case? Do you merge columns from
I agree. What should be the behavior when:
|
Very good questions...
My intuition was that we copy
Actually, I don't know. And if there isn't an obvious, intuitive behavior and the user can't be sure what will happen when updating this or that, we might be on the wrong track. Maybe we need to treat
Note that these are breaking changes, thus need to wait for the upcoming 2.0.0 release. |
Sounds really good that Quasar can infer columns. I can read here:
I therefore also propose that default_column defaults to All in all that would really fit well: calling I'll start working on it 🙂. I suggest that I introduce all this logic first and then I can work on extending it to polars dataframe in a separate PR. |
ui.table.from_pandas
df
and rows
as two sources of rows, make columns optional, and introduce default column config
Hi @falkoschindler, I've now implemented things on Due to the merging step, we now lose the ability for the table to update automatically when the initially provided lists are changed, i.e: rows = <some list>
columns = <some list>
t = ui.table(columns=columns, rows=rows)
columns = [{'name': 'name', 'field': 'name', 'label': 'name'}]
t.update()
# would add the 'name' column before as a sortable column, but it does not add the sortable property now. Do we need to keep this ability to update a table using update() after changing the initially provided input variables? Or is it fine to have to reassign the columns property ( If we need to:
Let me know what you think about what the user experience should be for updating rows, columns and df |
New implementation of
I've also updated the first post of this PR with a more detailed motivation and description of the proposal |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @tmlmt! I finally found the time for a thorough code review:
-
Even though I like type annotations and see the benefits of
TypedDict
, I removedColumnsConfigOptions
for now:- It seemed to be using an "old" syntax (https://peps.python.org/pep-0589/#alternative-syntax) and could be rewritten for Python 3.8.
- There was an unused
ColumnsConfig
. - When using
ui.table
, the IDE's type hint popup will only show "ColumnsConfigOptions", which I find less useful than "Dict". So even a static type checker will be more helpful with typed dicts, I don't like type information for the user being "hidden" behind another definition. - Most importantly: This PR is already quite complex. And because we don't really need the typed dict, I chose to remove it in favor for a smaller, more manageable diff.
-
I found the "none" string passed to the frontend rather confusing. Instead I'm inferring columns from the first row in Python, which is just one line and allows to keep the frontend unchanged. Sure, we miss the chance to use Quasar's default behavior. But it's easily reproduced and gives full power to the backend.
There are a few open questions / issue we'll need to address:
-
Do we really need to wait for 2.0 and simply break the API? Or can we provide backward compatibility?
- For now I only made
df
anddefault_column
keyword arguments. This way the old API with positional arguments still works. - The
from_pandas
method could be added back in and marked as deprecated. It should print a deprecation warning like we do, e.g., inget_slot_stack()
. - The new behavior of all columns being sortable by default might be the only remaining breaking change. But we could initialize
default_columns
with an empty dictionary for now and change it to{sortable: True}
in 2.0.
- For now I only made
-
For consistency we should update
ui.aggrid
accordingly. We might want to do it in a separate PR, but if we break anything, we should break both APIs at once in 2.0. -
What bothers me the most is that updating columns (and rows!) doesn't work like before. I think your example from Allow pandas DataFrame
df
androws
as two sources of rows, make columns optional, and introduce default column config #3263 (comment) didn't work before, buttable.column['...'] = ...; table.update()
is broken now. I find it rather critical, because it is hard to spot. So even if our users are aware of all breaking changes, they might miss this one in their project. Using observable collections for_rows
,_columns
(anddefault_column
?) might be a great solution though. We just need to make sure to instantiate it once and update it in the property setters so that we don't loose their reference.
What do you think? There are still some things to do, but it feels like we are on the home stretch. 🙂
Hi @falkoschindler, sorry for my late reply. Thanks for the thorough review, I had in mind to also take care of aggrid and polars, but didn't want to go too far before you had a look and make sure that coding style and so forth was OK. Looks like that was a good idea! I agree with your comments, or at least I understand the motivation of not wanting to introduce any breaking change for the moment. It seems you are already addressing the points you mentioned. Let me know if you'd like me to take care of any piece. |
I just experimented with observable collections, but noticed a fundamental problem: If you pass a list Worked before this PR: table = ui.table(columns=columns, rows=rows)
columns[:] = [{'name': 'name', 'field': 'name', 'label': 'name'}]
table.update() Works after this PR: table = ui.table(columns=columns, rows=rows)
table.columns[:] = [{'name': 'name', 'field': 'name', 'label': 'name'}]
# no update() call This is a subtle but ugly breaking change. A possible solution might be to call both merge functions with every I'll drop the non-working implementation here for reference.```py from typing import Any, Callable, Dict, List, Literal, Optional, Union from typing_extensions import Self from .. import helpers, observables, optional_features try: class Table(FilterElement, component='table.js'):
|
Thanks again for this pull request, @tmlmt!
The idea of moving Because I felt like starting over again, based on all the insights from your PR, implementing these three features one by one without breaking existing code, I created a new PR #3525. It also implements #2633 to increase consistency of the methods for adding, removing and updating rows. It turned out pretty well and got reviewed and merged today. Therefore I'm closing this PR in favor of #3525. It always feels a bit weird to discard valuable contributions like yours, but sometimes we need to try things out to spark new ideas and to see what works best. I hope you understand. Thanks again for your work! |
Hey @falkoschindler thanks a lot for the follow-up. And sorry I haven't continued the work before you did, things have been busy at home and I wouldn't have been able to resume work before October. I still think that a unified way to generate tables regardless of the sources would offer a better DX but I fully understand your intentions here to avoid breaking changes and keep consistency across elements. Thanks a lot for reusing up the ideas and thought process from this PR to create the new one! With |
Motivation
While it's relatively easy to edit an ui.table's column options after creating it from a pandas DataFrame, similar to how it can be done for ui.aggrid options (#947), it is a bit verbose and requires an update of the element.
More over, that method returns a table that cannot be easily updated as the dataframe is changed, unless something like PR is implemented.
Besides, the logic would have to be duplicated to support polars dataframes.
Finally, there is an opportunity to simplify the call for creating a new table from any type of source, while leveraging the embedded feature in Quasar's table element to infer columns from the first row of data.
Proposal
This PR proposes to partially rewrite the table element to: