filterx: remove fields with empty values #226

jszigetvari · 2024-07-25T09:56:16Z

ability to remove certain fields with empty values:

need to specifiy multiple potential null values ("", N/A, NA, None, 0, Default)
alternative names: drop_fields/remove_fields/ ...

#275

The text was updated successfully, but these errors were encountered:

jszigetvari · 2024-07-25T15:22:06Z

We should also add different modes of operations, or separate the functionality to two functions:

to only set the properties to empty string (to be used with CSV-like formatting)
to properly remove the properties (for KV-like formatting)

MrAnno · 2024-08-02T08:15:04Z

A more general solution would be implementing group_unset() for filterx (probably with a better name), where lists and regexp patterns can be specified.
I think it's not a much bigger task, so we should do this instead.

For AxoRouter, we can always add an axo-remove-null-values() SCL that parameterizes a group_set call with the known null-values.

alltilla · 2024-08-02T08:27:40Z

I would go even further in the generalization, with something like this:

create kv transformation callback functions, that can only be passed to some other functions, but not called on their own
the callbacks could have a result to either modify the key or the value, or to skip/unset the kv
make our parser functions accept a list of those callbacks, and apply them before adding a value to the dict they are creating
make a map()-like function that accepts a dict and a list of those callbacks and applies them on all the elements

I think this architecture would scale really good with our parsers and transformation logics.

jszigetvari · 2024-08-02T08:35:07Z

@MrAnno

For AxoRouter, we can always add an axo-remove-null-values() SCL that parameterizes a group_set call with the known null-values.

Well, in that regard we would need something where we could specify somehow (maybe different functions, or through a parameter) whether we need to actually unset the attributes or set them to emty string. (This is important because of CSV-like data, where the order of values is important.)
On top of that, perhaps there should be a default set of known null/empty values, which the user could extend or override (through an argument) if necessary.

MrAnno · 2024-08-02T10:13:36Z

I don't fully see how this could scale well if we consider our original decision of not allowing to define custom functions/callbacks in filterx. This means we will have to implement a set of transformation functions and also allow them to be passed to almost all of our functions that "create" new sets of data.

Coupling these transformations with the functions that create new values seems unnecessary to me in such a language where we have functional-style building blocks and where were are closed to in-language extensions (defining our own functions).

It seems cleaner to me to provide general enough transformation functions that can work on their own.

that can only be passed to some other functions, but not called on their own

This was my first trigger to think that coupling transformations with the "source of creation" may not be the best idea (both from the user's and the C implementation's perspective).

alltilla · 2024-08-02T10:43:40Z

Sure we can do everything by having separate functions for iterating through the dict and modifying its values, but it quickly gets resource intensive if we do it multiple times for the necessary transformations. I thought we could optimize it, but that might introduce some complications in our implementation.

MrAnno · 2024-08-02T11:44:02Z

Let's measure some filterx performance in real-world use cases, identify the bottlenecks, and do perf-related optimizations on places we are sure are worth the complexity compared to the actual numbers.

I'm not against adding some complexity when the end result is cleaner for the user, but in this case, this coupling seems comfortable, but it's not that clean if we think about it.

pepov added the filterx label Jul 30, 2024

pepov changed the title ~~filterx: unset_empties() improvements~~ filterx: remove fields with empty values Aug 6, 2024

pepov assigned alltilla and bshifter and unassigned alltilla Aug 6, 2024

pepov closed this as completed Sep 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

filterx: remove fields with empty values #226

filterx: remove fields with empty values #226

jszigetvari commented Jul 25, 2024 •

edited by pepov

Loading

jszigetvari commented Jul 25, 2024

MrAnno commented Aug 2, 2024 •

edited

Loading

alltilla commented Aug 2, 2024

jszigetvari commented Aug 2, 2024

MrAnno commented Aug 2, 2024 •

edited

Loading

alltilla commented Aug 2, 2024

MrAnno commented Aug 2, 2024

filterx: remove fields with empty values #226

filterx: remove fields with empty values #226

Comments

jszigetvari commented Jul 25, 2024 • edited by pepov Loading

jszigetvari commented Jul 25, 2024

MrAnno commented Aug 2, 2024 • edited Loading

alltilla commented Aug 2, 2024

jszigetvari commented Aug 2, 2024

MrAnno commented Aug 2, 2024 • edited Loading

alltilla commented Aug 2, 2024

MrAnno commented Aug 2, 2024

jszigetvari commented Jul 25, 2024 •

edited by pepov

Loading

MrAnno commented Aug 2, 2024 •

edited

Loading

MrAnno commented Aug 2, 2024 •

edited

Loading