You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Some databases have "hidden" columns that can be referenced in a query, but won't show up (by default) in a SELECT *. Oracle calls these "pseudocolumns", and that name seems to have stuck with other databases too. Personally I think "hidden" is a more descriptive name, but 🤷
In #9375 a (pragmatic) hack was added to the bigquery backend to support filtering on _TABLE_SUFFIX for partitioned tables. This hack is unfortunate in a few ways:
The schema of the table expression won't always match the resulting schema (in the common case, the ibis schema has _TABLE_SUFFIX, while the result doesn't).
If a user intentionally tries to select _TABLE_SUFFIX, the resulting table still won't include it since we unconditionally drop the value. For example, t.select("_TABLE_SUFFIX", "a").execute() will just have "a".
I propose we drop this special case in favor of a generic mechanism. This won't be as convenient for users trying to access _TABLE_SUFFIX, but it will be generic and less of a pain to maintain.
I think the easiest way to do this would be to add a method to Table (I'll call it hidden here, but not attached, could also be pseudo/pseudo_col/pseudocol/).
# When called, this method takes a name and an optional type.# If no type is given, the type is `unknown` and will require a cast# to do much with it.t.filter(t.hidden("_TABLE_SUFFIX", "string") >"abc")
# With no type specified, defaults to unknownt.filter(t.hidden("_TABLE_SUFFIX").cast("string") >"abc")
One nice thing about this (besides dropping the special casing) is it still allows users to include these columns in the result set if they're explicitly asked for:
expr=t.mutate(t.hidden("_TABLE_SUFFIX"))
expr.to_pandas() # this will include _TABLE_SUFFIX, while currently we don't
The text was updated successfully, but these errors were encountered:
Some databases have "hidden" columns that can be referenced in a query, but won't show up (by default) in a
SELECT *
. Oracle calls these "pseudocolumns", and that name seems to have stuck with other databases too. Personally I think "hidden" is a more descriptive name, but 🤷A few examples:
METADATA
columns_TABLE_SUFFIX
and_PARTITIONTIME
(docs)In #9375 a (pragmatic) hack was added to the bigquery backend to support filtering on
_TABLE_SUFFIX
for partitioned tables. This hack is unfortunate in a few ways:_TABLE_SUFFIX
, while the result doesn't)._TABLE_SUFFIX
, the resulting table still won't include it since we unconditionally drop the value. For example,t.select("_TABLE_SUFFIX", "a").execute()
will just have"a"
._TABLE_SUFFIX
in allto_*
methods #10048).I propose we drop this special case in favor of a generic mechanism. This won't be as convenient for users trying to access
_TABLE_SUFFIX
, but it will be generic and less of a pain to maintain.I think the easiest way to do this would be to add a method to
Table
(I'll call ithidden
here, but not attached, could also bepseudo
/pseudo_col
/pseudocol
/).One nice thing about this (besides dropping the special casing) is it still allows users to include these columns in the result set if they're explicitly asked for:
The text was updated successfully, but these errors were encountered: