Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CT-3096] Refactor selection + dbt list output, for better consistency across node types #8599

Open
jtcohen6 opened this issue Sep 8, 2023 · 1 comment
Labels
behavior_change_flag Impact: Exp paper_cut A small change that impacts lots of users in their day-to-day tech_debt Behind-the-scenes changes, with little direct impact on end-user functionality

Comments

@jtcohen6
Copy link
Contributor

jtcohen6 commented Sep 8, 2023

Based on conversation in #8546 / #8589 (comment)

What's the problem?

It's a bit odd that, for the purposes of listing & selection, we so strongly distinguish between:

  1. "Logical" node types (models/seeds/snapshots/tests/analyses), which dbt actually compiles/builds
  2. "Pointer" node types (sources/exposures/semantic_models/metrics), for which dbt just resolves & stores config

(these are my recently invented terms, not officially established anywhere, and by no means set in stone)

Specific behaviors

  • The default output of dbt list includes {node_type}:{node_package}.{node_name} for the "pointer" node types, but then {node_fqn} for the "logical" node types
  • It's possible to select a "pointer" node by its type (e.g. source:*, metric:*), but this is not possible for "logical" nodes
  • The fqn:* selection method only includes "logical" node types

As an example:

$ dbt -q ls --resource-type model semantic_model
jaffle_shop.marts.customers
jaffle_shop.marts.metricflow_time_spine
jaffle_shop.marts.order_items
jaffle_shop.marts.orders
jaffle_shop.staging.stg_customers
jaffle_shop.staging.stg_locations
jaffle_shop.staging.stg_order_items
jaffle_shop.staging.stg_orders
jaffle_shop.staging.stg_products
jaffle_shop.staging.stg_supplies
semantic_model:jaffle_shop.customers
semantic_model:jaffle_shop.locations
semantic_model:jaffle_shop.order_item
semantic_model:jaffle_shop.orders
semantic_model:jaffle_shop.stg_products

The models are displayed by FQN. The semantic_models are displayed by semantic_model:{package}.{name}. The idea is, you could copy-paste / pipe those string outputs and use as selectors. But wouldn't it be better if it were more consistent, and you could use them as selectors?

Why is it like this?

This feels like an organic outgrowth of

  • "Logical" nodes are more closely tied to files (a model is 1:1 with a file), and their FQNs are therefore more meaningfully linked to their relative file path
  • "Pointer" nodes are yaml config, many to a file. Their FQN matters less, compared with their node type, package/project namespace, and unique identifier. (In the case of sources, the unique identifier is source_name + table name).

How could this be better?

Some potential acceptance criteria:

  • All node types already have an FQN. So the fqn:* selector should actually include all nodes. Then we wouldn't need this wonky default selector.
  • It should be possible to select all node types using the {node_type}: syntax - either {node_type}:{node_fqn_part}, or {node_type}:* to select all nodes of that type. (Today, the workaround for "select all snapshots" is something like --select config.materialized:snapshot, which is janky and not really documented.)
  • We should support a resource_type:{type} selection method, similar to the --resource-type flag, but as an actual method that can be combined with other selection methods in unions/intersections.
  • The default output of dbt list should change to {node_type}:{node_fqn} for all node types — or the default --output should be unique_id instead of selector.
  • dbt ls should include all resource types by default. The fact that it excludes analyses by default (and only analyses) is pretty confusing.

Out of scope

Unrelated pet peeves about dbt list:

@jtcohen6 jtcohen6 added tech_debt Behind-the-scenes changes, with little direct impact on end-user functionality paper_cut A small change that impacts lots of users in their day-to-day labels Sep 8, 2023
@github-actions github-actions bot changed the title Refactor selection + dbt list output, for better consistency across node types [CT-3096] Refactor selection + dbt list output, for better consistency across node types Sep 8, 2023
@ChenyuLInx
Copy link
Contributor

  • all behavior change will be gated by one flag selection_include_analysis(tbd).
  • points:
    • 3
    • 4?
    • don't do it
    • 2
    • 2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
behavior_change_flag Impact: Exp paper_cut A small change that impacts lots of users in their day-to-day tech_debt Behind-the-scenes changes, with little direct impact on end-user functionality
Projects
None yet
Development

No branches or pull requests

3 participants