Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

{cube.sql()} is not supported in YAML/Python #7484

Open
wasd171 opened this issue Dec 1, 2023 · 12 comments
Open

{cube.sql()} is not supported in YAML/Python #7484

wasd171 opened this issue Dec 1, 2023 · 12 comments

Comments

@wasd171
Copy link
Contributor

wasd171 commented Dec 1, 2023

Is your feature request related to a problem? Please describe.
Extending cubes brings great code reusability. However, it is currently not supported for the Python cubes. Look at these 2 cubes:

# cubes/applications.yml
{% set model = dbt_model('applications') %}

cubes:
  - name: "cube_{{ model.name | safe }}"
    sql: >
      SELECT * FROM {{ model.sql_table | safe }}
    public: false

    dimensions:
      {{ model.as_dimensions() }}

    # Model-specific measures
    measures:
      - name: total_count
        type: count
# cubes/applications_sent.yml
cubes:
  - name: cube_applications_sent
    extends: cube_applications
    sql: >
      SELECT * FROM { cube_applications.sql() } WHERE applied_at is not null
    public: false

According to the documentation, this could work, but it gives me the following error:

Error: Compile errors:
Can't parse python expression. Most likely this type of syntax isn't supported yet: Unsupported Python Trailer children node: TrailerContext: ()
    at ErrorReporter.throwIfAny (/cube/node_modules/@cubejs-backend/schema-compiler/src/compiler/ErrorReporter.ts:133:13)
    at DataSchemaCompiler.throwIfAnyErrors (/cube/node_modules/@cubejs-backend/schema-compiler/src/compiler/DataSchemaCompiler.js:213:23)
    at /cube/node_modules/@cubejs-backend/schema-compiler/src/compiler/DataSchemaCompiler.js:108:16
    at CompilerApi.getCompilers (/cube/node_modules/@cubejs-backend/server-core/src/core/CompilerApi.js:66:26)
    at CompilerApi.scheduledPreAggregations (/cube/node_modules/@cubejs-backend/server-core/src/core/CompilerApi.js:189:31)
    at RefreshScheduler.roundRobinRefreshPreAggregationsQueryIterator (/cube/node_modules/@cubejs-backend/server-core/src/core/RefreshScheduler.ts:474:38)
    at /cube/node_modules/@cubejs-backend/server-core/src/core/RefreshScheduler.ts:595:12
    at async Promise.all (index 0)

If I modify the applications_sent cube's SQL to SELECT * FROM { cube_applications.sql } WHERE applied_at is not null then I can see that the cube_applications.sql is injected as a JS arrow function () = > query.cubeSql(cube.cubeName()) – obviously Python cannot evaluate it

Describe the solution you'd like
I would like to be able to extend Python cubes the same way as it is possible for the YAML / JS ones

Describe alternatives you've considered
Since I am preparing my data in dbt I can refactor the applications_sent cube to

# cubes/applications_sent.yml
{% set model = dbt_model('applications') %}

cubes:
  - name: cube_applications_sent
    extends: cube_applications
    sql: >
      SELECT * FROM {{ model.sql_table | safe }} WHERE applied_at is not null
    public: false

^ this will work but feels a bit hacky

@igorlukanin igorlukanin changed the title Support extending Python cubes {cube.sql()} is not supported in YAML/Python Dec 1, 2023
@igorlukanin
Copy link
Member

Hey @wasd171 👋 Thanks for reporting this! My understanding is that extends works as it should, the real issue is that {cube.sql()} doesn't work in YAML data models. You can try removing that—everything should just work then. However, I agree that this is something that should be implemented

@wasd171
Copy link
Contributor Author

wasd171 commented Dec 1, 2023

Thanks for a swift answer @igorlukanin

You can try removing that—everything should just work then

But I need some table to select from and to apply my where filter. How could I do it without { cube.sql() }?

@igorlukanin
Copy link
Member

As a workaround, I can recommend to put that table name in a variable, then use it in both cubes. In the meantime, we should definitely make {cube.sql()} work.

@ZiggiZagga
Copy link

ZiggiZagga commented Apr 5, 2024

Any news on {cube.sql()} issue?

@igorlukanin
Copy link
Member

@ZiggiZagga Supporting this is on the roadmap, however, there's currently no estimate as to when this might be done.

@ZiggiZagga
Copy link

ZiggiZagga commented Apr 11, 2024

Hallo @igorlukanin,

I am learning cube and how it works. I have now come across the concept of Data Blending:

I am looking at the example: https://cube.dev/docs/product/data-modeling/concepts/data-blending

cube(`all_sales`, {
  sql: `
    SELECT
      amount,
      user_id AS customer_id,
      created_at,
      'online' AS row_type
    FROM (${online_orders.sql()}) AS online --source cube via literal template
    UNION ALL
    SELECT
      amount,
      customer_id,
      created_at,
      'retail' AS row_type
    FROM (${retail_orders.sql()}) AS retail --source cube via literal template
`,
});

I was wondering isn't it more intuitiv if we had an addition property called "cube_sql" which works as follows

cube(`all_sales`, {
  -- new property "cube_sql"
  cube_sql: `
    SELECT
      amount,
      user_id AS customer_id,
      created_at,
      'online' AS row_type
    FROM online_orders AS online --source cube via cube id
    UNION ALL
    SELECT
      amount,
      customer_id,
      created_at,
      'retail' AS row_type
    FROM retail_orders AS retail --source cube via cube id
`,
});

this way, as a developer i dont need to call cube.sql() because it will be treated behind the scenes. All I need to do is refer to cubes by their identifiers.

What do you think is this possible?

@igorlukanin
Copy link
Member

Hi @ZiggiZagga, thanks for the code example!

Frankly, I don't see it as an improvement over the current situation.

First, it doesn't simplify the definition of data model: instead of {cube.sql()} one would need to write cube_sql. No big win syntax-wise.

Second, it introduces additional ambiguity: without checking which parameter is used, one can never be sure if some_table_or_cube in the SQL is a reference to a SQL table or to a cube.

I also doubt that bare some_table_or_cube references would work reliably at all times: now Cube lets you define cubes with names like from or where; would you like these to be treated as cube references in cube_sql as well?

@steven-luabase
Copy link

Commenting to follow for updates, would love to see this feature work for .yml models too. Current workaround is a series of lengthy CTEs

@haithem-souala
Copy link

+1

@pnadolny13
Copy link

pnadolny13 commented Nov 15, 2024

+1 to this! I just spent a decent amount of time trying to get this to work and posted in slack before finding out its not supported for yaml syntax.

Supporting this is on the roadmap, however, there's currently no estimate as to when this might be done.

@igorlukanin its fine if this isnt a top priority right now, understandable, but I might suggest adding a warning to the docs so users are aware that the data blending feature isnt fully supported for yaml yet. Maybe a link to this issue would be helpful too.

@igorlukanin
Copy link
Member

@pnadolny13 Sorry for the confusion. I have added a couple more links to this issue to the docs.

@pnadolny13
Copy link

Thanks @igorlukanin!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants