Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Submit scala jobs beta #8701

Closed
wants to merge 111 commits into from
Closed

Conversation

pekapa
Copy link

@pekapa pekapa commented Sep 24, 2023

Problem

Starting on version 1.3 DBT Python models became available. Another very common programming language for submitting spark jobs is Scala. This PR enables Scala models to be built with DBT

Solution

Using the structure created by Python models we can extend it to also support Scala models.
The parser for Scala is not widely available so a "cheap" version was built to provide just the minimum needed for it. More complex solutions (using ANTLR, for example) might be desired in the future.
Since the Python validator is missing tests we are also skipping those for now here. Same for typed annotations, it follows the precedent set by the Python models code.

Related dbt-spark PR: dbt-labs/dbt-spark/pull/891

Checklist

  • I have read the contributing guide and understand what's expected of me
  • I have run this code in development and it appears to resolve the stated issue
  • This PR includes tests, or tests are not required/relevant for this PR
  • This PR has no interface changes (e.g. macros, cli, logs, json artifacts, config files, adapter interface, etc) or this PR has already received feedback and approval from Product or DX
  • This PR includes type annotations for new and modified functions

aranke and others added 30 commits April 13, 2023 16:31
…labs#7409)

(cherry picked from commit ada8860)

Co-authored-by: Jeremy Cohen <jeremy@dbtlabs.com>
… (dbt-labs#7392)

(cherry picked from commit 6fedfe0)

Co-authored-by: Jeremy Cohen <jeremy@dbtlabs.com>
(cherry picked from commit 57e9096)

Co-authored-by: Michelle Ark <MichelleArk@users.noreply.github.com>
…est version has been modified (dbt-labs#7439) (dbt-labs#7460)

(cherry picked from commit 2739d5f)

Co-authored-by: Michelle Ark <MichelleArk@users.noreply.github.com>
…eRefResolver (dbt-labs#7438) (dbt-labs#7461)

(cherry picked from commit 9874f9e)

Co-authored-by: Michelle Ark <MichelleArk@users.noreply.github.com>
* Latest version should use un-suffixed alias

* Latest version can be in un-suffixed file

* FYI when unpinned ref to model with prerelease version

* [WIP] Nicer error if versioned ref to unversioned model

* Revert "Latest version should use un-suffixed alias"

This reverts commit 3616c52.

* Revert "[WIP] Nicer error if versioned ref to unversioned model"

This reverts commit c9ae4af.

* Define real event for UnpinnedRefNewVersionAvailable

* Update pp test for implicit unsuffixed defined_in

* Add changelog entry

* Fix unit test

* marky feedback

* Add test case for UnpinnedRefNewVersionAvailable event

(cherry picked from commit d53bb37)

Co-authored-by: Jeremy Cohen <jeremy@dbtlabs.com>
…#7535) (dbt-labs#7548)

* Back compat for previous retrurn type of 'collect_freshness'

* Test fixups

* PR feedback
(cherry picked from commit 19d6dab)

Co-authored-by: Doug Beatty <44704949+dbeatty10@users.noreply.github.com>
…) (dbt-labs#7555)

(cherry picked from commit 5a7b73b)

Co-authored-by: Jeremy Cohen <jeremy@dbtlabs.com>
* Pin protobuf to >=4.0.0

* Changie

(cherry picked from commit d34c511)

Co-authored-by: Gerda Shank <gerda@dbtlabs.com>
…bs#7572)

test statically parsed two-argument ref

(cherry picked from commit 0891aef)

Co-authored-by: Michelle Ark <MichelleArk@users.noreply.github.com>
…t-labs#7543) (dbt-labs#7557)

(cherry picked from commit 40aca4b)

Co-authored-by: Kshitij Aranke <kshitij.aranke@dbtlabs.com>
dbt-labs#7605)

* CT 2510 Throw error for duplicate versioned and non versioned model names (dbt-labs#7577)

* Check for versioned/unversioned duplicates

* Add new exception DuplicateVersionedUnversionedError

* Changie

* Handle packages when finding versioned and unversioned duplicates

(cherry picked from commit 29f2cfc)

* Issue AmbiguousAlias error after DuplicateResourceName
* Exclude some profile fields from Jinja rendering when they are not valid Jinja. (dbt-labs#7630)

* CT-2583: Exclude some profile fields from Jinja rendering.

* CT-2583: Add functional test.

* CT-2583: Change approach to password jinja detection

* CT-2583: Extract string constant and add additional checks

* CT-2583: Improve unit test coverage

* CT-2583: Update changelog entry to reflect new approach
@codecov
Copy link

codecov bot commented Sep 26, 2023

Codecov Report

Attention: 148 lines in your changes are missing coverage. Please review.

Comparison is base (417fc2a) 86.65% compared to head (1f34060) 86.16%.
Report is 7 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #8701      +/-   ##
==========================================
- Coverage   86.65%   86.16%   -0.49%     
==========================================
  Files         176      176              
  Lines       25674    25842     +168     
==========================================
+ Hits        22247    22268      +21     
- Misses       3427     3574     +147     
Flag Coverage Δ
integration 83.01% <16.85%> (-0.45%) ⬇️
unit 64.80% <12.35%> (-0.35%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Coverage Δ
core/dbt/adapters/base/__init__.py 100.00% <ø> (ø)
core/dbt/graph/selector_spec.py 93.54% <100.00%> (ø)
core/dbt/node_types.py 98.07% <100.00%> (+0.03%) ⬆️
core/dbt/parser/read_files.py 84.50% <ø> (ø)
core/dbt/tests/fixtures/project.py 98.29% <100.00%> (ø)
core/setup.py 0.00% <ø> (ø)
core/dbt/compilation.py 95.53% <87.50%> (-0.32%) ⬇️
core/dbt/contracts/files.py 93.69% <0.00%> (-0.86%) ⬇️
core/dbt/parser/base.py 92.76% <33.33%> (-0.82%) ⬇️
core/dbt/context/providers.py 88.60% <40.00%> (-0.35%) ⬇️
... and 2 more

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@dataders dataders changed the base branch from main to 1.5.latest September 26, 2023 17:59
@dataders dataders changed the base branch from 1.5.latest to main September 26, 2023 17:59
@dataders
Copy link
Contributor

dataders commented Sep 26, 2023

hey @pekapa, first I applaud your ambition and tenacity to tackle this. We all would like to live in a world in which dbt becomes more language agnostic. I actually opened #5796 just over a year ago to start the conversation in that direction. @max-sixty has also done a lot of work considering how it might be done. More recently, I waxed poetic in our newsletter about linguistic relativism, which might also be up your alley.

It's not the answer that you'd like to hear, but I don't think we can prioritize doing this in the short- or medium-term time frame. I'm going to close this PR now because we're not at the place to merge this for technical, architectural, and design decisions that would need to come first.

That said, we do welcome your help in designing how we will support more data transformation APIs in dbt.

To that end, I'd really appreciate:

  1. opening a "add support for Scala" issue to which both this PR and Submit scala jobs beta dbt-spark#891 link
  2. contributing to dbt & Python: language agnosticism #5796 adding your input as to design considerations
  3. check out others' contributions to adding more languages to dbt Core e.g. A very-WIP implementation of the PRQL plugin, for discussion #5982

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.