Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CT-2612] [Feature] Add Contract to Seeds #7742

Closed
3 tasks done
sanromeo opened this issue May 31, 2023 · 3 comments
Closed
3 tasks done

[CT-2612] [Feature] Add Contract to Seeds #7742

sanromeo opened this issue May 31, 2023 · 3 comments
Labels
enhancement New feature or request wontfix Not a bug or out of scope for dbt-core

Comments

@sanromeo
Copy link

Is this your first time submitting a feature request?

  • I have read the expectations for open source contributors
  • I have searched the existing issues, and I could not find an existing issue for this feature
  • I am requesting a straightforward extension of existing dbt functionality, rather than a Big Idea better suited to a discussion

Describe the feature

The proposed feature aims to enhance the functionality of Contract released in dbt Core 1.5 by introducing contracts to seeds. This feature will allow users to define contracts for SEED data, ensuring that the data adheres to specific rules or conditions before it is loaded into the database.

The implementation of this feature will involve adding a new contract attribute to the seed configuration in the dbt project file. Users will be able to define their contracts using this attribute, and dbt will validate the seed data against these contracts during the seed operation.

Describe alternatives you've considered

---
version: 2

seeds:
  - name: seed_name
    config:
      schema: seed_schema
      tags: seed
      column_types:
        seed_id: int
        seed_name: varchar(255)
      contract:
        enforced: true

Now that config returns error:

10:48:52  Completed with 1 error and 0 warnings:
10:48:52  
10:48:52  Compilation Error in seed seed_name (seeds/ref_sources/seed_name.csv)
10:48:52    expected string or bytes-like object, got 'Undefined'
10:48:52    
10:48:52    > in macro athena__get_empty_schema_sql (macros/adapters/columns.sql)
10:48:52    > called by macro get_empty_schema_sql (macros/adapters/columns.sql)
10:48:52    > called by macro assert_columns_equivalent (macros/materializations/models/table/columns_spec_ddl.sql)
10:48:52    > called by macro default__get_assert_columns_equivalent (macros/materializations/models/table/columns_spec_ddl.sql)
10:48:52    > called by macro get_assert_columns_equivalent (macros/materializations/models/table/columns_spec_ddl.sql)
10:48:52    > called by macro athena__create_table_as (macros/materializations/models/table/create_table_as.sql)
10:48:52    > called by macro create_table_as (macros/materializations/models/table/create_table_as.sql)
10:48:52    > called by macro athena__create_csv_table (macros/materializations/seeds/helpers.sql)
10:48:52    > called by macro create_csv_table (macros/materializations/seeds/helpers.sql)
10:48:52    > called by macro materialization_seed_default (macros/materializations/seeds/seed.sql)
10:48:52    > called by seed seed_name (seeds/ref_sources/seed_name.csv)
10:48:52  

Who will this benefit?

Same contract functionality as we have for models now

Are you interested in contributing this feature?

No response

Anything else?

No response

@sanromeo sanromeo added enhancement New feature or request triage labels May 31, 2023
@github-actions github-actions bot changed the title [Feature] Add Contract to Seeds [CT-2612] [Feature] Add Contract to Seeds May 31, 2023
@dbeatty10
Copy link
Contributor

Thanks for raising this proposal @sanromeo !

Is there something about dbt model contracts that you are hoping for dbt seeds that wouldn't be covered by using column_types with those seeds?

@sanromeo
Copy link
Author

@dbeatty10, thanks for your answer!

I hadn't noticed that I needed to specify data_type when using the contract. Now, it works well when I set data_type in the definition for seeds, as I've written below:

seeds:
  - name: seed_name
    config:
      schema: seed_schema
      tags: seed
      column_types:
        seed_id: int
        seed_name: varchar(255)
      contract:
        enforced: true
    columns:
      - name: seed_id
        data_type: integer
      - name: seed_name
        data_type: varchar    

It works with column_types and without them. As written in documentation for contracts:

When enforced, your contract must include every column's name and data_type (where data_type matches one that your data platform understands).

I will be more attentive next time, thank you for your help! 🔥 ❤️

@dbeatty10
Copy link
Contributor

You are very welcome @sanromeo ! 🔥 ❤️ 🙌

I'm going to close this, but please reach out if there's any other follow-up needed here.

@dbeatty10 dbeatty10 closed this as not planned Won't fix, can't repro, duplicate, stale Jun 30, 2023
@dbeatty10 dbeatty10 added wontfix Not a bug or out of scope for dbt-core and removed triage labels Jun 30, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request wontfix Not a bug or out of scope for dbt-core
Projects
None yet
Development

No branches or pull requests

2 participants