Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Package creating a new table instead of using existing seed #88

Open
4 tasks
Elsayed91 opened this issue Jul 5, 2023 · 0 comments
Open
4 tasks

Package creating a new table instead of using existing seed #88

Elsayed91 opened this issue Jul 5, 2023 · 0 comments
Labels
bug Something isn't working

Comments

@Elsayed91
Copy link

Describe the bug

I was working on a personal project 5 months ago which had dbt, and this package, everything was working well.
Now as I was trying to run the project again, it is not going as well.

Steps to reproduce

      - dbt_datamocktool.unit_test:
          input_mapping:
            source('historical_data', 'yellow_historical'): ref('test_dbt__bi__yellow_m2m')
          expected_output: ref('dbt__bi__yellow_m2m_expected')
          depends_on:
            - ref('seed_zones')
          # tags: ["unit-test"]

test_dbt__bi__yellow_m2m and dbt__bi__yellow_m2m_expected being seeds that have same configuration and are in the same directory and are also loaded to the same bigquery dataset.
the 2 seeds originate from 2 csv files with the same names.

Expected results

A big query job combining the 2 seeds and testing the results of apply a model to one of the input and comparing it to the output.

Actual results

the tool uses only 1 of the two table, the expected_output table, and attempts to create the other table.

/* {"app": "dbt", "dbt_version": "1.5.2", "profile_name": "default", "target_name": "dev", "node_id": "test.nytaxi.dbt_datamocktool_unit_test_dbt__bi__yellow_m2m_ref_seed_zones___ref_dbt__bi__yellow_m2m_expected___ref_test_dbt__bi__yellow_m2m_.ac26a1c36a"} */

    











with a as (

    
select

  

   
    `vendors` 
    
      , 
     
   
    `year` 
    
      , 
     
   
    `month` 
    
      , 
     
   
    `monthly_fare_amount` 
    
      , 
     
   
    `average_passenger_count` 
    
      , 
     
   
    `average_trip_distance` 
    
      , 
     
   
    `monthly_extra_amount` 
    
      , 
     
   
    `monthly_mta_tax` 
    
      , 
     
   
    `monthly_tip_amount` 
    
      , 
     
   
    `monthly_tolls_amount` 
    
      , 
     
   
    `monthly_improvement_surcharge` 
    
      , 
     
   
    `monthly_total_amount` 
    
      , 
     
   
    `monthly_congestion_surcharge` 
    
      , 
     
   
    `pickup_borough` 
    
      , 
     
   
    `pickup_zone` 
    
      , 
     
   
    `dropoff_borough` 
    
      , 
     
   
    `dropoff_zone` 
    
      , 
     
   
    `avg_duration_minutes` 
    
      , 
     
   
    `trip_count` 
    
      , 
     
   
    `total_duration_hours` 
    
      , 
     
   
    `busiest_hour_of_day` 
    
      , 
     
   
    `airport_trip` 
    
      , 
     
   
    `payment_type` 
     
  



from `stellarismusv5`.`models_ut`.`dbt__bi__yellow_m2m_expected`


),

b as (

    
select

  

   
    `vendors` 
    
      , 
     
   
    `year` 
    
      , 
     
   
    `month` 
    
      , 
     
   
    `monthly_fare_amount` 
    
      , 
     
   
    `average_passenger_count` 
    
      , 
     
   
    `average_trip_distance` 
    
      , 
     
   
    `monthly_extra_amount` 
    
      , 
     
   
    `monthly_mta_tax` 
    
      , 
     
   
    `monthly_tip_amount` 
    
      , 
     
   
    `monthly_tolls_amount` 
    
      , 
     
   
    `monthly_improvement_surcharge` 
    
      , 
     
   
    `monthly_total_amount` 
    
      , 
     
   
    `monthly_congestion_surcharge` 
    
      , 
     
   
    `pickup_borough` 
    
      , 
     
   
    `pickup_zone` 
    
      , 
     
   
    `dropoff_borough` 
    
      , 
     
   
    `dropoff_zone` 
    
      , 
     
   
    `avg_duration_minutes` 
    
      , 
     
   
    `trip_count` 
    
      , 
     
   
    `total_duration_hours` 
    
      , 
     
   
    `busiest_hour_of_day` 
    
      , 
     
   
    `airport_trip` 
    
      , 
     
   
    `payment_type` 
     
  



from -- depends_on: `stellarismusv5`.`models_seeds`.`seed_zones`
    
    
    `stellarismusv5`.`models_historical`.`dbt__bi__yellow_m2m_dmt_49567973`


),

a_intersect_b as (

    select * from a
    

    intersect distinct


    select * from b

),

a_except_b as (

    select * from a
    

    except distinct


    select * from b

),

b_except_a as (

    select * from b
    

    except distinct


    select * from a

),

all_records as (

    select
        *,
        true as in_a,
        true as in_b
    from a_intersect_b

    union all

    select
        *,
        true as in_a,
        false as in_b
    from a_except_b

    union all

    select
        *,
        false as in_a,
        true as in_b
    from b_except_a

),

final as (
    
    select * from all_records
    where not (in_a and in_b)
    order by  in_a desc, in_b desc

)

select * from final


dbt__bi__yellow_m2m_dmt_49567973 is not a table that I have created. It is basically an identical copy of the table that is being tested, and it is even created in the same dataset as the original table.

while stellarismusv5.models_ut.dbt__bi__yellow_m2m_expected is the correct seed for expected output.

Screenshots and log output

image image

System information

packages:
  - package: calogica/dbt_date
    version: 0.7.2
  - package: calogica/dbt_expectations
    version: 0.8.2
  - package: elementary-data/elementary
    version: 0.8.2
  - package: dbt-labs/dbt_utils
    version: 1.0.0
  - package: mjirv/dbt_datamocktool
    version: 0.2.0

Which database are you using dbt with?

  • postgres
  • redshift
  • [ x] bigquery
  • snowflake
  • other (specify: ____________)

The output of dbt --version:

1.5.2

Additional context

I checked issues first, and it does not look like this is a common issue, so I'd wager that this is something incorrect on my end.
I have spent the better part of the day fiddling around, trying things out, changing directories, names, nothing works. I cannot seem to
pinpoint the cause, so any hints would be appreciated.

@Elsayed91 Elsayed91 added the bug Something isn't working label Jul 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant