Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(ingestion/lookml): emit dummy sql condition for lookml custom condition tag #11008

Merged

Conversation

sid-acryl
Copy link
Contributor

@sid-acryl sid-acryl commented Jul 26, 2024

Summary by CodeRabbit

  • New Features

    • Simplified Looker tag rendering by defaulting to a static condition.
    • Introduction of new LookML views for enhanced finance note tracking with dynamic SQL logic.
    • Added explore block for employee salary ratings to improve compensation analysis.
  • Improvements

    • Enhanced SQL query processing with conditional logic for flexible data source selection.
    • Improved handling of SQL queries to ensure completeness and proper formatting.
  • Documentation

    • Updated JSON configurations for better dynamic SQL filtering and tracking of finance notes.
  • Style

    • Streamlined import statements for improved readability and maintainability.

Copy link
Contributor

coderabbitai bot commented Jul 26, 2024

Warning

Rate limit exceeded

@sid-acryl has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 16 minutes and 21 seconds before requesting another review.

How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

Commits

Files that changed from the base of the PR and between 9430d10 and 0a35459.

Walkthrough

The recent changes enhance the Looker integration by refining SQL handling and improving metadata structures. Key updates include shifting from dynamic SQL conditions to static outputs, reorganizing import statements for clarity, and introducing new LookML views for better data querying. These modifications bolster metadata tracking and enhance error handling, facilitating improved data governance and lineage management within the framework.

Changes

Files Change Summary
.../looker/looker_liquid_tag.py Simplified render_to_output to output a static "dummy_value" instead of performing dynamic filter handling.
.../looker/lookml_config.py Restructured import statements for improved readability.
.../integration/lookml/vv-lineage-and-liquid-templates/finance_notes.view.lkml Introduced two new LookML views for finance notes with filters and dimensions.
.../integration/lookml/vv_lineage_liquid_template_golden.json Enhanced SQL logic with conditional data source selection and introduced new metadata aspects.
.../integration/lookml/vv-lineage-and-liquid-templates/data.model.lkml Added new explore block and included employee_salary_rating.view.lkml for salary data insights.
.../integration/lookml/vv-lineage-and-liquid-templates/employee_salary_rating.view.lkml Defined a new LookML view to dynamically select employee data based on specified conditions.
.../looker/view_upstream.py Updated SQL method calls to utilize transformed SQL and introduced a new reporter parameter for error handling.
.../looker/looker_template_language.py Added a new private function _complete_incomplete_sql for improving SQL query handling in templates.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant LookerIntegration
    participant MetadataStore

    User->>LookerIntegration: Request data with conditions
    LookerIntegration->>LookerIntegration: Process SQL with new static logic
    LookerIntegration->>MetadataStore: Store updated metadata
    MetadataStore-->>LookerIntegration: Confirm storage
    LookerIntegration-->>User: Return requested data
Loading

🐰 In the meadow, I leap with glee,
With SQL updates as bright as can be.
No more filters, just a static tune,
Our data sings beneath the moon!
Hooray for changes, let’s all rejoice,
For bunnies and data, we celebrate with voice! 🌼✨


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

Share
Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai generate interesting stats about this repository and render them as a table.
    • @coderabbitai show all the console.log statements in this repository.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (invoked as PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Additionally, you can add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@github-actions github-actions bot added the ingestion PR or Issue related to the ingestion of metadata label Jul 26, 2024
@sid-acryl sid-acryl requested a review from hsheth2 July 26, 2024 15:05
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Outside diff range, codebase verification and nitpick comments (10)
metadata-ingestion/src/datahub/ingestion/source/looker/looker_template_language.py (1)

98-114: Document the function with examples.

Adding examples of input and output SQL strings in the function docstring would help future maintainers understand the function's behavior.

def _complete_in_complete_sql(raw_view: dict, sql: str) -> str:
    """
    Ensures SQL fragments include necessary SELECT and FROM clauses.

    Args:
        raw_view (dict): The raw view dictionary containing view details.
        sql (str): The SQL fragment to be completed.

    Returns:
        str: The completed SQL string.

    Example:
        Input: "column1, column2"
        Output: "SELECT column1, column2 FROM view_name"
    """
    ...
metadata-ingestion/src/datahub/ingestion/source/looker/lookml_concept_context.py (2)

317-323: Handle edge cases when retrieving transformed SQL.

The datahub_transformed_sql method should include error handling for cases where the "datahub_transformed_sql" key is missing or its value is malformed. This will prevent potential runtime errors and ensure the method handles edge cases gracefully.

  • File: metadata-ingestion/src/datahub/ingestion/source/looker/lookml_concept_context.py
  • Lines: 317-323

Suggested Update:
Add error handling in the datahub_transformed_sql method to check if the "datahub_transformed_sql" key exists and its value is valid before returning it.

Analysis chain

Handle edge cases when retrieving transformed SQL.

Ensure that the method handles edge cases, such as missing or malformed SQL strings, gracefully.

Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the method handles edge cases gracefully.

# Test: Ensure the method handles missing or malformed SQL strings.
rg --type python -A 5 $'datahub_transformed_sql'

Length of output: 78


Script:

#!/bin/bash
# Search for the implementation of the `datahub_transformed_sql` method in Python files
ast-grep --lang python --pattern $'def datahub_transformed_sql(self) -> str: $$$'

# Search for any error handling related to `derived_table` or `datahub_transformed_sql`
rg 'derived_table' -A 5
rg 'datahub_transformed_sql' -A 5

Length of output: 53203


273-287: Handle edge cases when retrieving transformed SQL table names.

The method datahub_transformed_sql_table_name should be reviewed to ensure it handles edge cases, such as missing or malformed SQL table names, gracefully. No direct evidence of handling these edge cases was found in the current codebase.

  • Ensure the method has proper error handling for missing or malformed SQL table names.
  • Consider adding unit tests to cover these scenarios.
Analysis chain

Handle edge cases when retrieving transformed SQL table names.

Ensure that the method handles edge cases, such as missing or malformed SQL table names, gracefully.

Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the method handles edge cases gracefully.

# Test: Ensure the method handles missing or malformed SQL table names.
rg --type python -A 5 $'datahub_transformed_sql_table_name'

Length of output: 89


Script:

#!/bin/bash
# Description: Verify the method handles edge cases gracefully.

# Search for test cases or invocations of the `datahub_transformed_sql_table_name` method
rg 'datahub_transformed_sql_table_name' -A 10

Length of output: 6659


Script:

#!/bin/bash
# Description: Verify the method handles edge cases gracefully.

# Search for unit tests or error handling logic involving the `datahub_transformed_sql_table_name` method
rg 'datahub_transformed_sql_table_name' -g '*test*.py' -A 10

Length of output: 60


Script:

#!/bin/bash
# Description: Verify the method handles edge cases gracefully.

# Search for any references to the `datahub_transformed_sql_table_name` method across the entire codebase
rg 'datahub_transformed_sql_table_name' -A 10

Length of output: 6659

metadata-ingestion/src/datahub/ingestion/source/looker/view_upstream.py (2)

262-262: Handle edge cases when retrieving transformed SQL.

The datahub_transformed_sql method does not handle edge cases such as missing or malformed SQL strings. It directly returns the value from the derived_table dictionary, which could lead to errors if the key is missing or the value is malformed. Consider adding checks to handle these scenarios gracefully.

  • metadata-ingestion/src/datahub/ingestion/source/looker/lookml_concept_context.py (lines around 262)
Analysis chain

Handle edge cases when retrieving transformed SQL.

Ensure that the method handles edge cases, such as missing or malformed SQL strings, gracefully.

Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the method handles edge cases gracefully.

# Test: Ensure the method handles missing or malformed SQL strings.
rg --type python -A 5 $'datahub_transformed_sql'

Length of output: 78


Script:

#!/bin/bash
# Description: Verify the method handles edge cases gracefully.

# Test: Ensure the method handles missing or malformed SQL strings.
rg -A 10 'def datahub_transformed_sql' 

Length of output: 2549


481-483: Handle edge cases when retrieving transformed SQL table names.

The method datahub_transformed_sql_table_name does not handle edge cases for malformed SQL table names. It falls back to the sql_table_name method, which retrieves the name but does not validate or handle malformed inputs. Consider adding validation and error handling to ensure robustness.

  • metadata-ingestion/src/datahub/ingestion/source/looker/lookml_concept_context.py:
    • datahub_transformed_sql_table_name
    • sql_table_name
    • _get_sql_table_name_field
Analysis chain

Handle edge cases when retrieving transformed SQL table names.

Ensure that the method handles edge cases, such as missing or malformed SQL table names, gracefully.

Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the method handles edge cases gracefully.

# Test: Ensure the method handles missing or malformed SQL table names.
rg --type python -A 5 $'datahub_transformed_sql_table_name'

Length of output: 89


Script:

#!/bin/bash
# Description: Verify the method handles edge cases gracefully.

# Test: Ensure the method handles missing or malformed SQL table names.
rg --type py -A 5 'datahub_transformed_sql_table_name'

Length of output: 4037


Script:

#!/bin/bash
# Description: Verify how the sql_table_name method handles malformed SQL table names.

# Locate the sql_table_name method definition
rg --type py -A 5 'def sql_table_name'

Length of output: 816


Script:

#!/bin/bash
# Description: Verify how the _get_sql_table_name_field method handles malformed SQL table names.

# Locate the _get_sql_table_name_field method definition
rg --type py -A 5 'def _get_sql_table_name_field'

Length of output: 790

metadata-ingestion/tests/integration/lookml/lookml_mces_badsql_parser.json (5)

1717-1765: Add descriptions for schema fields.

The description fields for the schema fields are empty. Adding descriptions will improve the usability of the metadata.

- "description": "",
+ "description": "Description of the field",

Line range hint 1476-1524:
Add descriptions for schema fields.

The description fields for the schema fields are empty. Adding descriptions will improve the usability of the metadata.

- "description": "",
+ "description": "Description of the field",

Line range hint 1291-1339:
Add descriptions for schema fields.

The description fields for the schema fields are empty. Adding descriptions will improve the usability of the metadata.

- "description": "",
+ "description": "Description of the field",

Line range hint 1100-1148:
Add descriptions for schema fields.

The description fields for the schema fields are empty. Adding descriptions will improve the usability of the metadata.

- "description": "",
+ "description": "Description of the field",

Line range hint 911-959:
Add descriptions for schema fields.

The description fields for the schema fields are empty. Adding descriptions will improve the usability of the metadata.

- "description": "",
+ "description": "Description of the field",
Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between 01b3461 and dd02704.

Files selected for processing (17)
  • metadata-ingestion/src/datahub/ingestion/source/looker/looker_liquid_tag.py (2 hunks)
  • metadata-ingestion/src/datahub/ingestion/source/looker/looker_template_language.py (3 hunks)
  • metadata-ingestion/src/datahub/ingestion/source/looker/lookml_concept_context.py (2 hunks)
  • metadata-ingestion/src/datahub/ingestion/source/looker/lookml_config.py (1 hunks)
  • metadata-ingestion/src/datahub/ingestion/source/looker/lookml_source.py (1 hunks)
  • metadata-ingestion/src/datahub/ingestion/source/looker/view_upstream.py (3 hunks)
  • metadata-ingestion/tests/integration/lookml/expected_output.json (1 hunks)
  • metadata-ingestion/tests/integration/lookml/lookml_mces_api_bigquery.json (1 hunks)
  • metadata-ingestion/tests/integration/lookml/lookml_mces_api_hive2.json (1 hunks)
  • metadata-ingestion/tests/integration/lookml/lookml_mces_badsql_parser.json (1 hunks)
  • metadata-ingestion/tests/integration/lookml/lookml_mces_offline.json (1 hunks)
  • metadata-ingestion/tests/integration/lookml/lookml_mces_offline_platform_instance.json (1 hunks)
  • metadata-ingestion/tests/integration/lookml/lookml_mces_with_external_urls.json (1 hunks)
  • metadata-ingestion/tests/integration/lookml/refinements_ingestion_golden.json (1 hunks)
  • metadata-ingestion/tests/integration/lookml/vv-lineage-and-liquid-templates/data.model.lkml (2 hunks)
  • metadata-ingestion/tests/integration/lookml/vv-lineage-and-liquid-templates/finance_notes.view.lkml (1 hunks)
  • metadata-ingestion/tests/integration/lookml/vv_lineage_liquid_template_golden.json (3 hunks)
Files skipped from review due to trivial changes (1)
  • metadata-ingestion/src/datahub/ingestion/source/looker/lookml_config.py
Additional comments not posted (38)
metadata-ingestion/tests/integration/lookml/vv-lineage-and-liquid-templates/data.model.lkml (2)

8-8: Inclusion of finance_notes.view.lkml is approved.

The new include statement adds finance_notes.view.lkml, enhancing the data model by integrating additional financial notes.


25-25: Addition of explore: latest_account_holder_notes is approved.

The new explore block expands the model's functionality, allowing users to access and analyze new data dimensions related to account holder notes.

metadata-ingestion/src/datahub/ingestion/source/looker/looker_liquid_tag.py (1)

28-30: Simplification of render_to_output is approved, but verify the removal of error handling.

The new implementation simplifies the method by removing error handling and dynamic output based on the filter value. Ensure that this change aligns with the intended functionality.

metadata-ingestion/tests/integration/lookml/vv-lineage-and-liquid-templates/finance_notes.view.lkml (2)

1-35: Definition of latest_account_holder_notes_base view is approved.

The view is well-structured and provides necessary filters and dimensions for querying account holder notes.


37-109: Definition of latest_account_holder_notes view is approved.

The view is well-structured and provides a comprehensive SQL query for retrieving the latest account holder notes.

metadata-ingestion/src/datahub/ingestion/source/looker/looker_template_language.py (3)

123-139: Verify the consistency of new keys usage.

Ensure that the new keys datahub_transformed_sql_table_name and datahub_transformed_sql are used consistently throughout the codebase.

Verification successful

The new keys datahub_transformed_sql_table_name and datahub_transformed_sql are used consistently throughout the codebase.

  • view_upstream.py
  • lookml_concept_context.py
  • looker_template_language.py
Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the usage of new keys `datahub_transformed_sql_table_name` and `datahub_transformed_sql`.

# Test: Ensure the new keys are used consistently.
rg --type python -A 5 $'datahub_transformed_sql_table_name|datahub_transformed_sql'

Length of output: 115


Script:

#!/bin/bash
# Description: Verify the usage of new keys `datahub_transformed_sql_table_name` and `datahub_transformed_sql`.

# Test: Ensure the new keys are used consistently.
rg --glob '*.py' -A 5 'datahub_transformed_sql_table_name|datahub_transformed_sql'

Length of output: 7531


98-114: Ensure the regex patterns are correct.

The regex patterns used to detect SELECT and FROM clauses should be thoroughly tested to ensure they cover all possible cases. Additionally, the pattern used in re.sub should be verified for correctness.

Verification successful

The regex patterns used in _complete_in_complete_sql are correct.

The patterns for detecting SELECT and FROM clauses and the substitution pattern for DERIVED_VIEW_PATTERN are appropriate and should function as intended.

  • re.search(r"SELECT\s", sql_query, flags=re.I)
  • re.search(r"FROM\s", sql_query, flags=re.I)
  • re.sub(DERIVED_VIEW_PATTERN, r"\1", sql_query) where DERIVED_VIEW_PATTERN is r"\$\{([^}]*)\}"

These patterns ensure that SQL fragments are correctly completed and cleaned.

Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the regex patterns used in `_complete_in_complete_sql`.

# Test: Ensure the regex patterns cover all possible cases.
rg --type python -A 5 $'_complete_in_complete_sql'

Length of output: 80


Script:

#!/bin/bash
# Description: Verify the regex patterns used in `_complete_in_complete_sql`.

# Test: Ensure the regex patterns cover all possible cases.
rg --type py -A 5 $'_complete_in_complete_sql'

Length of output: 1163


Script:

#!/bin/bash
# Description: Extract the implementation of `_complete_in_complete_sql` to review regex patterns.

# Extract the function implementation
ast-grep --lang python --pattern $'def _complete_in_complete_sql($_, $_) -> $_:\n    $$$' 

Length of output: 2113


Script:

#!/bin/bash
# Description: Search for the definition of DERIVED_VIEW_PATTERN in the codebase.

# Search for the definition
rg 'DERIVED_VIEW_PATTERN'

Length of output: 706


123-139: Handle edge cases when transforming SQL.

Ensure that the function handles edge cases, such as missing or malformed SQL strings, gracefully.

Verification successful

Edge cases handled for SQL transformation.

The resolve_liquid_variable function includes exception handling for malformed SQL strings, and the _complete_in_complete_sql function ensures SQL fragments are complete by adding missing clauses. These mechanisms should adequately handle edge cases involving missing or malformed SQL strings.

  • resolve_liquid_variable handles LiquidSyntaxError and CustomTagException by logging warnings and returning the original text.
  • _complete_in_complete_sql adds missing SELECT and FROM clauses to SQL fragments.
Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the function handles edge cases gracefully.

# Test: Ensure the function handles missing or malformed SQL strings.
rg --type python -A 5 $'resolve_liquid_variable_in_view_dict'

Length of output: 91


Script:

#!/bin/bash
# Description: Verify the function handles edge cases gracefully.

# Test: Ensure the function handles missing or malformed SQL strings.
rg --type py -A 5 $'resolve_liquid_variable_in_view_dict'

Length of output: 2035


Script:

#!/bin/bash
# Description: Extract the full implementation of the resolve_liquid_variable_in_view_dict function.

ast-grep --lang python --pattern $'def resolve_liquid_variable_in_view_dict($_, $_):\n    $$$' 

Length of output: 94


Script:

#!/bin/bash
# Description: Extract the full implementation of the resolve_liquid_variable_in_view_dict function.

rg --type py -A 50 $'def resolve_liquid_variable_in_view_dict'

Length of output: 3399


Script:

#!/bin/bash
# Description: Extract the implementations of resolve_liquid_variable and _complete_in_complete_sql functions.

# Search for the resolve_liquid_variable function implementation
rg --type py -A 50 $'def resolve_liquid_variable'

# Search for the _complete_in_complete_sql function implementation
rg --type py -A 50 $'def _complete_in_complete_sql'

Length of output: 14571

metadata-ingestion/src/datahub/ingestion/source/looker/lookml_concept_context.py (4)

309-315: Verify the impact of removing the transformed parameter.

Ensure that the removal of the transformed parameter does not impact other parts of the codebase.

Verification successful

The previous script failed due to an unrecognized file type. To address this, we need to correct the file type and rerun the script. The correct file type for Python in rg is py.


Verified: The removal of the transformed parameter does not impact other parts of the codebase.

The rg search results confirm that there are no references to the transformed parameter in the sql method across the codebase.

  • The sql method in the provided snippet and other parts of the codebase does not include a transformed parameter.
Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the impact of removing the `transformed` parameter.

# Test: Ensure the removal of the `transformed` parameter does not impact other parts of the codebase.
rg --type python -A 5 $'sql\(self\)'

Length of output: 68


Script:

#!/bin/bash
# Description: Verify the impact of removing the `transformed` parameter.

# Test: Ensure the removal of the `transformed` parameter does not impact other parts of the codebase.
rg --type py -A 5 $'sql\(self\)'

Length of output: 2133


273-287: Verify the consistency of method usage.

Ensure that the datahub_transformed_sql_table_name method is used consistently throughout the codebase.


317-323: Verify the consistency of method usage.

Ensure that the datahub_transformed_sql method is used consistently throughout the codebase.

Verification successful

The datahub_transformed_sql method is used consistently.

The method datahub_transformed_sql is utilized in multiple files across the codebase, confirming its consistent usage.

  • metadata-ingestion/src/datahub/ingestion/source/looker/view_upstream.py
  • metadata-ingestion/src/datahub/ingestion/source/looker/looker_template_language.py
Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the consistency of `datahub_transformed_sql` method usage.

# Test: Ensure the method is used consistently.
rg --type python -A 5 $'datahub_transformed_sql'

Length of output: 78


Script:

#!/bin/bash
# Description: Verify the consistency of `datahub_transformed_sql` method usage.

# Test: Ensure the method is used consistently.
rg -A 5 'datahub_transformed_sql'

Length of output: 7480


309-315: Handle edge cases when retrieving SQL.

Ensure that the method handles edge cases, such as missing or malformed SQL strings, gracefully.

metadata-ingestion/src/datahub/ingestion/source/looker/view_upstream.py (4)

262-262: Verify the correctness and consistency of the new method usage.

Ensure that the datahub_transformed_sql method is used correctly and consistently.

Verification successful

The datahub_transformed_sql method is used correctly and consistently.

The datahub_transformed_sql method is integrated consistently across the codebase, ensuring its correctness in various contexts.

  • metadata-ingestion/src/datahub/ingestion/source/looker/view_upstream.py
  • metadata-ingestion/src/datahub/ingestion/source/looker/looker_template_language.py
  • metadata-ingestion/src/datahub/ingestion/source/looker/lookml_concept_context.py
Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the correctness and consistency of `datahub_transformed_sql` method usage.

# Test: Ensure the method is used correctly and consistently.
rg --type python -A 5 $'datahub_transformed_sql'

Length of output: 78


Script:

#!/bin/bash
# Description: Verify the correctness and consistency of `datahub_transformed_sql` method usage.

# Test: Ensure the method is used correctly and consistently.
rg --glob '*.py' -A 5 'datahub_transformed_sql'

Length of output: 7494


481-483: Verify the correctness and consistency of the new method usage.

Ensure that the datahub_transformed_sql_table_name method is used correctly and consistently.

Verification successful

The datahub_transformed_sql_table_name method is used correctly and consistently.

The method is defined and used appropriately across the codebase, ensuring it returns a valid table name.

  • metadata-ingestion/src/datahub/ingestion/source/looker/view_upstream.py
  • metadata-ingestion/src/datahub/ingestion/source/looker/lookml_concept_context.py
  • metadata-ingestion/src/datahub/ingestion/source/looker/looker_template_language.py
Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the correctness and consistency of `datahub_transformed_sql_table_name` method usage.

# Test: Ensure the method is used correctly and consistently.
rg --type python -A 5 $'datahub_transformed_sql_table_name'

Length of output: 89


Script:

#!/bin/bash
# Description: Verify the correctness and consistency of `datahub_transformed_sql_table_name` method usage.

# Test: Ensure the method is used correctly and consistently.
rg --type py -A 5 $'datahub_transformed_sql_table_name'

Length of output: 4037


535-538: Verify the correctness and consistency of the new method usage.

Ensure that the datahub_transformed_sql_table_name method is used correctly and consistently.

Verification successful

Verified the correctness and consistency of the new method usage.

The datahub_transformed_sql_table_name method is used correctly and consistently across the codebase.

  • Defined in metadata-ingestion/src/datahub/ingestion/source/looker/lookml_concept_context.py.
  • Used in metadata-ingestion/src/datahub/ingestion/source/looker/looker_template_language.py.
  • Used in metadata-ingestion/src/datahub/ingestion/source/looker/view_upstream.py.
Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the correctness and consistency of `datahub_transformed_sql_table_name` method usage.

# Test: Ensure the method is used correctly and consistently.
rg --type python -A 5 $'datahub_transformed_sql_table_name'

Length of output: 89


Script:

#!/bin/bash
# Description: Verify the correctness and consistency of `datahub_transformed_sql_table_name` method usage.

# Test: Ensure the method is used correctly and consistently.
rg --type py -A 5 'datahub_transformed_sql_table_name'

Length of output: 4037


535-538: Handle edge cases when retrieving transformed SQL table names.

Ensure that the method handles edge cases, such as missing or malformed SQL table names, gracefully.

metadata-ingestion/src/datahub/ingestion/source/looker/lookml_source.py (1)

204-206: LGTM! But verify the function usage in the codebase.

The code changes are approved.

However, ensure that all function calls to view_context.sql match the new usage.

Verification successful

Verification successful!

All function calls to view_context.sql match the new usage pattern.

  • metadata-ingestion/src/datahub/ingestion/source/looker/lookml_source.py:204
Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify all function calls to `view_context.sql` match the new usage.

# Test: Search for the function usage. Expect: Only occurances of the new usage.
rg --type python -A 5 $'view_context.sql'

Length of output: 71


Script:

#!/bin/bash
# Description: Verify all function calls to `view_context.sql` match the new usage.

# Test: Search for the function usage. Expect: Only occurrences of the new usage.
rg --type py -A 5 $'view_context.sql'

Length of output: 691

metadata-ingestion/tests/integration/lookml/vv_lineage_liquid_template_golden.json (5)

1303-1320: LGTM!

The changes to the latest_account_holder_notes_base entity are approved.


1620-1635: LGTM!

The changes to the latest_account_holder_notes entity are approved.


Line range hint 305-320:
LGTM!

The changes to the employee_income_source entity are approved.


1707-1722: LGTM!

The changes to the employee_total_income entity are approved.


1655-1670: LGTM!

The changes to the employee_tax_report entity are approved.

metadata-ingestion/tests/integration/lookml/expected_output.json (1)

1635-1635: LGTM! The dynamic condition enhances flexibility.

The SQL query modification to use a dynamic LookML condition for order_region improves the query's adaptability, allowing it to filter based on different regions dynamically.

metadata-ingestion/tests/integration/lookml/lookml_mces_api_hive2.json (1)

1635-1635: Dynamic SQL condition improves flexibility.

The change from a static condition to a dynamic condition in the SQL query enhances the configurability and adaptability of the query. This allows the query to dynamically filter based on the order_region parameter.

metadata-ingestion/tests/integration/lookml/lookml_mces_api_bigquery.json (1)

1635-1635: Dynamic SQL condition enhances flexibility.

The modification to use LookML syntax {% condition order_region %} order.region {% endcondition %} in the viewLogic field allows for dynamic filtering based on the order_region. This change improves the flexibility of the query by enabling it to adapt to different input parameters.

metadata-ingestion/tests/integration/lookml/lookml_mces_badsql_parser.json (5)

1679-1714: LGTM!

The JSON structure for UpstreamLineage is correct and provides detailed lineage information.


Line range hint 1458-1474:
LGTM!

The JSON structure for UpstreamLineage is correct and provides detailed lineage information.


Line range hint 1273-1289:
LGTM!

The JSON structure for UpstreamLineage is correct and provides detailed lineage information.


Line range hint 1082-1098:
LGTM!

The JSON structure for UpstreamLineage is correct and provides detailed lineage information.


Line range hint 893-909:
LGTM!

The JSON structure for UpstreamLineage is correct and provides detailed lineage information.

metadata-ingestion/tests/integration/lookml/lookml_mces_offline.json (2)

1679-1714: LGTM! The UpstreamLineage section is well-structured and detailed.

The UpstreamLineage section provides a clear lineage context for data processing, enhancing traceability of data sources.


1717-1766: LGTM! The SchemaMetadata section is comprehensive and well-defined.

The SchemaMetadata section includes detailed attributes and field definitions, improving the clarity of the data model.

metadata-ingestion/tests/integration/lookml/lookml_mces_offline_platform_instance.json (2)

1716-1766: LGTM! Ensure the correctness of the SchemaMetadata structure.

The SchemaMetadata structure enhances the descriptive capabilities of the dataset and ensures better integration with data lineage tracking. Verify the correctness of the data types and structure.


1678-1714: LGTM! Ensure the correctness of the UpstreamLineage structure.

The UpstreamLineage structure enhances data lineage tracking by providing detailed lineage information, including audit timestamps, dataset URIs, and types. Verify the correctness of the data types and structure.

Verification successful

The UpstreamLineage structure is correctly included and consistent with the schema definitions.

The structure is present in the test file metadata-ingestion/tests/integration/lookml/lookml_mces_offline_platform_instance.json and matches the definitions in the schema files.

  • metadata-ingestion/tests/integration/lookml/lookml_mces_offline_platform_instance.json
Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the correctness of the `UpstreamLineage` structure.

# Test: Check for the presence of `UpstreamLineage` in the schema definition.
rg --type json -A 5 $'"UpstreamLineage"'

Length of output: 3656


Script:

#!/bin/bash
# Description: Compare the `UpstreamLineage` structure in the test file with the definitions in the schema files.

# Extract `UpstreamLineage` structure from the test file.
test_structure=$(rg --type json -A 20 '"UpstreamLineage"' metadata-ingestion/tests/integration/lookml/lookml_mces_offline_platform_instance.json)

# Extract `UpstreamLineage` structure from the schema files.
schema_files=$(fd --type f --name '*.json' metadata-service/restli-api/src/main/snapshot/)
schema_structures=$(rg --type json -A 20 '"UpstreamLineage"' $schema_files)

# Print the extracted structures for comparison.
echo "Test File Structure:"
echo "$test_structure"
echo
echo "Schema File Structures:"
echo "$schema_structures"

Length of output: 39020


Script:

#!/bin/bash
# Description: Compare the `UpstreamLineage` structure in the test file with the definitions in the schema files.

# Extract `UpstreamLineage` structure from the test file.
test_structure=$(rg --type json -A 20 '"UpstreamLineage"' metadata-ingestion/tests/integration/lookml/lookml_mces_offline_platform_instance.json)

# Extract `UpstreamLineage` structure from the schema files.
schema_files=$(fd . --type f -e json metadata-service/restli-api/src/main/snapshot/)
schema_structures=$(rg --type json -A 20 '"UpstreamLineage"' $schema_files)

# Print the extracted structures for comparison.
echo "Test File Structure:"
echo "$test_structure"
echo
echo "Schema File Structures:"
echo "$schema_structures"

Length of output: 40598


Script:

#!/bin/bash
# Description: List lines around the expected location of the `UpstreamLineage` structure in the test file.

# List lines around the expected location of the `UpstreamLineage` structure.
rg --type json -A 20 -B 20 '"UpstreamLineage"' metadata-ingestion/tests/integration/lookml/lookml_mces_offline_platform_instance.json

Length of output: 135


Script:

#!/bin/bash
# Description: List the contents of the test file to manually verify the presence of the `UpstreamLineage` structure and other relevant elements.

# List the contents of the test file.
cat metadata-ingestion/tests/integration/lookml/lookml_mces_offline_platform_instance.json

Length of output: 110561

metadata-ingestion/tests/integration/lookml/refinements_ingestion_golden.json (2)

1659-1659: The change enhances query flexibility.

Replacing the hardcoded condition with a dynamic condition using templating syntax improves the adaptability of the query to different contexts or inputs.


1659-1659: Verify the correctness and usage of the templating syntax.

Ensure that the templating syntax {% condition order_region %} order.region {% endcondition %} is correctly implemented and used throughout the codebase.

metadata-ingestion/tests/integration/lookml/lookml_mces_with_external_urls.json (2)

1686-1723: LGTM! Verify the correctness of the lineage data.

The lineage data for the customer_facts dataset appears to be well-formed and includes detailed information about upstream datasets and fields.

However, ensure that the lineage data is correct and accurately represents the relationships between datasets and fields.

Verification successful

Verified: Lineage data for customer_facts is correctly represented.

The lineage data for the customer_facts dataset includes detailed information about upstream datasets and fields, and the relationships are accurately represented.

  • Files containing lineage data:
    • metadata-ingestion/tests/integration/lookml/lookml_mces_with_external_urls.json
    • metadata-ingestion/tests/integration/lookml/lookml_mces_api_bigquery.json
    • metadata-ingestion/tests/integration/lookml/lookml_mces_offline_platform_instance.json
    • metadata-ingestion/tests/integration/lookml/lookml_mces_offline.json
    • metadata-ingestion/tests/integration/lookml/lookml_mces_badsql_parser.json
    • metadata-ingestion/tests/integration/lookml/refinements_ingestion_golden.json
    • metadata-ingestion/tests/integration/lookml/expected_output.json
    • metadata-ingestion/tests/integration/lookml/lookml_mces_api_hive2.json
Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the correctness of the lineage data for `customer_facts`.

# Test: Search for the lineage data. Expect: Correct relationships between datasets and fields.
rg --type json -A 10 $'"urn:li:dataset:(urn:li:dataPlatform:looker,lkml_samples.view.customer_facts,PROD)"'

Length of output: 108


Script:

#!/bin/bash
# Description: Verify the correctness of the lineage data for `customer_facts`.

# Test: Search for the lineage data. Expect: Correct relationships between datasets and fields.
rg --type json -A 20 'customer_facts'

Length of output: 135820


1724-1774: LGTM! Verify the correctness of the schema metadata.

The schema metadata for the customer_facts dataset appears to be well-formed and includes detailed information about fields and their properties.

However, ensure that the schema metadata is correct and accurately represents the schema of the customer_facts dataset.

Verification successful

Schema Metadata Verification Successful

The schema metadata for the customer_facts dataset is correct and accurately represents the schema. The fields customer_id and lifetime_spend are consistently defined across different files with the expected properties and types.

  • metadata-ingestion/tests/integration/lookml/lookml_mces_with_external_urls.json
  • metadata-ingestion/tests/integration/lookml/expected_output.json
  • metadata-ingestion/tests/integration/lookml/refinements_ingestion_golden.json
  • metadata-ingestion/tests/integration/lookml/lookml_mces_api_hive2.json
  • metadata-ingestion/tests/integration/lookml/lookml_mces_badsql_parser.json
  • metadata-ingestion/tests/integration/lookml/lookml_mces_offline.json
  • metadata-ingestion/tests/integration/lookml/lookml_mces_offline_platform_instance.json
  • metadata-ingestion/tests/integration/lookml/lookml_mces_api_bigquery.json
Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the correctness of the schema metadata for `customer_facts`.

# Test: Search for the schema metadata. Expect: Correct field properties and types.
rg --type json -A 10 $'"schemaName": "customer_facts"'

Length of output: 10641


Script:

#!/bin/bash
# Description: Extract the full schema metadata for `customer_facts` and verify field properties and types.

# Extract schema metadata for `customer_facts` from relevant JSON files
rg --type json -A 50 $'"schemaName": "customer_facts"' metadata-ingestion/tests/integration/lookml/lookml_mces_with_external_urls.json

Length of output: 2427


Script:

#!/bin/bash
# Description: Search for the definition or documentation of the `customer_facts` schema to validate the correctness of the extracted schema metadata.

# Search for `customer_facts` schema definition or documentation
rg --type json -A 10 'customer_facts'

Length of output: 90530

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between 01b3461 and dd02704.

Files selected for processing (17)
  • metadata-ingestion/src/datahub/ingestion/source/looker/looker_liquid_tag.py (2 hunks)
  • metadata-ingestion/src/datahub/ingestion/source/looker/looker_template_language.py (3 hunks)
  • metadata-ingestion/src/datahub/ingestion/source/looker/lookml_concept_context.py (2 hunks)
  • metadata-ingestion/src/datahub/ingestion/source/looker/lookml_config.py (1 hunks)
  • metadata-ingestion/src/datahub/ingestion/source/looker/lookml_source.py (1 hunks)
  • metadata-ingestion/src/datahub/ingestion/source/looker/view_upstream.py (3 hunks)
  • metadata-ingestion/tests/integration/lookml/expected_output.json (1 hunks)
  • metadata-ingestion/tests/integration/lookml/lookml_mces_api_bigquery.json (1 hunks)
  • metadata-ingestion/tests/integration/lookml/lookml_mces_api_hive2.json (1 hunks)
  • metadata-ingestion/tests/integration/lookml/lookml_mces_badsql_parser.json (1 hunks)
  • metadata-ingestion/tests/integration/lookml/lookml_mces_offline.json (1 hunks)
  • metadata-ingestion/tests/integration/lookml/lookml_mces_offline_platform_instance.json (1 hunks)
  • metadata-ingestion/tests/integration/lookml/lookml_mces_with_external_urls.json (1 hunks)
  • metadata-ingestion/tests/integration/lookml/refinements_ingestion_golden.json (1 hunks)
  • metadata-ingestion/tests/integration/lookml/vv-lineage-and-liquid-templates/data.model.lkml (2 hunks)
  • metadata-ingestion/tests/integration/lookml/vv-lineage-and-liquid-templates/finance_notes.view.lkml (1 hunks)
  • metadata-ingestion/tests/integration/lookml/vv_lineage_liquid_template_golden.json (3 hunks)
Files skipped from review due to trivial changes (1)
  • metadata-ingestion/src/datahub/ingestion/source/looker/lookml_config.py
Additional comments not posted (73)
metadata-ingestion/tests/integration/lookml/vv-lineage-and-liquid-templates/data.model.lkml (2)

8-8: LGTM!

The inclusion of finance_notes.view.lkml aligns with the objective of enhancing the data model.


Line range hint 25-27:
LGTM!

The explore declaration for latest_account_holder_notes aligns with the objective of enhancing the data model.

metadata-ingestion/src/datahub/ingestion/source/looker/looker_liquid_tag.py (1)

28-30: Verify the implications of the change.

The method now writes a static string "1=1" to the output buffer, removing error handling and dynamic content generation. Ensure this change does not negatively impact the functionality.

Consider adding a comment to explain the rationale behind this change.

metadata-ingestion/tests/integration/lookml/vv-lineage-and-liquid-templates/finance_notes.view.lkml (2)

1-35: LGTM!

The latest_account_holder_notes_base view is well-defined and aligns with the objective of enhancing the data model.


37-109: LGTM! Verify the SQL logic.

The latest_account_holder_notes view is well-defined and aligns with the objective of enhancing the data model. Ensure the SQL logic is correct and efficient.

metadata-ingestion/src/datahub/ingestion/source/looker/looker_template_language.py (2)

124-139: LGTM! But verify the use of remove_extra_spaces_and_newlines and _complete_in_complete_sql.

The function updates the view dictionary with transformed SQL fields. The changes look correct.

However, ensure that remove_extra_spaces_and_newlines is necessary and that _complete_in_complete_sql is correctly handling the transformed SQL.

Verification successful

LGTM! But verify the use of remove_extra_spaces_and_newlines and _complete_in_complete_sql.

The function updates the view dictionary with transformed SQL fields. The changes look correct.

The use of remove_extra_spaces_and_newlines is necessary to ensure that extra spaces and newlines are removed from the sql_table_name, preventing issues in URNs. The function _complete_in_complete_sql is correctly handling the transformed SQL by ensuring that necessary SQL clauses are present.

Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the use of `remove_extra_spaces_and_newlines` and `_complete_in_complete_sql`.

# Test: Search for the usage of `resolve_liquid_variable_in_view_dict` and check the context.
rg --type python -A 5 $'resolve_liquid_variable_in_view_dict'

Length of output: 91


Script:

#!/bin/bash
# Search for the definition and usage of `remove_extra_spaces_and_newlines` in the codebase.
rg 'def remove_extra_spaces_and_newlines' -A 10
rg 'remove_extra_spaces_and_newlines' -A 5

# Search for the definition and usage of `_complete_in_complete_sql` in the codebase.
rg 'def _complete_in_complete_sql' -A 10
rg '_complete_in_complete_sql' -A 5

Length of output: 6017


98-114: LGTM! But verify the regex patterns and raw_view['name'].

The function ensures SQL completeness by adding missing SELECT and FROM clauses. The regex patterns and the use of raw_view['name'] look correct.

However, ensure that the regex patterns match all possible variations of SELECT and FROM clauses and that raw_view['name'] is always available and correctly formatted.

metadata-ingestion/src/datahub/ingestion/source/looker/lookml_concept_context.py (5)

273-287: LGTM! But verify the use of raw_view['datahub_transformed_sql_table_name'] and sql_table_name.

The function retrieves the datahub_transformed_sql_table_name from raw_view and, if not present, falls back to invoking the existing sql_table_name method. The changes look correct.

However, ensure that raw_view['datahub_transformed_sql_table_name'] is always available and correctly formatted, and that the call to sql_table_name is necessary.

Verification successful

LGTM! But verify the use of raw_view['datahub_transformed_sql_table_name'] and sql_table_name.

The function retrieves the datahub_transformed_sql_table_name from raw_view and, if not present, falls back to invoking the existing sql_table_name method. The changes look correct.

However, ensure that raw_view['datahub_transformed_sql_table_name'] is always available and correctly formatted, and that the call to sql_table_name is necessary.

  • The datahub_transformed_sql_table_name method is correctly implemented.
  • The sql_table_name method is correctly called if datahub_transformed_sql_table_name is not found.
  • Both methods handle the necessary transformations and cleanups.
Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the use of `raw_view['datahub_transformed_sql_table_name']` and `sql_table_name`.

# Test: Search for the usage of `datahub_transformed_sql_table_name` and check the context.
rg --type python -A 5 $'datahub_transformed_sql_table_name'

Length of output: 89


Script:

#!/bin/bash
# Description: Verify the use of `raw_view['datahub_transformed_sql_table_name']` and `sql_table_name`.

# Test: Search for the usage of `raw_view['datahub_transformed_sql_table_name']` and check the context.
rg --type py -A 5 $'raw_view\\[\'datahub_transformed_sql_table_name\'\\]'

# Test: Search for the usage of `sql_table_name` and check the context.
rg --type py -A 5 $'sql_table_name'

Length of output: 31506


Line range hint 481-483:
LGTM! But verify the use of datahub_transformed_sql_table_name.

The function has been updated to use datahub_transformed_sql_table_name. The changes look correct.

However, ensure that datahub_transformed_sql_table_name is always available and correctly formatted.

Verification successful

LGTM! The function _get_upstream_dataset_urn is correctly using datahub_transformed_sql_table_name.

The changes look correct and the usage of datahub_transformed_sql_table_name is appropriate.

  • metadata-ingestion/src/datahub/ingestion/source/looker/view_upstream.py: Verified the usage of _get_upstream_dataset_urn.
Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the use of `datahub_transformed_sql_table_name`.

# Test: Search for the usage of `_get_upstream_dataset_urn` and check the context.
rg --type python -A 5 $'_get_upstream_dataset_urn'

Length of output: 80


Script:

#!/bin/bash
# Description: Verify the use of `datahub_transformed_sql_table_name`.

# Test: Search for the usage of `_get_upstream_dataset_urn` and check the context.
rg --type py -A 5 '_get_upstream_dataset_urn'

Length of output: 12345


317-323: LGTM! But verify the use of raw_view['datahub_transformed_sql'].

The function returns the datahub_transformed_sql from the derived table. The changes look correct.

However, ensure that raw_view['datahub_transformed_sql'] is always available and correctly formatted.

Verification successful

LGTM! The use of raw_view['datahub_transformed_sql'] is verified.

The datahub_transformed_sql is correctly accessed from the derived table and is used appropriately in the codebase.

  • looker_template_language.py: Ensures datahub_transformed_sql is set in the derived table.
  • view_upstream.py: Utilizes datahub_transformed_sql in various contexts.
Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the use of `raw_view['datahub_transformed_sql']`.

# Test: Search for the usage of `datahub_transformed_sql` and check the context.
rg --type python -A 5 $'datahub_transformed_sql'

Length of output: 78


Script:

#!/bin/bash
# Description: Verify the use of `raw_view['datahub_transformed_sql']`.

# Test: Search for the usage of `datahub_transformed_sql` and check the context.
rg --glob '*.py' -A 5 'datahub_transformed_sql'

Length of output: 7494


309-315: LGTM! But verify the removal of the transformed parameter.

The function has been simplified by removing the transformed parameter. The method now directly returns the SQL query from the derived table. The changes look correct.

However, ensure that the removal of the transformed parameter does not impact other parts of the codebase.

Verification successful

Verified: The removal of the transformed parameter from the sql function does not impact other parts of the codebase.

The sql function is used correctly without the transformed parameter, and the term transformed is present in different contexts unrelated to the sql function.

  • No issues found with the removal of the transformed parameter.
Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the removal of the `transformed` parameter.

# Test: Search for the usage of `sql` and check the context.
rg --type python -A 5 $'sql'

Length of output: 58


Script:

#!/bin/bash
# Description: Verify the removal of the `transformed` parameter.

# Test: Search for the definition of the `sql` function and its usage in the codebase.
rg 'def sql' -A 5
rg 'sql(' -A 5

Length of output: 7108


Script:

#!/bin/bash
# Description: Verify the removal of the `transformed` parameter.

# Test: Search for the usage of the `sql` function in the lookml_concept_context.py file.
rg 'sql(' metadata-ingestion/src/datahub/ingestion/source/looker/lookml_concept_context.py -A 5

Length of output: 156


Script:

#!/bin/bash
# Description: Verify the removal of the `transformed` parameter.

# Test: Search for the usage of the `sql` function and the term `transformed` in the lookml_concept_context.py file.
rg 'sql\(' metadata-ingestion/src/datahub/ingestion/source/looker/lookml_concept_context.py -A 5
rg 'transformed' metadata-ingestion/src/datahub/ingestion/source/looker/lookml_concept_context.py -A 5

Length of output: 1849


269-271: LGTM! But verify the use of raw_view[NAME].

The function ensures that if the sql_table_name field is not set, it directly assigns the lowercased raw_view[NAME] to sql_table_name. The changes look correct.

However, ensure that raw_view[NAME] is always available and correctly formatted.

Verification successful

LGTM! The usage of raw_view[NAME] is valid and consistent.

The self.raw_view is properly initialized, and NAME is used correctly as a key to access elements within self.raw_view. The changes look correct.

  • The self.raw_view is initialized with the raw_view parameter.
  • The NAME constant is used consistently to access elements within self.raw_view.
Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the use of `raw_view[NAME]`.

# Test: Search for the usage of `sql_table_name` and check the context.
rg --type python -A 5 $'sql_table_name'

Length of output: 69


Script:

#!/bin/bash
# Description: Verify the use of `raw_view[NAME]`.

# Test: Search for the usage of `sql_table_name` and check the context.
rg --type py -A 5 $'sql_table_name'

Length of output: 31429


Script:

#!/bin/bash
# Description: Verify the initialization and usage of `self.raw_view` and `NAME`.

# Test: Search for the initialization of `self.raw_view` and usage of `NAME` in `lookml_concept_context.py`.
rg --type py 'self.raw_view' -A 5
rg --type py 'NAME' -A 5

Length of output: 335557

metadata-ingestion/src/datahub/ingestion/source/looker/view_upstream.py (2)

481-483: LGTM! But verify the use of datahub_transformed_sql_table_name.

The function has been updated to use datahub_transformed_sql_table_name. The changes look correct.

However, ensure that datahub_transformed_sql_table_name is always available and correctly formatted.

Verification successful

LGTM! But verify the use of datahub_transformed_sql_table_name.

The function has been updated to use datahub_transformed_sql_table_name. The changes look correct.

However, ensure that datahub_transformed_sql_table_name is always available and correctly formatted.

Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the use of `datahub_transformed_sql_table_name`.

# Test: Search for the usage of `_get_upstream_dataset_urn` and check the context.
rg --type python -A 5 $'_get_upstream_dataset_urn'

Length of output: 80


Script:

#!/bin/bash
# Description: Verify the use of `_get_upstream_dataset_urn` and `datahub_transformed_sql_table_name`.

# Test 1: Search for the usage of `_get_upstream_dataset_urn` and check the context.
rg --type py -A 5 '_get_upstream_dataset_urn'

# Test 2: Search for the definition and usage of `datahub_transformed_sql_table_name`.
rg --type py 'datahub_transformed_sql_table_name'

Length of output: 13477


262-262: LGTM! But verify the use of datahub_transformed_sql.

The function has been updated to use datahub_transformed_sql. The changes look correct.

However, ensure that datahub_transformed_sql is always available and correctly formatted.

Verification successful

The datahub_transformed_sql method is correctly defined and returns the expected SQL string from the derived_table.

The method is implemented as follows:

  • It retrieves the derived_table and returns the value associated with the "datahub_transformed_sql" key.
  • This method is used in the view_upstream.py file as expected.

The changes look correct and the method is properly implemented and used.

Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the use of `datahub_transformed_sql`.

# Test: Search for the usage of `__get_spr` and check the context.
rg --type python -A 5 $'__get_spr'

Length of output: 64


Script:

#!/bin/bash
# Description: Verify the use of `datahub_transformed_sql`.

# Test: Search for the usage of `__get_spr` and check the context.
rg --type py -A 5 $'__get_spr'

# Test: Search for the definition and usage of `datahub_transformed_sql`.
rg --type py $'datahub_transformed_sql'

Length of output: 3203


Script:

#!/bin/bash
# Description: Verify the definition of `datahub_transformed_sql`.

# Test: Search for the definition of `datahub_transformed_sql` method.
rg --type py -A 10 $'def datahub_transformed_sql' metadata-ingestion/src/datahub/ingestion/source/looker/lookml_concept_context.py

Length of output: 858

metadata-ingestion/src/datahub/ingestion/source/looker/lookml_source.py (1)

204-204: Ensure the change in SQL handling logic is intentional.

The call to view_context.sql() without the transformed argument might change the SQL logic being processed. Verify that this change is intentional and does not affect the expected behavior.

metadata-ingestion/tests/integration/lookml/vv_lineage_liquid_template_golden.json (39)

1303-1320: New aspect added: subTypes.

The subTypes aspect has been added to the dataset. Ensure that the type names are correct and relevant.


1321-1338: New aspect added: viewProperties.

The viewProperties aspect includes detailed SQL logic for the dataset. Ensure that the SQL logic is correct and optimized for performance.


1339-1354: New aspect added: container.

The container aspect specifies the container for the dataset. Ensure that the container URN is correct and relevant.


1355-1593: New aspect added: UpstreamLineage.

The UpstreamLineage aspect includes upstream datasets and fine-grained lineages. Ensure that the lineage information is accurate and complete.


1594-1617: New aspect added: browsePathsV2.

The browsePathsV2 aspect includes browse paths for the dataset. Ensure that the browse paths are correct and relevant.


1620-1635: New aspect added: subTypes.

The subTypes aspect has been added to the dataset. Ensure that the type names are correct and relevant.


1636-1653: New aspect added: viewProperties.

The viewProperties aspect includes detailed SQL logic for the dataset. Ensure that the SQL logic is correct and optimized for performance.


1654-1669: New aspect added: container.

The container aspect specifies the container for the dataset. Ensure that the container URN is correct and relevant.


1670-1705: New aspect added: UpstreamLineage.

The UpstreamLineage aspect includes upstream datasets and fine-grained lineages. Ensure that the lineage information is accurate and complete.


1706-1729: New aspect added: browsePathsV2.

The browsePathsV2 aspect includes browse paths for the dataset. Ensure that the browse paths are correct and relevant.


Line range hint 305-320:
New aspect added: viewProperties.

The viewProperties aspect includes detailed SQL logic for the dataset. Ensure that the SQL logic is correct and optimized for performance.


Line range hint 321-336:
New aspect added: container.

The container aspect specifies the container for the dataset. Ensure that the container URN is correct and relevant.


Line range hint 337-370:
New aspect added: UpstreamLineage.

The UpstreamLineage aspect includes upstream datasets and fine-grained lineages. Ensure that the lineage information is accurate and complete.


Line range hint 371-394:
New aspect added: browsePathsV2.

The browsePathsV2 aspect includes browse paths for the dataset. Ensure that the browse paths are correct and relevant.


Line range hint 395-410:
New aspect added: viewProperties.

The viewProperties aspect includes detailed SQL logic for the dataset. Ensure that the SQL logic is correct and optimized for performance.


Line range hint 411-426:
New aspect added: container.

The container aspect specifies the container for the dataset. Ensure that the container URN is correct and relevant.


Line range hint 427-460:
New aspect added: UpstreamLineage.

The UpstreamLineage aspect includes upstream datasets and fine-grained lineages. Ensure that the lineage information is accurate and complete.


Line range hint 461-484:
New aspect added: browsePathsV2.

The browsePathsV2 aspect includes browse paths for the dataset. Ensure that the browse paths are correct and relevant.


Line range hint 485-500:
New aspect added: viewProperties.

The viewProperties aspect includes detailed SQL logic for the dataset. Ensure that the SQL logic is correct and optimized for performance.


Line range hint 501-516:
New aspect added: container.

The container aspect specifies the container for the dataset. Ensure that the container URN is correct and relevant.


Line range hint 517-550:
New aspect added: UpstreamLineage.

The UpstreamLineage aspect includes upstream datasets and fine-grained lineages. Ensure that the lineage information is accurate and complete.


Line range hint 551-574:
New aspect added: browsePathsV2.

The browsePathsV2 aspect includes browse paths for the dataset. Ensure that the browse paths are correct and relevant.


Line range hint 575-590:
New aspect added: viewProperties.

The viewProperties aspect includes detailed SQL logic for the dataset. Ensure that the SQL logic is correct and optimized for performance.


Line range hint 591-606:
New aspect added: container.

The container aspect specifies the container for the dataset. Ensure that the container URN is correct and relevant.


Line range hint 607-640:
New aspect added: UpstreamLineage.

The UpstreamLineage aspect includes upstream datasets and fine-grained lineages. Ensure that the lineage information is accurate and complete.


Line range hint 641-664:
New aspect added: browsePathsV2.

The browsePathsV2 aspect includes browse paths for the dataset. Ensure that the browse paths are correct and relevant.


1303-1320: New aspect added: subTypes.

The subTypes aspect has been added to the dataset. Ensure that the type names are correct and relevant.


1321-1338: New aspect added: viewProperties.

The viewProperties aspect includes detailed SQL logic for the dataset. Ensure that the SQL logic is correct and optimized for performance.


1339-1354: New aspect added: container.

The container aspect specifies the container for the dataset. Ensure that the container URN is correct and relevant.


1355-1593: New aspect added: UpstreamLineage.

The UpstreamLineage aspect includes upstream datasets and fine-grained lineages. Ensure that the lineage information is accurate and complete.


1594-1617: New aspect added: browsePathsV2.

The browsePathsV2 aspect includes browse paths for the dataset. Ensure that the browse paths are correct and relevant.


1620-1635: New aspect added: subTypes.

The subTypes aspect has been added to the dataset. Ensure that the type names are correct and relevant.


1636-1653: New aspect added: viewProperties.

The viewProperties aspect includes detailed SQL logic for the dataset. Ensure that the SQL logic is correct and optimized for performance.


1654-1669: New aspect added: container.

The container aspect specifies the container for the dataset. Ensure that the container URN is correct and relevant.


1670-1705: New aspect added: UpstreamLineage.

The UpstreamLineage aspect includes upstream datasets and fine-grained lineages. Ensure that the lineage information is accurate and complete.


1706-1729: New aspect added: browsePathsV2.

The browsePathsV2 aspect includes browse paths for the dataset. Ensure that the browse paths are correct and relevant.


Line range hint 1729-1736:
New aspect added: tagKey for Dimension.

The tagKey aspect has been added to the tag. Ensure that the tag name is correct and relevant.


Line range hint 1737-1744:
New aspect added: tagKey for Measure.

The tagKey aspect has been added to the tag. Ensure that the tag name is correct and relevant.


Line range hint 1745-1752:
New aspect added: tagKey for Temporal.

The tagKey aspect has been added to the tag. Ensure that the tag name is correct and relevant.

metadata-ingestion/tests/integration/lookml/expected_output.json (1)

1635-1635: Verify the correctness of the dynamic condition in the SQL query.

The static condition order.region = 'ap-south-1' has been replaced with a dynamic condition using LookML's templating language: {% condition order_region %} order.region {% endcondition %}. Ensure that this dynamic condition is correctly implemented and enhances the query's adaptability.

metadata-ingestion/tests/integration/lookml/lookml_mces_api_hive2.json (1)

1635-1635: Confirm the correctness of the new dynamic filtering mechanism.

The new templated approach {% condition order_region %} order.region {% endcondition %} enhances the flexibility of the query by allowing the region to be specified at runtime. Ensure that the templating syntax is correctly supported and tested in the LookML environment.

metadata-ingestion/tests/integration/lookml/lookml_mces_api_bigquery.json (1)

1635-1635: LGTM!

The use of LookML templating syntax for the order_region condition enhances the flexibility and maintainability of the query.

metadata-ingestion/tests/integration/lookml/lookml_mces_badsql_parser.json (2)

1678-1714: LGTM!

The UpstreamLineage segment is well-structured and includes detailed information about upstream datasets and their relationships. The fine-grained lineage information is comprehensive and correctly formatted.


1717-1766: LGTM!

The SchemaMetadata segment is well-structured and includes comprehensive details about the customer_facts schema. The fields are correctly defined with appropriate attributes such as nullability, descriptions, and data types.

metadata-ingestion/tests/integration/lookml/lookml_mces_offline.json (2)

1679-1714: Verify the correctness and completeness of the UpstreamLineage entry.

Ensure that the UpstreamLineage entry correctly represents the relationships between upstream and downstream datasets and fields. Verify that the audit stamp, upstream datasets, and fine-grained lineages are accurate and complete.


1717-1766: Verify the correctness and completeness of the SchemaMetadata entry.

Ensure that the SchemaMetadata entry correctly represents the schema named customer_facts. Verify that the platform, version, timestamps, hash, fields, and primary keys are accurate and complete.

metadata-ingestion/tests/integration/lookml/lookml_mces_offline_platform_instance.json (6)

1717-1765: Ensure correct and complete SchemaMetadata definitions.

The SchemaMetadata aspect is comprehensive, but ensure that all field paths, types, and descriptions are correctly defined and consistent across datasets.


1766-1768: Ensure correct DatasetProperties definitions.

The DatasetProperties aspect is well-defined, but ensure that all custom properties and dataset names are correctly defined and consistent across datasets.


Line range hint 1769-1771:
Ensure correct BrowsePathsV2 definitions.

The BrowsePathsV2 aspect is well-defined, but ensure that all browse paths are correctly defined and consistent across datasets.


1679-1713: Ensure consistency in UpstreamLineage definitions.

The UpstreamLineage aspect is well-defined, but ensure that all upstream and downstream fields are correctly referenced and consistent across datasets.

Verification successful

Ensure consistency in UpstreamLineage definitions.

The UpstreamLineage aspect is consistently defined across datasets with accurate references. No issues were found in the structure or field definitions.

Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify consistency in UpstreamLineage definitions across datasets.

# Test: Search for UpstreamLineage aspect. Expect: Consistent references across datasets.
rg --type json -A 10 '"com.linkedin.pegasus2avro.dataset.UpstreamLineage"'

Length of output: 456783


Line range hint 1775-1777:
Ensure correct TagProperties definitions.

The TagProperties aspect is well-defined, but ensure that all tag properties are correctly defined and consistent across tags.

Verification successful

TagProperties definitions are correct and consistent across tags.

The TagProperties aspect is consistently defined in multiple files, and the definitions are correct.

  • Examples of consistent definitions:
    • "name": "Dimension", "description": "A tag that is applied to all dimension fields."
    • "name": "Temporal", "description": "A tag that is applied to all time-based (temporal) fields such as timestamps or durations."
    • "name": "Measure", "description": "A tag that is applied to all measures (metrics). Measures are typically the columns that you aggregate on"
Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify correctness of TagProperties definitions across tags.

# Test: Search for TagProperties aspect. Expect: Correct definitions across tags.
rg --type json -A 5 '"com.linkedin.pegasus2avro.tag.TagProperties"'

Length of output: 48591


Line range hint 1772-1774:
Ensure correct Status definitions.

The Status aspect is well-defined, but ensure that all removal statuses are correctly defined and consistent across datasets and tags.

Verification successful

Verification Successful: Correct Status definitions

The Status aspect is consistently defined across datasets and tags, with all instances of the removed field set to false.

  • Files Checked:
    • smoke-test/tests/cypress/cypress_dbt_data.json
    • metadata-ingestion/tests/unit/glue/glue_mces_golden_table_lineage.json
    • metadata-ingestion/tests/integration/trino/trino_hive_instance_mces_golden.json
    • metadata-ingestion/tests/integration/lookml/lookml_mces_offline_platform_instance.json
    • And many more as listed in the output.
Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify correctness of Status definitions across datasets and tags.

# Test: Search for Status aspect. Expect: Correct definitions across datasets and tags.
rg --type json -A 5 '"com.linkedin.pegasus2avro.common.Status"'

Length of output: 436706

metadata-ingestion/tests/integration/lookml/refinements_ingestion_golden.json (1)

1659-1659: LGTM! The use of a templated condition enhances flexibility.

The change from a hardcoded filter to a templated condition for order.region improves the adaptability and maintainability of the SQL query.

metadata-ingestion/tests/integration/lookml/lookml_mces_with_external_urls.json (5)

1686-1696: Ensure completeness of UpstreamLineage attributes.

The UpstreamLineage structure includes auditStamp, dataset, and type. Verify that all necessary attributes are included and correctly populated.


1698-1721: Check fineGrainedLineages for accuracy and completeness.

The fineGrainedLineages attribute within UpstreamLineage includes detailed lineage mappings. Ensure that all mappings are accurate and complete.


1725-1736: Ensure completeness of SchemaMetadata attributes.

The SchemaMetadata structure includes schemaName, platform, version, created, and lastModified. Verify that all necessary attributes are included and correctly populated.


1743-1771: Check fields attribute for accuracy and completeness.

The fields attribute within SchemaMetadata includes field definitions. Ensure that all fields are accurately defined and all necessary attributes are included.


1773-1774: Verify primaryKeys attribute.

The primaryKeys attribute within SchemaMetadata should list all primary keys. Ensure that it is correctly populated.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between dd02704 and 75ede04.

Files selected for processing (3)
  • metadata-ingestion/src/datahub/ingestion/source/looker/looker_liquid_tag.py (2 hunks)
  • metadata-ingestion/tests/integration/lookml/vv-lineage-and-liquid-templates/finance_notes.view.lkml (1 hunks)
  • metadata-ingestion/tests/integration/lookml/vv_lineage_liquid_template_golden.json (3 hunks)
Files skipped from review as they are similar to previous changes (1)
  • metadata-ingestion/tests/integration/lookml/vv-lineage-and-liquid-templates/finance_notes.view.lkml
Additional comments not posted (18)
metadata-ingestion/src/datahub/ingestion/source/looker/looker_liquid_tag.py (2)

30-30: LGTM!

The code changes are approved.


28-30: Verify the impact of removing dynamic content generation and error handling.

The method now writes a static string "dummy_value" to the output buffer, which simplifies the rendering process but alters the expected behavior when using the Looker condition tag. Ensure that this change does not negatively impact other parts of the codebase that rely on dynamic content generation and error handling.

Verification successful

Verify the impact of removing dynamic content generation and error handling.

The ConditionNode class and its render_to_output method are only defined and used within metadata-ingestion/src/datahub/ingestion/source/looker/looker_liquid_tag.py. There are no other references to these entities in the codebase, indicating that the changes are isolated and do not negatively impact other parts of the codebase.

  • metadata-ingestion/src/datahub/ingestion/source/looker/looker_liquid_tag.py
Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the impact of removing dynamic content generation and error handling.

# Test: Search for the usage of the `ConditionNode` class and its `render_to_output` method. Expect: No negative impact on the codebase.
rg --type python -A 5 $'ConditionNode'

Length of output: 68


Script:

#!/bin/bash
# Description: Verify the impact of removing dynamic content generation and error handling.

# Test: Search for the usage of the `ConditionNode` class and its `render_to_output` method. Expect: No negative impact on the codebase.
rg 'class ConditionNode' --glob '*.py' -A 10
rg 'def render_to_output' --glob '*.py' -A 10

Length of output: 2597


Script:

#!/bin/bash
# Description: Verify the impact of removing dynamic content generation and error handling.

# Test: Search for the usage of the `ConditionNode` class and its `render_to_output` method throughout the entire codebase.
rg 'ConditionNode' -A 5
rg 'render_to_output' -A 5

Length of output: 2100

metadata-ingestion/tests/integration/lookml/vv_lineage_liquid_template_golden.json (16)

305-305: LGTM!

The code changes are approved.


1644-1644: LGTM!

The code changes are approved.


1303-1320: LGTM!

The code changes are approved.


1321-1338: LGTM!

The code changes are approved.


1339-1354: LGTM!

The code changes are approved.


1594-1617: LGTM!

The code changes are approved.


305-305: LGTM!

The code changes are approved.


1762-1776: LGTM!

The code changes are approved.


1303-1320: Verify the correctness of the new aspect subTypes.

The new aspect subTypes enhances metadata management. Ensure that the aspect is correctly defined and behaves as expected.


1339-1354: Verify the correctness of the new aspect container.

The new aspect container enhances metadata management. Ensure that the aspect is correctly defined and behaves as expected.


1594-1617: Verify the correctness of the new aspect browsePathsV2.

The new aspect browsePathsV2 enhances metadata management. Ensure that the aspect is correctly defined and behaves as expected.


1644-1644: Verify the correctness of the new SQL logic.

The new SQL logic enhances data querying capabilities. Ensure that the SQL syntax is correct and that the logic behaves as expected.


305-305: Verify the correctness of the conditional SQL logic.

The new logic introduces conditional SQL for dynamic data source selection. Ensure that the conditions and SQL syntax are correct and that the logic behaves as expected.


1762-1776: Verify the correctness of the new aspect tagKey.

The new aspect tagKey enhances metadata management. Ensure that the aspect is correctly defined and behaves as expected.


305-305: Verify the correctness of the new SQL logic.

The new SQL logic enhances data querying capabilities. Ensure that the SQL syntax is correct and that the logic behaves as expected.


1321-1338: Verify the correctness of the new view logic.

The new view logic enhances data querying capabilities. Ensure that the SQL syntax is correct and that the logic behaves as expected.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between 75ede04 and 13c07f2.

Files selected for processing (4)
  • metadata-ingestion/src/datahub/ingestion/source/looker/view_upstream.py (16 hunks)
  • metadata-ingestion/tests/integration/lookml/vv-lineage-and-liquid-templates/data.model.lkml (2 hunks)
  • metadata-ingestion/tests/integration/lookml/vv-lineage-and-liquid-templates/employee_salary_rating.view.lkml (1 hunks)
  • metadata-ingestion/tests/integration/lookml/vv_lineage_liquid_template_golden.json (2 hunks)
Files skipped from review as they are similar to previous changes (1)
  • metadata-ingestion/tests/integration/lookml/vv-lineage-and-liquid-templates/data.model.lkml
Additional comments not posted (36)
metadata-ingestion/tests/integration/lookml/vv-lineage-and-liquid-templates/employee_salary_rating.view.lkml (7)

1-2: Define view and derived table correctly.

The view employee_salary_rating and its derived table are defined correctly.


3-20: Ensure SQL logic is correct and readable.

The SQL logic uses conditional statements to select the appropriate data source based on filters. This approach enhances flexibility but ensure the conditions and SQL syntax are correct and readable.


23-28: Filter definition is correct.

The filter rating_window is defined correctly with appropriate metadata.


30-33: Dimension id definition is correct.

The dimension id is defined correctly with appropriate SQL logic.


35-38: Dimension name definition is correct.

The dimension name is defined correctly with appropriate SQL logic.


40-43: Dimension source definition is correct.

The dimension source is defined correctly with appropriate SQL logic.


45-48: Dimension income definition is correct.

The dimension income is defined correctly with appropriate SQL logic.

metadata-ingestion/src/datahub/ingestion/source/looker/view_upstream.py (12)

209-223: Include reporter parameter in constructor.

The reporter parameter is correctly included in the constructor to enhance error reporting.


250-253: Include reporter parameter in constructor.

The reporter parameter is correctly included in the constructor to enhance error reporting.


266-274: Use transformed SQL in __get_spr method.

The method correctly uses datahub_transformed_sql to handle SQL transformations.


282-289: Report warning for table-level lineage errors.

The method correctly reports warnings for table-level lineage errors using the reporter.


311-319: Report warning for column-level lineage errors.

The method correctly reports warnings for column-level lineage errors using the reporter.


350-358: Report warning for column-level lineage errors.

The method correctly reports warnings for column-level lineage errors using the reporter.


407-411: Include reporter parameter in constructor.

The reporter parameter is correctly included in the constructor to enhance error reporting.


492-495: Include reporter parameter in constructor.

The reporter parameter is correctly included in the constructor to enhance error reporting.


504-507: Use transformed SQL table name in __get_upstream_dataset_urn method.

The method correctly uses datahub_transformed_sql_table_name to handle SQL transformations.


548-551: Include reporter parameter in constructor.

The reporter parameter is correctly included in the constructor to enhance error reporting.


Line range hint 559-566: Use transformed SQL table name in __get_upstream_dataset_urn method.

The method correctly uses datahub_transformed_sql_table_name to handle SQL transformations.


Line range hint 618-662: Include reporter parameter in create_view_upstream function.

The reporter parameter is correctly included in the function calls to enhance error reporting.

metadata-ingestion/tests/integration/lookml/vv_lineage_liquid_template_golden.json (17)

Line range hint 1-20: Container properties are defined correctly.

The container properties are defined with appropriate metadata.


1303-1320: Define subTypes aspect for employee_salary_rating.

The subTypes aspect is defined correctly for the dataset.


1323-1338: Define viewProperties aspect for employee_salary_rating.

The viewProperties aspect is defined correctly with appropriate SQL logic and metadata.


1341-1354: Define container aspect for employee_salary_rating.

The container aspect is defined correctly for the dataset.


1560-1582: Define browsePathsV2 aspect for employee_salary_rating.

The browsePathsV2 aspect is defined correctly with appropriate metadata.


Line range hint 305-320: Define subTypes aspect for employee_income_source.

The subTypes aspect is defined correctly for the dataset.


Line range hint 323-338: Define viewProperties aspect for employee_income_source.

The viewProperties aspect is defined correctly with appropriate SQL logic and metadata.


Line range hint 341-354: Define container aspect for employee_income_source.

The container aspect is defined correctly for the dataset.


Line range hint 560-582: Define browsePathsV2 aspect for employee_income_source.

The browsePathsV2 aspect is defined correctly with appropriate metadata.


Line range hint 823-838: Define subTypes aspect for employee_total_income.

The subTypes aspect is defined correctly for the dataset.


Line range hint 841-856: Define viewProperties aspect for employee_total_income.

The viewProperties aspect is defined correctly with appropriate SQL logic and metadata.


Line range hint 859-872: Define container aspect for employee_total_income.

The container aspect is defined correctly for the dataset.


Line range hint 1078-1100: Define browsePathsV2 aspect for employee_total_income.

The browsePathsV2 aspect is defined correctly with appropriate metadata.


Line range hint 1281-1296: Define subTypes aspect for employee_tax_report.

The subTypes aspect is defined correctly for the dataset.


Line range hint 1299-1314: Define viewProperties aspect for employee_tax_report.

The viewProperties aspect is defined correctly with appropriate SQL logic and metadata.


1317-1330: Define container aspect for employee_tax_report.

The container aspect is defined correctly for the dataset.


1519-1541: Define browsePathsV2 aspect for employee_tax_report.

The browsePathsV2 aspect is defined correctly with appropriate metadata.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Outside diff range, codebase verification and nitpick comments (2)
metadata-ingestion/src/datahub/ingestion/source/looker/looker_template_language.py (2)

98-114: Ensure correct function name in the docstring.

The function name in the docstring should match the actual function name.

- def _complete_incomplete_sql(raw_view: dict, sql: str) -> str:
+ def _complete_incomplete_sql(raw_view: dict, sql: str) -> str:

Line range hint 115-139:
Fix the function call typo.

The function _complete_incomplete_sql is called with a typo.

- _complete_in_complete_sql(
+ _complete_incomplete_sql(
Tools
Ruff

138-138: Undefined name _complete_in_complete_sql

(F821)

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between 13c07f2 and 1bf1344.

Files selected for processing (1)
  • metadata-ingestion/src/datahub/ingestion/source/looker/looker_template_language.py (3 hunks)
Additional context used
Ruff
metadata-ingestion/src/datahub/ingestion/source/looker/looker_template_language.py

138-138: Undefined name _complete_in_complete_sql

(F821)

Additional comments not posted (1)
metadata-ingestion/src/datahub/ingestion/source/looker/looker_template_language.py (1)

98-114: Ensure proper handling of SQL fragments.

The function _complete_incomplete_sql correctly handles SQL fragments by adding missing SELECT and FROM clauses. The regex pattern DERIVED_VIEW_PATTERN is used to clean up the SQL.

Copy link
Collaborator

@hsheth2 hsheth2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs a minor tweak, to exception reporting

Once that's done, we can merge this

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

Commits

Files that changed from the base of the PR and between 1bf1344 and 9430d10.

Files selected for processing (1)
  • metadata-ingestion/src/datahub/ingestion/source/looker/looker_template_language.py (3 hunks)
Files skipped from review as they are similar to previous changes (1)
  • metadata-ingestion/src/datahub/ingestion/source/looker/looker_template_language.py

@hsheth2 hsheth2 merged commit 0667470 into datahub-project:master Jul 31, 2024
58 checks passed
arosanda added a commit to infobip/datahub that referenced this pull request Sep 23, 2024
* feat(forms) Handle deleting forms references when hard deleting forms (datahub-project#10820)

* refactor(ui): Misc improvements to the setup ingestion flow (ingest uplift 1/2)  (datahub-project#10764)

Co-authored-by: John Joyce <john@Johns-MBP.lan>
Co-authored-by: John Joyce <john@ip-192-168-1-200.us-west-2.compute.internal>

* fix(ingestion/airflow-plugin): pipeline tasks discoverable in search (datahub-project#10819)

* feat(ingest/transformer): tags to terms transformer (datahub-project#10758)

Co-authored-by: Aseem Bansal <asmbansal2@gmail.com>

* fix(ingestion/unity-catalog): fixed issue with profiling with GE turned on (datahub-project#10752)

Co-authored-by: Aseem Bansal <asmbansal2@gmail.com>

* feat(forms) Add java SDK for form entity PATCH + CRUD examples (datahub-project#10822)

* feat(SDK) Add java SDK for structuredProperty entity PATCH + CRUD examples (datahub-project#10823)

* feat(SDK) Add StructuredPropertyPatchBuilder in python sdk and provide sample CRUD files (datahub-project#10824)

* feat(forms) Add CRUD endpoints to GraphQL for Form entities (datahub-project#10825)

* add flag for includeSoftDeleted in scroll entities API (datahub-project#10831)

* feat(deprecation) Return actor entity with deprecation aspect (datahub-project#10832)

* feat(structuredProperties) Add CRUD graphql APIs for structured property entities (datahub-project#10826)

* add scroll parameters to openapi v3 spec (datahub-project#10833)

* fix(ingest): correct profile_day_of_week implementation (datahub-project#10818)

* feat(ingest/glue): allow ingestion of empty databases from Glue (datahub-project#10666)

Co-authored-by: Harshal Sheth <hsheth2@gmail.com>

* feat(cli): add more details to get cli (datahub-project#10815)

* fix(ingestion/glue): ensure date formatting works on all platforms for aws glue (datahub-project#10836)

* fix(ingestion): fix datajob patcher (datahub-project#10827)

* fix(smoke-test): add suffix in temp file creation (datahub-project#10841)

* feat(ingest/glue): add helper method to permit user or group ownership (datahub-project#10784)

* feat(): Show data platform instances in policy modal if they are set on the policy (datahub-project#10645)

Co-authored-by: Hendrik Richert <hendrik.richert@swisscom.com>

* docs(patch): add patch documentation for how implementation works (datahub-project#10010)

Co-authored-by: John Joyce <john@acryl.io>

* fix(jar): add missing custom-plugin-jar task (datahub-project#10847)

* fix(): also check exceptions/stack trace when filtering log messages (datahub-project#10391)

Co-authored-by: John Joyce <john@acryl.io>

* docs(): Update posts.md (datahub-project#9893)

Co-authored-by: Hyejin Yoon <0327jane@gmail.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>

* chore(ingest): update acryl-datahub-classify version (datahub-project#10844)

* refactor(ingest): Refactor structured logging to support infos, warnings, and failures structured reporting to UI (datahub-project#10828)

Co-authored-by: John Joyce <john@Johns-MBP.lan>
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>

* fix(restli): log aspect-not-found as a warning rather than as an error (datahub-project#10834)

* fix(ingest/nifi): remove duplicate upstream jobs (datahub-project#10849)

* fix(smoke-test): test access to create/revoke personal access tokens (datahub-project#10848)

* fix(smoke-test): missing test for move domain (datahub-project#10837)

* ci: update usernames to not considered for community (datahub-project#10851)

* env: change defaults for data contract visibility (datahub-project#10854)

* fix(ingest/tableau): quote special characters in external URL (datahub-project#10842)

* fix(smoke-test): fix flakiness of auto complete test

* ci(ingest): pin dask dependency for feast (datahub-project#10865)

* fix(ingestion/lookml): liquid template resolution and view-to-view cll (datahub-project#10542)

* feat(ingest/audit): add client id and version in system metadata props (datahub-project#10829)

* chore(ingest): Mypy 1.10.1 pin (datahub-project#10867)

* docs: use acryl-datahub-actions as expected python package to install (datahub-project#10852)

* docs: add new js snippet (datahub-project#10846)

* refactor(ingestion): remove company domain for security reason (datahub-project#10839)

* fix(ingestion/spark): Platform instance and column level lineage fix (datahub-project#10843)

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>

* feat(ingestion/tableau): optionally ingest multiple sites and create site containers (datahub-project#10498)

Co-authored-by: Yanik Häni <Yanik.Haeni1@swisscom.com>

* fix(ingestion/looker): Add sqlglot dependency and remove unused sqlparser (datahub-project#10874)

* fix(manage-tokens): fix manage access token policy (datahub-project#10853)

* Batch get entity endpoints (datahub-project#10880)

* feat(system): support conditional write semantics (datahub-project#10868)

* fix(build): upgrade vercel builds to Node 20.x (datahub-project#10890)

* feat(ingest/lookml): shallow clone repos (datahub-project#10888)

* fix(ingest/looker): add missing dependency (datahub-project#10876)

* fix(ingest): only populate audit stamps where accurate (datahub-project#10604)

* fix(ingest/dbt): always encode tag urns (datahub-project#10799)

* fix(ingest/redshift): handle multiline alter table commands (datahub-project#10727)

* fix(ingestion/looker): column name missing in explore (datahub-project#10892)

* fix(lineage) Fix lineage source/dest filtering with explored per hop limit (datahub-project#10879)

* feat(conditional-writes): misc updates and fixes (datahub-project#10901)

* feat(ci): update outdated action (datahub-project#10899)

* feat(rest-emitter): adding async flag to rest emitter (datahub-project#10902)

Co-authored-by: Gabe Lyons <gabe.lyons@acryl.io>

* feat(ingest): add snowflake-queries source (datahub-project#10835)

* fix(ingest): improve `auto_materialize_referenced_tags_terms` error handling (datahub-project#10906)

* docs: add new company to adoption list (datahub-project#10909)

* refactor(redshift): Improve redshift error handling with new structured reporting system (datahub-project#10870)

Co-authored-by: John Joyce <john@Johns-MBP.lan>
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>

* feat(ui) Finalize support for all entity types on forms (datahub-project#10915)

* Index ExecutionRequestResults status field (datahub-project#10811)

* feat(ingest): grafana connector (datahub-project#10891)

Co-authored-by: Shirshanka Das <shirshanka@apache.org>
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>

* fix(gms) Add Form entity type to EntityTypeMapper (datahub-project#10916)

* feat(dataset): add support for external url in Dataset (datahub-project#10877)

* docs(saas-overview) added missing features to observe section (datahub-project#10913)

Co-authored-by: John Joyce <john@acryl.io>

* fix(ingest/spark): Fixing Micrometer warning (datahub-project#10882)

* fix(structured properties): allow application of structured properties without schema file (datahub-project#10918)

* fix(data-contracts-web) handle other schedule types (datahub-project#10919)

* fix(ingestion/tableau): human-readable message for PERMISSIONS_MODE_SWITCHED error (datahub-project#10866)

Co-authored-by: Harshal Sheth <hsheth2@gmail.com>

* Add feature flag for view defintions (datahub-project#10914)

Co-authored-by: Ethan Cartwright <ethan.cartwright@acryl.io>

* feat(ingest/BigQuery): refactor+parallelize dataset metadata extraction (datahub-project#10884)

* fix(airflow): add error handling around render_template() (datahub-project#10907)

* feat(ingestion/sqlglot): add optional `default_dialect` parameter to sqlglot lineage (datahub-project#10830)

* feat(mcp-mutator): new mcp mutator plugin (datahub-project#10904)

* fix(ingest/bigquery): changes helper function to decode unicode scape sequences (datahub-project#10845)

* feat(ingest/postgres): fetch table sizes for profile (datahub-project#10864)

* feat(ingest/abs): Adding azure blob storage ingestion source (datahub-project#10813)

* fix(ingest/redshift): reduce severity of SQL parsing issues (datahub-project#10924)

* fix(build): fix lint fix web react (datahub-project#10896)

* fix(ingest/bigquery): handle quota exceeded for project.list requests (datahub-project#10912)

* feat(ingest): report extractor failures more loudly (datahub-project#10908)

* feat(ingest/snowflake): integrate snowflake-queries into main source (datahub-project#10905)

* fix(ingest): fix docs build (datahub-project#10926)

* fix(ingest/snowflake): fix test connection (datahub-project#10927)

* fix(ingest/lookml): add view load failures to cache (datahub-project#10923)

* docs(slack) overhauled setup instructions and screenshots (datahub-project#10922)

Co-authored-by: John Joyce <john@acryl.io>

* fix(airflow): Add comma parsing of owners to DataJobs (datahub-project#10903)

* fix(entityservice): fix merging sideeffects (datahub-project#10937)

* feat(ingest): Support System Ingestion Sources, Show and hide system ingestion sources with Command-S (datahub-project#10938)

Co-authored-by: John Joyce <john@Johns-MBP.lan>

* chore() Set a default lineage filtering end time on backend when a start time is present (datahub-project#10925)

Co-authored-by: John Joyce <john@ip-192-168-1-200.us-west-2.compute.internal>
Co-authored-by: John Joyce <john@Johns-MBP.lan>

* Added relationships APIs to V3. Added these generic APIs to V3 swagger doc. (datahub-project#10939)

* docs: add learning center to docs (datahub-project#10921)

* doc: Update hubspot form id (datahub-project#10943)

* chore(airflow): add python 3.11 w/ Airflow 2.9 to CI (datahub-project#10941)

* fix(ingest/Glue): column upstream lineage between S3 and Glue (datahub-project#10895)

* fix(ingest/abs): split abs utils into multiple files (datahub-project#10945)

* doc(ingest/looker): fix doc for sql parsing documentation (datahub-project#10883)

Co-authored-by: Harshal Sheth <hsheth2@gmail.com>

* fix(ingest/bigquery): Adding missing BigQuery types (datahub-project#10950)

* fix(ingest/setup): feast and abs source setup (datahub-project#10951)

* fix(connections) Harden adding /gms to connections in backend (datahub-project#10942)

* feat(siblings) Add flag to prevent combining siblings in the UI (datahub-project#10952)

* fix(docs): make graphql doc gen more automated (datahub-project#10953)

* feat(ingest/athena): Add option for Athena partitioned profiling (datahub-project#10723)

* fix(spark-lineage): default timeout for future responses (datahub-project#10947)

* feat(datajob/flow): add environment filter using info aspects (datahub-project#10814)

* fix(ui/ingest): correct privilege used to show tab (datahub-project#10483)

Co-authored-by: Kunal-kankriya <127090035+Kunal-kankriya@users.noreply.github.com>

* feat(ingest/looker): include dashboard urns in browse v2 (datahub-project#10955)

* add a structured type to batchGet in OpenAPI V3 spec (datahub-project#10956)

* fix(ui): scroll on the domain sidebar to show all domains (datahub-project#10966)

* fix(ingest/sagemaker): resolve incorrect variable assignment for SageMaker API call (datahub-project#10965)

* fix(airflow/build): Pinning mypy (datahub-project#10972)

* Fixed a bug where the OpenAPI V3 spec was incorrect. The bug was introduced in datahub-project#10939. (datahub-project#10974)

* fix(ingest/test): Fix for mssql integration tests (datahub-project#10978)

* fix(entity-service) exist check correctly extracts status (datahub-project#10973)

* fix(structuredProps) casing bug in StructuredPropertiesValidator (datahub-project#10982)

* bugfix: use anyOf instead of allOf when creating references in openapi v3 spec (datahub-project#10986)

* fix(ui): Remove ant less imports (datahub-project#10988)

* feat(ingest/graph): Add get_results_by_filter to DataHubGraph (datahub-project#10987)

* feat(ingest/cli): init does not actually support environment variables (datahub-project#10989)

* fix(ingest/graph): Update get_results_by_filter graphql query (datahub-project#10991)

* feat(ingest/spark): Promote beta plugin (datahub-project#10881)

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>

* feat(ingest): support domains in meta -> "datahub" section (datahub-project#10967)

* feat(ingest): add `check server-config` command (datahub-project#10990)

* feat(cli): Make consistent use of DataHubGraphClientConfig (datahub-project#10466)

Deprecates get_url_and_token() in favor of a more complete option: load_graph_config() that returns a full DatahubClientConfig.
This change was then propagated across previous usages of get_url_and_token so that connections to DataHub server from the client respect the full breadth of configuration specified by DatahubClientConfig.

I.e: You can now specify disable_ssl_verification: true in your ~/.datahubenv file so that all cli functions to the server work when ssl certification is disabled.

Fixes datahub-project#9705

* fix(ingest/s3): Fixing container creation when there is no folder in path (datahub-project#10993)

* fix(ingest/looker): support platform instance for dashboards & charts (datahub-project#10771)

* feat(ingest/bigquery): improve handling of information schema in sql parser (datahub-project#10985)

* feat(ingest): improve `ingest deploy` command (datahub-project#10944)

* fix(backend): allow excluding soft-deleted entities in relationship-queries; exclude soft-deleted members of groups (datahub-project#10920)

- allow excluding soft-deleted entities in relationship-queries
- exclude soft-deleted members of groups

* fix(ingest/looker): downgrade missing chart type log level (datahub-project#10996)

* doc(acryl-cloud): release docs for 0.3.4.x (datahub-project#10984)

Co-authored-by: John Joyce <john@acryl.io>
Co-authored-by: RyanHolstien <RyanHolstien@users.noreply.github.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Co-authored-by: Pedro Silva <pedro@acryl.io>

* fix(protobuf/build): Fix protobuf check jar script (datahub-project#11006)

* fix(ui/ingest): Support invalid cron jobs (datahub-project#10998)

* fix(ingest): fix graph config loading (datahub-project#11002)

Co-authored-by: Pedro Silva <pedro@acryl.io>

* feat(docs): Document __DATAHUB_TO_FILE_ directive (datahub-project#10968)

Co-authored-by: Harshal Sheth <hsheth2@gmail.com>

* fix(graphql/upsertIngestionSource): Validate cron schedule; parse error in CLI (datahub-project#11011)

* feat(ece): support custom ownership type urns in ECE generation (datahub-project#10999)

* feat(assertion-v2): changed Validation tab to Quality and created new Governance tab (datahub-project#10935)

* fix(ingestion/glue): Add support for missing config options for profiling in Glue (datahub-project#10858)

* feat(propagation): Add models for schema field docs, tags, terms (datahub-project#2959) (datahub-project#11016)

Co-authored-by: Chris Collins <chriscollins3456@gmail.com>

* docs: standardize terminology to DataHub Cloud (datahub-project#11003)

* fix(ingestion/transformer): replace the externalUrl container (datahub-project#11013)

* docs(slack) troubleshoot docs (datahub-project#11014)

* feat(propagation): Add graphql API (datahub-project#11030)

Co-authored-by: Chris Collins <chriscollins3456@gmail.com>

* feat(propagation):  Add models for Action feature settings (datahub-project#11029)

* docs(custom properties): Remove duplicate from sidebar (datahub-project#11033)

* feat(models): Introducing Dataset Partitions Aspect (datahub-project#10997)

Co-authored-by: John Joyce <john@Johns-MBP.lan>
Co-authored-by: John Joyce <john@ip-192-168-1-200.us-west-2.compute.internal>

* feat(propagation): Add Documentation Propagation Settings (datahub-project#11038)

* fix(models): chart schema fields mapping, add dataHubAction entity, t… (datahub-project#11040)

* fix(ci): smoke test lint failures (datahub-project#11044)

* docs: fix learning center color scheme & typo (datahub-project#11043)

* feat: add cloud main page (datahub-project#11017)

Co-authored-by: Jay <159848059+jayacryl@users.noreply.github.com>

* feat(restore-indices): add additional step to also clear system metadata service (datahub-project#10662)

Co-authored-by: John Joyce <john@acryl.io>

* docs: fix typo (datahub-project#11046)

* fix(lint): apply spotless (datahub-project#11050)

* docs(airflow): example query to get datajobs for a dataflow (datahub-project#11034)

* feat(cli): Add run-id option to put sub-command (datahub-project#11023)

Adds an option to assign run-id to a given put command execution. 
This is useful when transformers do not exist for a given ingestion payload, we can follow up with custom metadata and assign it to an ingestion pipeline.

* fix(ingest): improve sql error reporting calls (datahub-project#11025)

* fix(airflow): fix CI setup (datahub-project#11031)

* feat(ingest/dbt): add experimental `prefer_sql_parser_lineage` flag (datahub-project#11039)

* fix(ingestion/lookml): enable stack-trace in lookml logs (datahub-project#10971)

* (chore): Linting fix (datahub-project#11015)

* chore(ci): update deprecated github actions (datahub-project#10977)

* Fix ALB configuration example (datahub-project#10981)

* chore(ingestion-base): bump base image packages (datahub-project#11053)

* feat(cli): Trim report of dataHubExecutionRequestResult to max GMS size (datahub-project#11051)

* fix(ingestion/lookml): emit dummy sql condition for lookml custom condition tag (datahub-project#11008)

Co-authored-by: Harshal Sheth <hsheth2@gmail.com>

* fix(ingestion/powerbi): fix issue with broken report lineage (datahub-project#10910)

* feat(ingest/tableau): add retry on timeout (datahub-project#10995)

* change generate kafka connect properties from env (datahub-project#10545)

Co-authored-by: david-leifker <114954101+david-leifker@users.noreply.github.com>

* fix(ingest): fix oracle cronjob ingestion (datahub-project#11001)

Co-authored-by: david-leifker <114954101+david-leifker@users.noreply.github.com>

* chore(ci): revert update deprecated github actions (datahub-project#10977) (datahub-project#11062)

* feat(ingest/dbt-cloud): update metadata_endpoint inference (datahub-project#11041)

* build: Reduce size of datahub-frontend-react image by 50-ish% (datahub-project#10878)

Co-authored-by: david-leifker <114954101+david-leifker@users.noreply.github.com>

* fix(ci): Fix lint issue in datahub_ingestion_run_summary_provider.py (datahub-project#11063)

* docs(ingest): update developing-a-transformer.md (datahub-project#11019)

* feat(search-test): update search tests from datahub-project#10408 (datahub-project#11056)

* feat(cli): add aspects parameter to DataHubGraph.get_entity_semityped (datahub-project#11009)

Co-authored-by: Harshal Sheth <hsheth2@gmail.com>

* docs(airflow): update min version for plugin v2 (datahub-project#11065)

* doc(ingestion/tableau): doc update for derived permission (datahub-project#11054)

Co-authored-by: Pedro Silva <pedro.cls93@gmail.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>

* fix(py): remove dep on types-pkg_resources (datahub-project#11076)

* feat(ingest/mode): add option to exclude restricted (datahub-project#11081)

* fix(ingest): set lastObserved in sdk when unset (datahub-project#11071)

* doc(ingest): Update capabilities (datahub-project#11072)

* chore(vulnerability): Log Injection (datahub-project#11090)

* chore(vulnerability): Information exposure through a stack trace (datahub-project#11091)

* chore(vulnerability): Comparison of narrow type with wide type in loop condition (datahub-project#11089)

* chore(vulnerability): Insertion of sensitive information into log files (datahub-project#11088)

* chore(vulnerability): Risky Cryptographic Algorithm (datahub-project#11059)

* chore(vulnerability): Overly permissive regex range (datahub-project#11061)

Co-authored-by: Harshal Sheth <hsheth2@gmail.com>

* fix: update customer data (datahub-project#11075)

* fix(models): fixing the datasetPartition models (datahub-project#11085)

Co-authored-by: John Joyce <john@ip-192-168-1-200.us-west-2.compute.internal>

* fix(ui): Adding view, forms GraphQL query, remove showing a fallback error message on unhandled GraphQL error (datahub-project#11084)

Co-authored-by: John Joyce <john@ip-192-168-1-200.us-west-2.compute.internal>

* feat(docs-site): hiding learn more from cloud page (datahub-project#11097)

* fix(docs): Add correct usage of orFilters in search API docs (datahub-project#11082)

Co-authored-by: Jay <159848059+jayacryl@users.noreply.github.com>

* fix(ingest/mode): Regexp in mode name matcher didn't allow underscore (datahub-project#11098)

* docs: Refactor customer stories section (datahub-project#10869)

Co-authored-by: Jeff Merrick <jeff@wireform.io>

* fix(release): fix full/slim suffix on tag (datahub-project#11087)

* feat(config): support alternate hashing algorithm for doc id (datahub-project#10423)

Co-authored-by: david-leifker <114954101+david-leifker@users.noreply.github.com>
Co-authored-by: John Joyce <john@acryl.io>

* fix(emitter): fix typo in get method of java kafka emitter (datahub-project#11007)

* fix(ingest): use correct native data type in all SQLAlchemy sources by compiling data type using dialect (datahub-project#10898)

Co-authored-by: Harshal Sheth <hsheth2@gmail.com>

* chore: Update contributors list in PR labeler (datahub-project#11105)

* feat(ingest): tweak stale entity removal messaging (datahub-project#11064)

* fix(ingestion): enforce lastObserved timestamps in SystemMetadata (datahub-project#11104)

* fix(ingest/powerbi): fix broken lineage between chart and dataset (datahub-project#11080)

* feat(ingest/lookml): CLL support for sql set in sql_table_name attribute of lookml view (datahub-project#11069)

* docs: update graphql docs on forms & structured properties (datahub-project#11100)

* test(search): search openAPI v3 test (datahub-project#11049)

* fix(ingest/tableau): prevent empty site content urls (datahub-project#11057)

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>

* feat(entity-client): implement client batch interface (datahub-project#11106)

* fix(snowflake): avoid reporting warnings/info for sys tables (datahub-project#11114)

* fix(ingest): downgrade column type mapping warning to info (datahub-project#11115)

* feat(api): add AuditStamp to the V3 API entity/aspect response (datahub-project#11118)

* fix(ingest/redshift): replace r'\n' with '\n' to avoid token error redshift serverless… (datahub-project#11111)

* fix(entiy-client): handle null entityUrn case for restli (datahub-project#11122)

* fix(sql-parser): prevent bad urns from alter table lineage (datahub-project#11092)

* fix(ingest/bigquery): use small batch size if use_tables_list_query_v2 is set (datahub-project#11121)

* fix(graphql): add missing entities to EntityTypeMapper and EntityTypeUrnMapper (datahub-project#10366)

* feat(ui): Changes to allow editable dataset name (datahub-project#10608)

Co-authored-by: Jay Kadambi <jayasimhan_venkatadri@optum.com>

* fix: remove saxo (datahub-project#11127)

* feat(mcl-processor): Update mcl processor hooks (datahub-project#11134)

* fix(openapi): fix openapi v2 endpoints & v3 documentation update

* Revert "fix(openapi): fix openapi v2 endpoints & v3 documentation update"

This reverts commit 573c1cb.

* docs(policies): updates to policies documentation (datahub-project#11073)

* fix(openapi): fix openapi v2 and v3 docs update (datahub-project#11139)

* feat(auth): grant type and acr values custom oidc parameters support (datahub-project#11116)

* fix(mutator): mutator hook fixes (datahub-project#11140)

* feat(search): support sorting on multiple fields (datahub-project#10775)

* feat(ingest): various logging improvements (datahub-project#11126)

* fix(ingestion/lookml): fix for sql parsing error (datahub-project#11079)

Co-authored-by: Harshal Sheth <hsheth2@gmail.com>

* feat(docs-site) cloud page spacing and content polishes (datahub-project#11141)

* feat(ui) Enable editing structured props on fields (datahub-project#11042)

* feat(tests): add md5 and last computed to testResult model (datahub-project#11117)

* test(openapi): openapi regression smoke tests (datahub-project#11143)

* fix(airflow): fix tox tests + update docs (datahub-project#11125)

* docs: add chime to adoption stories (datahub-project#11142)

* fix(ingest/databricks): Updating code to work with Databricks sdk 0.30 (datahub-project#11158)

* fix(kafka-setup): add missing script to image (datahub-project#11190)

* fix(config): fix hash algo config (datahub-project#11191)

* test(smoke-test): updates to smoke-tests (datahub-project#11152)

* fix(elasticsearch): refactor idHashAlgo setting (datahub-project#11193)

* chore(kafka): kafka version bump (datahub-project#11211)

* readd UsageStatsWorkUnit

* fix merge problems

* change logo

---------

Co-authored-by: Chris Collins <chriscollins3456@gmail.com>
Co-authored-by: John Joyce <john@acryl.io>
Co-authored-by: John Joyce <john@Johns-MBP.lan>
Co-authored-by: John Joyce <john@ip-192-168-1-200.us-west-2.compute.internal>
Co-authored-by: dushayntAW <158567391+dushayntAW@users.noreply.github.com>
Co-authored-by: sagar-salvi-apptware <159135491+sagar-salvi-apptware@users.noreply.github.com>
Co-authored-by: Aseem Bansal <asmbansal2@gmail.com>
Co-authored-by: Kevin Chun <kevin1chun@gmail.com>
Co-authored-by: jordanjeremy <72943478+jordanjeremy@users.noreply.github.com>
Co-authored-by: skrydal <piotr.skrydalewicz@gmail.com>
Co-authored-by: Harshal Sheth <hsheth2@gmail.com>
Co-authored-by: david-leifker <114954101+david-leifker@users.noreply.github.com>
Co-authored-by: sid-acryl <155424659+sid-acryl@users.noreply.github.com>
Co-authored-by: Julien Jehannet <80408664+aviv-julienjehannet@users.noreply.github.com>
Co-authored-by: Hendrik Richert <github@richert.li>
Co-authored-by: Hendrik Richert <hendrik.richert@swisscom.com>
Co-authored-by: RyanHolstien <RyanHolstien@users.noreply.github.com>
Co-authored-by: Felix Lüdin <13187726+Masterchen09@users.noreply.github.com>
Co-authored-by: Pirry <158024088+chardaway@users.noreply.github.com>
Co-authored-by: Hyejin Yoon <0327jane@gmail.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Co-authored-by: cburroughs <chris.burroughs@gmail.com>
Co-authored-by: ksrinath <ksrinath@users.noreply.github.com>
Co-authored-by: Mayuri Nehate <33225191+mayurinehate@users.noreply.github.com>
Co-authored-by: Kunal-kankriya <127090035+Kunal-kankriya@users.noreply.github.com>
Co-authored-by: Shirshanka Das <shirshanka@apache.org>
Co-authored-by: ipolding-cais <155455744+ipolding-cais@users.noreply.github.com>
Co-authored-by: Tamas Nemeth <treff7es@gmail.com>
Co-authored-by: Shubham Jagtap <132359390+shubhamjagtap639@users.noreply.github.com>
Co-authored-by: haeniya <yanik.haeni@gmail.com>
Co-authored-by: Yanik Häni <Yanik.Haeni1@swisscom.com>
Co-authored-by: Gabe Lyons <itsgabelyons@gmail.com>
Co-authored-by: Gabe Lyons <gabe.lyons@acryl.io>
Co-authored-by: 808OVADOZE <52988741+shtephlee@users.noreply.github.com>
Co-authored-by: noggi <anton.kuraev@acryl.io>
Co-authored-by: Nicholas Pena <npena@foursquare.com>
Co-authored-by: Jay <159848059+jayacryl@users.noreply.github.com>
Co-authored-by: ethan-cartwright <ethan.cartwright.m@gmail.com>
Co-authored-by: Ethan Cartwright <ethan.cartwright@acryl.io>
Co-authored-by: Nadav Gross <33874964+nadavgross@users.noreply.github.com>
Co-authored-by: Patrick Franco Braz <patrickfbraz@poli.ufrj.br>
Co-authored-by: pie1nthesky <39328908+pie1nthesky@users.noreply.github.com>
Co-authored-by: Joel Pinto Mata (KPN-DSH-DEX team) <130968841+joelmataKPN@users.noreply.github.com>
Co-authored-by: Ellie O'Neil <110510035+eboneil@users.noreply.github.com>
Co-authored-by: Ajoy Majumdar <ajoymajumdar@hotmail.com>
Co-authored-by: deepgarg-visa <149145061+deepgarg-visa@users.noreply.github.com>
Co-authored-by: Tristan Heisler <tristankheisler@gmail.com>
Co-authored-by: Andrew Sikowitz <andrew.sikowitz@acryl.io>
Co-authored-by: Davi Arnaut <davi.arnaut@acryl.io>
Co-authored-by: Pedro Silva <pedro@acryl.io>
Co-authored-by: amit-apptware <132869468+amit-apptware@users.noreply.github.com>
Co-authored-by: Sam Black <sam.black@acryl.io>
Co-authored-by: Raj Tekal <varadaraj_tekal@optum.com>
Co-authored-by: Steffen Grohsschmiedt <gitbhub@steffeng.eu>
Co-authored-by: jaegwon.seo <162448493+wornjs@users.noreply.github.com>
Co-authored-by: Renan F. Lima <51028757+lima-renan@users.noreply.github.com>
Co-authored-by: Matt Exchange <xkollar@users.noreply.github.com>
Co-authored-by: Jonny Dixon <45681293+acrylJonny@users.noreply.github.com>
Co-authored-by: Pedro Silva <pedro.cls93@gmail.com>
Co-authored-by: Pinaki Bhattacharjee <pinakipb2@gmail.com>
Co-authored-by: Jeff Merrick <jeff@wireform.io>
Co-authored-by: skrydal <piotr.skrydalewicz@acryl.io>
Co-authored-by: AndreasHegerNuritas <163423418+AndreasHegerNuritas@users.noreply.github.com>
Co-authored-by: jayasimhankv <145704974+jayasimhankv@users.noreply.github.com>
Co-authored-by: Jay Kadambi <jayasimhan_venkatadri@optum.com>
Co-authored-by: David Leifker <david.leifker@acryl.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ingestion PR or Issue related to the ingestion of metadata
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants