Skip to content

Commit

Permalink
Merge branch 'dev/octavius-catto' into fix/sql-header-in-incrementals
Browse files Browse the repository at this point in the history
  • Loading branch information
drewbanin authored May 1, 2020
2 parents 29e2bbc + 6e1665d commit f3d4377
Show file tree
Hide file tree
Showing 57 changed files with 2,239 additions and 971 deletions.
11 changes: 11 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,10 @@
## dbt next (release TBD)

### Breaking changes
- Added a new dbt_project.yml version format. This emits a deprecation warning currently, but support for the existing version will be removed in a future dbt version ([#2300](https://github.com/fishtown-analytics/dbt/issues/2300), [#2312](https://github.com/fishtown-analytics/dbt/pull/2312))
- The `graph` object available in some dbt contexts now has an additional member `sources` (along side the existing `nodes`). Sources have been removed from `nodes` and added to `sources` instead ([#2312](https://github.com/fishtown-analytics/dbt/pull/2312))
- The 'location' field has been removed from bigquery catalogs ([#2382](https://github.com/fishtown-analytics/dbt/pull/2382))

### Features
- Added --fail-fast argument for dbt run and dbt test to fail on first test failure or runtime error. ([#1649](https://github.com/fishtown-analytics/dbt/issues/1649), [#2224](https://github.com/fishtown-analytics/dbt/pull/2224))
- Support for appending query comments to SQL queries. ([#2138](https://github.com/fishtown-analytics/dbt/issues/2138), [#2199](https://github.com/fishtown-analytics/dbt/pull/2199))
Expand All @@ -12,6 +18,10 @@
- Snowflake now uses "describe table" to get the columns in a relation ([#2260](https://github.com/fishtown-analytics/dbt/issues/2260), [#2324](https://github.com/fishtown-analytics/dbt/pull/2324))
- Add a 'depends_on' attribute to the log record extra field ([#2316](https://github.com/fishtown-analytics/dbt/issues/2316), [#2341](https://github.com/fishtown-analytics/dbt/pull/2341))
- Added a '--no-browser' argument to "dbt docs serve" so you can serve docs in an environment that only has a CLI browser which would otherwise deadlock dbt ([#2004](https://github.com/fishtown-analytics/dbt/issues/2004), [#2364](https://github.com/fishtown-analytics/dbt/pull/2364))
- Snowflake now uses "describe table" to get the columns in a relation ([#2260](https://github.com/fishtown-analytics/dbt/issues/2260), [#2324](https://github.com/fishtown-analytics/dbt/pull/2324))
- Sources (and therefore freshness tests) can be enabled and disabled via dbt_project.yml ([#2283](https://github.com/fishtown-analytics/dbt/issues/2283), [#2312](https://github.com/fishtown-analytics/dbt/pull/2312), [#2357](https://github.com/fishtown-analytics/dbt/pull/2357))
- schema.yml files are now fully rendered in a context that is aware of vars declared in from dbt_project.yml files ([#2269](https://github.com/fishtown-analytics/dbt/issues/2269), [#2357](https://github.com/fishtown-analytics/dbt/pull/2357))
- Sources from dependencies can be overridden in schema.yml files ([#2287](https://github.com/fishtown-analytics/dbt/issues/2287), [#2357](https://github.com/fishtown-analytics/dbt/pull/2357))

### Fixes
- When a jinja value is undefined, give a helpful error instead of failing with cryptic "cannot pickle ParserMacroCapture" errors ([#2110](https://github.com/fishtown-analytics/dbt/issues/2110), [#2184](https://github.com/fishtown-analytics/dbt/pull/2184))
Expand All @@ -27,6 +37,7 @@
- Fix "dbt deps" command so it respects the "--project-dir" arg if specified. ([#2338](https://github.com/fishtown-analytics/dbt/issues/2338), [#2339](https://github.com/fishtown-analytics/dbt/issues/2339))
- On `run_cli` API calls that are passed `--vars` differing from the server's `--vars`, the RPC server rebuilds the manifest for that call. ([#2265](https://github.com/fishtown-analytics/dbt/issues/2265), [#2363](https://github.com/fishtown-analytics/dbt/pull/2363))
- Fix "Object of type Decimal is not JSON serializable" error when BigQuery queries returned numeric types in nested data structures ([#2336](https://github.com/fishtown-analytics/dbt/issues/2336), [#2348](https://github.com/fishtown-analytics/dbt/pull/2348))
- No longer query the information_schema.schemata view on bigquery ([#2320](https://github.com/fishtown-analytics/dbt/issues/2320), [#2382](https://github.com/fishtown-analytics/dbt/pull/2382))
- Add support for `sql_header` config in incremental models ([#2136](https://github.com/fishtown-analytics/dbt/issues/2136), [#2200](https://github.com/fishtown-analytics/dbt/pull/2200))

### Under the hood
Expand Down
7 changes: 6 additions & 1 deletion core/dbt/adapters/base/impl.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@
from contextlib import contextmanager
from dataclasses import dataclass
from datetime import datetime
from itertools import chain
from typing import (
Optional, Tuple, Callable, Iterable, Type, Dict, Any, List, Mapping,
Iterator, Union, Set
Expand Down Expand Up @@ -289,7 +290,11 @@ def _get_cache_schemas(
lowercase strings.
"""
info_schema_name_map = SchemaSearchMap()
for node in manifest.nodes.values():
nodes: Iterator[CompileResultNode] = chain(
manifest.nodes.values(),
manifest.sources.values(),
)
for node in nodes:
if exec_only and node.resource_type not in NodeType.executable():
continue
relation = self.Relation.create_from(self.config, node)
Expand Down
53 changes: 46 additions & 7 deletions core/dbt/clients/jinja.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,9 @@
import re
import tempfile
import threading
from ast import literal_eval
from contextlib import contextmanager
from itertools import chain, islice
from typing import (
List, Union, Set, Optional, Dict, Any, Iterator, Type, NoReturn
)
Expand Down Expand Up @@ -102,9 +104,51 @@ class NativeSandboxEnvironment(MacroFuzzEnvironment):
code_generator_class = jinja2.nativetypes.NativeCodeGenerator


def quoted_native_concat(nodes):
"""This is almost native_concat from the NativeTemplate, except in the
special case of a single argument that is a quoted string and returns a
string, the quotes are re-inserted.
"""
head = list(islice(nodes, 2))

if not head:
return None

if len(head) == 1:
raw = head[0]
else:
raw = "".join([str(v) for v in chain(head, nodes)])

try:
result = literal_eval(raw)
except (ValueError, SyntaxError, MemoryError):
return raw

if len(head) == 1 and len(raw) > 2 and isinstance(result, str):
return _requote_result(raw, result)
else:
return result


class NativeSandboxTemplate(jinja2.nativetypes.NativeTemplate): # mypy: ignore
environment_class = NativeSandboxEnvironment

def render(self, *args, **kwargs):
"""Render the template to produce a native Python type. If the
result is a single node, its value is returned. Otherwise, the
nodes are concatenated as strings. If the result can be parsed
with :func:`ast.literal_eval`, the parsed value is returned.
Otherwise, the string is returned.
"""
vars = dict(*args, **kwargs)

try:
return quoted_native_concat(
self.root_render_func(self.new_context(vars))
)
except Exception:
return self.environment.handle_exception()


NativeSandboxEnvironment.template_class = NativeSandboxTemplate # type: ignore

Expand Down Expand Up @@ -425,7 +469,7 @@ def render_template(template, ctx: Dict[str, Any], node=None) -> str:
return template.render(ctx)


def _requote_result(raw_value, rendered):
def _requote_result(raw_value: str, rendered: str) -> str:
double_quoted = raw_value.startswith('"') and raw_value.endswith('"')
single_quoted = raw_value.startswith("'") and raw_value.endswith("'")
if double_quoted:
Expand All @@ -451,12 +495,7 @@ def get_rendered(
capture_macros=capture_macros,
native=native,
)

result = render_template(template, ctx, node)

if native and isinstance(result, str):
result = _requote_result(string, result)
return result
return render_template(template, ctx, node)


def undefined_error(msg) -> NoReturn:
Expand Down
36 changes: 24 additions & 12 deletions core/dbt/compilation.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,3 @@
import itertools
import os
from collections import defaultdict
from typing import List, Dict, Any
Expand All @@ -11,6 +10,7 @@
from dbt.linker import Linker

from dbt.context.providers import generate_runtime_model
from dbt.contracts.graph.compiled import NonSourceNode
from dbt.contracts.graph.manifest import Manifest
import dbt.exceptions
import dbt.flags
Expand Down Expand Up @@ -60,22 +60,24 @@ def print_compile_stats(stats):
logger.info("Found {}".format(stat_line))


def _node_enabled(node):
def _node_enabled(node: NonSourceNode):
# Disabled models are already excluded from the manifest
if node.resource_type == NodeType.Test and not node.config.enabled:
return False
else:
return True


def _generate_stats(manifest):
def _generate_stats(manifest: Manifest):
stats: Dict[NodeType, int] = defaultdict(int)
for node_name, node in itertools.chain(
manifest.nodes.items(),
manifest.macros.items()):
for node in manifest.nodes.values():
if _node_enabled(node):
stats[node.resource_type] += 1

for source in manifest.sources.values():
stats[source.resource_type] += 1
for macro in manifest.macros.values():
stats[macro.resource_type] += 1
return stats


Expand Down Expand Up @@ -182,24 +184,34 @@ def compile_node(self, node, manifest, extra_context=None):

return injected_node

def write_graph_file(self, linker, manifest):
def write_graph_file(self, linker: Linker, manifest: Manifest):
filename = graph_file_name
graph_path = os.path.join(self.config.target_path, filename)
if dbt.flags.WRITE_JSON:
linker.write_graph(graph_path, manifest)

def link_node(self, linker, node, manifest):
def link_node(
self, linker: Linker, node: NonSourceNode, manifest: Manifest
):
linker.add_node(node.unique_id)

for dependency in node.depends_on_nodes:
if manifest.nodes.get(dependency):
if dependency in manifest.nodes:
linker.dependency(
node.unique_id,
(manifest.nodes.get(dependency).unique_id))
(manifest.nodes[dependency].unique_id)
)
elif dependency in manifest.sources:
linker.dependency(
node.unique_id,
(manifest.sources[dependency].unique_id)
)
else:
dbt.exceptions.dependency_not_found(node, dependency)

def link_graph(self, linker, manifest):
def link_graph(self, linker: Linker, manifest: Manifest):
for source in manifest.sources.values():
linker.add_node(source.unique_id)
for node in manifest.nodes.values():
self.link_node(linker, node, manifest)

Expand All @@ -208,7 +220,7 @@ def link_graph(self, linker, manifest):
if cycle:
raise RuntimeError("Found a cycle: {}".format(cycle))

def compile(self, manifest, write=True):
def compile(self, manifest: Manifest, write=True):
linker = Linker()

self.link_graph(linker, manifest)
Expand Down
10 changes: 9 additions & 1 deletion core/dbt/config/project.py
Original file line number Diff line number Diff line change
Expand Up @@ -575,7 +575,12 @@ def render_from_dict(
rendered_project['project-root'] = project_root
package_renderer = renderer.get_package_renderer()
rendered_packages = package_renderer.render_data(packages_dict)
return cls.from_project_config(rendered_project, rendered_packages)
try:
return cls.from_project_config(rendered_project, rendered_packages)
except DbtProjectError as exc:
if exc.path is None:
exc.path = os.path.join(project_root, 'dbt_project.yml')
raise

@classmethod
def partial_load(
Expand Down Expand Up @@ -645,6 +650,9 @@ def as_v1(self):
# stuff any 'vars' entries into the old-style
# models/seeds/snapshots dicts
for project_name, items in dct['vars'].items():
if not isinstance(items, dict):
# can't translate top-level vars
continue
for cfgkey in ['models', 'seeds', 'snapshots']:
if project_name not in mutated[cfgkey]:
mutated[cfgkey][project_name] = {}
Expand Down
13 changes: 1 addition & 12 deletions core/dbt/config/renderer.py
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ def render_value(
# if it wasn't read as a string, ignore it
if not isinstance(value, str):
return value
return str(get_rendered(value, self.context))
return get_rendered(value, self.context, native=True)

def render_data(
self, data: Dict[str, Any]
Expand Down Expand Up @@ -131,17 +131,6 @@ class ProfileRenderer(BaseRenderer):
def name(self):
'Profile'

def render_entry(self, value, keypath):
result = super().render_entry(value, keypath)

if len(keypath) == 1 and keypath[-1] == 'port':
try:
return int(result)
except ValueError:
# let the validator or connection handle this
pass
return result


class SchemaYamlRenderer(BaseRenderer):
DOCUMENTABLE_NODES = frozenset(
Expand Down
58 changes: 57 additions & 1 deletion core/dbt/context/configured.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
from dbt.include.global_project import PACKAGES
from dbt.include.global_project import PROJECT_NAME as GLOBAL_PROJECT_NAME

from dbt.context.base import contextproperty
from dbt.context.base import contextproperty, Var
from dbt.context.target import TargetContext
from dbt.exceptions import raise_duplicate_macro_name

Expand All @@ -25,6 +25,55 @@ def project_name(self) -> str:
return self.config.project_name


class ConfiguredVar(Var):
def __init__(
self,
context: Dict[str, Any],
config: AdapterRequiredConfig,
project_name: str,
):
super().__init__(context, config.cli_vars)
self.config = config
self.project_name = project_name

def __call__(self, var_name, default=Var._VAR_NOTSET):
my_config = self.config.load_dependencies()[self.project_name]

# cli vars > active project > local project
if var_name in self.config.cli_vars:
return self.config.cli_vars[var_name]

if self.config.config_version == 2 and my_config.config_version == 2:

active_vars = self.config.vars.to_dict()
active_vars = active_vars.get(self.project_name, {})
if var_name in active_vars:
return active_vars[var_name]

if self.config.project_name != my_config.project_name:
config_vars = my_config.vars.to_dict()
config_vars = config_vars.get(self.project_name, {})
if var_name in config_vars:
return config_vars[var_name]

if default is not Var._VAR_NOTSET:
return default

return self.get_missing_var(var_name)


class SchemaYamlContext(ConfiguredContext):
def __init__(self, config, project_name: str):
super().__init__(config)
self._project_name = project_name

@contextproperty
def var(self) -> ConfiguredVar:
return ConfiguredVar(
self._ctx, self.config, self._project_name
)


FlatNamespace = Dict[str, MacroGenerator]
NamespaceMember = Union[FlatNamespace, MacroGenerator]
FullNamespace = Dict[str, NamespaceMember]
Expand Down Expand Up @@ -134,3 +183,10 @@ def generate_query_header_context(
):
ctx = QueryHeaderContext(config, manifest)
return ctx.to_dict()


def generate_schema_yml(
config: AdapterRequiredConfig, project_name: str
) -> Dict[str, Any]:
ctx = SchemaYamlContext(config, project_name)
return ctx.to_dict()
Loading

0 comments on commit f3d4377

Please sign in to comment.