forked from dbt-labs/dbt-core
-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[pull] main from dbt-labs:main #29
Open
pull
wants to merge
681
commits into
pgoslatara:main
Choose a base branch
from
dbt-labs:main
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
* Add more accurate RSS high water mark measurement for Linux * Add changelog entry. * Checks to avoid exception based flow control, per review. * Fix review nit.
* Add fixtures for setting and resettign flags for unit tests * Remove unnecessary `set_from_args` in non `unittest.TestCase` based unit tests In the previous commit we added a pytest fixture which sets and tears down the global flags arg via `set_from_args` for every pytest based unit test. Previously we had added a `set_from_args` in tests or test files to reset the global flags from if they were modified by a previous test. This is no longer necessary because of the work done in the previous commit. Note: We did not modify any tests that use the `unittest.TestCase` class because they don't use pytest fixtures. Thus those tests need to continue operating as they currently do until we shift them to pytest unit tests. * Utilize the new `args_for_flags` fixture for setting of flags in `test_contracts_graph_parsed.py` * Convert `test_compilation.py` from `TestCase` tests to pytest tests We did this so in the next commit we can drop the unnecessary `set_from_args` in the next commit. That will be it's own commit because converting these tests is a restructuring that doing separately makes things easier to follow. That is to say, all changes in this commit were just to convert the tests to pytest, no other changes were made. * Drop unnecessary `set_from_args` in `test_compilation.py` * Add return types to all methods in `test_compilation.py` * Reduce imports from `compilation` in `test_compilation.py` * Update `test_logging.py` now that we don't need to worry about global flags * Conditionally import `Generator` type for python 3.8 In python 3.9 `Generator` was moved to `collections.abc` and deprecated in `typing`. We still support 3.8 and thus need to be conditionally importing `Generator`. We should remove this in the future when we drop support for 3.8.
* add saved query resource type * nits
…al with multiple nulls (#10202)
* add new mode * reflect dbt-common changes * changelog * rename var
* Create `runtime_config` fixture and necessary upstream fixtures * Check for better scoped `ProjectContractError` in test_runtime tests Previously in `test_unsupported_version_extra_config` and `test_archive_not_allowed` we were checking for `DbtProjectError`. This worked because `ProjectContractError` is a subclass of `DbtProjectError`. However, if we check for `DbtProjectError` in these tests than, some tangential failure which raises a `DbtProejctError` type error would go undetected. As we plan on modifying these tests to be pytest in the coming commits, we want to ensure that the tests are succeeding for the right reason. * Convert `test_str` of `TestRuntimeConfig` to a pytest test using fixtures * Convert `test_from_parts` of `TestRuntimeConfig` to a pytest test using fixtures While converting `test_from_parts` I noticed the comment > TODO(jeb): Adapters must assert that quoting is populated? This led me to beleive that `quoting` shouldn't be "fully" realized in our project fixture unless we're saying that it's gone through adapter instantiation. Thus I update the `quoting` on our project fixture to be an empty dict. This change affected `test__str__` in `test_project.py` which we thus needed to update accordingly. * Convert runtime version specifier tests to pytest tests and move to test_project We've done two things in this commit, which arguably _should_ have been done in two commits. First we moved the version specifier tests from `test_runtime.py::TestRuntimeConfig` to `test_project.py::TestGetRequiredVersion` this is because what is really being tested is the `_get_required_version` method. Doing it via `RuntimeConfig.from_parts` method made actually testing it a lot harder as it requires setting up more of the world and running with a _full_ project config dict. The second thing we did was convert it from the old unittest implementation to a pytest implementation. This saves us from having to create most of the world as we were doing previously in these tests. Of note, I did not move the test `test_unsupported_version_range_bad_config`. This test is a bit different from the rest of the version specifier tests. It was introduced in [1eb5857](1eb5857) of [#2726](#2726) to resolve [#2638](#2638). The focus of #2726 was to ensure the version specifier checks were run _before_ the validation of the `dbt_project.yml`. Thus what this test is actually testing for is order of operations at parse time. As such, this is really more a _functional_ test than a unit test. In the next commit we'll get this test moved (and renamed) * Create a better test for checking that version checks come before project schema validation * Convert `test_get_metadata` to pytest test * Refactor `test_archive_not_allowed` to functional test We do already have tests that ensure "extra" keys aren't allowed in the dbt_project.yaml. This test is different because it's checking that a specific key, `archive`, isn't allowed. We do this because at one point in time `archive` _was_ an allowed key. Specifically, we stopped allowing `archive` in dbt-core 0.15.0 via commit [f26948d](f26948d). Given that it's been 5 years and a major version, we could probably remove this test, but let's keep it around unless we start supporting `archive` again. * Convert `warn_for_unused_resource_config_paths` tests to use pytest
* init push arbitrary configs for generic tests pr * iterative work * initial test design attempts * test reformatting * test rework, have basic structure for 3 of 4 passing, need to figure out how to best represent same key error, failing correctly though * swap up test formats for new config dict and mixed varitey to run of dbt parse and inspecting the manifest * modify tests to get passing, then modify the TestBuilder class work from earlier to be more dry * add changelog * modify code to match suggested changes around seperate methods and test id fix * add column_name reference to init for deeper nested _render_values can use the input * add type annotations * feedback based on mike review
* Fire skipped events at debug level Closes #8774 * add changelog entry * Update to work with 1.9.*. * Add tests for --fail-fast not showing skip messages unless --debug. * Update test that works by itself, but assumes to much to work in integration tests. --------- Co-authored-by: Scott Gigante <scottgigante@users.noreply.github.com>
* Update semantic interface requirement * Update setup.py
* Rename `maniest` fixture in `test_selector` to `mock_manifest` We have a globally available `manifest` fixture in our unit tests. In the coming commits we're going to add tests to the file which use the gloablly available `manifest` fixture. Prior to this commit, the locally defined `manifest` fixture was taking precidence. To get around this, the easiest solution was to rename the locally defined fixture. I had tried to isolate the locally defined fixture by moving it, and the relevant tests to a test class like `TestNodeSelector`. However because of _how_ the relevant tests were parameterized, this proved difficult. Basically for readability we define a variable which holds a list of all the parameterization variables. By moving to a test class, the definition of the variables would have had to be defined directly in the parameterization macro call. Although possible, it made the readability slighty worse. It might be worth doing anyway in the long run, but instead I used a less heavy handed alternative (already described) * Improve type hinting in `tests/unit/utils/manifest.py` * Ensure `args` get set from global flags for `runtime_config` fixture in unit tests The `Compiler.compile` method accesses `self.config.args.which`. The `config` is the `RuntimeConfig` the `Compiler` was instantiated with. Our `runtime_config` fixture was being instatiated with an empty dict for the `args` property. Thus a `which` property of the args wasn't being made avaiable, and if `compile` was run a runtime error would occur. To solve this, we've begun instantiating the args from the global flags via `get_flags()`. This works because we ensure the `set_test_flags` fixture is run first which calls `set_from_args`. * Create a `make_manifest` utility function for use in unit tests and fixture creation * Refactor `Compiler` and `NodeSelector` tests in `test_selector.py` to use pytesting methodology * Remove parsing tests that exist in `test_selector.py` We had some tests in `test_selector.py::GraphTest` that didn't add anything ontop of what was already being tested else where in the file except the parsing of models. However, the tests in `test_parser.py::ModelParserTest` cover everything being tested here (and then some). Thus these tests in `test_selector.py::GraphTest` are unnecessary and can be deleted. * Move `test__partial_parse` from `test_selector.py` to `test_manifest.py` There was a test `test__partial_parse` in `test_selector.py` which tested the functionality of `is_partial_parsable` of the `ManifestLoader`. This doesn't really make sense to exist in `test_selector.py` where we are testing selectors. We test the `ManifestLoader` class in `test_manifest.py` which seemed like a more appropriate place for the test. Additionally we renamed the test to `test_is_partial_parsable_by_version` to more accurately describe what is being tested. * Make `test_selector`'s manifest fixture name even more specific * Add type hint to `expected_nodes` in `test_selector.py` tests In the test `tests/unit/graph/test_selector.py::TestCompiler::test_two_models_simple_ref` we have a variable `expected_nodes` that we are setting via a list comprehension. At a glance it isn't immediately obvious what `expected_nodes` actually is. It's a list, but a list of what? One suggestion was to explicitly write out the list of strings. However, I worry about the brittleness of doing so. That might be the way we head long term, but as a compromise for now, I've added a type hint the the variable definition.
* update on_skip to adjust for node that do not have schema * changelog
* Add changelog entry. * Update schemas and test fixtures for new snapshot meta-column * Add back comment.
…1056) * Rename `batch_info` to `previous_batch_results` * Exclude `previous_batch_results` from serialization of model node to avoid jinja context bloat * Drop `previous_batch_results` key from `test_manifest.py` unit tests In 4050e37 we began excluding `previous_batch_results` from the serialized representation of the ModelNode. As such, we no longer need to check for it in `test_manifest.py`.
* Add `batch_id` to jinja context of microbatch batches * Add changie doc * Update `format_batch_start` to assume `batch_start` is always provided * Add "runtime only" property `batch_context` to `ModelNode` By it being "runtime only" we mean that it doesn't exist on the artifact and thus won't be written out to the manifest artifact. * Begin populating `batch_context` during materialization execution for microbatch batches * Fix circular import * Fixup MicrobatchBuilder.batch_id property method * Ensure MicrobatchModelRunner doesn't double compile batches We were compiling the node for each batch _twice_. Besides making microbatch models more expensive than they needed to be, double compiling wasn't causing any issue. However the first compilation was happening _before_ we had added the batch context information to the model node for the batch. This was leading to models which try to access the `batch_context` information on the model to blow up, which was undesirable. As such, we've now gone and skipped the first compilation. We've done this similar to how SavedQuery nodes skip compilation. * Add `__post_serialize__` method to `BatchContext` to ensure correct dict shape This is weird, but necessary, I apologize. Mashumaro handles the dictification of this class via a compile time generated `to_dict` method based off of the _typing_ of th class. By default `datetime` types are converted to strings. We don't want that, we want them to stay datetimes. * Update tests to check for `batch_context` * Update `resolve_event_time_filter` to use new `batch_context` * Stop testing for batchless compiled code for microbatch models In 45daec7 we stopped an extra compilation that was happening per batch prior to the batch_context being loaded. Stopping this extra compilation means that compiled sql for the microbatch model without the event time filter / batch context is no longer produced. We have discussed this and _believe_ it is okay given that this is a new node type that has not hit GA yet. * Rename `ModelNode.batch_context` to `ModelNode.batch` * Rename `build_batch_context` to `build_jinja_context_for_batch` The name `build_batch_context` was confusing as 1) We have a `BatchContext` object, which the method was not building 2) The method builds the jinja context for the batch As such it felt appropriate to rename the method to more accurately communicate what it does. * Rename test macro `invalid_batch_context_macro_sql` to `invalid_batch_jinja_context_macro_sql` This rename was to make it more clear that the jinja context for a batch was being checked, as a batch_context has a slightly different connotation. * Update changie doc
Co-authored-by: Courtney Holcomb <courtneyeholcomb@gmail.com>
* Improve performance of select_children() and select_parents() * Add changelog entry.
* New function to add graph edges. * Clean up, leave out flag temporarily for testing. * Put new test edge behavior behind flag. * Final draft of documentaiton.
* microbatch: split out first and last batch to run in serial * only run pre_hook on first batch, post_hook on last batch * refactor: internalize parallel to RunTask._submit_batch * Add optional `force_sequential` to `_submit_batch` to allow for skipping parallelism check * Force last batch to run sequentially * Force first batch to run sequentially * Remove batch_idx check in `should_run_in_parallel` `should_run_in_parallel` shouldn't, and no longer needs to, take into consideration where in batch exists in a larger context. The first and last batch for a microbatch model are now forced to run sequentially by `handle_microbatch_model` * Begin skipping batches if first batch fails * Write custom `on_skip` for `MicrobatchModelRunner` to better handle when batches are skipped This was necessary specifically because the default on skip set the `X of Y` part of the skipped log using the `node_index` and the `num_nodes`. If there was 2 nodes and we are on the 4th batch of the second node, we'd get a message like `SKIPPED 4 of 2...` which didn't make much sense. We're likely in a future commit going to add a custom event for logging the start, result, and skipping of batches for better readability of the logs. * Add microbatch pre-hook, post-hook, and sequential first/last batch tests * Fix/Add tests around first batch failure vs latter batch failure * Correct MicrobatchModelRunner.on_skip to handle skipping the entire node Previously `MicrobatchModelRunner.on_skip` only handled when a _batch_ of the model was being skipped. However, that method is also used when the entire microbatch model is being skipped due to an upstream node error. Because we previously _weren't_ handling this second case, it'd cause an unhandled runtime exception. Thus, we now need to check whether we're running a batch or not, and there is no batch, then use the super's on_skip method. * Correct conditional logic for setting pre- and post-hooks for batches Previously we were doing an if+elif for setting pre- and post-hooks for batches, where in the `if` matched if the batch wasn't the first batch, and the `elif` matched if the batch wasn't the last batch. The issue with this is that if the `if` was hit, the `elif` _wouldn't_ be hit. This caused the first batch to appropriately not run the `post-hook` but then every hook after would run the `post-hook`. * Add two new event types `LogStartBatch` and `LogBatchResult` * Update MicrobatchModelRunner to use new batch specific log events * Fix event testing * Update microbatch integration tests to catch batch specific event types --------- Co-authored-by: Quigley Malcolm <quigley.malcolm@dbtlabs.com>
* Update single batch test case to check for generic exceptions * Explicitly skip last final batch execution when there is only one batch Previously if there was only one batch, we would try to execute _two_ batches. The first batch, and a "last" non existent batch. This would result in an unhandled exception. * Changie doc
…it code (#11115) * Update partial success test to assert partial successes mean that the run failed * Update results interpretation to include `PartialSuccess` as failure status
* fix MicrobatchExecutionDebug message * Fix typing in `describe_batch` to convince mypy `batch_start` exists when needed --------- Co-authored-by: Quigley Malcolm <quigley.malcolm@dbtlabs.com>
…le adapter doesn't support it (#11145) * Begin producing warning when attempting to force concurrent batches without adapter support Batches of microbatch models can be executed sequentially or concurrently. We try to figure out which to do intelligently. As part of that, we implemented an override, the model config `concurrent_batches`, to allow the user to bypass _some_ of our automatic detection. However, a user _cannot_ for batches to run concurrently if the adapter doesn't support concurrent batches (declaring support is opt in). Thus, if an adapter _doesn't_ support running batches concurrently, and a user tries to force concurrent execution via `concurrent_batches`, then we need to warn the user that that isn't happening. * Add custom event type for warning about invalid `concurrent_batches` config * Fire `InvalidConcurrentBatchesConfig` warning via `warn_or_error` so it can be silenced
Co-authored-by: Doug Beatty <44704949+dbeatty10@users.noreply.github.com>
* roadmap post december 2024 * fix yml spacing * fix code snippet format
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
See Commits and Changes for more details.
Created by pull[bot]
Can you help keep this open source service alive? 💖 Please sponsor : )