-
Notifications
You must be signed in to change notification settings - Fork 14.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AIRFLOW-3001] Add index 'ti_dag_date' to taskinstance #3885
Conversation
|
|
I've resolved some CI issues. Please confirm or correct any problems. Thanks :) |
[AIRFLOW-3001] Add index 'ti_dag_date' to taskinstance [AIRFLOW-3001] Add index 'ti_dag_date' to taskinstance [AIRFLOW-3001] Add index 'ti_dag_date' to taskinstance
Codecov Report
@@ Coverage Diff @@
## master #3885 +/- ##
==========================================
+ Coverage 15.22% 75.5% +60.27%
==========================================
Files 199 199
Lines 15946 15947 +1
==========================================
+ Hits 2428 12040 +9612
+ Misses 13518 3907 -9611
Continue to review full report at Codecov.
|
All commits have been squashed in one. Thanks :) |
Thanks @ubermen Looks good! 👍 |
Fixed variable for deleting resources. [AIRFLOW-XXX] Remove residual line in Changelog (apache#3814) [AIRFLOW-2930] Fix celery excecutor scheduler crash (apache#3784) Caused by an update in PR apache#3740. execute_command.apply_async(args=command, ...) -command is a list of short unicode strings and the above code pass multiple arguments to a function defined as taking only one argument. -command = ["airflow", "run", "dag323",...] -args = command = ["airflow", "run", "dag323", ...] -execute_command("airflow","run","dag3s3", ...) will be error and exit. [AIRFLOW-2854] kubernetes_pod_operator add more configuration items (apache#3697) * kubernetes_pod_operator add more configuration items * fix test_kubernetes_pod_operator test_faulty_service_account failure case * fix review comment issues * pod_operator add hostnetwork config * add doc example [AIRFLOW-2994] Fix command status check in Qubole Check operator (apache#3790) [AIRFLOW-2949] Add syntax highlight for single quote strings (apache#3795) * AIRFLOW-2949: Add syntax highlight for single quote strings * AIRFLOW-2949: Also updated new UI main.css [AIRFLOW-2948] Arg check & better doc - SSHOperator & SFTPOperator (apache#3793) There may be different combinations of arguments, and some processings are being done 'silently', while users may not be fully aware of them. For example - User only needs to provide either `ssh_hook` or `ssh_conn_id`, while this is not clear in doc - if both provided, `ssh_conn_id` will be ignored. - if `remote_host` is provided, it will replace the `remote_host` which wasndefined in `ssh_hook` or predefined in the connection of `ssh_conn_id` These should be documented clearly to ensure it's transparent to the users. log.info() should also be used to remind users and provide clear logs. In addition, add instance check for ssh_hook to ensure it is of the correct type (SSHHook). Tests are updated for this PR. [AIRFLOW-XXX] Fix Broken Link in CONTRIBUTING.md [AIRFLOW-2980] ReadTheDocs - Fix Missing API Reference [AIRFLOW-2779] Make GHE auth third party licensed (apache#3803) This reinstates the original license. [AIRFLOW-XXX] Add Format to list of companies (apache#3824) [AIRFLOW-2900] Show code for packaged DAGs (apache#3749) [AIRFLOW-2983] Add prev_ds_nodash and next_ds_nodash macro (apache#3821) [AIRFLOW-2974] Extended Databricks hook with clusters operation (apache#3817) Add hooks for: - cluster start, - restart, - terminate. Add unit tests for the added hooks. Add hooks for cluster start, restart and terminate. Add unit tests for the added hooks. Add cluster_id variable for performing cluster operation tests. [AIRFLOW-2951] Update dag_run table end_date when state change (apache#3798) The existing airflow only change dag_run table end_date value when a user teminate a dag in web UI. The end_date will not be updated if airflow detected a dag finished and updated its state. This commit add end_date update in DagRun's set_state function to make up tho problem mentioned above. [AIRFLOW-2145] fix deadlock on clearing running TI (apache#3657) a `shutdown` task is not considered be `unfinished`, so a dag run can deadlock when all `unfinished` downstreams are all waiting on a task that's in the `shutdown` state. fix this by considering `shutdown` to be `unfinished`, since it's not truly a terminal state [AIRFLOW-XXX] Fix typo in docstring of gcs_to_bq (apache#3833) [AIRFLOW-2476] Allow tabulate up to 0.8.2 (apache#3835) [AIRFLOW-XXX] Fix typos in faq.rst (apache#3837) [AIRFLOW-2979] Make celery_result_backend conf Backwards compatible (apache#3832) (apache#2806) Renamed `celery_result_backend` to `result_backend` and broke backwards compatibility. [AIRFLOW-2866] Fix missing CSRF token head when using RBAC UI (apache#3804) [AIRFLOW-491] Add feature to pass extra api configs to BQ Hook (apache#3733) [AIRFLOW-3007] Update backfill example in Scheduler docs The scheduler docs at https://airflow.apache.org/scheduler.html#backfill-and-catchup use deprecated way of passing `schedule_interval`. `schedule_interval` should be pass to DAG as a separate parameter and not as a default arg. [AIRFLOW-3005] Replace 'Airbnb Airflow' with 'Apache Airflow' (apache#3845) [AIRFLOW-3002] Fix variable & tests in GoogleCloudBucketHelper (apache#3843) [AIRFLOW-2991] Log path to driver output after Dataproc job (apache#3827) [AIRFLOW-XXX] Fix python3 and flake8 errors in dev/airflow-jira This is a script that checks if the Jira's marked as fixed in a release are actually merged in - getting this working is helpful to me in preparing 1.10.1 [AIRFLOW-2883] Add import and export for pool cli using JSON [AIRFLOW-3021] Add Censys to who uses Airflow list > Censys > Find and analyze every reachable server and device on the Internet > https://censys.io/ closes AIRFLOW-3021 https://issues.apache.org/jira/browse/AIRFLOW-3021 Add Branch to Company List [AIRFLOW-3008] Move Kubernetes example DAGs to contrib [AIRFLOW-2997] Support cluster fields in bigquery (apache#3838) This adds a cluster_fields argument to the bigquery hook, GCS to bigquery operator and bigquery query operators. This field requests that bigquery store the result of the query/load operation sorted according to the specified fields (the order of fields given is significant). [AIRFLOW-XXX] Redirect FAQ `airflow[crypto]` to How-to Guides. [AIRFLOW-XXX] Remove redundant space in Kerberos (apache#3866) [AIRFLOW-3028] Update Text & Images in Readme.md [AIRFLOW-1917] Trim extra newline and trailing whitespace from log (apache#3862) [AIRFLOW-2985] Operators for S3 object copying/deleting (apache#3823) 1. Copying: Under the hood, it's `boto3.client.copy_object()`. It can only handle the situation in which the S3 connection used can access both source and destination bucket/key. 2. Deleting: 2.1 Under the hood, it's `boto3.client.delete_objects()`. It supports either deleting one single object or multiple objects. 2.2 If users try to delete a non-existent object, the request will still succeed, but there will be an entry 'Errors' in the response. There may also be other reasons which may cause similar 'Errors' ( request itself would succeed without explicit exception). So an argument `silent_on_errors` is added to let users decide if this sort of 'Errors' should fail the operator. The corresponding methods are added into S3Hook, and these two operators are 'wrappers' of these methods. [AIRFLOW-3030] Fix CLI docs (apache#3872) [AIRFLOW-XXX] Update kubernetes.rst docs (apache#3875) Update kubernetes.rst with correct KubernetesPodOperator inputs for the volumes. [AIRFLOW-XXX] Add Enigma to list of companies [AIRFLOW-2965] CLI tool to show the next execution datetime Cover different cases - schedule_interval is "@once" or None, then following_schedule method would always return None - If dag is paused, print reminder - If latest_execution_date is not found, print warning saying not applicable. [AIRFLOW-XXX] Add Bombora Inc using Airflow [AIRFLOW-XXX] Move Dag level access control out of 1.10 section (apache#3882) It isn't in 1.10 (and wasn't in this section when the PR was created). [AIRFLOW-3012] Fix Bug when passing emails for SLA [AIRFLOW-2797] Create Google Dataproc cluster with custom image (apache#3871) [AIRFLOW-XXX] Updated README to include CAVA [AIRFLOW-3035] Allow custom 'job_error_states' in dataproc ops (apache#3884) Allow caller to pass in custom list of Dataproc job states into the DataProc*Operator classes that should result in the _DataProcJob.raise_error() method raising an Exception. [AIRFLOW-3034]: Readme updates : Add Slack & Twitter, remove Gitter [AIRFLOW-3056] Add happn to Airflow user list [AIRFLOW-3052] Add logo options to Airflow (apache#3892) [AIRFLOW-2524] Add SageMaker Batch Inference (apache#3767) * Fix for comments * Fix sensor test * Update non_terminal_states and failed_states to static variables of SageMakerHook Add SageMaker Transform Operator & Sensor Co-authored-by: srrajeev-aws <srrajeev@amazon.com> [AIRFLOW-XXX] Added Jeitto as one of happy Airflow users! (apache#3902) [AIRFLOW-XXX] Add Jeitto as one happy Airflow user! [AIRFLOW-3044] Dataflow operators accept templated job_name param (apache#3887) * Default value of new job_name param is templated task_id, to match the existing behavior as much as possible. * Change expected value in test_mlengine_operator_utils.py to match default for new job_name param. [AIRFLOW-2707] Validate task_log_reader on upgrade from <=1.9 (apache#3881) We changed the default logging config and config from 1.9 to 1.10, but anyone who upgrades and has an existing airflow.cfg won't know they need to change this value - instead they will get nothing displayed in the UI (ajax request fails) and see "'NoneType' object has no attribute 'read'" in the error log. This validates that config section at start up, and seamlessly upgrades the old previous value. [AIRFLOW-3025] Enable specifying dns and dns_search options for DockerOperator (apache#3860) Enable specifying dns and dns_search options for DockerOperator [AIRFLOW-1298] Clear UPSTREAM_FAILED using the clean cli (apache#3886) * [AIRFLOW-1298] Fix 'clear only_failed' * [AIRFLOW-1298] Fix 'clear only_failed' [AIRFLOW-3059] Log how many rows are read from Postgres (apache#3905) To know how many data is being read from Postgres, it is nice to log this to the Airflow log. Previously when there was no data, it would still create a single file. This is not something that we want, and therefore we've changed this behaviour. Refactored the tests to make use of Postgres itself since we have it running. This makes the tests more realistic, instead of mocking everything. [AIRFLOW-XXX] Fix typo in docs/timezone.rst (apache#3904) [AIRFLOW-3068] Remove deprecated imports [AIRFLOW-3036] Add relevant ECS options to ECS operator. (apache#3908) The ECS operator currently supports only a subset of available options for running ECS tasks. This patch adds all ECS options that could be relevant to airflow; options that wouldn't make sense here, like `count`, were skipped. [AIRFLOW-1195] Add feature to clear tasks in Parent Dag (apache#3907) [AIRFLOW-3073] Add note-Profiling feature not supported in new webserver (apache#3909) Adhoc queries and Charts features are no longer supported in new FAB-based webserver and UI. But this is not mentioned at all in the doc "Data Profiling" (https://airflow.incubator.apache.org/profiling.html) This commit adds a note to remind users for this. [AIRFLOW-XXX] Fix SlackWebhookOperator docs (apache#3915) The docs refer to `conn_id` while the actual argument is `http_conn_id`. [AIRFLOW-1441] Fix inconsistent tutorial code (apache#2466) [AIRFLOW-XXX] Add 90 Seconds to companies [AIRFLOW-3096] Further reduce DaysUntilStale for probo/stale [AIRFLOW-3072] Assign permission get_logs_with_metadata to viewer role (apache#3913) [AIRFLOW-3090] Demote dag start/stop log messages to debug (apache#3920) [AIRFLOW-2407] Use feature detection for reload() (apache#3298) * [AIRFLOW-2407] Use feature detection for reload() [Use feature detection instead of version detection](https://docs.python.org/3/howto/pyporting.html#use-feature-detection-instead-of-version-detection) is a Python porting best practice that avoids a flake8 undefined name error... flake8 testing of https://github.com/apache/incubator-airflow on Python 3.6.3 [AIRFLOW-XXX] Fix a wrong sample bash command, a display issue & a few typos (apache#3924) [AIRFLOW-3090] Make No tasks to consider for execution debug (apache#3923) During normal operation, it is not necessary to see the message. This can only be useful when debugging an issue. AIRFLOW-2952 Fix Kubernetes CI (apache#3922) The current dockerised CI pipeline doesn't run minikube and the Kubernetes integration tests. This starts a Kubernetes cluster using minikube and runs k8s integration tests using docker-compose. [AIRFLOW-2918] Fix Flake8 violations (apache#3931) [AIRFLOW-3076] Remove preloading of MySQL testdata (apache#3911) One of the things for tests is being self contained. This means that it should not depend on anything external, such as loading data. This PR will use the setUp and tearDown to load the data into MySQL and remove it afterwards. This removes the actual bash mysql commands and will make it easier to dockerize the whole testsuite in the future [AIRFLOW-2918] Remove unused imports [AIRFLOW-3090] Specify path of key file in log message (apache#3921) [AIRFLOW-3067] Display www_rbac Flask flash msg properly (apache#3903) The Flask flash messages are not displayed properly. When we don't give a category for a flash message, defautl value will be 'message'. In some cases, we specify 'error' category. Using Flask-AppBuilder, the flash message will be given a CSS class 'alert-[category]'. But We don't have 'alert-message' or 'alert-error' in the current 'bootstrap-theme.css' file. This makes the the flash messages in www_rbac UI come with no background color. This commit addresses this issue by adding 'alert-message' (using specs of existing CSS class 'alert-info') and 'alert-error' (using specs of existing CSS class 'alert-danger') into 'bootstrap-theme.css'. [AIRFLOW-3109] Bugfix to allow user/op roles to clear task intance via UI by default add show statements to hql filtering. [AIRFLOW-3051] Change CLI to make users ops similar to connections The ability to manipulate users from the command line is a bit clunky. Currently 'airflow create_user' and 'airflow delete_user' and 'airflow list_users'. It seems that these ought to be made more like connections, so that it becomes 'airflow users list ...', 'airflow users delete ...' and 'airflow users create ...' [AIRFLOW-3009] Import Hashable from collection.abc to fix Python 3.7 deprecation warning (apache#3849) [AIRFLOW-3111] Fix instructions in UPDATING.md and remove comment (apache#3944) artifacts in default_airflow.cfg - fixed incorrect instructions in UPDATING.md regarding core.log_filename_template and elasticsearch.elasticsearch_log_id_template - removed comments referencing "additional curly braces" from default_airflow.cfg since they're irrelevant to the rendered airflow.cfg [AIRFLOW-3117] Add instructions to allow GPL dependency (apache#3949) The installation instructions failed to mention how to proceed with the GPL dependency. For those who are not concerned by GPL, it is useful to know how to proceed with GPL dependency. [AIRFLOW-XXX] Add Square to the companies lists [AIRFLOW-XXX] Add Fathom Health to readme [AIRFLOW-XXX] Pin Click to 6.7 to Fix CI (apache#3962) [AIRFLOW-XXX] Fix SlackWebhookOperator execute method comment (apache#3963) [AIRFLOW-3100][AIRFLOW-3101] Improve docker compose local testing (apache#3933) [AIRFLOW-3127] Fix out-dated doc for Celery SSL (apache#3967) Now in `airflow.cfg`, for Celery-SSL, the item names are "ssl_active", "ssl_key", "ssl_cert", and "ssl_cacert". (since PR https://github.com/apache/incubator-airflow/pull/2806/files) But in the documentation https://airflow.incubator.apache.org/security.html?highlight=celery or https://github.com/apache/incubator-airflow/blob/master/docs/security.rst, it's "CELERY_SSL_ACTIVE", "CELERY_SSL_KEY", "CELERY_SSL_CERT", and "CELERY_SSL_CACERT", which is out-dated and may confuse readers. [AIRFLOW-XXX] Fix PythonVirtualenvOperator tests (apache#3968) The recent update to the CI image changed the default python from python2 to python3. The PythonVirtualenvOperator tests expected python2 as default and fail due to serialisation errors. [AIRFLOW-2952] Fix Kubernetes CI (apache#3957) - Update outdated cli command to create user - Remove `airflow/example_dags_kubernetes` as the dag already exists in `contrib/example_dags/` - Update the path to copy K8s dags [AIRFLOW-3104] Add .airflowignore info into doc (apache#3939) .airflowignore is a nice feature, but it was not mentioned at all in the documentation. [AIRFLOW-XXX] Add Delete for CLI Example in UPDATING.md [AIRFLOW-3123] Use a stack for DAG context management (apache#3956) [AIRFLOW-3125] Monitor Task Instances creation rates (apache#3966) Montor Task Instances creation rates by Operator type. These stats can provide some visibility on how much workload Airflow is getting. They can be used for resource allocation in the long run (i.e. to determine when we should scale up workers) and debugging in scenarios like the creation rate of certain type of Task Instances spikes. [AIRFLOW-3129] Backfill mysql hook unit tests. (apache#3970) [AIRFLOW-3124] Fix RBAC webserver debug mode (apache#3958) [AIRFLOW-XXX] Add Compass to companies list (apache#3972) We're using Airflow at Compass now. [AIRFLOW-XXX] Speed up DagBagTest cases (apache#3974) I noticed that many of the tests of DagBags operate on a specific DAG only, and don't need to load the example or test dags. By not loading the dags we don't need to this shaves about 10-20s of test time. [AIRFLOW-2912] Add Deploy and Delete operators for GCF (apache#3969) Both Deploy and Delete operators interact with Google Cloud Functions to manage functions. Both are idempotent and make use of GcfHook - hook that encapsulates communication with GCP over GCP API. [AIRFLOW-1390] Update Alembic to 0.9 (apache#3935) [AIRFLOW-2238] Update PR tool to remove outdated info (apache#3978) [AIRFLOW-XXX] Don't spam test logs with "bad cron expression" messages (apache#3973) We needed these test dags to check the behaviour of invalid cron expressions, but by default we were loading them every time we create a DagBag (which many, many tests to). Instead we ignore these known-bad dags by default, and the test checking those (tests/models.py:DagBagTest.test_process_file_cron_validity_check) is already explicitly processing those DAGs directly, so it remains tested. [AIRFLOW-XXX] Fix undocumented params in S3_hook Some function parameters were undocumented. Additional docstrings were added for clarity. [AIRFLOW-3079] Improve migration scripts to support MSSQL Server (apache#3964) There were two problems for MSSQL. First, 'timestamp' data type in MSSQL Server is essentially a row-id, and not a timezone enabled date/time stamp. Second, alembic creates invalid SQL when applying the 0/1 constraint to boolean values. MSSQL should enforce this constraint by simply asserting a boolean value. [AIRFLOW-XXX] Add DoorDash to README.md (apache#3980) DoorDash uses Airflow https://softwareengineeringdaily.com/2018/09/28/doordash/ [AIRFLOW-3062] Add Qubole in integration docs (apache#3946) [AIRFLOW-3129] Improve test coverage of airflow.models. (apache#3982) [AIRFLOW-2574] Cope with '%' in SQLA DSN when running migrations (apache#3787) Alembic uses a ConfigParser like Airflow does, and "%% is a special value in there, so we need to escape it. As per the Alembic docs: > Note that this value is passed to ConfigParser.set, which supports > variable interpolation using pyformat (e.g. `%(some_value)s`). A raw > percent sign not part of an interpolation symbol must therefore be > escaped, e.g. `%%` [AIRFLOW-3137] Make ProxyFix middleware optional. (apache#3983) The ProxyFix middleware should only be used when airflow is running behind a trusted proxy. This patch adds a `USE_PROXY_FIX` flag that defaults to `False`. [AIRFLOW-3004] Add config disabling scheduler cron (apache#3899) [AIRFLOW-3103][AIRFLOW-3147] Update flask-appbuilder (apache#3937) [AIRFLOW-XXX] Fixing the issue in Documentation (apache#3998) Fixing the operator name from DataFlowOperation to DataFlowJavaOperator in Documentation [AIRFLOW-3088] Include slack-compatible emoji image [AIRFLOW-3161] fix TaskInstance log link in RBAC UI [AIRFLOW-3148] Remove unnecessary arg "parameters" in RedshiftToS3Transfer (apache#3995) "Parameters" are used to help render the SQL command. But in this operator, only "schema" and "table" are needed. There is no SQL command to render. By checking the code,we can also find argument "parameters" is never really used. (Fix a minor issue in the docstring as well) [AIRFLOW-3159] Update GCS logging docs for latest code (apache#3952) [AIRFLOW-XXX] Fix airflow.models.DAG docstring mistake Closes apache#4004 from Sambeth/sambeth [AIRFLOW-XXX] Adding Home Depot as users of Apache airflow (apache#4013) * Adding Home Depot as users of Apache airflow [AIRFLOW-XXX] Added ThoughtWorks as user of Airflow in README (apache#4012) [AIRFLOW-XXX] Added DataCamp to list of companies in README (apache#4009) [AIRFLOW-3165] Document interpolation of '%' and warn (apache#4007) [AIRFLOW-3099] Complete list of optional airflow.cfg sections (apache#4002) [AIRFLOW-3162] Fix HttpHook URL parse error when port is specified (apache#4001) [AIRFLOW-3055] add get_dataset and get_datasets_list to bigquery_hook (apache#3894) * [AIRFLOW-3055] add get_dataset and get_datasets_list to bigquery_hook [AIRFLOW-3141] Add missing missing sensor tests. (apache#3991) Fixed string encoding error and updated with master. [AIRFLOW-XXX] Fix wrong {{ next_ds }} description (apache#4017) [AIRFLOW-XXX] Fix Typo in SFTPOperator docstring (apache#4016) [AIRFLOW-XXX] Remove residual line in Changelog (apache#3814) [AIRFLOW-2930] Fix celery excecutor scheduler crash (apache#3784) Caused by an update in PR apache#3740. execute_command.apply_async(args=command, ...) -command is a list of short unicode strings and the above code pass multiple arguments to a function defined as taking only one argument. -command = ["airflow", "run", "dag323",...] -args = command = ["airflow", "run", "dag323", ...] -execute_command("airflow","run","dag3s3", ...) will be error and exit. [AIRFLOW-2854] kubernetes_pod_operator add more configuration items (apache#3697) * kubernetes_pod_operator add more configuration items * fix test_kubernetes_pod_operator test_faulty_service_account failure case * fix review comment issues * pod_operator add hostnetwork config * add doc example [AIRFLOW-2994] Fix command status check in Qubole Check operator (apache#3790) [AIRFLOW-2949] Add syntax highlight for single quote strings (apache#3795) * AIRFLOW-2949: Add syntax highlight for single quote strings * AIRFLOW-2949: Also updated new UI main.css [AIRFLOW-2948] Arg check & better doc - SSHOperator & SFTPOperator (apache#3793) There may be different combinations of arguments, and some processings are being done 'silently', while users may not be fully aware of them. For example - User only needs to provide either `ssh_hook` or `ssh_conn_id`, while this is not clear in doc - if both provided, `ssh_conn_id` will be ignored. - if `remote_host` is provided, it will replace the `remote_host` which wasndefined in `ssh_hook` or predefined in the connection of `ssh_conn_id` These should be documented clearly to ensure it's transparent to the users. log.info() should also be used to remind users and provide clear logs. In addition, add instance check for ssh_hook to ensure it is of the correct type (SSHHook). Tests are updated for this PR. [AIRFLOW-XXX] Fix Broken Link in CONTRIBUTING.md [AIRFLOW-2980] ReadTheDocs - Fix Missing API Reference [AIRFLOW-2779] Make GHE auth third party licensed (apache#3803) This reinstates the original license. [AIRFLOW-XXX] Add Format to list of companies (apache#3824) [AIRFLOW-2900] Show code for packaged DAGs (apache#3749) [AIRFLOW-2974] Extended Databricks hook with clusters operation (apache#3817) Add hooks for: - cluster start, - restart, - terminate. Add unit tests for the added hooks. Add hooks for cluster start, restart and terminate. Add unit tests for the added hooks. Add cluster_id variable for performing cluster operation tests. [AIRFLOW-2951] Update dag_run table end_date when state change (apache#3798) The existing airflow only change dag_run table end_date value when a user teminate a dag in web UI. The end_date will not be updated if airflow detected a dag finished and updated its state. This commit add end_date update in DagRun's set_state function to make up tho problem mentioned above. [AIRFLOW-2145] fix deadlock on clearing running TI (apache#3657) a `shutdown` task is not considered be `unfinished`, so a dag run can deadlock when all `unfinished` downstreams are all waiting on a task that's in the `shutdown` state. fix this by considering `shutdown` to be `unfinished`, since it's not truly a terminal state [AIRFLOW-XXX] Fix typo in docstring of gcs_to_bq (apache#3833) [AIRFLOW-2476] Allow tabulate up to 0.8.2 (apache#3835) [AIRFLOW-XXX] Fix typos in faq.rst (apache#3837) [AIRFLOW-2979] Make celery_result_backend conf Backwards compatible (apache#3832) (apache#2806) Renamed `celery_result_backend` to `result_backend` and broke backwards compatibility. [AIRFLOW-2866] Fix missing CSRF token head when using RBAC UI (apache#3804) [AIRFLOW-3007] Update backfill example in Scheduler docs The scheduler docs at https://airflow.apache.org/scheduler.html#backfill-and-catchup use deprecated way of passing `schedule_interval`. `schedule_interval` should be pass to DAG as a separate parameter and not as a default arg. [AIRFLOW-3005] Replace 'Airbnb Airflow' with 'Apache Airflow' (apache#3845) [AIRFLOW-3002] Fix variable & tests in GoogleCloudBucketHelper (apache#3843) [AIRFLOW-2991] Log path to driver output after Dataproc job (apache#3827) [AIRFLOW-XXX] Fix python3 and flake8 errors in dev/airflow-jira This is a script that checks if the Jira's marked as fixed in a release are actually merged in - getting this working is helpful to me in preparing 1.10.1 [AIRFLOW-2883] Add import and export for pool cli using JSON [AIRFLOW-3021] Add Censys to who uses Airflow list > Censys > Find and analyze every reachable server and device on the Internet > https://censys.io/ closes AIRFLOW-3021 https://issues.apache.org/jira/browse/AIRFLOW-3021 Add Branch to Company List [AIRFLOW-3008] Move Kubernetes example DAGs to contrib [AIRFLOW-2997] Support cluster fields in bigquery (apache#3838) This adds a cluster_fields argument to the bigquery hook, GCS to bigquery operator and bigquery query operators. This field requests that bigquery store the result of the query/load operation sorted according to the specified fields (the order of fields given is significant). [AIRFLOW-XXX] Redirect FAQ `airflow[crypto]` to How-to Guides. [AIRFLOW-XXX] Remove redundant space in Kerberos (apache#3866) [AIRFLOW-3028] Update Text & Images in Readme.md [AIRFLOW-1917] Trim extra newline and trailing whitespace from log (apache#3862) [AIRFLOW-2985] Operators for S3 object copying/deleting (apache#3823) 1. Copying: Under the hood, it's `boto3.client.copy_object()`. It can only handle the situation in which the S3 connection used can access both source and destination bucket/key. 2. Deleting: 2.1 Under the hood, it's `boto3.client.delete_objects()`. It supports either deleting one single object or multiple objects. 2.2 If users try to delete a non-existent object, the request will still succeed, but there will be an entry 'Errors' in the response. There may also be other reasons which may cause similar 'Errors' ( request itself would succeed without explicit exception). So an argument `silent_on_errors` is added to let users decide if this sort of 'Errors' should fail the operator. The corresponding methods are added into S3Hook, and these two operators are 'wrappers' of these methods. [AIRFLOW-3030] Fix CLI docs (apache#3872) [AIRFLOW-XXX] Update kubernetes.rst docs (apache#3875) Update kubernetes.rst with correct KubernetesPodOperator inputs for the volumes. [AIRFLOW-XXX] Add Enigma to list of companies [AIRFLOW-2965] CLI tool to show the next execution datetime Cover different cases - schedule_interval is "@once" or None, then following_schedule method would always return None - If dag is paused, print reminder - If latest_execution_date is not found, print warning saying not applicable. [AIRFLOW-XXX] Add Bombora Inc using Airflow [AIRFLOW-XXX] Move Dag level access control out of 1.10 section (apache#3882) It isn't in 1.10 (and wasn't in this section when the PR was created). [AIRFLOW-3012] Fix Bug when passing emails for SLA [AIRFLOW-2797] Create Google Dataproc cluster with custom image (apache#3871) [AIRFLOW-XXX] Updated README to include CAVA [AIRFLOW-3035] Allow custom 'job_error_states' in dataproc ops (apache#3884) Allow caller to pass in custom list of Dataproc job states into the DataProc*Operator classes that should result in the _DataProcJob.raise_error() method raising an Exception. [AIRFLOW-3034]: Readme updates : Add Slack & Twitter, remove Gitter [AIRFLOW-3056] Add happn to Airflow user list [AIRFLOW-3052] Add logo options to Airflow (apache#3892) [AIRFLOW-2524] Add SageMaker Batch Inference (apache#3767) * Fix for comments * Fix sensor test * Update non_terminal_states and failed_states to static variables of SageMakerHook Add SageMaker Transform Operator & Sensor Co-authored-by: srrajeev-aws <srrajeev@amazon.com> [AIRFLOW-XXX] Added Jeitto as one of happy Airflow users! (apache#3902) [AIRFLOW-XXX] Add Jeitto as one happy Airflow user! [AIRFLOW-3044] Dataflow operators accept templated job_name param (apache#3887) * Default value of new job_name param is templated task_id, to match the existing behavior as much as possible. * Change expected value in test_mlengine_operator_utils.py to match default for new job_name param. [AIRFLOW-2707] Validate task_log_reader on upgrade from <=1.9 (apache#3881) We changed the default logging config and config from 1.9 to 1.10, but anyone who upgrades and has an existing airflow.cfg won't know they need to change this value - instead they will get nothing displayed in the UI (ajax request fails) and see "'NoneType' object has no attribute 'read'" in the error log. This validates that config section at start up, and seamlessly upgrades the old previous value. [AIRFLOW-3025] Enable specifying dns and dns_search options for DockerOperator (apache#3860) Enable specifying dns and dns_search options for DockerOperator [AIRFLOW-1298] Clear UPSTREAM_FAILED using the clean cli (apache#3886) * [AIRFLOW-1298] Fix 'clear only_failed' * [AIRFLOW-1298] Fix 'clear only_failed' [AIRFLOW-3059] Log how many rows are read from Postgres (apache#3905) To know how many data is being read from Postgres, it is nice to log this to the Airflow log. Previously when there was no data, it would still create a single file. This is not something that we want, and therefore we've changed this behaviour. Refactored the tests to make use of Postgres itself since we have it running. This makes the tests more realistic, instead of mocking everything. [AIRFLOW-XXX] Fix typo in docs/timezone.rst (apache#3904) [AIRFLOW-3068] Remove deprecated imports [AIRFLOW-3036] Add relevant ECS options to ECS operator. (apache#3908) The ECS operator currently supports only a subset of available options for running ECS tasks. This patch adds all ECS options that could be relevant to airflow; options that wouldn't make sense here, like `count`, were skipped. [AIRFLOW-1195] Add feature to clear tasks in Parent Dag (apache#3907) [AIRFLOW-3073] Add note-Profiling feature not supported in new webserver (apache#3909) Adhoc queries and Charts features are no longer supported in new FAB-based webserver and UI. But this is not mentioned at all in the doc "Data Profiling" (https://airflow.incubator.apache.org/profiling.html) This commit adds a note to remind users for this. [AIRFLOW-XXX] Fix SlackWebhookOperator docs (apache#3915) The docs refer to `conn_id` while the actual argument is `http_conn_id`. [AIRFLOW-1441] Fix inconsistent tutorial code (apache#2466) [AIRFLOW-XXX] Add 90 Seconds to companies [AIRFLOW-3096] Further reduce DaysUntilStale for probo/stale [AIRFLOW-3072] Assign permission get_logs_with_metadata to viewer role (apache#3913) [AIRFLOW-3090] Demote dag start/stop log messages to debug (apache#3920) [AIRFLOW-2407] Use feature detection for reload() (apache#3298) * [AIRFLOW-2407] Use feature detection for reload() [Use feature detection instead of version detection](https://docs.python.org/3/howto/pyporting.html#use-feature-detection-instead-of-version-detection) is a Python porting best practice that avoids a flake8 undefined name error... flake8 testing of https://github.com/apache/incubator-airflow on Python 3.6.3 [AIRFLOW-XXX] Fix a wrong sample bash command, a display issue & a few typos (apache#3924) [AIRFLOW-3090] Make No tasks to consider for execution debug (apache#3923) During normal operation, it is not necessary to see the message. This can only be useful when debugging an issue. AIRFLOW-2952 Fix Kubernetes CI (apache#3922) The current dockerised CI pipeline doesn't run minikube and the Kubernetes integration tests. This starts a Kubernetes cluster using minikube and runs k8s integration tests using docker-compose. [AIRFLOW-2918] Fix Flake8 violations (apache#3931) [AIRFLOW-3076] Remove preloading of MySQL testdata (apache#3911) One of the things for tests is being self contained. This means that it should not depend on anything external, such as loading data. This PR will use the setUp and tearDown to load the data into MySQL and remove it afterwards. This removes the actual bash mysql commands and will make it easier to dockerize the whole testsuite in the future [AIRFLOW-2918] Remove unused imports [AIRFLOW-3090] Specify path of key file in log message (apache#3921) [AIRFLOW-3067] Display www_rbac Flask flash msg properly (apache#3903) The Flask flash messages are not displayed properly. When we don't give a category for a flash message, defautl value will be 'message'. In some cases, we specify 'error' category. Using Flask-AppBuilder, the flash message will be given a CSS class 'alert-[category]'. But We don't have 'alert-message' or 'alert-error' in the current 'bootstrap-theme.css' file. This makes the the flash messages in www_rbac UI come with no background color. This commit addresses this issue by adding 'alert-message' (using specs of existing CSS class 'alert-info') and 'alert-error' (using specs of existing CSS class 'alert-danger') into 'bootstrap-theme.css'. [AIRFLOW-3109] Bugfix to allow user/op roles to clear task intance via UI by default add show statements to hql filtering. [AIRFLOW-3051] Change CLI to make users ops similar to connections The ability to manipulate users from the command line is a bit clunky. Currently 'airflow create_user' and 'airflow delete_user' and 'airflow list_users'. It seems that these ought to be made more like connections, so that it becomes 'airflow users list ...', 'airflow users delete ...' and 'airflow users create ...' [AIRFLOW-3009] Import Hashable from collection.abc to fix Python 3.7 deprecation warning (apache#3849) [AIRFLOW-3111] Fix instructions in UPDATING.md and remove comment (apache#3944) artifacts in default_airflow.cfg - fixed incorrect instructions in UPDATING.md regarding core.log_filename_template and elasticsearch.elasticsearch_log_id_template - removed comments referencing "additional curly braces" from default_airflow.cfg since they're irrelevant to the rendered airflow.cfg [AIRFLOW-3117] Add instructions to allow GPL dependency (apache#3949) The installation instructions failed to mention how to proceed with the GPL dependency. For those who are not concerned by GPL, it is useful to know how to proceed with GPL dependency. [AIRFLOW-XXX] Add Square to the companies lists [AIRFLOW-XXX] Add Fathom Health to readme [AIRFLOW-XXX] Pin Click to 6.7 to Fix CI (apache#3962) [AIRFLOW-XXX] Fix SlackWebhookOperator execute method comment (apache#3963) [AIRFLOW-3100][AIRFLOW-3101] Improve docker compose local testing (apache#3933) [AIRFLOW-3127] Fix out-dated doc for Celery SSL (apache#3967) Now in `airflow.cfg`, for Celery-SSL, the item names are "ssl_active", "ssl_key", "ssl_cert", and "ssl_cacert". (since PR https://github.com/apache/incubator-airflow/pull/2806/files) But in the documentation https://airflow.incubator.apache.org/security.html?highlight=celery or https://github.com/apache/incubator-airflow/blob/master/docs/security.rst, it's "CELERY_SSL_ACTIVE", "CELERY_SSL_KEY", "CELERY_SSL_CERT", and "CELERY_SSL_CACERT", which is out-dated and may confuse readers. [AIRFLOW-XXX] Fix PythonVirtualenvOperator tests (apache#3968) The recent update to the CI image changed the default python from python2 to python3. The PythonVirtualenvOperator tests expected python2 as default and fail due to serialisation errors. [AIRFLOW-2952] Fix Kubernetes CI (apache#3957) - Update outdated cli command to create user - Remove `airflow/example_dags_kubernetes` as the dag already exists in `contrib/example_dags/` - Update the path to copy K8s dags [AIRFLOW-3104] Add .airflowignore info into doc (apache#3939) .airflowignore is a nice feature, but it was not mentioned at all in the documentation. [AIRFLOW-XXX] Add Delete for CLI Example in UPDATING.md [AIRFLOW-3123] Use a stack for DAG context management (apache#3956) [AIRFLOW-3125] Monitor Task Instances creation rates (apache#3966) Montor Task Instances creation rates by Operator type. These stats can provide some visibility on how much workload Airflow is getting. They can be used for resource allocation in the long run (i.e. to determine when we should scale up workers) and debugging in scenarios like the creation rate of certain type of Task Instances spikes. [AIRFLOW-3129] Backfill mysql hook unit tests. (apache#3970) [AIRFLOW-3124] Fix RBAC webserver debug mode (apache#3958) [AIRFLOW-XXX] Add Compass to companies list (apache#3972) We're using Airflow at Compass now. [AIRFLOW-XXX] Speed up DagBagTest cases (apache#3974) I noticed that many of the tests of DagBags operate on a specific DAG only, and don't need to load the example or test dags. By not loading the dags we don't need to this shaves about 10-20s of test time. [AIRFLOW-2912] Add Deploy and Delete operators for GCF (apache#3969) Both Deploy and Delete operators interact with Google Cloud Functions to manage functions. Both are idempotent and make use of GcfHook - hook that encapsulates communication with GCP over GCP API. [AIRFLOW-1390] Update Alembic to 0.9 (apache#3935) [AIRFLOW-2238] Update PR tool to remove outdated info (apache#3978) [AIRFLOW-XXX] Don't spam test logs with "bad cron expression" messages (apache#3973) We needed these test dags to check the behaviour of invalid cron expressions, but by default we were loading them every time we create a DagBag (which many, many tests to). Instead we ignore these known-bad dags by default, and the test checking those (tests/models.py:DagBagTest.test_process_file_cron_validity_check) is already explicitly processing those DAGs directly, so it remains tested. [AIRFLOW-XXX] Fix undocumented params in S3_hook Some function parameters were undocumented. Additional docstrings were added for clarity. [AIRFLOW-3079] Improve migration scripts to support MSSQL Server (apache#3964) There were two problems for MSSQL. First, 'timestamp' data type in MSSQL Server is essentially a row-id, and not a timezone enabled date/time stamp. Second, alembic creates invalid SQL when applying the 0/1 constraint to boolean values. MSSQL should enforce this constraint by simply asserting a boolean value. [AIRFLOW-XXX] Add DoorDash to README.md (apache#3980) DoorDash uses Airflow https://softwareengineeringdaily.com/2018/09/28/doordash/ [AIRFLOW-3062] Add Qubole in integration docs (apache#3946) [AIRFLOW-3129] Improve test coverage of airflow.models. (apache#3982) [AIRFLOW-2574] Cope with '%' in SQLA DSN when running migrations (apache#3787) Alembic uses a ConfigParser like Airflow does, and "%% is a special value in there, so we need to escape it. As per the Alembic docs: > Note that this value is passed to ConfigParser.set, which supports > variable interpolation using pyformat (e.g. `%(some_value)s`). A raw > percent sign not part of an interpolation symbol must therefore be > escaped, e.g. `%%` [AIRFLOW-3137] Make ProxyFix middleware optional. (apache#3983) The ProxyFix middleware should only be used when airflow is running behind a trusted proxy. This patch adds a `USE_PROXY_FIX` flag that defaults to `False`. [AIRFLOW-3004] Add config disabling scheduler cron (apache#3899) [AIRFLOW-3103][AIRFLOW-3147] Update flask-appbuilder (apache#3937) [AIRFLOW-XXX] Fixing the issue in Documentation (apache#3998) Fixing the operator name from DataFlowOperation to DataFlowJavaOperator in Documentation [AIRFLOW-3088] Include slack-compatible emoji image [AIRFLOW-3161] fix TaskInstance log link in RBAC UI [AIRFLOW-3148] Remove unnecessary arg "parameters" in RedshiftToS3Transfer (apache#3995) "Parameters" are used to help render the SQL command. But in this operator, only "schema" and "table" are needed. There is no SQL command to render. By checking the code,we can also find argument "parameters" is never really used. (Fix a minor issue in the docstring as well) [AIRFLOW-3159] Update GCS logging docs for latest code (apache#3952) [AIRFLOW-XXX] Fix airflow.models.DAG docstring mistake Closes apache#4004 from Sambeth/sambeth [AIRFLOW-XXX] Adding Home Depot as users of Apache airflow (apache#4013) * Adding Home Depot as users of Apache airflow [AIRFLOW-XXX] Added ThoughtWorks as user of Airflow in README (apache#4012) [AIRFLOW-XXX] Added DataCamp to list of companies in README (apache#4009) [AIRFLOW-3165] Document interpolation of '%' and warn (apache#4007) [AIRFLOW-3099] Complete list of optional airflow.cfg sections (apache#4002) [AIRFLOW-3162] Fix HttpHook URL parse error when port is specified (apache#4001) [AIRFLOW-3055] add get_dataset and get_datasets_list to bigquery_hook (apache#3894) * [AIRFLOW-3055] add get_dataset and get_datasets_list to bigquery_hook [AIRFLOW-3141] Add missing missing sensor tests. (apache#3991) [AIRFLOW-XXX] Fix wrong {{ next_ds }} description (apache#4017) [AIRFLOW-XXX] Fix Typo in SFTPOperator docstring (apache#4016) Addressed changes from comments made in the PR. [AIRFLOW-3139] include parameters into log.info in SQL operators, if any (apache#3986) For all SQL-operators based on DbApiHook, sql command itself is printed into log.info. But if parameters are used for the sql command, the parameters would not be included in the printing. This makes the log less useful. This commit ensures that the parameters are also printed into the log.info, if any. [AIRFLOW-XXX] Include Danamica in list of companies using Airflow (apache#4019) [AIRFLOW-XXX] Update manage-connections.rst (apache#4020) Explain how to connect with MySQL [AIRFLOW-XXX] Add CarLabs to companies list (apache#4021) [AIRFLOW-3175] Fix docstring format in airflow/jobs.py (apache#4025) These docstrings could not parsed properly in Sphinx syntax [AIRFLOW-3086] Add extras group for google auth to setup.py. (apache#3917) To clarify installation instructions for the google auth backend, add an install group to `setup.py` that installs dependencies google auth via `pip install apache-airflow[google_auth]`. [AIRFLOW-XXX] Include Pagar.me in list of users of Airflow (apache#4026) [AIRFLOW-3173] Add _cmd options for password config options (apache#4024) There were a few more "password" config options added over the last few months that didn't have _cmd options. Any config option that is a password should be able to be provided via a _cmd version. [AIRFLOW-3078] Basic operators for Google Compute Engine (apache#4022) Add GceInstanceStartOperator, GceInstanceStopOperator and GceSetMachineTypeOperator. Each operator includes: - core logic - input params validation - unit tests - presence in the example DAG - docstrings - How-to and Integration documentation Additionally, in GceHook error checking if response is 200 OK was added: Some types of errors are only visible in the response's "error" field and the overall HTTP response is 200 OK. That is why apart from checking if status is "done" we also check if "error" is empty, and if not an exception is raised with error message extracted from the "error" field of the response. In this commit we also separated out Body Field Validator to separate module in tools - this way it can be reused between various GCP operators, it has proven to be usable in at least two of them now. Co-authored-by: sprzedwojski <szymon.przedwojski@polidea.com> Co-authored-by: potiuk <jarek.potiuk@polidea.com> [AIRFLOW-3168] More resillient database use in CI (apache#4014) Make sure mysql is available before calling it in CI [AIRFLOW-3177] Change scheduler_heartbeat from gauge to counter (apache#4027) This updates the scheduler_heartbeat metric from a gauge to a counter to better support the statsd_exporter for usage with Prometheus. A counter allows users to track the rate of the heartbeat, and integrates with the exporter better. A crashing or down scheduler will no longer emit the metric, but the statsd_exporter will continue to show a 1 for the metric value. This fixes that issue because a counter will continually change, and the lack of change indicates an issue with the scheduler. Add statsd change notice in UPDATING.md [AIRFLOW-2956] Add kubernetes tolerations (apache#3806) [AIRFLOW-3183] Fix bug in DagFileProcessorManager.max_runs_reached() (apache#4031) The condition is intended to ensure the function will return False if any file's run_count is still smaller than max_run. But the operator used here is "!=". Instead, it should be "<". This is because in DagFileProcessorManager, there is no statement helping limit the upper limit of run_count. It's possible that files' run_count will be bigger than max_run. In such case, max_runs_reached() method may fail its purpose. [AIRFLOW-3099] Don't ever warn about missing sections of config (apache#4028) Rather than looping through and setting each config variable individually, and having to know which sections are optional and which aren't, instead we can just call a single function on ConfigParser and it will read the config from the dict, and more importantly here, never error about missing sections - it will just create them as needed. [AIRFLOW-1837] Respect task start_date when different from dag's (apache#4010) Currently task instances get created and scheduled based on the DAG's start date rather than their own. This commit adds a check before creating a task instance to see that the start date is not after the execution date. [AIRFLOW-3089] Drop hard-coded url scheme in google auth redirect. (apache#3919) The google auth provider hard-codes the `_scheme` in the callback url to `https` so that airflow generates correct urls when run behind a proxy that terminates tls. But this means that google auth can't be used when running without https--for example, during local development. Also, hard-coding `_scheme` isn't the correct solution to the problem of running behind a proxy. Instead, the proxy should be configured to set the `X-Forwarded-Proto` header to `https`; Flask interprets this header and generates the appropriate callback url without hard-coding the scheme. [AIRFLOW-XXX] Add Grab to companies list (apache#4041) [AIRFLOW-3178] Handle percents signs in configs for airflow run (apache#4029) * [AIRFLOW-3178] Don't mask defaults() function from ConfigParser ConfigParser (the base class for AirflowConfigParser) expects defaults() to be a function - so when we re-assign it to be a property some of the methods from ConfigParser no longer work. * [AIRFLOW-3178] Correctly escape percent signs when creating temp config Otherwise we have a problem when we come to use those values. * [AIRFLOW-3178] Use os.chmod instead of shelling out There's no need to run another process for a built in Python function. This also removes a possible race condition that would make temporary config file be readable by more than the airflow or run-as user The exact behaviour would depend on the umask we run under, and the primary group of our user, likely this would mean the file was readably by members of the airflow group (which in most cases would be just the airflow user). To remove any such possibility we chmod the file before we write to it [AIRFLOW-2216] Use profile for AWS hook if S3 config file provided in aws_default connection extra parameters (apache#4011) Use profile for AWS hook if S3 config file provided in aws_default connection extra parameters Add test to validate profile set [AIRFLOW-3001] Add index 'ti_dag_date' to taskinstance (apache#3885) To optimize query performance [AIRFLOW-2794] Add WasbDeleteBlobOperator (apache#3961) Deleting Azure blob is now supported. Either single blobs can be deleted, or one can choose to supply a prefix, in which case one can match multiple blobs to be deleted. [AIRFLOW-3138] Use current data type for migrations (apache#3985) * Use timestamp instead of timestamp with timezone for migration. [AIRFLOW-393] Add callback for FTP downloads (apache#2372) [AIRFLOW-3119] Enable debugging with Celery(apache#3950) This will enable --loglevel when launching a celery worker and inherit that LOGGING_LEVEL setting from airflow.cfg [AIRFLOW-3112] Make SFTP hook to inherit SSH hook (apache#3945) This is to aline the arguments of SFTP hook with SSH hook [AIRFLOW-3195] Log query and task_id in druid-hook (apache#4018) Log query and task_id in druid-hook [AIRFLOW-3187] Update airflow.gif file with a slower version (apache#4033) [AIRFLOW-2789] Create single node DataProc cluster (apache#4015) Create single node cluster - infer from num_workers
To optimize query performance
To optimize query performance
Thanks @ubermen for this fix. We experienced increasing load on our DB (Postgres 10, RDS m4.large, 1.7M rows in task_instance table) and slower task scheduling. After analysis this query was identified as cause. After creating the index load went down and tasks are scheduled fast again. |
To optimize query performance
To optimize query performance
To optimize query performance
commit 5b95be403a4ca8e1d163d65b69a4c609d416b760 Author: Chris Fei <chris@indicative.com> Date: Thu Jan 24 13:06:55 2019 -0500 Added custom file support in code view commit 666f1f103f6dda0f31217677e67629301f01dbdc Author: Chris Fei <chris@indicative.com> Date: Wed Jan 23 18:42:37 2019 -0500 compat with older mysql commit c51cc139f838125d908b7022b6449e00e79545b9 Author: Chris Fei <chris@indicative.com> Date: Wed Jan 23 10:58:40 2019 -0500 Added patch info commit 5a041ad90e02cad9b227b2817eb177a91afcf9fb Author: Kaxil Naik <kaxilnaik@apache.org> Date: Sat Jan 19 16:17:55 2019 +0000 Add Changes to CHANGELOG commit 346dede8ace2a0eb77360deb352b043b354e515f Author: Kaxil Naik <kaxilnaik@apache.org> Date: Sat Jan 19 15:04:17 2019 +0000 Fix issue when trying to edit connection in RBAC UI commit 6a637d24ece295520e1ec99650f5c48862a570b7 Author: Fokko Driesprong <fokkodriesprong@godatadriven.com> Date: Mon Oct 15 07:16:29 2018 +0200 Make flake8 compliant One voilation that slipped in by PR that didn't rebase onto latest master commit 5dbda81064106ab1b9e7b94707fcc61772edb3a5 Author: ubermen <kjh3477@gmail.com> Date: Sun Sep 16 05:01:03 2018 +0900 Clear UPSTREAM_FAILED using the clean cli (#3886) * [AIRFLOW-1298] Fix 'clear only_failed' * [AIRFLOW-1298] Fix 'clear only_failed' commit 3d87232efbdefdffe504a0cdf394dfd1262b98c7 Author: Xiaodong <xd_deng@hotmail.com> Date: Sun Sep 16 20:38:09 2018 +0800 Refine web UI authentication-related docs (#3863) commit 55ccae87e72239197900793d1dee270b079f14e3 Author: Nathaniel Ritholtz <nritholtz@gmail.com> Date: Thu Sep 27 15:43:26 2018 -0400 Fix SlackWebhookOperator execute method comment (#3963) commit afaa7cfda07851d632feec0ce089d561abbe3b56 Author: Mingye Xia <mingye.xia@outlook.com> Date: Fri Sep 28 10:07:43 2018 -0700 Monitor Task Instances creation rates (#3966) Montor Task Instances creation rates by Operator type. These stats can provide some visibility on how much workload Airflow is getting. They can be used for resource allocation in the long run (i.e. to determine when we should scale up workers) and debugging in scenarios like the creation rate of certain type of Task Instances spikes. commit e189fbdabb30fcbc463e9b81d5b9dd65f05088ff Author: Szymon Bilinski <szymon.bilinski@gmail.com> Date: Sat Sep 29 15:45:37 2018 +0200 Fix undocumented params in S3_hook Some function parameters were undocumented. Additional docstrings were added for clarity. commit 0d1aed9bb3a372e17d83339b2a2f8d8a67b808a3 Author: Santhoshkumar. P <sann3@users.noreply.github.com> Date: Thu Oct 4 22:50:48 2018 +0530 Fixing the issue in Documentation (#3998) Fixing the operator name from DataFlowOperation to DataFlowJavaOperator in Documentation commit e46886608546423b1e575a97acaf0bd8322afeb4 Author: Kaxil Naik <kaxilnaik@gmail.com> Date: Mon Oct 8 13:22:03 2018 +0100 Fix Typo in SFTPOperator docstring (#4016) commit aed387b97b2f63d8bdee3e9d96fdc83e4caa3d43 Author: marengaz <marengaz@users.noreply.github.com> Date: Mon Oct 29 14:28:52 2018 +0000 Correct misleading BigQuery error (#4098) commit d7b24721a98460eb917de3fae4796348412e4619 Author: mishikaSingh <mishikaps@gmail.com> Date: Wed Oct 31 18:51:28 2018 +0530 Catch transient DB exceptions from scheduler's heartbeat it does not crash (#3650) If there is any issue in DB connection then rest of the functions take care of those exceptions but in heartbeat of scheduler, there is no handling for this kind of situation. Airflow Scheduler should not crash if a "transient" DB exception occurs in the heartbeat of scheduler. commit f4503047520f478e604cec60281b475dfc199ad5 Author: Marcin Szymański <ms32035@gmail.com> Date: Tue Nov 13 14:37:57 2018 +0100 fix list processing in resolve_template_files (#4086) * [AIRFLOW-3245] fix list processing in resolve_template_files * [AIRFLOW-3245] add tests * [AIRFLOW-3245] modify tests commit a77f51081fe841466f715f288f6f34ec40f8e5ba Author: Nicholas Huang <nicholas.ykhuang@gmail.com> Date: Tue Nov 20 01:15:32 2018 -0800 AIRFLOW-XXX Fix copy&paste mistake (#4212) In emr_create_job_flow_operator.py the :type clearly mismatches with the :param name, suggesting a copy&paste mistake. commit 384845d3db4fa4cd3e13e5c9059f4a8642112be0 Author: Ryan Yuan <ryan.yuan@outlook.com> Date: Thu Nov 22 22:58:32 2018 +1100 Fix incorrect docstring in DatastoreHook (#4222) Correct docstring in DatastoreHook commit 043a4c35ea2f22b802a70b4936e38ce155c4b325 Author: rmn36 <rmn36@case.edu> Date: Fri Nov 23 10:41:04 2018 -0800 Add new TriggerRule for 0 upstream failures (#4182) Add new TriggerRule that triggers only if all upstream do not fail (success or skipped tasks are allowed) commit f59111760e07938c6b42b20bc45a3dc1070a1764 Author: Victor Noël <victornoel@users.noreply.github.com> Date: Mon Nov 26 10:02:08 2018 +0100 KubernetesPodOperator does not delete on timeout failure (#4218) Signed-off-by: Victor Noel <victor.noel@brennus-analytics.com> commit 8c9a39e9a7dc7a62d2de487bd4d33ae334aed116 Author: Kaxil Naik <kaxilnaik@gmail.com> Date: Sat Dec 8 22:31:39 2018 +0000 Fix Minor issues with Azure Cosmos Operator (#4289) - Fixed Documentation in integration.rst - Fixed Incorrect type in docstring of `AzureCosmosInsertDocumentOperator` - Added the Hook, Sensor and Operator in code.rst - Updated the name of example DAG and its filename to follow the convention commit 8ef0c9dfadca37fc399396e8c002ee119289e3d1 Author: Michal Dziemianko <michal.dziemianko@gmail.com> Date: Wed Dec 26 20:01:34 2018 +0000 Fix FTPSensor failing on error message with unexpected text. (#2450) * [AIRFLOW-1413] Fix FTPSensor file presence check Currently FTPSensor operates by checking text of error message returned from ftp lib. It only succeeds if the message matches the expected text. Otherwise it fails with an exception. However the message is dependend on a system, locale and possibly other factors. This patch changes the operation to inspect error code rather than message text. It also adds option to ignore certain classes of errors such as Host Unavailable that are recoverable, thus the performed action can and should be retried according to ftp spec. * [AIRFLOW-1413] Adjustments as per code review * [AIRFLOW-1413] fixing style Co-Authored-By: mdziemianko <michal.dziemianko@gmail.com> commit e30a6a8232882fa38c0f0058449f0ae5cee6f363 Author: Kaxil Naik <kaxilnaik@gmail.com> Date: Wed Sep 5 23:24:56 2018 +0100 Fix Minor issues in Documentation commit 208728c1d13b880a08e75e359786de812350ae66 Author: Kaxil Naik <kaxilnaik@gmail.com> Date: Sun Oct 14 20:11:44 2018 +0100 Fix BashOperator Docstring (#4052) commit 0e0a9dda8f6cc99917136a70519d7b29911773a1 Author: Marcin Szymański <ms32035@gmail.com> Date: Thu Nov 22 22:34:46 2018 +0000 update run statistics on dag refresh (#4197) * [AIRFLOW-3348] update run statistics on dag refresh commit 152d93dc8be07fe4c020c878eef8ce7528407c71 Author: Marcus <marcuseagan@gmail.com> Date: Thu Dec 13 23:19:22 2018 -0800 removed an unused/dangerous display-none (#4295) * removed an unused display-none that is currently overriden but could resurface as a bug. * remove the other display none in /www commit ddd292fd79fcf250eb131e68cb618e258939d9aa Author: cclauss <cclauss@bluewin.ch> Date: Thu Sep 20 21:48:36 2018 +0200 Use feature detection for reload() (#3298) * [AIRFLOW-2407] Use feature detection for reload() [Use feature detection instead of version detection](https://docs.python.org/3/howto/pyporting.html#use-feature-detection-instead-of-version-detection) is a Python porting best practice that avoids a flake8 undefined name error... flake8 testing of https://github.com/apache/incubator-airflow on Python 3.6.3 commit f7e6dfe7cac5012c27134bb90e08cf8f3c091701 Author: Joshua Carp <jm.carp@gmail.com> Date: Thu Oct 25 06:33:21 2018 -0400 Add SKIPPED to task states. (#4059) commit f350aff54e99dab586cd4165625fafcd1ac86907 Author: Abdul Nimeri <abdul@stripe.com> Date: Thu Jul 26 20:53:57 2018 +0200 Compress tree view JSON The tree view generates JSON that can be massive for bigger DAGs, up to 10s of MBs. The JSON is currently prettified, which both takes up more CPU time during serialization, and slows down everything else that uses it. Considering the JSON is only meant to be used programmatically, this is an easy win Closes #3620 from abdul-stripe/smaller-tree-view- json commit 6e349ea80abe1d3f618033aac700ab98dd4991af Author: Verdan Mahmood <verdan.mahmood@gmail.com> Date: Thu Jul 26 20:55:04 2018 +0200 Respect shared datetime across tabs Closes #3615 from verdan/AIRFLOW-2766-shared- datetime commit 496b6684d66636007e2887bf66876bd9b7303ea3 Author: Verdan Mahmood <verdan.mahmood@gmail.com> Date: Wed Aug 8 13:47:59 2018 +0200 Enables FAB's theme support (#3719) commit 94a004cc758b81b1d863b85c782fc3d2ffab436c Author: Gabriel Silk <gabe@nomic.com> Date: Mon Sep 3 11:37:20 2018 -0700 Fix missing CSRF token head when using RBAC UI (#3804) commit 2b542f33ebbb24ff166ab07feaab79e1af1950d9 Author: Stefan Seelmann <mail@stefan-seelmann.de> Date: Thu Sep 20 20:49:20 2018 +0200 Assign permission get_logs_with_metadata to viewer role (#3913) commit 078c7b05d6231cf3d29ac226b3abd479bfa17d05 Author: Xiaodong <xd_deng@hotmail.com> Date: Tue Sep 25 02:20:17 2018 +0800 Display www_rbac Flask flash msg properly (#3903) The Flask flash messages are not displayed properly. When we don't give a category for a flash message, defautl value will be 'message'. In some cases, we specify 'error' category. Using Flask-AppBuilder, the flash message will be given a CSS class 'alert-[category]'. But We don't have 'alert-message' or 'alert-error' in the current 'bootstrap-theme.css' file. This makes the the flash messages in www_rbac UI come with no background color. This commit addresses this issue by adding 'alert-message' (using specs of existing CSS class 'alert-info') and 'alert-error' (using specs of existing CSS class 'alert-danger') into 'bootstrap-theme.css'. commit 711582538c62e789bea7e5f8b36ed27dc5a3e75a Author: Joshua Carp <jm.carp@gmail.com> Date: Mon Oct 15 09:12:33 2018 -0400 Handle duration for missing dag. (#3984) commit 44abb72349622bff0e1a267cf286514713dd71b1 Author: Ash Berlin-Taylor <ash_github@firemirror.com> Date: Mon Nov 19 18:26:18 2018 +0000 Don't publish md5 sigs as part of release (#4210) Apache recommend against publishing MD5 files now as they are relatively easy to collide and shouldn't be trusted anymore commit 48d5ef35960b88622c997fdd823aee97f971503b Author: smithakoduri <41048347+smithakoduri@users.noreply.github.com> Date: Wed Nov 14 17:45:24 2018 -0800 Fix issue with persistence of RBAC Permissions modified via UI (#4118) commit f815f13bd74ec6c71154ef68d8ef6a5c779d4317 Author: Sumit Maheshwari <sumeet.manit@gmail.com> Date: Wed Nov 7 15:42:17 2018 +0530 Small CSS fixes (#4140) * Don't highlight logout button when viewing Log tab of a task * Align Airflow logo to the center of the login page commit 6c445291240dca676fd050b416d56bbfcacf7363 Author: Zakaria EL Mesaoudi <elmesaoudee@gmail.com> Date: Wed Nov 7 21:58:15 2018 +0000 AIRFLOW-3259] Fix internal server error when displaying charts (#4114) This is caused by the fact that the function 'sort' is no longer a part of Dataframe in pandas and is still used in the code base. It has ever since been replaced by 'sort_values'. Replacing the function gets the chart display back to its normal behaviour. commit d08940e7cc3a78ac1509332ff091cc8d7048c40f Author: Kaxil Naik <kaxilnaik@apache.org> Date: Thu Jan 17 21:31:07 2019 +0000 Update Changelog commit a4effc1a9d6a17de1088c8537c9a182318e0421b Author: Vivek <3vivekb@gmail.com> Date: Wed Jan 16 21:57:42 2019 -0800 Fix a typo of config (#4544) commit 625ec7398995724b1b7469153ab7906226e20eec Author: Daniel Lamblin <dlamblin+github@gmail.com> Date: Fri Jan 18 00:21:37 2019 +0900 Correct Typo in sensor's exception (#4545) commit 238339b35eb963cf7b337bedd3092dbbc9208038 Author: Kaxil Naik <kaxilnaik@apache.org> Date: Thu Jan 17 14:39:59 2019 +0000 Fix the broken refresh button on Graph View in RBAC UI commit faaba1fd401f49d811e9380ff57bbd3868c42cee Author: Kaxil Naik <kaxilnaik@apache.org> Date: Wed Jan 16 21:59:57 2019 +0000 Changelog and version for 1.10.2 commit d3ff2abde31f837ddf038c7cae232fa4d80693fb Author: Felix <feluelle@users.noreply.github.com> Date: Fri Jan 11 19:17:20 2019 +0100 Update github_enterprise_auth.py commit 1a6153eb0f3dc2445b3fbe91cd9a4976aa45e25c Author: Ash Berlin-Taylor <ash_github@firemirror.com> Date: Thu Aug 30 13:30:40 2018 +0100 Make GHE auth third party licensed (#3803) commit a74331f6cc36c6f3a0ebd3a9ffb3c577d9e2d5e9 Author: Verdan Mahmood <verdan.mahmood@gmail.com> Date: Fri Aug 3 14:07:50 2018 +0200 Display multiple timezones on UI (#3687) commit d70dd93c99693770bd3f29d25634004bc99030d0 Author: Verdan Mahmood <verdan.mahmood@gmail.com> Date: Thu Jul 26 20:45:14 2018 +0200 Implement eslint for JS code check (#3641) commit 777b176624ce126c2c02c40ed56b0f357a15723c Author: Verdan Mahmood <verdan.mahmood@gmail.com> Date: Wed Jul 25 14:15:30 2018 +0200 Removes unused hard-coded dagreD3 Closes #3635 from verdan/AIRFLOW-2782-dagred3-fix commit fd6217b982eb31573a7b0029b75d6525b92c9ca2 Author: Verdan Mahmood <verdan.mahmood@gmail.com> Date: Tue Jul 24 11:22:53 2018 +0200 Upgrades the Dagre D3 version Closes #3634 from verdan/AIRFLOW-2782-upgrade- dagre-d3 commit 3856ea4948b2c7069681c10a785ce2db5c9e1af0 Author: Jarek Potiuk <jarek@potiuk.com> Date: Tue Jan 15 01:54:03 2019 +0100 All GCP operators have now optional GCP Project ID (#4500) commit cd746a253fe45188cbd1b9285d927565f4007a35 Author: Xiaodong <xd_deng@hotmail.com> Date: Tue Jan 15 01:34:45 2019 +0800 Support SSL Protection When Redis is Used as Broker for CeleryExecutor (#4521) From Celery 4.1 (current Airflow is using 4.1.1), "broker_use_ssl" argument starts to support Redis (earlier this argument is only supported when amqp is used for broker) (REF: https://github.com/celery/celery/blob/4.1/docs/userguide/configuration.rst). commit 53f65ed33b42c8ddce28aea6383c9d92f47d6c65 Author: Wyndham Blanton <bo.blanton@gmail.com> Date: Mon Jan 14 10:06:46 2019 -0800 - KubernetsExecutor: Need in try_number in labels if getting them later (#4163) * Need in labels if getting them later * has to be an int to match running keys - otherwise running list will never empty * pr comments * bad merge * mend pep issue * add try_numer to make_pod test commit 4d721fe96e468c2d35eab7f5cffcd0f295bdc706 Author: Ash Berlin-Taylor <ash_github@firemirror.com> Date: Mon Jan 14 15:59:03 2019 +0000 Escape links generated in model views (#4519) commit c834f3003a2ef16793f74c889b735c6527af5fd8 Author: Xiaodong <xd_deng@hotmail.com> Date: Mon Jan 14 18:10:04 2019 +0800 Change the lowest allowed version of "requests" (#4517) commit f62834f0871c40ad38473107fe255f1d377804c5 Author: Kaxil Naik <kaxilnaik@apache.org> Date: Mon Jan 14 00:39:37 2019 +0000 Revert [AIRFLOW-3692] Remove ENV variables to avoid GPL (#4506) commit 111f48aa328de3781b9e7f23f703388ed6821b9d Author: Kaxil Naik <kaxilnaik@apache.org> Date: Sun Jan 13 20:42:16 2019 +0000 Update CHANGELOG.txt commit 8afb59e7100bd1618788474c2d7ec7b7c8e68038 Author: Kaxil Naik <kaxilnaik@apache.org> Date: Sun Jan 13 20:36:42 2019 +0000 Add CHANGELOG & K8s to Documentation commit 2cccaef9565f3863281b99428f67338bb0657571 Author: Kaxil Naik <kaxilnaik@gmail.com> Date: Sun Jan 13 20:12:53 2019 +0000 Add Version info to Airflow Documentation (#4512) commit bf34820c76575bb49cfdb223a7be1f84bfa66fa6 Author: Kaxil Naik <kaxilnaik@apache.org> Date: Sun Jan 13 19:38:56 2019 +0000 Remove Duplicates from Changelog commit 74f8ce662754b45c2ef1f7658cfca05c11efb48d Author: Kaxil Naik <kaxilnaik@apache.org> Date: Sun Jan 13 17:48:18 2019 +0000 Update CHANGELOG.txt commit 938442bc9597d21bbcf6ecbad3c3c8956aa649bd Author: Gabriel Nicolas Avellaneda <avellaneda.gabriel@gmail.com> Date: Wed Dec 5 17:55:38 2018 -0200 Add Kubernetes Dependency in Extra Packages Doc (#4281) commit 9e01ac740cde14ec928bca5ea89a4f1d21934401 Author: Joshua Carp <jm.carp@gmail.com> Date: Tue Oct 9 11:14:07 2018 -0400 Add extras group for google auth to setup.py. (#3917) To clarify installation instructions for the google auth backend, add an install group to `setup.py` that installs dependencies google auth via `pip install apache-airflow[google_auth]`. commit 7c2b1c2173e8714ecd9b435a614bb6f66b1cbe07 Author: Naman Bhalla <namanbhalla1998@gmail.com> Date: Sat Sep 8 21:40:27 2018 +0530 Remove redundant space in Kerberos (#3866) commit bb02c0334c42825f36e1db56a3b87313927b70c4 Author: Taylor D. Edmiston <tedmiston@gmail.com> Date: Wed Aug 15 01:09:26 2018 -0400 Clean up installation extra packages table (#3750) Sort the extra packages table, use official product names, improve capitalization, and make table whitespace consistent. commit 0f4df115fd81f9623d1d448b1f96a94b8480bb90 Author: Kaxil Naik <kaxilnaik@gmail.com> Date: Fri Jun 22 16:35:48 2018 +0200 Add instructions to install SSH dependencies Closes #3536 from kaxil/patch-1 commit 082372692059899008e746849b00cfa985d6c91e Author: Kaxil Naik <kaxilnaik@apache.org> Date: Sun Jan 13 16:56:54 2019 +0000 Update CHANGELOG.txt commit a9107377d0c4c36c03246572cbd9a306716c7cd5 Author: Fokko Driesprong <fokko@driesprong.frl> Date: Sun Jan 13 17:54:45 2019 +0100 Replace psycopg2-binary by psycopg2 (#4508) For Python packages, psycopg2 is preferred over psycopg2-binary http://initd.org/psycopg/docs/install.html#binary-install-from-pypi commit 9e09fd63726bf1cc35528350991ef2afe5b697de Author: Bryant Biggs <bryantbiggs@gmail.com> Date: Sun Dec 2 00:58:50 2018 -0500 Correct Python Version Documentation Reference (#4259) commit dc3ca9595f0ddd28f7a567230aa4e5e923dd7355 Author: Felix <feluelle@users.noreply.github.com> Date: Sun Nov 11 23:40:03 2018 +0100 Update Contributing Guide - Git Hooks (#4120) - changes pre-commit example to use methods - adds activating virtual env for python to run things like flake8 locally - changes pre-commit file to use set -e command to instantly exit if any non-zero error occurs - changes flake8 call to lint the repo instead of not only the changes files commit a7840fdf8277def04a27370673e8c9b1db996309 Author: Kaxil Naik <kaxilnaik@gmail.com> Date: Wed Aug 29 21:44:24 2018 +0100 Fix Broken Link in CONTRIBUTING.md commit 21e3ab7116b8f6b434718d06345ddc4a240192b1 Author: BasPH <BasPH@users.noreply.github.com> Date: Sat Oct 27 15:48:25 2018 +0200 Fix incorrect statement in contributing guide (#4104) commit 18fcc5eedf357265a5de0cef2b691e1c0b32adba Author: bolkedebruin <bolkedebruin@users.noreply.github.com> Date: Sun Jan 13 13:34:00 2019 +0100 Remove ENV variables to avoid GPL (#4506) commit 539d924f058ce4bd49175562767bf25e39ac5523 Author: Taylor D. Edmiston <tedmiston@gmail.com> Date: Thu Jul 26 11:02:47 2018 +0200 Skip test_mark_success_no_kill in PostgreSQL on CI See mailing list thread "Flaky test case: test_mark_success_no_kill". Closes #3642 from tedmiston/test_mark_success_no_kill-postgresql commit 611b277e5f6d5d204ca1c85445b22068dae9ade6 Author: Xiaodong <xd_deng@hotmail.com> Date: Sun Jan 13 22:33:08 2019 +0800 Fix bug to set state of a task for manually-triggered DAGs (#4504) commit e63655089ecaa313b80b3c894142fafd356d2c85 Author: Xiaodong <xd_deng@hotmail.com> Date: Sun Jan 13 21:11:12 2019 +0800 Update pop-up message when deleting DAG in RBAC UI (#4505) This feature was added in https://github.com/apache/airflow/pull/4287, but the pop-up messages was only updated in airflow/www/templates/airflow/dag.html, while it should be updated for all dag.html & dags.html for both /www and /www_rbac. commit 3da675eb8c94e6446b59e3ed04cf1ca9211f4c40 Author: bolkedebruin <bolkedebruin@users.noreply.github.com> Date: Sun Jan 13 09:02:34 2019 +0100 Update notice to 2019 (#4503) commit c07d80c113be728de96a32d454e539876ea9c94a Author: Kaxil Naik <kaxilnaik@apache.org> Date: Sun Jan 13 01:28:05 2019 +0000 Bump version to 1.10.2b2 commit 3031387a74630a84e2ce53e971a48cde9e66d4b3 Author: Kaxil Naik <kaxilnaik@apache.org> Date: Sat Jan 12 23:00:13 2019 +0000 Version 1.10.2b1 commit 8efc392697a4c31bf66d32cd7cb504076e0a2c03 Author: Kamil Breguła <mik-laj@users.noreply.github.com> Date: Sat Jan 12 20:54:43 2019 +0100 Add missing @apply_defaults decorators (#4498) commit 26dffa0f396ffedd297c9e8fdab53a3d4b1c161d Author: Kaxil Naik <kaxilnaik@apache.org> Date: Sat Jan 12 22:03:47 2019 +0000 Fix CI commit 0387008886706e21745f41ddaaec43c5b2a1d68e Author: Ash Berlin-Taylor <ash_github@firemirror.com> Date: Sun Nov 11 22:22:11 2018 +0000 Speed up RBAC view tests (#4162) Not re-creating the FAB app ones per test functions took the run time of the TestAirflowBaseViews from 223s down to 53s on my laptop, _and_ made it only print the deprecation warning (fixed in another PR already open) once instead of 10+ times. commit 2f540911ce937321af6bef22b48a3877a712aadd Author: Fokko Driesprong <fokko@driesprong.frl> Date: Tue Jan 8 11:45:40 2019 +0100 Make sure that the session is closed (#4298) commit f399719bf86bc9cfe4478c01474960a02e0c0d81 Author: Xiaodong <xd_deng@hotmail.com> Date: Thu Nov 29 23:21:06 2018 +0800 Fix/refine tests for api/common/experimental/ (#4255) Follow-up on [AIRFLOW-3239] Related PRs: #4074, #4131 1. Fix (test_)trigger_dag.py 2. Fix (test_)mark_tasks.py 2-1. properly name the file 2-2. Correct the name of sample DAG 2-3. Correct the range of sample execution_dates (earlier one conflict with the start_date of the sample DAG) 2-4. Skip for test running on MySQL Seems something is wrong with airflow.api.common.experimental.mark_tasks.set_state, Corresponding test case works on Postgres & SQLite, but fails when on MySQL ("(1062, "Duplicate entry '110' for key 'PRIMARY'")"). A TODO note is added to remind us fix it for MySQL later. 3. Remove unnecessary lines in test_pool.py commit e62866a903e439d51508c5324f72c6d5c32abf53 Author: Yingbo Wang <ybwang@gmail.com> Date: Fri Aug 31 16:49:39 2018 -0700 Update dag_run table end_date when state change (#3798) The existing airflow only change dag_run table end_date value when a user teminate a dag in web UI. The end_date will not be updated if airflow detected a dag finished and updated its state. This commit add end_date update in DagRun's set_state function to make up tho problem mentioned above. commit 851d32846fd3590e134678adb216a0e77413b91c Author: yrqls21 <yrqls21@gmail.com> Date: Wed Aug 1 01:31:30 2018 -0700 Fix bug in set DAG run state workflow (#3606) commit 98681fefbecde6d9dcde4f49da735fce499bdd78 Author: Tao Feng <tfeng@lyft.com> Date: Sat Jan 5 06:10:31 2019 -0800 Update committer list based on latest TLP discussion (#4427) commit fc200df99e1af1f98086c0dca8eb1ed301d1d6fd Author: Kaxil Naik <kaxilnaik@gmail.com> Date: Sat Jan 5 16:32:12 2019 +0000 Remove remaining incubator mention & Fix CI Behaviour (#4441) commit 38b6775d3108856f83c06824b3e3099a99d589ec Author: XD-DENG <xd_deng@hotmail.com> Date: Sun Sep 9 21:06:51 2018 +0800 CLI tool to show the next execution datetime (#3834) commit b6c9c46a70a7443198a36ec79374ec1cba22046a Author: aoen <aoen@users.noreply.github.com> Date: Sat Jan 12 15:59:40 2019 +0200 Fix not being able to specify execution_date when creating dagrun (#4037) commit 090ffbc77171fa14c5d0b86df21bc74a5badec9b Author: Jarek Potiuk <jarek@potiuk.com> Date: Sat Jan 12 13:44:01 2019 +0100 Consistency update in tests for All GCP-related operators (#4493) commit e30e2192c62613869888989fd303d6d615f336a6 Author: Jarek Potiuk <jarek@potiuk.com> Date: Wed Dec 5 21:35:29 2018 +0100 GCP operators documentation clarifications (#4273) commit e737f5588db7d108d98e481759a83c1c1150a720 Author: Cameron Moberg <cjmoberg@gmail.com> Date: Thu Aug 2 12:44:16 2018 -0700 Add GCP specific k8s pod operator (#3532) Executes a task in a Kubernetes pod in the specified Google Kubernetes Engine cluster. This makes it easier to interact with GCP kubernetes engine service because it encapsulates acquiring credentials. commit 179831bc72d8a447c33e45aa8b0ca3258c5f3615 Author: Jarek Potiuk <jarek@potiuk.com> Date: Sat Jan 12 10:07:45 2019 +0100 Use googlapiclient for google apis (#4484) The deprecated apiclient package name is used in a number of places. This commit changes it to googleapiclient and modifies the right packages to be used instead. commit 5b546769598e1c6782d7155964c9b20b4bfc9945 Author: Gordon Ball <chronitis@gmail.com> Date: Mon Nov 5 15:48:11 2018 +0100 Support multipart uploads to GCS (#4084) * [AIRFLOW-3205] Support multipart uploads to GCS Cloud Storage supports resumable/multipart uploads for large files, which can be used to avoid limitations on the size of a single HTTP request, or by adding a retry behaviour, increase the reliability of large transfers. * [AIRFLOW-3205] Use only the multipart keyword This removes the chunksize keyword, using instead multipart=True for a default chunk size or multipart=int to override the default. commit 2965c0d125f89579c0abd6a1e782a47819470ade Author: Jasper Kahn <jasperakahn@gmail.com> Date: Wed Aug 8 01:04:19 2018 -0700 Add GoogleCloudKMSHook (#3677) Adds a hook enabling encryption and decryption through Google Cloud KMS. This should also contribute to AIRFLOW-2062. commit 2956727745c358045df61e1ae110798748c25492 Author: Jarek Potiuk <jarek@potiuk.com> Date: Sat Jan 12 02:18:40 2019 +0100 Add required permission to CloudSQL export/import example (#4489) commit b752d33b9bb568a1682fa78c72f468358ea2ab86 Author: Szymon Przedwojski <szymon.przedwojski@gmail.com> Date: Wed Dec 5 21:33:00 2018 +0100 Google Cloud SQL import/export operator (#4251) commit 8f5145bc68a38c26c29a3b677bed87f932d03d20 Author: Stefan Seelmann <mail@stefan-seelmann.de> Date: Sat Jan 12 02:06:11 2019 +0100 Fix logs when task is in rescheduled state (#4492) commit cfc2addcfe3156118867f6cd7fca60e6b049e6fb Author: Jarek Potiuk <jarek@potiuk.com> Date: Sat Jan 12 02:01:49 2019 +0100 Added Google Cloud Base Hook to documentation (#4487) commit c6d9e516b358224e181098e78d5dff1f97cc69ba Author: Felix <feluelle@users.noreply.github.com> Date: Fri Jan 11 19:17:20 2019 +0100 Unify different License Header commit 25703d468d04436409bd4845570f46c5a428daf2 Author: Mike Mole <mikemole@gmail.com> Date: Fri Jan 11 14:35:08 2019 -0500 Add AwsGlueCatalogPartitionSensor (#4112) Adds AwsGlueCatalogPartitionSensor and AwsGlueCatalogHook with supporting functions. Unit tests are included but rely on mocking since Moto does not yet fully support AWS Glue Catalog at this time. commit 5d3e362a1bf6a70ed9af0aa4a3e99f6af3a6e0eb Author: Ant Weiss <antweiss@users.noreply.github.com> Date: Fri Jan 11 21:04:54 2019 +0200 Remove invalid parameter KeepJobFlowAliveWhenNoSteps in example DAG (#4404) The parameter 'KeepJobFlowAliveWhenNoSteps' in JOB_FLOW_OVERRIDES doesn't pass boto API parameter validation, as it should be a part of 'Instances' object. Signed-off-by: Anton Weiss <anton@otomato.link> commit a9f70793e70a1f4cdf2d0b25a4a3806199d809a2 Author: Xiaodong <xd_deng@hotmail.com> Date: Thu Jan 10 14:58:29 2019 +0800 Refine the functionality of "/health" endpoint (#4309) commit ba49fe857670505ccaa63e64d61a34d3e50c49ad Author: Tobias Kaymak <tobias.kaymak@ricardo.ch> Date: Thu Jan 10 18:57:15 2019 +0100 Fix zendesk integration (#4466) commit df603b5582f6d51d30cf9711bb4d03e430715894 Author: Drew J. Sonne <drew.sonne@gmail.com> Date: Thu Jan 10 23:03:59 2019 +0000 Load plugins from entry_points (#4412) * [AIRFLOW-3605] Add entrypoint plugin docs This documentation came from https://github.com/apache/incubator-airflow/pull/730 which had already started work on a PR for this functionality. * [AIRFLOW-3605] Extend plugin loading functionality Added business logic to import AirflowPlugin classes through entry_points. This means we don’t have to interact with the file system directly to install plugins, and can manage them via `pip`. commit f26660fcd7164b43441dc20d79eb051a5f59a39c Author: Tao Feng <tfeng@lyft.com> Date: Wed Jan 9 12:23:03 2019 -0800 Rename plugins_manager.py to test_xx to trigger tests (#4464) commit 1d37beba3e1a3cf76967259c2cb819ce5b0084f6 Author: Stefan Seelmann <mail@stefan-seelmann.de> Date: Fri Jan 11 10:58:14 2019 +0100 Visualize reschedule state in all views (#4408) * [AIRFLOW-3589] Visualize reschedule state in all views * Add explicit `UP_FOR_RESCHEDULE` state * Add legend and CSS to views * [AIRFLOW-3589] Visualize reschedule state in all views * Use set or tuple instad of list * Use `with` statement for session handling commit 3b066368f579a3523d7aba563d5dd825d87f8ce3 Author: Kamil Breguła <mik-laj@users.noreply.github.com> Date: Fri Jan 11 07:46:22 2019 +0100 Docs: Fix paths to GCS transfer operator (#4479) commit 20944272a40d723238ac6b5369a2bd35f72386d0 Author: Ash Berlin-Taylor <ash_github@firemirror.com> Date: Wed Jan 9 23:06:42 2019 +0000 Escape links generated in model views (#4463) commit 0856fb5df5c88ea2ff13527f81c1aaacfe59c39a Author: Kaxil Naik <kaxilnaik@gmail.com> Date: Wed Jan 9 23:09:48 2019 +0000 Add dependency for Enum (#4468) commit 7b9e03c23356e7cd4044a43ab2574f523115c888 Author: Kaxil Naik <kaxilnaik@apache.org> Date: Wed Jan 9 21:50:02 2019 +0000 Update Updating instructions for changes in 1.10.2 commit 8be3f469d97f1d30f6b40a9c30a476659228cc67 Author: Jarek Potiuk <jarek@potiuk.com> Date: Wed Jan 9 21:36:39 2019 +0100 Cleanup of GCP Cloud SQL Connection (#4451) commit 68e2ea0786ae2f1ff4852ab9fe1c28669ebcab48 Author: dima-asana <42555784+dima-asana@users.noreply.github.com> Date: Fri Oct 12 00:14:47 2018 -0700 Respect task start_date when different from dag's (#4010) commit a95dc3c3befe3b2725186051f894ccc1e4e2bca4 Author: Kaxil Naik <kaxilnaik@apache.org> Date: Wed Jan 9 00:10:03 2019 +0000 Remove Flake8 Diff checker commit ec97c480eb39ecde01013e499bb54828343935d6 Author: Joshua Carp <jm.carp@gmail.com> Date: Thu Oct 4 03:20:24 2018 -0400 Update flask-appbuilder (#3937) commit b6b107725b915a2e6fa69d73d93fa388b047aceb Author: dima-asana <42555784+dima-asana@users.noreply.github.com> Date: Thu Oct 11 01:55:15 2018 -0700 More resillient database use in CI (#4014) commit 31630ccbf42577c814c6ce372be0a077e09cb816 Author: Fokko Driesprong <fokko@driesprong.frl> Date: Fri Sep 21 16:37:55 2018 +0200 Remove preloading of MySQL testdata (#3911) One of the things for tests is being self contained. This means that it should not depend on anything external, such as loading data. This PR will use the setUp and tearDown to load the data into MySQL and remove it afterwards. This removes the actual bash mysql commands and will make it easier to dockerize the whole testsuite in the future commit 633829b7171234b33f244abd28539c53161e0941 Author: Kengo Seki <sekikn@apache.org> Date: Wed Aug 1 18:07:34 2018 +0900 Brush up the CI script for minikube commit 0db140666298acdfe0a9070d6bb38f876713ec8d Author: Felix <feluelle@users.noreply.github.com> Date: Tue Jan 8 10:36:52 2019 +0100 Fix example http operator (#4455) commit 72e9543f8e6cce218b124728876e2e6fc677966e Author: Kaxil Naik <kaxilnaik@apache.org> Date: Tue Jan 8 01:50:49 2019 +0000 Fix flake8 issues commit 759459a5a02b39649201fda395dfbf64f9991fe9 Author: Jeff Payne <jeffkpayne@gmail.com> Date: Wed Sep 12 14:42:13 2018 -0700 Allow custom 'job_error_states' in dataproc ops (#3884) commit f12aee6b631d465da65fec78a6e945f1654a4ecd Author: gseva <gavrilovseva@gmail.com> Date: Wed Dec 19 06:48:57 2018 -0300 Make hmsclient optional in airflow.hooks.hive_hooks (#4080) Delay the import right up until it is needed, like how we do with the thrift imports. commit 804c0b1886dabe145ba69e306c25c1d8d98cd08b Author: Kengo Seki <sekikn@apache.org> Date: Tue Aug 7 01:42:02 2018 +0900 Fix scheduler_ops_metrics.py to work (#3653) This PR fixes timezone problem in scheduler_ops_metrics.py and makes its timeout configurable. commit 652cab9c7bc8dfbe229536d89348365f245828e7 Author: juhwi.lee <juhwi.lee@navercorp.com> Date: Sun Jul 15 12:14:21 2018 +0200 add job properties update in hive to druid operator. Closes #3600 from happyjulie/AIRFLOW-2751 commit 8e591079499dcd495059d2668ff9de102a072fdb Author: Fokko Driesprong <fokko@driesprong.frl> Date: Sun Sep 16 14:31:10 2018 +0200 Log how many rows are read from Postgres (#3905) To know how many data is being read from Postgres, it is nice to log this to the Airflow log. Previously when there was no data, it would still create a single file. This is not something that we want, and therefore we've changed this behaviour. Refactored the tests to make use of Postgres itself since we have it running. This makes the tests more realistic, instead of mocking everything. commit 44463c10979163674d7e9cf6c1ad443e6a15ea8f Author: Kevin Yang <kevin.yang@airbnb.com> Date: Wed Jul 11 10:28:06 2018 +0200 Make task instance context available for hive queries commit d38cc5034cbf32bea4e8f50e2d5fcff770628f22 Author: Fokko Driesprong <fokkodriesprong@godatadriven.com> Date: Fri Sep 21 16:36:28 2018 +0200 Remove unused imports commit f676c887e27bcf54a21249f67623297e0719cbf1 Author: johnhofman <johncarlhofman@gmail.com> Date: Fri Sep 28 12:04:29 2018 +0200 Fix PythonVirtualenvOperator tests (#3968) The recent update to the CI image changed the default python from python2 to python3. The PythonVirtualenvOperator tests expected python2 as default and fail due to serialisation errors. commit 05314944fb27b8b337dc4baa856d7b1f00445539 Author: Fokko Driesprong <fokko@driesprong.frl> Date: Fri Sep 21 16:25:54 2018 +0200 Fix Flake8 violations (#3931) commit c0f450f014c07d03271712600745bc95c75113fe Author: Fokko Driesprong <fokkodriesprong@godatadriven.com> Date: Thu Nov 8 00:02:18 2018 +0100 Make flake8 compliant commit 1291037f7fcd711e834317b6c566546ab962db41 Author: Mike Ascah <mike.ascah@joinroot.com> Date: Fri Jul 20 13:46:50 2018 +0200 Add except type to broad S3Hook try catch clauses S3Hook will silently fail if given a conn_id that does not exist. The calls to check_for_key done by an S3KeySensor will never fail if the credentials object is not configured correctly. This adds the expected ClientError exception type when performing a HEAD operation on an object that doesn't exist to the try catch statements so that other exceptions are properly raised. Closes #3616 from mascah/AIRFLOW-2771-S3hook- except-type commit d6457df20786e50dfd324ab091e1749684605695 Author: Fokko Driesprong <fokko@driesprong.frl> Date: Tue Aug 21 00:44:36 2018 +0200 Fix Flake8 violations (#3772) commit 9bd3eac129e8c7f50d8727e2892e65416de36b55 Author: Matt Revell <nightowlmatt@gmail.com> Date: Thu Aug 2 08:43:39 2018 +0100 Handle getsource() calls gracefully commit f10699c09538c8ab8b86abaf3fd8ca0c9d3002e3 Author: BrechtDeVlieger <brechtdevlieger@hotmail.com> Date: Tue Dec 11 19:20:14 2018 +0100 Fix integrety error in rbac AirflowSecurityManager (#4305) This was caused by the variable `role` being shadowed in a loop statement. commit 0bd6ff1d405919d6d165116ad5129e7c57a3b5f8 Author: Riccardo Bini <odracci@gmail.com> Date: Mon Dec 31 06:03:33 2018 +0100 Fix Kubernetes operator with git-sync (#3770) commit f70eb6d4c0df08987ed1c7ea3e36ff4909bd436f Author: Fokko Driesprong <fokko@driesprong.frl> Date: Fri Oct 12 23:22:52 2018 +0200 Make flake8 compliant (#4035) commit 049645ede236698aba0e8fdafbf2bc8e7630b4f5 Author: Israel Knight <israel.s.knight@gmail.com> Date: Thu Sep 6 00:07:28 2018 -0700 Implemented DatabricksRunNowOperator for jobs/run-now … (#3813) Add functionality to kick of a Databricks job right away. * Per feedback: fixed a documentation error, reintegrated the execute and on_kill onto the objects. * Fixed a documentation issue. commit 7cb23c3871496c2560ecd342f1ac1b3c9e5f5681 Author: Giovanni Lanzani <gglanzani@users.noreply.github.com> Date: Wed Nov 7 23:07:48 2018 +0100 Simplify Kerberos code (#3563) Some functions were not used. On top of that, the `principal_from_username` function was getting the wrong config value ("security" instead of "kerberos"). Since the results were only used by `kerberos.checkPassword`, and the function can cope with needing a realm in the `username` when `realm` is provided, we removed the `principal_from_username` function altogether. commit 28f9f7b7b9a7212a4f9aecf492124b1b5004a6b4 Author: Verdan Mahmood <verdan.mahmood@gmail.com> Date: Tue Jul 24 01:07:15 2018 +0100 Limit DAGs parsing to once only Closes #3614 from verdan/double-dag-parsing commit d6916e389fcf7227bea6dbe66a7656e53689cd9c Author: Kengo Seki <sekikn@apache.org> Date: Tue Jul 17 13:52:28 2018 +0100 Add subcommands to delete and list users Currently, adding user is the only operation that CLI has on RBAC. This PR adds functionality to delete and list users via CLI. Closes #3610 from sekikn/AIRFLOW-2750 commit add64ef1bb48a8a16db6dc71b4552e918435c57a Author: Tao feng <tfeng@lyft.com> Date: Mon Jul 16 13:13:42 2018 -0700 Airflow DAG level access (#3197) commit cbb809e9ffd350aa13c348b3557f5b387983f75a Author: Kevin Yang <kevin.yang@airbnb.com> Date: Thu Jun 28 13:30:36 2018 -0700 Add set failed for DagRun and task in tree view (#3255) commit 3b634feea05ca64321eb53af0c031b902020dae7 Author: Kaxil Naik <kaxilnaik@gmail.com> Date: Wed Sep 5 00:46:41 2018 +0100 Move Kubernetes example DAGs to contrib commit 32e190b655362ffd1403fe29558c78af70919bb0 Author: Riccardo Bini <odracci@gmail.com> Date: Fri Sep 21 14:36:09 2018 +0200 Fix Kubernetes CI (#3922) commit 7d710f68c415e096f8566d1fef444d5b8cc124cc Author: Fokko Driesprong <fokko@driesprong.frl> Date: Sat Aug 25 19:50:16 2018 +0200 Enable Codecov on Docker-CI Build (#3780) - Add missing variables and use codecov instead of coveralls. The issue why it wasn't working was because missing environment variables. The codecov library heavily depends on the environment variables in the CI to determine how to push the reports to codecov. - Remove the explicit passing of the variables in the `tox.ini` since it is already done in the `docker-compose.yml`, having to maintain this at two places makes it brittle. - Removed the empty Codecov yml since codecov was complaining that it was unable to parse it commit 1bfe1f5a4edb582ce67e8c61dfc7319b1f6e9aef Author: Gerardo Curiel <gerardo@gerar.do> Date: Wed Aug 22 18:26:54 2018 +1000 Dockerise CI pipeline (#3393) commit 4db444226734ac50f23edd78312e8aed2432282a Author: Kaxil Naik <kaxilnaik@apache.org> Date: Mon Jan 7 21:05:50 2019 +0000 Revert "[AIRFLOW-XXX] Switch to openjdk8 in Travis tests" This reverts commit 47ab0401d9a7c369151cc1800fee6adfe0efde53. commit 23fe47c970abdae9aa9f66ccfb0ec05c899b24c9 Author: Kevin Pullin <kevin.pullin@gmail.com> Date: Mon Jan 7 12:46:05 2019 -0800 Support global k8s affinity and toleration configs (#4247) commit e5d11820525af25ad287c0fc7edf0a5656cd62b5 Author: Tao Feng <tfeng@lyft.com> Date: Sun Jan 6 20:45:49 2019 -0800 Fix a flake8 error to unblock CI (#4453) commit 0a94c7e3677fc4c3227bc32e2b78baeb1da7134e Author: Raja Gangopadhya <raja.gangopadhya@remix.com> Date: Sun Jan 6 13:59:55 2019 -0800 Resolve a bug in adding password_auth to api as auth method (#4343) commit 400a460b1bf1ad2c83aa233e593ef7ab9321dcb7 Author: Dana Ma <dana.ma537@gmail.com> Date: Mon Jan 7 08:51:01 2019 +1100 Add region param for EMR jobflow creation (#4418) commit 294317f2deee0a3ff517949261986c1d0da806e4 Author: Kaxil Naik <kaxilnaik@gmail.com> Date: Sun Jan 6 21:40:54 2019 +0000 Fix test for GCS to GCS Transfer Hook (#4452) commit 5afde1d4e726f5f60b3d6cf3bf604859fe3be6b7 Author: Joshua Carp <jm.carp@gmail.com> Date: Sun Jan 6 14:35:31 2019 -0500 Add gcs to gcs transfer operator. (#4331) commit 4a8cbb084377e2d5e98b486b8645bc5c00023cf4 Author: Kaxil Naik <kaxilnaik@gmail.com> Date: Sun Dec 2 11:08:26 2018 +0000 Add missing GCP operators to Docs (#4260) commit 01513d55c62f97186c168adf941c897071e9be73 Author: Tao Feng <tfeng@lyft.com> Date: Sat Jan 5 06:05:25 2019 -0800 Remove incubation/incubator mention (#4419) commit d5f50c2532c1cb08f9b09a733963b19044ae4fff Author: r39132 <siddharthanand@yahoo.com> Date: Mon Sep 10 14:30:35 2018 -0700 Readme updates : Add Slack & Twitter, remove Gitter commit 0ec010edd1c91cc4b2cae608ed29b5db9c80d14c Author: r39132 <siddharthanand@yahoo.com> Date: Thu Sep 6 11:57:15 2018 -0700 Update Text & Images in Readme.md commit 7449cf4290d698239628adcbfef412755a76496a Author: Kaxil Naik <kaxilnaik@gmail.com> Date: Tue Sep 4 08:34:20 2018 +0100 Add badge to show supported Python versions (#3839) commit a7df3bb1bbb0def34b019dcae1376e92353a1e5d Author: Ash Berlin-Taylor <ash_github@firemirror.com> Date: Thu May 24 16:20:16 2018 +0100 Update PR tool to push directly to Github commit 4a9b98cb2baeed2d0feec1d898947addd4337706 Author: Ash Berlin-Taylor <ash_github@firemirror.com> Date: Thu May 24 16:05:37 2018 +0100 Flake8 fixes on dev/airflow-pr commit 9f40dd9557c1aa87241c90ab29154a2105e9545d Author: Kaxil Naik <kaxilnaik@gmail.com> Date: Sat Sep 29 11:35:26 2018 +0100 Update PR tool to remove outdated info (#3978) commit 99ad866a7f47d58e72ea5b651ff12e8f5dcf9f71 Author: Kaxil Naik <kaxilnaik@gmail.com> Date: Wed Sep 5 01:04:42 2018 +0100 Replace 'Airbnb Airflow' with 'Apache Airflow' (#3845) commit c99303398b11ce88d5c263777e84d960d881a29f Author: Kaxil Naik <kaxilnaik@gmail.com> Date: Sat Jan 5 13:15:56 2019 +0000 Fix GCP Spanner Test (#4440) commit bac9dcea19fbb7671790381aff7ce30c9bd538b0 Author: Kaxil Naik <kaxilnaik@gmail.com> Date: Sat Jan 5 02:17:45 2019 +0000 Make execution_date templated in TriggerDagRunOperator (#4359) commit 55a96af903c9b50a494e2163d09c7a1e803e50d5 Author: aoen <aoen@users.noreply.github.com> Date: Mon Dec 31 08:31:11 2018 +0200 Fix next_ds/prev_ds semantics for manual runs (#4385) commit 704ce1a6bf029ead0a04ea242b922db6b3c55cc3 Author: Kaxil Naik <kaxilnaik@gmail.com> Date: Sun Nov 25 21:44:07 2018 +0000 Add templated field in TriggerDagRunOperator (#4228) * [AIRFLOW-1196][AIRFLOW-2399] Make trigger_dag_id a templated field for TriggerDagRunOperator * Update dagrun_operator.py commit 69af9f0095edeae1a0a480857a0192ca1e579755 Author: Jarek Potiuk <jarek@potiuk.com> Date: Sat Jan 5 13:06:56 2019 +0100 Add GCP Spanner Database Operators (#4353) commit ab05880739f941fc5924b1477ee666e6e626a89d Author: Jarek Potiuk <jarek@potiuk.com> Date: Fri Jan 4 14:54:33 2019 +0100 Update Cloud SQL Proxy to have shorter path for UNIX socket (#4350) commit d92221d858c072b7c475bbeb9492d669e2d28f98 Author: Sumit Maheshwari <sumeet.manit@gmail.com> Date: Fri Jan 4 19:25:56 2019 +0530 Placeholder support in connections form (#4185) commit 43e813c4ccdb96461b182be1c2097654ffc9eec9 Author: Dariusz Aniszewski <dariusz@aniszewski.eu> Date: Fri Jan 4 14:50:15 2019 +0100 Add Google Cloud BigTable operators (#4354) commit 8caff95104ec8142c45f0d98f0787455808a4e1e Author: Conrad Lee <conradlee@gmail.com> Date: Fri Jan 4 00:18:05 2019 +0100 For gcs_to_bq: add missing init of schema_fields var (#4430) commit 5e11f9744fb2d35bbfbd20b6645876eb93d2c304 Author: Chinh Nguyen <chinhngt@gmail.com> Date: Thu Jan 3 14:39:24 2019 -0800 Fix AirflowException import (#4389) Looks like the class path changed and broke wasb_hook commit f5123265fc729cc4965079b9c21d981d16ca8d01 Author: Kaxil Naik <kaxilnaik@gmail.com> Date: Wed Jan 2 12:36:23 2019 +0000 Fix Type Error for BigQueryOperator (#4384) commit 1a440cacc6d5042c3614544580c1250673bc7b6d Author: Kaxil Naik <kaxilnaik@gmail.com> Date: Fri Sep 28 11:51:04 2018 +0100 Fix Kubernetes CI (#3957) commit 7fea1fb0f7431496eadfbc58fb0f3afabab2f045 Author: Kaxil Naik <kaxilnaik@apache.org> Date: Thu Jan 3 20:50:09 2019 +0000 Add license to Contrib Example DAG Init file commit 9cdc1e18fd31fed03a10b40227f8ab74f84a0188 Author: Steve Jacobs <brokenjacobs@gmail.com> Date: Wed Jan 2 00:57:08 2019 -0700 Add support for https and user auth (#2879) commit bccaacdda3785fd47d643b3fc87b352fbc544398 Author: Kaxil Naik <kaxilnaik@gmail.com> Date: Thu Jan 3 10:51:46 2019 +0000 Fix WeekDay Sensor Example (#4431) commit cfdb4b46273bf5532629b2d4269d2881e3a4c4dc Author: Kaxil Naik <kaxilnaik@gmail.com> Date: Thu Jan 3 09:35:35 2019 +0000 Add DayOfWeek Sensor (#4363) * [AIRFLOW-3560] Add WeekEnd & DayOfWeek Sensors * Change to using Enum * Fix Docstring * Refactor into a Single Sensor commit a0c7c9f1c7bab372ee66f16f7b25505c60c5e9cd Author: Kaxil Naik <kaxilnaik@apache.org> Date: Mon Dec 31 12:50:01 2018 +0000 Fix Flake8 issues commit fb0e74d4bc0a83e25c5ee1d170f74671175ebf69 Author: Kevin Pullin <kevin.pullin@gmail.com> Date: Sun Dec 16 23:05:26 2018 -0800 Read `dags_in_image` config value as a boolean (#4319) * Read `dags_in_image` config value as a boolean This PR is a minor fix for #3683 The dags_in_image config value is read as a string. However, the existing code expects this to be a boolean. For example, in worker_configuration.py there is the statement: if not self.kube_config.dags_in_image: Since the value is a non-empty string ('False') and not a boolean, this evaluates to true (since non-empty strings are truthy) and skips the logic to add the dags_volume_claim volume mount. This results in the CI tests failing because the dag volume is missing in the k8s pod definition. This PR reads the dags_in_image using the conf.getboolean to fix this error. Rebased on 457ad83e4eb02b7348e5ce00292ca9bd27032651, before the previous dags_in_image commit was reverted. * Revert "Revert [AIRFLOW-2770] [AIRFLOW-3505] (#4318)" This reverts commit 77c368fd228fe5edfdb3304ed4cb000a50667010. commit 29d140fa61480cd9b7a9e9cd30b1b7425fd7d6c0 Author: John Cheng <ckljohn@gmail.com> Date: Tue Nov 6 00:28:01 2018 +0800 Add volume mount to KubernetesExecutorConfig (#3855) Added volumes and volume_mounts to the KubernetesExecutorConfig so `volumes` or `secrets` can be mount to worker pod. commit 9233726288666fdb506a0a7a32754b5e01f223c6 Author: John Cheng <ckljohn@gmail.com> Date: Sun Aug 19 22:07:53 2018 +0800 Set AIRFLOW__CORE__SQL_ALCHEMY_CONN only when needed (#3766) Only when `airflow_configmap` is not provided and `AIRFLOW__CORE__SQL_ALCHEMY_CONN` not in secrets, it is set as an env var. commit a83d355016954a6c7fc46103aefde11200b16870 Author: Aldo Giambelluca <xoen@users.noreply.github.com> Date: Mon Aug 6 21:44:48 2018 +0100 Added `kubernetes.worker_dags_folder` configuration (#3612) It was previously hardcoded to `/tmp/dags`. This causes problems with python import of modules in the DAGs folder. commit c602e2850c64012cf1fd28914b33b35154c819bc Author: Shintaro Murakami <mrkm4ntr@gmail.com> Date: Wed Jul 4 17:48:51 2018 +0100 Fix inconsistency of default config of kubernetes worker Closes #3529 from mrkm4ntr/airflow-2655 commit f216bdd7e58e15b617d38b8a18ec20680cd84bb3 Author: roc <rockerchen@tencent.com> Date: Wed Jun 20 20:37:39 2018 +0200 Add worker_container_image_pull_policy Set worker_container_image_pull_policy in default_airflow.cfg As AIRFLOW-2617 added worker_container_image_pull_policy config to the section of kubernetes, but the airflow_default.cfg was not updated, this PR add worker_container_image_pull_policy to default_airflow.cfg. Closes #3521 from imroc/AIRFLOW-2645 commit 911884bef5106797e503b830c35f6087b0de4117 Author: Ravi Kotecha <kotecha.ravi@gmail.com> Date: Fri Jun 22 16:37:46 2018 +0200 fix config dags_volume_subpath and logs_volume_subpath Make sure you have checked _all_ steps below. ### JIRA - [x] My PR addresses the following [Airflow JIRA] (https://issues.apache.org/jira/browse/AIRFLOW/) issues and references them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR" - https://issues.apache.org/jira/browse/AIRFLOW-2661 - In case you are fixing a typo in the documentation you can prepend your commit with \[AIRFLOW-XXX\], code changes always need a JIRA issue. ### Description - [x] Here are some details about my PR, including screenshots of any UI changes: Changes the use of `log_volume_subpath` and `dags_volume_subpath` which are now passed into the construction of the worker pod's volumeMounts instead of the volume section (where subPath is not valid). ### Tests - [x] My PR adds the following unit tests __OR__ does not need testing for this extremely good reason: Unit tests have been added but I'm not sure how to add integration tests for this without breaking the other minikube tests ### Commits - [x] My commits all reference JIRA issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "[How to write a good git commit message](http://chris.beams.io/posts/git- commit/)": 1. Subject is separated from body by a blank line 2. Subject is limited to 50 characters 3. Subject does not end with a period 4. Subject uses the imperative mood ("add", not "adding") 5. Body wraps at 72 characters 6. Body explains "what" and "why", not "how" ### Documentation - [x] In case of new functionality, my PR adds documentation that describes how to use it. No new functionality added ### Code Quality - [x] Passes `git diff upstream/master -u -- "*.py" | flake8 --diff` Closes #3537 from r4vi/AIRFLOW-2661 commit 27a3c893a8bad408d96cd423f71fec017c7c52f0 Author: Tom Kunc <tom.kunc@kinesis.org> Date: Tue Jul 24 01:12:09 2018 +0100 Pass annotations to KubernetesExecutorConfig commit 2ccae80e4358f47470bc94a20ce7f95ce0d85b49 Author: Joshua Carp <jm.carp@gmail.com> Date: Sat Dec 29 19:12:17 2018 -0500 Standardize GKE hook (#4364) commit f2f649a1255991498946fb1edf09080d4caeebdb Author: Cameron Moberg <cjmoberg@gmail.com> Date: Tue Aug 7 09:57:41 2018 -0700 Fix GKEClusterHook catching wrong exception (#3711) commit 8534ef4c2c852e258eba5dcdaf8daf79dacb25f9 Author: Yohei Onishi <vivre214@gmail.com> Date: Sun Dec 30 08:09:00 2018 +0800 Fix TypeError in GCSToS3Op & S3ToGCSOp (#4371) Fix TypeError on GoogleCloudStorageToS3Operator & S3ToGoogleCloudStorageOperator commit 5dd247b2d7e15597414c7e67f1de3325965356b0 Author: Kaxil Naik <kaxilnaik@gmail.com> Date: Sat Dec 29 11:00:20 2018 +0000 Add support for location in BigQueryHook (#4324) commit fbab5ca4eedfcbbde5bae86eed45e2aa4b7708a5 Author: Kaxil Naik <kaxilnaik@gmail.com> Date: Sun Dec 9 22:29:11 2018 +0000 Fix default values in BigQuery Hook & BigQueryOperator (… …#4274) commit 290da3586da05c792fb9b7f4664aa9bd9c739f46 Author: Ryan Yuan <ryan.yuan@outlook.com> Date: Thu Nov 22 10:16:18 2018 +1100 BigQueryHook's Ability to Create View (#4213) commit 246ad9e4067027450ebbe220edf3582619652cfe Author: Ryan Yuan <ryan.yuan@outlook.com> Date: Sat Nov 17 22:52:03 2018 +1100 Add method to allow inserting rows into BQ table (#4179) commit 7c08b9892de805dc684ba97b9bc74469092e01c0 Author: Kengo Seki <sekikn@apache.org> Date: Fri Nov 16 14:31:24 2018 -0800 Fix BigQueryCursor.execute to work with Python3 (#4198) BigQueryCursor.execute uses dict.iteritems internally, so it fails with Python3 if binding parameters are provided. This PR fixes this problem. commit e2975e531f718a256e5afbf2cc306ac57f5ef187 Author: Iuliia Volkova <xnuinside@gmail.com> Date: Mon Oct 22 12:03:22 2018 +0300 Support autodetected schemas in BigQuery run_load (#3880) commit f6145a5d140a1180168f33e26c1642ec23ecfcf8 Author: Kaxil Naik <kaxilnaik@gmail.com> Date: Sat Oct 20 13:05:17 2018 +0100 BigQuery Hook - Minor Refactoring (#4066) commit 9da372b97b2fdf47b7868c30784fa88be6ac7e35 Author: Iuliia Volkova <xnuinside@gmail.com> Date: Sun Oct 7 21:49:50 2018 +0300 add get_dataset and get_datasets_list to bigquery_hook (#3894) * [AIRFLOW-3055] add get_dataset and get_datasets_list to bigquery_hook commit a3cf0b3899f7b3b27b48cf132d9ad217b295c87e Author: Iuliia Volkova <xnuinside@gmail.com> Date: Fri Sep 21 17:46:59 2018 +0300 Added BigQueryCreateEmptyDatasetOperator and create_emty_dataset to bigquery_hook (#3876) commit f6bbd7142c323734c0d91a6339362e71401813a8 Author: Gordon Ball <chronitis@gmail.com> Date: Fri Sep 7 18:41:03 2018 +0200 Support cluster fields in bigquery (#3838) This adds a cluster_fields argument to the bigquery hook, GCS to bigquery operator and bigquery query operators. This field requests that bigquery store the result of the query/load operation sorted according to the specified fields (the order of fields given is significant). commit b7e17e181b49941640a575ad33dcc07eabeeb68e Author: Kaxil Naik <kaxilnaik@gmail.com> Date: Fri Aug 31 09:36:24 2018 +0100 Fix Docstrings for Operators (#3820) commit abc89aea43c20772ec659d3245855b4fca6e2b6a Author: Xiaodong <xd_deng@hotmail.com> Date: Tue Aug 28 20:36:29 2018 +0800 Arg `verify` for AwsHook() & S3 sensors/operators (#3764) commit 541793c510d4fa59caa9a62785a837c85adbf4ab Author: Kengo Seki <sekikn@apache.org> Date: Wed Jul 18 09:04:25 2018 -0700 Add a sensor for MongoDB This PR adds a sensor for MongoDB, which waits for some document that matches the given query to be inserted to the specified collection. Closes #3611 from sekikn/AIRFLOW-2758 commit 0a49c8fb0c3d3eb057ebea09a44b14453e79616a Author: Kengo Seki <sekikn@apache.org> Date: Wed Jun 20 20:36:32 2018 +0200 Add Cassandra table sensor Just like a partition sensor for Hive, this PR adds a sensor that waits for a table to be created in Cassandra cluster. Closes #3518 from sekikn/AIRFLOW-2640 commit de6d19f57f01a2d3380cd5600233c3e4afdfa8e2 Author: Yuliya Volkova <xnuinside@gmail.com> Date: Mon Sep 3 23:30:22 2018 +0300 Add feature to pass extra api configs to BQ Hook (#3733) commit a5dcbdef5626e46f86873be73655f775aebe3a0e Author: Kazuhiro Sera <seratch@gmail.com> Date: Sun Aug 12 13:11:19 2018 +0900 Fix typos detected by github.com/client9/misspell (#3732) commit bba5721d11528aba525e0c4d4804b1b113fe1f07 Author: Tao Feng <tfeng@lyft.com> Date: Sat Dec 22 10:13:39 2018 -0800 Add a PythonSensor (#4349) commit adcacfc1f2d49d57b4bfa5a81bf784e2ffb66fb7 Author: Xiaodong <xd_deng@hotmail.com> Date: Thu Dec 20 06:43:09 2018 +0800 Fix inconsistent comment in example_python_operator.py (#4337) commit 4137dda52e017d5c9bf8146ce6b0cd9a1ef4d824 Author: eladkal <45845474+eladkal@users.noreply.github.com> Date: Thu Dec 20 00:40:28 2018 +0200 Fix incorrect parameter in SFTPOperator example (#4344) commit 2ef9df6af223e45edb5a11b5872be25b2b0880e5 Author: Szymon Przedwojski <szymon.przedwojski@gmail.com> Date: Wed Dec 19 22:41:53 2018 +0100 Google Cloud Spanner instance database query operator (#4314) commit ef19572c6d04a39be81df2a921c2192cbd8580be Author: gseva <gavrilovseva@gmail.com> Date: Wed Dec 19 06:48:57 2018 -0300 Make hmsclient optional in airflow.hooks.hive_hooks (#4080) Delay the import right up until it is needed, like how we do with the thrift imports. commit a80f01ddcb767860991acf0a1e6340d9d29d39f0 Author: Felix <feluelle@users.noreply.github.com> Date: Mon Dec 17 19:05:21 2018 +0100 Add missing remote logging field (#4333) commit 017f745f3f7c45c60f49e1b12c30686319069c98 Author: Kaxil Naik <kaxilnaik@gmail.com> Date: Sat Dec 15 23:13:36 2018 +0000 Add 2 options for ts_nodash Macro (#4323) commit 3f0c50390ec2da69d1c2bde0d86dd78701575b53 Author: Alvin Ali Khaled <aakside@gmail.com> Date: Wed Nov 14 02:59:30 2018 -0800 Revise template variables documentation (#4172) Updated documentation to elaborate on the (yesterday|tomorrow)_.* variables' relations to the execution date. commit 5acc53d4d17154d1656336e8b446d414e100de46 Author: thomasbrockmeier <thomas.brockmeier@gmail.com> Date: Sat Dec 15 16:27:10 2018 +0100 Airflow Filter_by_owner not working with password_auth (#4276) Local users were always a superuser, this adds a column to the DB (and defaults to false, which is going to cause a bit of an upgrade pain for people, but defaulting to not being an admin is the only secure default.) commit c6a79388ff82cbc2848d7318e554c76814ab843b Author: Stefan Seelmann <mail@stefan-seelmann.de> Date: Fri Sep 21 07:00:29 2018 +0200 Explicit re-schedule of sensors (#3596) * [AIRFLOW-2747] Explicit re-schedule of sensors Add `mode` property to sensors. If set to `reschedule` an AirflowRescheduleException is raised instead of sleeping which sets the task back to state `NONE`. Reschedules are recorded in new `task_schedule` table and visualized in the Gantt view. New TI dependency checks if a sensor task is ready to be re-scheduled. * Reformat sqlalchemy imports * Make `_handle_reschedule` private * Remove print * Add comment * Add comment * Don't record reschule request in test mode commit 7abaa169edce078aaf7470e741f12e5d190ae318 Author: Kevin Yang <yrqls21@gmail.com> Date: Mon Nov 26 21:49:31 2018 -0800 Add index on dag_id in sla_miss table (#4235) The select queries on sla_miss table produce a great % of DB traffic and thus made the DB CPU usage unnecessarily high. It would be a low hanging fruit to add an index and reduce the load. commit f171593b5e3e6b7cc8a2cea24a2613c69e4368cc Author: ubermen <kjh3477@gmail.com> Date: Fri Oct 12 18:38:51 2018 +0900 Add index 'ti_dag_date' to taskinstance (#3885) To optimize query performance commit d221699325362ac9246f21a1edc76383826c567e Author: Vardan Gupta <vardanguptacse@gmail.com> Date: Wed Aug 8 22:43:53 2018 +0530 Add index on log table (#3709) commit 7740e1bc259a18f98dbe76076c98f11e7d0ed1d2 Author: Niels Zeilemaker <niels@zeilemaker.nl> Date: Sat Dec 15 16:11:26 2018 +0100 Performance fixes for topological_sort of Tasks (#4322) For larger DAGs topological_sort was found to be very inefficient. Made some small changes to the code to improve the data structures used in the method. commit 974bb04c013b0747c2c89edac939077f8be45431 Author: Ash Berlin-Taylor <ash_github@firemirror.com> Date: Sat Dec 15 11:50:22 2018 +0000 Fetch more than 50 items in `airflow-jira compare` script (#4300) commit 9b0e6963979441070a5efab7ac94997cf4d1d8da Author: Tao feng <tfeng@lyft.com> Date: Wed Jun 20 20:32:50 2018 +0200 Add option to query for DAG runs given a DAG ID Closes #3515 from feng-tao/airflow-1919 commit d8f709ae1944739990b01190d1bfb9eaca826a2a Author: Joshua Carp <jm.carp@gmail.com> Date: Fri Dec 14 07:23:23 2018 -0500 Explicitly set transfer operator description. (#4279) commit 71a3a0340970b85a75947690dd57741d82f2e973 Author: tal181 <tal181@gmail.com> Date: Thu Dec 13 03:23:47 2018 +0200 Add OpenFaaS hook (#4267) commit 8d178443c606f738bcd2f655b5724b62fb9ea00c Author: Szymon Przedwojski <szymon.przedwojski@gmail.com> Date: Thu Dec 13 02:15:43 2018 +0100 Google Cloud Spanner deploy / delete operators (#4286) commit f088e6be54f3846583e7fbb542d2b23c45c64928 Author: Andy Cooper <andycooper.s@gmail.com> Date: Tue Jul 24 01:01:38 2018 +0100 Add context manager entry points to mongoHook Closes #3628 from andscoop/Add-connection-close- to-mongo-hook commit e14c4e43cbc12bdc0647e4f0f8da3c1719c1d2ca Author: yangaws <31293788+yangaws@users.noreply.github.com> Date: Thu Dec 6 11:51:11 2018 -0800 Add SageMaker doc to AWS integration section (#4278) commit c1f63705e204db0ca1b1441a27660ef389b86d35 Author: Daniel Imberman <daniel.imberman@gmail.com> Date: Fri Dec 7 15:39:47 2018 -0800 Fix Over-logging in the k8s executor (#4296) There are two log lines in the k8sexecutor that can cause schedulers to crash due to too many logs. commit 5567d251d8da5041eefc56f96fbb3166d9d45b4c Author: Xiaodong <xd_deng@hotmail.com> Date: Sat Dec 8 09:14:43 2018 +0800 Keeps records in Log Table when DAG is deleted (#4287) Users will use either API or web UI to delete DAG (after DAG file is removed): - Using API: provide one boolean parameter to let users decide if they want to keep records in Log table when they delete a DAG. Default value it True (to keep records in Log table). - From UI: will keep records in the Log table when delete records for a specific DAG ID (pop-up message is updated accordingly). commit 65d04fd55d2ea2157201be5869069388ebcddddf Author: Kaxil Naik <kaxilnaik@gmail.com> Date: Sun Sep 2 22:27:19 2018 +0100 Fix typo in docstring of gcs_to_bq (#3833) commit 9dbe78c512c7219a850b99d61d1320ef8f0d35f4 Author: Tom Miller <tmiller@microsoft.com> Date: Thu Dec 6 10:18:44 2018 -0800 Implement an Azure CosmosDB operator (#4265) Add an operator and hook to manipulate and use Azure CosmosDB documents, including creation, deletion, and updating documents and collections. Includes sensor to detect documents being added to a collection. commit df178de8130cc39f778e77a3c1792e3563248cce Author: yangaws <31293788+yangaws@users.noreply.github.com> Date: Thu Dec 6 11:51:11 2018 -0800 Add SageMaker doc to AWS integration section (#4278) commit 1a50b8498e86741162b74962d8a59911c3ff5c1f Author: John Cheng <ckljohn@gmail.com> Date: Thu Nov 15 06:16:08 2018 +0800 Add MongoDB connection (#4154) commit 577e026d1608bfd3c087b21b7c56373d839a61fc Author: Kaxil Naik <kaxilnaik@gmail.com> Date: Mon Dec 3 12:39:46 2018 +0000 Allows creating intermediate dirs in SFTPOperator (#4270) commit 6110c7152146d81e52c0a113e913929831956483 Author: Xiaodong <xd_deng@hotmail.com> Date: Thu Aug 30 03:20:11 2018 +0800 Arg check & better doc - SSHOperator & SFTPOperator (#3793) There may be different combinations of arguments, and some processings are being done 'silently', while users may not be fully aware of them. For example - User only needs to provide either `ssh_hook` or `ssh_conn_id`, while this is not clear in doc - if both provided, `ssh_conn_id` will be ignored. - if `remote_host` is provided, it will replace the `remote_host` which wasndefined in `ssh_hook` or predefined in the connection of `ssh_conn_id` These should be documented clearly to ensure it's transparent to the users. log.info() should also be used to remind users and provide clear logs. In addition, add instance check for ssh_hook to ensure it is of the correct type (SSHHook). Tests are updated for this PR. commit 3fbbded39c9bfc0a5e055eac29891325671e504d Author: John Cheng <ckljohn@gmail.com> Date: Sun Aug 19 22:10:25 2018 +0800 Add remote_host of SSH/SFTP operator as templated field (#3765) It allows remote_host to be passed to operator with XCOM. commit f81f14ec43fd035e0b71400b8016aa3145f27fd4 Author: Cameron Moberg <cjmoberg@gmail.com> Date: Tue Jul 31 15:24:34 2018 -0400 Update SSH Operator's Hook to respect timeout (#3666) commit b6c5d1f721c91bea27de336b8500c2854598e370 Author: Ash Berlin-Taylor <ash_gi…
commit 5b95be403a4ca8e1d163d65b69a4c609d416b760 Author: Chris Fei <chris@indicative.com> Date: Thu Jan 24 13:06:55 2019 -0500 Added custom file support in code view commit 666f1f103f6dda0f31217677e67629301f01dbdc Author: Chris Fei <chris@indicative.com> Date: Wed Jan 23 18:42:37 2019 -0500 compat with older mysql commit c51cc139f838125d908b7022b6449e00e79545b9 Author: Chris Fei <chris@indicative.com> Date: Wed Jan 23 10:58:40 2019 -0500 Added patch info commit 5a041ad90e02cad9b227b2817eb177a91afcf9fb Author: Kaxil Naik <kaxilnaik@apache.org> Date: Sat Jan 19 16:17:55 2019 +0000 Add Changes to CHANGELOG commit 346dede8ace2a0eb77360deb352b043b354e515f Author: Kaxil Naik <kaxilnaik@apache.org> Date: Sat Jan 19 15:04:17 2019 +0000 Fix issue when trying to edit connection in RBAC UI commit 6a637d24ece295520e1ec99650f5c48862a570b7 Author: Fokko Driesprong <fokkodriesprong@godatadriven.com> Date: Mon Oct 15 07:16:29 2018 +0200 Make flake8 compliant One voilation that slipped in by PR that didn't rebase onto latest master commit 5dbda81064106ab1b9e7b94707fcc61772edb3a5 Author: ubermen <kjh3477@gmail.com> Date: Sun Sep 16 05:01:03 2018 +0900 Clear UPSTREAM_FAILED using the clean cli (#3886) * [AIRFLOW-1298] Fix 'clear only_failed' * [AIRFLOW-1298] Fix 'clear only_failed' commit 3d87232efbdefdffe504a0cdf394dfd1262b98c7 Author: Xiaodong <xd_deng@hotmail.com> Date: Sun Sep 16 20:38:09 2018 +0800 Refine web UI authentication-related docs (#3863) commit 55ccae87e72239197900793d1dee270b079f14e3 Author: Nathaniel Ritholtz <nritholtz@gmail.com> Date: Thu Sep 27 15:43:26 2018 -0400 Fix SlackWebhookOperator execute method comment (#3963) commit afaa7cfda07851d632feec0ce089d561abbe3b56 Author: Mingye Xia <mingye.xia@outlook.com> Date: Fri Sep 28 10:07:43 2018 -0700 Monitor Task Instances creation rates (#3966) Montor Task Instances creation rates by Operator type. These stats can provide some visibility on how much workload Airflow is getting. They can be used for resource allocation in the long run (i.e. to determine when we should scale up workers) and debugging in scenarios like the creation rate of certain type of Task Instances spikes. commit e189fbdabb30fcbc463e9b81d5b9dd65f05088ff Author: Szymon Bilinski <szymon.bilinski@gmail.com> Date: Sat Sep 29 15:45:37 2018 +0200 Fix undocumented params in S3_hook Some function parameters were undocumented. Additional docstrings were added for clarity. commit 0d1aed9bb3a372e17d83339b2a2f8d8a67b808a3 Author: Santhoshkumar. P <sann3@users.noreply.github.com> Date: Thu Oct 4 22:50:48 2018 +0530 Fixing the issue in Documentation (#3998) Fixing the operator name from DataFlowOperation to DataFlowJavaOperator in Documentation commit e46886608546423b1e575a97acaf0bd8322afeb4 Author: Kaxil Naik <kaxilnaik@gmail.com> Date: Mon Oct 8 13:22:03 2018 +0100 Fix Typo in SFTPOperator docstring (#4016) commit aed387b97b2f63d8bdee3e9d96fdc83e4caa3d43 Author: marengaz <marengaz@users.noreply.github.com> Date: Mon Oct 29 14:28:52 2018 +0000 Correct misleading BigQuery error (#4098) commit d7b24721a98460eb917de3fae4796348412e4619 Author: mishikaSingh <mishikaps@gmail.com> Date: Wed Oct 31 18:51:28 2018 +0530 Catch transient DB exceptions from scheduler's heartbeat it does not crash (#3650) If there is any issue in DB connection then rest of the functions take care of those exceptions but in heartbeat of scheduler, there is no handling for this kind of situation. Airflow Scheduler should not crash if a "transient" DB exception occurs in the heartbeat of scheduler. commit f4503047520f478e604cec60281b475dfc199ad5 Author: Marcin Szymański <ms32035@gmail.com> Date: Tue Nov 13 14:37:57 2018 +0100 fix list processing in resolve_template_files (#4086) * [AIRFLOW-3245] fix list processing in resolve_template_files * [AIRFLOW-3245] add tests * [AIRFLOW-3245] modify tests commit a77f51081fe841466f715f288f6f34ec40f8e5ba Author: Nicholas Huang <nicholas.ykhuang@gmail.com> Date: Tue Nov 20 01:15:32 2018 -0800 AIRFLOW-XXX Fix copy&paste mistake (#4212) In emr_create_job_flow_operator.py the :type clearly mismatches with the :param name, suggesting a copy&paste mistake. commit 384845d3db4fa4cd3e13e5c9059f4a8642112be0 Author: Ryan Yuan <ryan.yuan@outlook.com> Date: Thu Nov 22 22:58:32 2018 +1100 Fix incorrect docstring in DatastoreHook (#4222) Correct docstring in DatastoreHook commit 043a4c35ea2f22b802a70b4936e38ce155c4b325 Author: rmn36 <rmn36@case.edu> Date: Fri Nov 23 10:41:04 2018 -0800 Add new TriggerRule for 0 upstream failures (#4182) Add new TriggerRule that triggers only if all upstream do not fail (success or skipped tasks are allowed) commit f59111760e07938c6b42b20bc45a3dc1070a1764 Author: Victor Noël <victornoel@users.noreply.github.com> Date: Mon Nov 26 10:02:08 2018 +0100 KubernetesPodOperator does not delete on timeout failure (#4218) Signed-off-by: Victor Noel <victor.noel@brennus-analytics.com> commit 8c9a39e9a7dc7a62d2de487bd4d33ae334aed116 Author: Kaxil Naik <kaxilnaik@gmail.com> Date: Sat Dec 8 22:31:39 2018 +0000 Fix Minor issues with Azure Cosmos Operator (#4289) - Fixed Documentation in integration.rst - Fixed Incorrect type in docstring of `AzureCosmosInsertDocumentOperator` - Added the Hook, Sensor and Operator in code.rst - Updated the name of example DAG and its filename to follow the convention commit 8ef0c9dfadca37fc399396e8c002ee119289e3d1 Author: Michal Dziemianko <michal.dziemianko@gmail.com> Date: Wed Dec 26 20:01:34 2018 +0000 Fix FTPSensor failing on error message with unexpected text. (#2450) * [AIRFLOW-1413] Fix FTPSensor file presence check Currently FTPSensor operates by checking text of error message returned from ftp lib. It only succeeds if the message matches the expected text. Otherwise it fails with an exception. However the message is dependend on a system, locale and possibly other factors. This patch changes the operation to inspect error code rather than message text. It also adds option to ignore certain classes of errors such as Host Unavailable that are recoverable, thus the performed action can and should be retried according to ftp spec. * [AIRFLOW-1413] Adjustments as per code review * [AIRFLOW-1413] fixing style Co-Authored-By: mdziemianko <michal.dziemianko@gmail.com> commit e30a6a8232882fa38c0f0058449f0ae5cee6f363 Author: Kaxil Naik <kaxilnaik@gmail.com> Date: Wed Sep 5 23:24:56 2018 +0100 Fix Minor issues in Documentation commit 208728c1d13b880a08e75e359786de812350ae66 Author: Kaxil Naik <kaxilnaik@gmail.com> Date: Sun Oct 14 20:11:44 2018 +0100 Fix BashOperator Docstring (#4052) commit 0e0a9dda8f6cc99917136a70519d7b29911773a1 Author: Marcin Szymański <ms32035@gmail.com> Date: Thu Nov 22 22:34:46 2018 +0000 update run statistics on dag refresh (#4197) * [AIRFLOW-3348] update run statistics on dag refresh commit 152d93dc8be07fe4c020c878eef8ce7528407c71 Author: Marcus <marcuseagan@gmail.com> Date: Thu Dec 13 23:19:22 2018 -0800 removed an unused/dangerous display-none (#4295) * removed an unused display-none that is currently overriden but could resurface as a bug. * remove the other display none in /www commit ddd292fd79fcf250eb131e68cb618e258939d9aa Author: cclauss <cclauss@bluewin.ch> Date: Thu Sep 20 21:48:36 2018 +0200 Use feature detection for reload() (#3298) * [AIRFLOW-2407] Use feature detection for reload() [Use feature detection instead of version detection](https://docs.python.org/3/howto/pyporting.html#use-feature-detection-instead-of-version-detection) is a Python porting best practice that avoids a flake8 undefined name error... flake8 testing of https://github.com/apache/incubator-airflow on Python 3.6.3 commit f7e6dfe7cac5012c27134bb90e08cf8f3c091701 Author: Joshua Carp <jm.carp@gmail.com> Date: Thu Oct 25 06:33:21 2018 -0400 Add SKIPPED to task states. (#4059) commit f350aff54e99dab586cd4165625fafcd1ac86907 Author: Abdul Nimeri <abdul@stripe.com> Date: Thu Jul 26 20:53:57 2018 +0200 Compress tree view JSON The tree view generates JSON that can be massive for bigger DAGs, up to 10s of MBs. The JSON is currently prettified, which both takes up more CPU time during serialization, and slows down everything else that uses it. Considering the JSON is only meant to be used programmatically, this is an easy win Closes #3620 from abdul-stripe/smaller-tree-view- json commit 6e349ea80abe1d3f618033aac700ab98dd4991af Author: Verdan Mahmood <verdan.mahmood@gmail.com> Date: Thu Jul 26 20:55:04 2018 +0200 Respect shared datetime across tabs Closes #3615 from verdan/AIRFLOW-2766-shared- datetime commit 496b6684d66636007e2887bf66876bd9b7303ea3 Author: Verdan Mahmood <verdan.mahmood@gmail.com> Date: Wed Aug 8 13:47:59 2018 +0200 Enables FAB's theme support (#3719) commit 94a004cc758b81b1d863b85c782fc3d2ffab436c Author: Gabriel Silk <gabe@nomic.com> Date: Mon Sep 3 11:37:20 2018 -0700 Fix missing CSRF token head when using RBAC UI (#3804) commit 2b542f33ebbb24ff166ab07feaab79e1af1950d9 Author: Stefan Seelmann <mail@stefan-seelmann.de> Date: Thu Sep 20 20:49:20 2018 +0200 Assign permission get_logs_with_metadata to viewer role (#3913) commit 078c7b05d6231cf3d29ac226b3abd479bfa17d05 Author: Xiaodong <xd_deng@hotmail.com> Date: Tue Sep 25 02:20:17 2018 +0800 Display www_rbac Flask flash msg properly (#3903) The Flask flash messages are not displayed properly. When we don't give a category for a flash message, defautl value will be 'message'. In some cases, we specify 'error' category. Using Flask-AppBuilder, the flash message will be given a CSS class 'alert-[category]'. But We don't have 'alert-message' or 'alert-error' in the current 'bootstrap-theme.css' file. This makes the the flash messages in www_rbac UI come with no background color. This commit addresses this issue by adding 'alert-message' (using specs of existing CSS class 'alert-info') and 'alert-error' (using specs of existing CSS class 'alert-danger') into 'bootstrap-theme.css'. commit 711582538c62e789bea7e5f8b36ed27dc5a3e75a Author: Joshua Carp <jm.carp@gmail.com> Date: Mon Oct 15 09:12:33 2018 -0400 Handle duration for missing dag. (#3984) commit 44abb72349622bff0e1a267cf286514713dd71b1 Author: Ash Berlin-Taylor <ash_github@firemirror.com> Date: Mon Nov 19 18:26:18 2018 +0000 Don't publish md5 sigs as part of release (#4210) Apache recommend against publishing MD5 files now as they are relatively easy to collide and shouldn't be trusted anymore commit 48d5ef35960b88622c997fdd823aee97f971503b Author: smithakoduri <41048347+smithakoduri@users.noreply.github.com> Date: Wed Nov 14 17:45:24 2018 -0800 Fix issue with persistence of RBAC Permissions modified via UI (#4118) commit f815f13bd74ec6c71154ef68d8ef6a5c779d4317 Author: Sumit Maheshwari <sumeet.manit@gmail.com> Date: Wed Nov 7 15:42:17 2018 +0530 Small CSS fixes (#4140) * Don't highlight logout button when viewing Log tab of a task * Align Airflow logo to the center of the login page commit 6c445291240dca676fd050b416d56bbfcacf7363 Author: Zakaria EL Mesaoudi <elmesaoudee@gmail.com> Date: Wed Nov 7 21:58:15 2018 +0000 AIRFLOW-3259] Fix internal server error when displaying charts (#4114) This is caused by the fact that the function 'sort' is no longer a part of Dataframe in pandas and is still used in the code base. It has ever since been replaced by 'sort_values'. Replacing the function gets the chart display back to its normal behaviour. commit d08940e7cc3a78ac1509332ff091cc8d7048c40f Author: Kaxil Naik <kaxilnaik@apache.org> Date: Thu Jan 17 21:31:07 2019 +0000 Update Changelog commit a4effc1a9d6a17de1088c8537c9a182318e0421b Author: Vivek <3vivekb@gmail.com> Date: Wed Jan 16 21:57:42 2019 -0800 Fix a typo of config (#4544) commit 625ec7398995724b1b7469153ab7906226e20eec Author: Daniel Lamblin <dlamblin+github@gmail.com> Date: Fri Jan 18 00:21:37 2019 +0900 Correct Typo in sensor's exception (#4545) commit 238339b35eb963cf7b337bedd3092dbbc9208038 Author: Kaxil Naik <kaxilnaik@apache.org> Date: Thu Jan 17 14:39:59 2019 +0000 Fix the broken refresh button on Graph View in RBAC UI commit faaba1fd401f49d811e9380ff57bbd3868c42cee Author: Kaxil Naik <kaxilnaik@apache.org> Date: Wed Jan 16 21:59:57 2019 +0000 Changelog and version for 1.10.2 commit d3ff2abde31f837ddf038c7cae232fa4d80693fb Author: Felix <feluelle@users.noreply.github.com> Date: Fri Jan 11 19:17:20 2019 +0100 Update github_enterprise_auth.py commit 1a6153eb0f3dc2445b3fbe91cd9a4976aa45e25c Author: Ash Berlin-Taylor <ash_github@firemirror.com> Date: Thu Aug 30 13:30:40 2018 +0100 Make GHE auth third party licensed (#3803) commit a74331f6cc36c6f3a0ebd3a9ffb3c577d9e2d5e9 Author: Verdan Mahmood <verdan.mahmood@gmail.com> Date: Fri Aug 3 14:07:50 2018 +0200 Display multiple timezones on UI (#3687) commit d70dd93c99693770bd3f29d25634004bc99030d0 Author: Verdan Mahmood <verdan.mahmood@gmail.com> Date: Thu Jul 26 20:45:14 2018 +0200 Implement eslint for JS code check (#3641) commit 777b176624ce126c2c02c40ed56b0f357a15723c Author: Verdan Mahmood <verdan.mahmood@gmail.com> Date: Wed Jul 25 14:15:30 2018 +0200 Removes unused hard-coded dagreD3 Closes #3635 from verdan/AIRFLOW-2782-dagred3-fix commit fd6217b982eb31573a7b0029b75d6525b92c9ca2 Author: Verdan Mahmood <verdan.mahmood@gmail.com> Date: Tue Jul 24 11:22:53 2018 +0200 Upgrades the Dagre D3 version Closes #3634 from verdan/AIRFLOW-2782-upgrade- dagre-d3 commit 3856ea4948b2c7069681c10a785ce2db5c9e1af0 Author: Jarek Potiuk <jarek@potiuk.com> Date: Tue Jan 15 01:54:03 2019 +0100 All GCP operators have now optional GCP Project ID (#4500) commit cd746a253fe45188cbd1b9285d927565f4007a35 Author: Xiaodong <xd_deng@hotmail.com> Date: Tue Jan 15 01:34:45 2019 +0800 Support SSL Protection When Redis is Used as Broker for CeleryExecutor (#4521) From Celery 4.1 (current Airflow is using 4.1.1), "broker_use_ssl" argument starts to support Redis (earlier this argument is only supported when amqp is used for broker) (REF: https://github.com/celery/celery/blob/4.1/docs/userguide/configuration.rst). commit 53f65ed33b42c8ddce28aea6383c9d92f47d6c65 Author: Wyndham Blanton <bo.blanton@gmail.com> Date: Mon Jan 14 10:06:46 2019 -0800 - KubernetsExecutor: Need in try_number in labels if getting them later (#4163) * Need in labels if getting them later * has to be an int to match running keys - otherwise running list will never empty * pr comments * bad merge * mend pep issue * add try_numer to make_pod test commit 4d721fe96e468c2d35eab7f5cffcd0f295bdc706 Author: Ash Berlin-Taylor <ash_github@firemirror.com> Date: Mon Jan 14 15:59:03 2019 +0000 Escape links generated in model views (#4519) commit c834f3003a2ef16793f74c889b735c6527af5fd8 Author: Xiaodong <xd_deng@hotmail.com> Date: Mon Jan 14 18:10:04 2019 +0800 Change the lowest allowed version of "requests" (#4517) commit f62834f0871c40ad38473107fe255f1d377804c5 Author: Kaxil Naik <kaxilnaik@apache.org> Date: Mon Jan 14 00:39:37 2019 +0000 Revert [AIRFLOW-3692] Remove ENV variables to avoid GPL (#4506) commit 111f48aa328de3781b9e7f23f703388ed6821b9d Author: Kaxil Naik <kaxilnaik@apache.org> Date: Sun Jan 13 20:42:16 2019 +0000 Update CHANGELOG.txt commit 8afb59e7100bd1618788474c2d7ec7b7c8e68038 Author: Kaxil Naik <kaxilnaik@apache.org> Date: Sun Jan 13 20:36:42 2019 +0000 Add CHANGELOG & K8s to Documentation commit 2cccaef9565f3863281b99428f67338bb0657571 Author: Kaxil Naik <kaxilnaik@gmail.com> Date: Sun Jan 13 20:12:53 2019 +0000 Add Version info to Airflow Documentation (#4512) commit bf34820c76575bb49cfdb223a7be1f84bfa66fa6 Author: Kaxil Naik <kaxilnaik@apache.org> Date: Sun Jan 13 19:38:56 2019 +0000 Remove Duplicates from Changelog commit 74f8ce662754b45c2ef1f7658cfca05c11efb48d Author: Kaxil Naik <kaxilnaik@apache.org> Date: Sun Jan 13 17:48:18 2019 +0000 Update CHANGELOG.txt commit 938442bc9597d21bbcf6ecbad3c3c8956aa649bd Author: Gabriel Nicolas Avellaneda <avellaneda.gabriel@gmail.com> Date: Wed Dec 5 17:55:38 2018 -0200 Add Kubernetes Dependency in Extra Packages Doc (#4281) commit 9e01ac740cde14ec928bca5ea89a4f1d21934401 Author: Joshua Carp <jm.carp@gmail.com> Date: Tue Oct 9 11:14:07 2018 -0400 Add extras group for google auth to setup.py. (#3917) To clarify installation instructions for the google auth backend, add an install group to `setup.py` that installs dependencies google auth via `pip install apache-airflow[google_auth]`. commit 7c2b1c2173e8714ecd9b435a614bb6f66b1cbe07 Author: Naman Bhalla <namanbhalla1998@gmail.com> Date: Sat Sep 8 21:40:27 2018 +0530 Remove redundant space in Kerberos (#3866) commit bb02c0334c42825f36e1db56a3b87313927b70c4 Author: Taylor D. Edmiston <tedmiston@gmail.com> Date: Wed Aug 15 01:09:26 2018 -0400 Clean up installation extra packages table (#3750) Sort the extra packages table, use official product names, improve capitalization, and make table whitespace consistent. commit 0f4df115fd81f9623d1d448b1f96a94b8480bb90 Author: Kaxil Naik <kaxilnaik@gmail.com> Date: Fri Jun 22 16:35:48 2018 +0200 Add instructions to install SSH dependencies Closes #3536 from kaxil/patch-1 commit 082372692059899008e746849b00cfa985d6c91e Author: Kaxil Naik <kaxilnaik@apache.org> Date: Sun Jan 13 16:56:54 2019 +0000 Update CHANGELOG.txt commit a9107377d0c4c36c03246572cbd9a306716c7cd5 Author: Fokko Driesprong <fokko@driesprong.frl> Date: Sun Jan 13 17:54:45 2019 +0100 Replace psycopg2-binary by psycopg2 (#4508) For Python packages, psycopg2 is preferred over psycopg2-binary http://initd.org/psycopg/docs/install.html#binary-install-from-pypi commit 9e09fd63726bf1cc35528350991ef2afe5b697de Author: Bryant Biggs <bryantbiggs@gmail.com> Date: Sun Dec 2 00:58:50 2018 -0500 Correct Python Version Documentation Reference (#4259) commit dc3ca9595f0ddd28f7a567230aa4e5e923dd7355 Author: Felix <feluelle@users.noreply.github.com> Date: Sun Nov 11 23:40:03 2018 +0100 Update Contributing Guide - Git Hooks (#4120) - changes pre-commit example to use methods - adds activating virtual env for python to run things like flake8 locally - changes pre-commit file to use set -e command to instantly exit if any non-zero error occurs - changes flake8 call to lint the repo instead of not only the changes files commit a7840fdf8277def04a27370673e8c9b1db996309 Author: Kaxil Naik <kaxilnaik@gmail.com> Date: Wed Aug 29 21:44:24 2018 +0100 Fix Broken Link in CONTRIBUTING.md commit 21e3ab7116b8f6b434718d06345ddc4a240192b1 Author: BasPH <BasPH@users.noreply.github.com> Date: Sat Oct 27 15:48:25 2018 +0200 Fix incorrect statement in contributing guide (#4104) commit 18fcc5eedf357265a5de0cef2b691e1c0b32adba Author: bolkedebruin <bolkedebruin@users.noreply.github.com> Date: Sun Jan 13 13:34:00 2019 +0100 Remove ENV variables to avoid GPL (#4506) commit 539d924f058ce4bd49175562767bf25e39ac5523 Author: Taylor D. Edmiston <tedmiston@gmail.com> Date: Thu Jul 26 11:02:47 2018 +0200 Skip test_mark_success_no_kill in PostgreSQL on CI See mailing list thread "Flaky test case: test_mark_success_no_kill". Closes #3642 from tedmiston/test_mark_success_no_kill-postgresql commit 611b277e5f6d5d204ca1c85445b22068dae9ade6 Author: Xiaodong <xd_deng@hotmail.com> Date: Sun Jan 13 22:33:08 2019 +0800 Fix bug to set state of a task for manually-triggered DAGs (#4504) commit e63655089ecaa313b80b3c894142fafd356d2c85 Author: Xiaodong <xd_deng@hotmail.com> Date: Sun Jan 13 21:11:12 2019 +0800 Update pop-up message when deleting DAG in RBAC UI (#4505) This feature was added in https://github.com/apache/airflow/pull/4287, but the pop-up messages was only updated in airflow/www/templates/airflow/dag.html, while it should be updated for all dag.html & dags.html for both /www and /www_rbac. commit 3da675eb8c94e6446b59e3ed04cf1ca9211f4c40 Author: bolkedebruin <bolkedebruin@users.noreply.github.com> Date: Sun Jan 13 09:02:34 2019 +0100 Update notice to 2019 (#4503) commit c07d80c113be728de96a32d454e539876ea9c94a Author: Kaxil Naik <kaxilnaik@apache.org> Date: Sun Jan 13 01:28:05 2019 +0000 Bump version to 1.10.2b2 commit 3031387a74630a84e2ce53e971a48cde9e66d4b3 Author: Kaxil Naik <kaxilnaik@apache.org> Date: Sat Jan 12 23:00:13 2019 +0000 Version 1.10.2b1 commit 8efc392697a4c31bf66d32cd7cb504076e0a2c03 Author: Kamil Breguła <mik-laj@users.noreply.github.com> Date: Sat Jan 12 20:54:43 2019 +0100 Add missing @apply_defaults decorators (#4498) commit 26dffa0f396ffedd297c9e8fdab53a3d4b1c161d Author: Kaxil Naik <kaxilnaik@apache.org> Date: Sat Jan 12 22:03:47 2019 +0000 Fix CI commit 0387008886706e21745f41ddaaec43c5b2a1d68e Author: Ash Berlin-Taylor <ash_github@firemirror.com> Date: Sun Nov 11 22:22:11 2018 +0000 Speed up RBAC view tests (#4162) Not re-creating the FAB app ones per test functions took the run time of the TestAirflowBaseViews from 223s down to 53s on my laptop, _and_ made it only print the deprecation warning (fixed in another PR already open) once instead of 10+ times. commit 2f540911ce937321af6bef22b48a3877a712aadd Author: Fokko Driesprong <fokko@driesprong.frl> Date: Tue Jan 8 11:45:40 2019 +0100 Make sure that the session is closed (#4298) commit f399719bf86bc9cfe4478c01474960a02e0c0d81 Author: Xiaodong <xd_deng@hotmail.com> Date: Thu Nov 29 23:21:06 2018 +0800 Fix/refine tests for api/common/experimental/ (#4255) Follow-up on [AIRFLOW-3239] Related PRs: #4074, #4131 1. Fix (test_)trigger_dag.py 2. Fix (test_)mark_tasks.py 2-1. properly name the file 2-2. Correct the name of sample DAG 2-3. Correct the range of sample execution_dates (earlier one conflict with the start_date of the sample DAG) 2-4. Skip for test running on MySQL Seems something is wrong with airflow.api.common.experimental.mark_tasks.set_state, Corresponding test case works on Postgres & SQLite, but fails when on MySQL ("(1062, "Duplicate entry '110' for key 'PRIMARY'")"). A TODO note is added to remind us fix it for MySQL later. 3. Remove unnecessary lines in test_pool.py commit e62866a903e439d51508c5324f72c6d5c32abf53 Author: Yingbo Wang <ybwang@gmail.com> Date: Fri Aug 31 16:49:39 2018 -0700 Update dag_run table end_date when state change (#3798) The existing airflow only change dag_run table end_date value when a user teminate a dag in web UI. The end_date will not be updated if airflow detected a dag finished and updated its state. This commit add end_date update in DagRun's set_state function to make up tho problem mentioned above. commit 851d32846fd3590e134678adb216a0e77413b91c Author: yrqls21 <yrqls21@gmail.com> Date: Wed Aug 1 01:31:30 2018 -0700 Fix bug in set DAG run state workflow (#3606) commit 98681fefbecde6d9dcde4f49da735fce499bdd78 Author: Tao Feng <tfeng@lyft.com> Date: Sat Jan 5 06:10:31 2019 -0800 Update committer list based on latest TLP discussion (#4427) commit fc200df99e1af1f98086c0dca8eb1ed301d1d6fd Author: Kaxil Naik <kaxilnaik@gmail.com> Date: Sat Jan 5 16:32:12 2019 +0000 Remove remaining incubator mention & Fix CI Behaviour (#4441) commit 38b6775d3108856f83c06824b3e3099a99d589ec Author: XD-DENG <xd_deng@hotmail.com> Date: Sun Sep 9 21:06:51 2018 +0800 CLI tool to show the next execution datetime (#3834) commit b6c9c46a70a7443198a36ec79374ec1cba22046a Author: aoen <aoen@users.noreply.github.com> Date: Sat Jan 12 15:59:40 2019 +0200 Fix not being able to specify execution_date when creating dagrun (#4037) commit 090ffbc77171fa14c5d0b86df21bc74a5badec9b Author: Jarek Potiuk <jarek@potiuk.com> Date: Sat Jan 12 13:44:01 2019 +0100 Consistency update in tests for All GCP-related operators (#4493) commit e30e2192c62613869888989fd303d6d615f336a6 Author: Jarek Potiuk <jarek@potiuk.com> Date: Wed Dec 5 21:35:29 2018 +0100 GCP operators documentation clarifications (#4273) commit e737f5588db7d108d98e481759a83c1c1150a720 Author: Cameron Moberg <cjmoberg@gmail.com> Date: Thu Aug 2 12:44:16 2018 -0700 Add GCP specific k8s pod operator (#3532) Executes a task in a Kubernetes pod in the specified Google Kubernetes Engine cluster. This makes it easier to interact with GCP kubernetes engine service because it encapsulates acquiring credentials. commit 179831bc72d8a447c33e45aa8b0ca3258c5f3615 Author: Jarek Potiuk <jarek@potiuk.com> Date: Sat Jan 12 10:07:45 2019 +0100 Use googlapiclient for google apis (#4484) The deprecated apiclient package name is used in a number of places. This commit changes it to googleapiclient and modifies the right packages to be used instead. commit 5b546769598e1c6782d7155964c9b20b4bfc9945 Author: Gordon Ball <chronitis@gmail.com> Date: Mon Nov 5 15:48:11 2018 +0100 Support multipart uploads to GCS (#4084) * [AIRFLOW-3205] Support multipart uploads to GCS Cloud Storage supports resumable/multipart uploads for large files, which can be used to avoid limitations on the size of a single HTTP request, or by adding a retry behaviour, increase the reliability of large transfers. * [AIRFLOW-3205] Use only the multipart keyword This removes the chunksize keyword, using instead multipart=True for a default chunk size or multipart=int to override the default. commit 2965c0d125f89579c0abd6a1e782a47819470ade Author: Jasper Kahn <jasperakahn@gmail.com> Date: Wed Aug 8 01:04:19 2018 -0700 Add GoogleCloudKMSHook (#3677) Adds a hook enabling encryption and decryption through Google Cloud KMS. This should also contribute to AIRFLOW-2062. commit 2956727745c358045df61e1ae110798748c25492 Author: Jarek Potiuk <jarek@potiuk.com> Date: Sat Jan 12 02:18:40 2019 +0100 Add required permission to CloudSQL export/import example (#4489) commit b752d33b9bb568a1682fa78c72f468358ea2ab86 Author: Szymon Przedwojski <szymon.przedwojski@gmail.com> Date: Wed Dec 5 21:33:00 2018 +0100 Google Cloud SQL import/export operator (#4251) commit 8f5145bc68a38c26c29a3b677bed87f932d03d20 Author: Stefan Seelmann <mail@stefan-seelmann.de> Date: Sat Jan 12 02:06:11 2019 +0100 Fix logs when task is in rescheduled state (#4492) commit cfc2addcfe3156118867f6cd7fca60e6b049e6fb Author: Jarek Potiuk <jarek@potiuk.com> Date: Sat Jan 12 02:01:49 2019 +0100 Added Google Cloud Base Hook to documentation (#4487) commit c6d9e516b358224e181098e78d5dff1f97cc69ba Author: Felix <feluelle@users.noreply.github.com> Date: Fri Jan 11 19:17:20 2019 +0100 Unify different License Header commit 25703d468d04436409bd4845570f46c5a428daf2 Author: Mike Mole <mikemole@gmail.com> Date: Fri Jan 11 14:35:08 2019 -0500 Add AwsGlueCatalogPartitionSensor (#4112) Adds AwsGlueCatalogPartitionSensor and AwsGlueCatalogHook with supporting functions. Unit tests are included but rely on mocking since Moto does not yet fully support AWS Glue Catalog at this time. commit 5d3e362a1bf6a70ed9af0aa4a3e99f6af3a6e0eb Author: Ant Weiss <antweiss@users.noreply.github.com> Date: Fri Jan 11 21:04:54 2019 +0200 Remove invalid parameter KeepJobFlowAliveWhenNoSteps in example DAG (#4404) The parameter 'KeepJobFlowAliveWhenNoSteps' in JOB_FLOW_OVERRIDES doesn't pass boto API parameter validation, as it should be a part of 'Instances' object. Signed-off-by: Anton Weiss <anton@otomato.link> commit a9f70793e70a1f4cdf2d0b25a4a3806199d809a2 Author: Xiaodong <xd_deng@hotmail.com> Date: Thu Jan 10 14:58:29 2019 +0800 Refine the functionality of "/health" endpoint (#4309) commit ba49fe857670505ccaa63e64d61a34d3e50c49ad Author: Tobias Kaymak <tobias.kaymak@ricardo.ch> Date: Thu Jan 10 18:57:15 2019 +0100 Fix zendesk integration (#4466) commit df603b5582f6d51d30cf9711bb4d03e430715894 Author: Drew J. Sonne <drew.sonne@gmail.com> Date: Thu Jan 10 23:03:59 2019 +0000 Load plugins from entry_points (#4412) * [AIRFLOW-3605] Add entrypoint plugin docs This documentation came from https://github.com/apache/incubator-airflow/pull/730 which had already started work on a PR for this functionality. * [AIRFLOW-3605] Extend plugin loading functionality Added business logic to import AirflowPlugin classes through entry_points. This means we don’t have to interact with the file system directly to install plugins, and can manage them via `pip`. commit f26660fcd7164b43441dc20d79eb051a5f59a39c Author: Tao Feng <tfeng@lyft.com> Date: Wed Jan 9 12:23:03 2019 -0800 Rename plugins_manager.py to test_xx to trigger tests (#4464) commit 1d37beba3e1a3cf76967259c2cb819ce5b0084f6 Author: Stefan Seelmann <mail@stefan-seelmann.de> Date: Fri Jan 11 10:58:14 2019 +0100 Visualize reschedule state in all views (#4408) * [AIRFLOW-3589] Visualize reschedule state in all views * Add explicit `UP_FOR_RESCHEDULE` state * Add legend and CSS to views * [AIRFLOW-3589] Visualize reschedule state in all views * Use set or tuple instad of list * Use `with` statement for session handling commit 3b066368f579a3523d7aba563d5dd825d87f8ce3 Author: Kamil Breguła <mik-laj@users.noreply.github.com> Date: Fri Jan 11 07:46:22 2019 +0100 Docs: Fix paths to GCS transfer operator (#4479) commit 20944272a40d723238ac6b5369a2bd35f72386d0 Author: Ash Berlin-Taylor <ash_github@firemirror.com> Date: Wed Jan 9 23:06:42 2019 +0000 Escape links generated in model views (#4463) commit 0856fb5df5c88ea2ff13527f81c1aaacfe59c39a Author: Kaxil Naik <kaxilnaik@gmail.com> Date: Wed Jan 9 23:09:48 2019 +0000 Add dependency for Enum (#4468) commit 7b9e03c23356e7cd4044a43ab2574f523115c888 Author: Kaxil Naik <kaxilnaik@apache.org> Date: Wed Jan 9 21:50:02 2019 +0000 Update Updating instructions for changes in 1.10.2 commit 8be3f469d97f1d30f6b40a9c30a476659228cc67 Author: Jarek Potiuk <jarek@potiuk.com> Date: Wed Jan 9 21:36:39 2019 +0100 Cleanup of GCP Cloud SQL Connection (#4451) commit 68e2ea0786ae2f1ff4852ab9fe1c28669ebcab48 Author: dima-asana <42555784+dima-asana@users.noreply.github.com> Date: Fri Oct 12 00:14:47 2018 -0700 Respect task start_date when different from dag's (#4010) commit a95dc3c3befe3b2725186051f894ccc1e4e2bca4 Author: Kaxil Naik <kaxilnaik@apache.org> Date: Wed Jan 9 00:10:03 2019 +0000 Remove Flake8 Diff checker commit ec97c480eb39ecde01013e499bb54828343935d6 Author: Joshua Carp <jm.carp@gmail.com> Date: Thu Oct 4 03:20:24 2018 -0400 Update flask-appbuilder (#3937) commit b6b107725b915a2e6fa69d73d93fa388b047aceb Author: dima-asana <42555784+dima-asana@users.noreply.github.com> Date: Thu Oct 11 01:55:15 2018 -0700 More resillient database use in CI (#4014) commit 31630ccbf42577c814c6ce372be0a077e09cb816 Author: Fokko Driesprong <fokko@driesprong.frl> Date: Fri Sep 21 16:37:55 2018 +0200 Remove preloading of MySQL testdata (#3911) One of the things for tests is being self contained. This means that it should not depend on anything external, such as loading data. This PR will use the setUp and tearDown to load the data into MySQL and remove it afterwards. This removes the actual bash mysql commands and will make it easier to dockerize the whole testsuite in the future commit 633829b7171234b33f244abd28539c53161e0941 Author: Kengo Seki <sekikn@apache.org> Date: Wed Aug 1 18:07:34 2018 +0900 Brush up the CI script for minikube commit 0db140666298acdfe0a9070d6bb38f876713ec8d Author: Felix <feluelle@users.noreply.github.com> Date: Tue Jan 8 10:36:52 2019 +0100 Fix example http operator (#4455) commit 72e9543f8e6cce218b124728876e2e6fc677966e Author: Kaxil Naik <kaxilnaik@apache.org> Date: Tue Jan 8 01:50:49 2019 +0000 Fix flake8 issues commit 759459a5a02b39649201fda395dfbf64f9991fe9 Author: Jeff Payne <jeffkpayne@gmail.com> Date: Wed Sep 12 14:42:13 2018 -0700 Allow custom 'job_error_states' in dataproc ops (#3884) commit f12aee6b631d465da65fec78a6e945f1654a4ecd Author: gseva <gavrilovseva@gmail.com> Date: Wed Dec 19 06:48:57 2018 -0300 Make hmsclient optional in airflow.hooks.hive_hooks (#4080) Delay the import right up until it is needed, like how we do with the thrift imports. commit 804c0b1886dabe145ba69e306c25c1d8d98cd08b Author: Kengo Seki <sekikn@apache.org> Date: Tue Aug 7 01:42:02 2018 +0900 Fix scheduler_ops_metrics.py to work (#3653) This PR fixes timezone problem in scheduler_ops_metrics.py and makes its timeout configurable. commit 652cab9c7bc8dfbe229536d89348365f245828e7 Author: juhwi.lee <juhwi.lee@navercorp.com> Date: Sun Jul 15 12:14:21 2018 +0200 add job properties update in hive to druid operator. Closes #3600 from happyjulie/AIRFLOW-2751 commit 8e591079499dcd495059d2668ff9de102a072fdb Author: Fokko Driesprong <fokko@driesprong.frl> Date: Sun Sep 16 14:31:10 2018 +0200 Log how many rows are read from Postgres (#3905) To know how many data is being read from Postgres, it is nice to log this to the Airflow log. Previously when there was no data, it would still create a single file. This is not something that we want, and therefore we've changed this behaviour. Refactored the tests to make use of Postgres itself since we have it running. This makes the tests more realistic, instead of mocking everything. commit 44463c10979163674d7e9cf6c1ad443e6a15ea8f Author: Kevin Yang <kevin.yang@airbnb.com> Date: Wed Jul 11 10:28:06 2018 +0200 Make task instance context available for hive queries commit d38cc5034cbf32bea4e8f50e2d5fcff770628f22 Author: Fokko Driesprong <fokkodriesprong@godatadriven.com> Date: Fri Sep 21 16:36:28 2018 +0200 Remove unused imports commit f676c887e27bcf54a21249f67623297e0719cbf1 Author: johnhofman <johncarlhofman@gmail.com> Date: Fri Sep 28 12:04:29 2018 +0200 Fix PythonVirtualenvOperator tests (#3968) The recent update to the CI image changed the default python from python2 to python3. The PythonVirtualenvOperator tests expected python2 as default and fail due to serialisation errors. commit 05314944fb27b8b337dc4baa856d7b1f00445539 Author: Fokko Driesprong <fokko@driesprong.frl> Date: Fri Sep 21 16:25:54 2018 +0200 Fix Flake8 violations (#3931) commit c0f450f014c07d03271712600745bc95c75113fe Author: Fokko Driesprong <fokkodriesprong@godatadriven.com> Date: Thu Nov 8 00:02:18 2018 +0100 Make flake8 compliant commit 1291037f7fcd711e834317b6c566546ab962db41 Author: Mike Ascah <mike.ascah@joinroot.com> Date: Fri Jul 20 13:46:50 2018 +0200 Add except type to broad S3Hook try catch clauses S3Hook will silently fail if given a conn_id that does not exist. The calls to check_for_key done by an S3KeySensor will never fail if the credentials object is not configured correctly. This adds the expected ClientError exception type when performing a HEAD operation on an object that doesn't exist to the try catch statements so that other exceptions are properly raised. Closes #3616 from mascah/AIRFLOW-2771-S3hook- except-type commit d6457df20786e50dfd324ab091e1749684605695 Author: Fokko Driesprong <fokko@driesprong.frl> Date: Tue Aug 21 00:44:36 2018 +0200 Fix Flake8 violations (#3772) commit 9bd3eac129e8c7f50d8727e2892e65416de36b55 Author: Matt Revell <nightowlmatt@gmail.com> Date: Thu Aug 2 08:43:39 2018 +0100 Handle getsource() calls gracefully commit f10699c09538c8ab8b86abaf3fd8ca0c9d3002e3 Author: BrechtDeVlieger <brechtdevlieger@hotmail.com> Date: Tue Dec 11 19:20:14 2018 +0100 Fix integrety error in rbac AirflowSecurityManager (#4305) This was caused by the variable `role` being shadowed in a loop statement. commit 0bd6ff1d405919d6d165116ad5129e7c57a3b5f8 Author: Riccardo Bini <odracci@gmail.com> Date: Mon Dec 31 06:03:33 2018 +0100 Fix Kubernetes operator with git-sync (#3770) commit f70eb6d4c0df08987ed1c7ea3e36ff4909bd436f Author: Fokko Driesprong <fokko@driesprong.frl> Date: Fri Oct 12 23:22:52 2018 +0200 Make flake8 compliant (#4035) commit 049645ede236698aba0e8fdafbf2bc8e7630b4f5 Author: Israel Knight <israel.s.knight@gmail.com> Date: Thu Sep 6 00:07:28 2018 -0700 Implemented DatabricksRunNowOperator for jobs/run-now … (#3813) Add functionality to kick of a Databricks job right away. * Per feedback: fixed a documentation error, reintegrated the execute and on_kill onto the objects. * Fixed a documentation issue. commit 7cb23c3871496c2560ecd342f1ac1b3c9e5f5681 Author: Giovanni Lanzani <gglanzani@users.noreply.github.com> Date: Wed Nov 7 23:07:48 2018 +0100 Simplify Kerberos code (#3563) Some functions were not used. On top of that, the `principal_from_username` function was getting the wrong config value ("security" instead of "kerberos"). Since the results were only used by `kerberos.checkPassword`, and the function can cope with needing a realm in the `username` when `realm` is provided, we removed the `principal_from_username` function altogether. commit 28f9f7b7b9a7212a4f9aecf492124b1b5004a6b4 Author: Verdan Mahmood <verdan.mahmood@gmail.com> Date: Tue Jul 24 01:07:15 2018 +0100 Limit DAGs parsing to once only Closes #3614 from verdan/double-dag-parsing commit d6916e389fcf7227bea6dbe66a7656e53689cd9c Author: Kengo Seki <sekikn@apache.org> Date: Tue Jul 17 13:52:28 2018 +0100 Add subcommands to delete and list users Currently, adding user is the only operation that CLI has on RBAC. This PR adds functionality to delete and list users via CLI. Closes #3610 from sekikn/AIRFLOW-2750 commit add64ef1bb48a8a16db6dc71b4552e918435c57a Author: Tao feng <tfeng@lyft.com> Date: Mon Jul 16 13:13:42 2018 -0700 Airflow DAG level access (#3197) commit cbb809e9ffd350aa13c348b3557f5b387983f75a Author: Kevin Yang <kevin.yang@airbnb.com> Date: Thu Jun 28 13:30:36 2018 -0700 Add set failed for DagRun and task in tree view (#3255) commit 3b634feea05ca64321eb53af0c031b902020dae7 Author: Kaxil Naik <kaxilnaik@gmail.com> Date: Wed Sep 5 00:46:41 2018 +0100 Move Kubernetes example DAGs to contrib commit 32e190b655362ffd1403fe29558c78af70919bb0 Author: Riccardo Bini <odracci@gmail.com> Date: Fri Sep 21 14:36:09 2018 +0200 Fix Kubernetes CI (#3922) commit 7d710f68c415e096f8566d1fef444d5b8cc124cc Author: Fokko Driesprong <fokko@driesprong.frl> Date: Sat Aug 25 19:50:16 2018 +0200 Enable Codecov on Docker-CI Build (#3780) - Add missing variables and use codecov instead of coveralls. The issue why it wasn't working was because missing environment variables. The codecov library heavily depends on the environment variables in the CI to determine how to push the reports to codecov. - Remove the explicit passing of the variables in the `tox.ini` since it is already done in the `docker-compose.yml`, having to maintain this at two places makes it brittle. - Removed the empty Codecov yml since codecov was complaining that it was unable to parse it commit 1bfe1f5a4edb582ce67e8c61dfc7319b1f6e9aef Author: Gerardo Curiel <gerardo@gerar.do> Date: Wed Aug 22 18:26:54 2018 +1000 Dockerise CI pipeline (#3393) commit 4db444226734ac50f23edd78312e8aed2432282a Author: Kaxil Naik <kaxilnaik@apache.org> Date: Mon Jan 7 21:05:50 2019 +0000 Revert "[AIRFLOW-XXX] Switch to openjdk8 in Travis tests" This reverts commit 47ab0401d9a7c369151cc1800fee6adfe0efde53. commit 23fe47c970abdae9aa9f66ccfb0ec05c899b24c9 Author: Kevin Pullin <kevin.pullin@gmail.com> Date: Mon Jan 7 12:46:05 2019 -0800 Support global k8s affinity and toleration configs (#4247) commit e5d11820525af25ad287c0fc7edf0a5656cd62b5 Author: Tao Feng <tfeng@lyft.com> Date: Sun Jan 6 20:45:49 2019 -0800 Fix a flake8 error to unblock CI (#4453) commit 0a94c7e3677fc4c3227bc32e2b78baeb1da7134e Author: Raja Gangopadhya <raja.gangopadhya@remix.com> Date: Sun Jan 6 13:59:55 2019 -0800 Resolve a bug in adding password_auth to api as auth method (#4343) commit 400a460b1bf1ad2c83aa233e593ef7ab9321dcb7 Author: Dana Ma <dana.ma537@gmail.com> Date: Mon Jan 7 08:51:01 2019 +1100 Add region param for EMR jobflow creation (#4418) commit 294317f2deee0a3ff517949261986c1d0da806e4 Author: Kaxil Naik <kaxilnaik@gmail.com> Date: Sun Jan 6 21:40:54 2019 +0000 Fix test for GCS to GCS Transfer Hook (#4452) commit 5afde1d4e726f5f60b3d6cf3bf604859fe3be6b7 Author: Joshua Carp <jm.carp@gmail.com> Date: Sun Jan 6 14:35:31 2019 -0500 Add gcs to gcs transfer operator. (#4331) commit 4a8cbb084377e2d5e98b486b8645bc5c00023cf4 Author: Kaxil Naik <kaxilnaik@gmail.com> Date: Sun Dec 2 11:08:26 2018 +0000 Add missing GCP operators to Docs (#4260) commit 01513d55c62f97186c168adf941c897071e9be73 Author: Tao Feng <tfeng@lyft.com> Date: Sat Jan 5 06:05:25 2019 -0800 Remove incubation/incubator mention (#4419) commit d5f50c2532c1cb08f9b09a733963b19044ae4fff Author: r39132 <siddharthanand@yahoo.com> Date: Mon Sep 10 14:30:35 2018 -0700 Readme updates : Add Slack & Twitter, remove Gitter commit 0ec010edd1c91cc4b2cae608ed29b5db9c80d14c Author: r39132 <siddharthanand@yahoo.com> Date: Thu Sep 6 11:57:15 2018 -0700 Update Text & Images in Readme.md commit 7449cf4290d698239628adcbfef412755a76496a Author: Kaxil Naik <kaxilnaik@gmail.com> Date: Tue Sep 4 08:34:20 2018 +0100 Add badge to show supported Python versions (#3839) commit a7df3bb1bbb0def34b019dcae1376e92353a1e5d Author: Ash Berlin-Taylor <ash_github@firemirror.com> Date: Thu May 24 16:20:16 2018 +0100 Update PR tool to push directly to Github commit 4a9b98cb2baeed2d0feec1d898947addd4337706 Author: Ash Berlin-Taylor <ash_github@firemirror.com> Date: Thu May 24 16:05:37 2018 +0100 Flake8 fixes on dev/airflow-pr commit 9f40dd9557c1aa87241c90ab29154a2105e9545d Author: Kaxil Naik <kaxilnaik@gmail.com> Date: Sat Sep 29 11:35:26 2018 +0100 Update PR tool to remove outdated info (#3978) commit 99ad866a7f47d58e72ea5b651ff12e8f5dcf9f71 Author: Kaxil Naik <kaxilnaik@gmail.com> Date: Wed Sep 5 01:04:42 2018 +0100 Replace 'Airbnb Airflow' with 'Apache Airflow' (#3845) commit c99303398b11ce88d5c263777e84d960d881a29f Author: Kaxil Naik <kaxilnaik@gmail.com> Date: Sat Jan 5 13:15:56 2019 +0000 Fix GCP Spanner Test (#4440) commit bac9dcea19fbb7671790381aff7ce30c9bd538b0 Author: Kaxil Naik <kaxilnaik@gmail.com> Date: Sat Jan 5 02:17:45 2019 +0000 Make execution_date templated in TriggerDagRunOperator (#4359) commit 55a96af903c9b50a494e2163d09c7a1e803e50d5 Author: aoen <aoen@users.noreply.github.com> Date: Mon Dec 31 08:31:11 2018 +0200 Fix next_ds/prev_ds semantics for manual runs (#4385) commit 704ce1a6bf029ead0a04ea242b922db6b3c55cc3 Author: Kaxil Naik <kaxilnaik@gmail.com> Date: Sun Nov 25 21:44:07 2018 +0000 Add templated field in TriggerDagRunOperator (#4228) * [AIRFLOW-1196][AIRFLOW-2399] Make trigger_dag_id a templated field for TriggerDagRunOperator * Update dagrun_operator.py commit 69af9f0095edeae1a0a480857a0192ca1e579755 Author: Jarek Potiuk <jarek@potiuk.com> Date: Sat Jan 5 13:06:56 2019 +0100 Add GCP Spanner Database Operators (#4353) commit ab05880739f941fc5924b1477ee666e6e626a89d Author: Jarek Potiuk <jarek@potiuk.com> Date: Fri Jan 4 14:54:33 2019 +0100 Update Cloud SQL Proxy to have shorter path for UNIX socket (#4350) commit d92221d858c072b7c475bbeb9492d669e2d28f98 Author: Sumit Maheshwari <sumeet.manit@gmail.com> Date: Fri Jan 4 19:25:56 2019 +0530 Placeholder support in connections form (#4185) commit 43e813c4ccdb96461b182be1c2097654ffc9eec9 Author: Dariusz Aniszewski <dariusz@aniszewski.eu> Date: Fri Jan 4 14:50:15 2019 +0100 Add Google Cloud BigTable operators (#4354) commit 8caff95104ec8142c45f0d98f0787455808a4e1e Author: Conrad Lee <conradlee@gmail.com> Date: Fri Jan 4 00:18:05 2019 +0100 For gcs_to_bq: add missing init of schema_fields var (#4430) commit 5e11f9744fb2d35bbfbd20b6645876eb93d2c304 Author: Chinh Nguyen <chinhngt@gmail.com> Date: Thu Jan 3 14:39:24 2019 -0800 Fix AirflowException import (#4389) Looks like the class path changed and broke wasb_hook commit f5123265fc729cc4965079b9c21d981d16ca8d01 Author: Kaxil Naik <kaxilnaik@gmail.com> Date: Wed Jan 2 12:36:23 2019 +0000 Fix Type Error for BigQueryOperator (#4384) commit 1a440cacc6d5042c3614544580c1250673bc7b6d Author: Kaxil Naik <kaxilnaik@gmail.com> Date: Fri Sep 28 11:51:04 2018 +0100 Fix Kubernetes CI (#3957) commit 7fea1fb0f7431496eadfbc58fb0f3afabab2f045 Author: Kaxil Naik <kaxilnaik@apache.org> Date: Thu Jan 3 20:50:09 2019 +0000 Add license to Contrib Example DAG Init file commit 9cdc1e18fd31fed03a10b40227f8ab74f84a0188 Author: Steve Jacobs <brokenjacobs@gmail.com> Date: Wed Jan 2 00:57:08 2019 -0700 Add support for https and user auth (#2879) commit bccaacdda3785fd47d643b3fc87b352fbc544398 Author: Kaxil Naik <kaxilnaik@gmail.com> Date: Thu Jan 3 10:51:46 2019 +0000 Fix WeekDay Sensor Example (#4431) commit cfdb4b46273bf5532629b2d4269d2881e3a4c4dc Author: Kaxil Naik <kaxilnaik@gmail.com> Date: Thu Jan 3 09:35:35 2019 +0000 Add DayOfWeek Sensor (#4363) * [AIRFLOW-3560] Add WeekEnd & DayOfWeek Sensors * Change to using Enum * Fix Docstring * Refactor into a Single Sensor commit a0c7c9f1c7bab372ee66f16f7b25505c60c5e9cd Author: Kaxil Naik <kaxilnaik@apache.org> Date: Mon Dec 31 12:50:01 2018 +0000 Fix Flake8 issues commit fb0e74d4bc0a83e25c5ee1d170f74671175ebf69 Author: Kevin Pullin <kevin.pullin@gmail.com> Date: Sun Dec 16 23:05:26 2018 -0800 Read `dags_in_image` config value as a boolean (#4319) * Read `dags_in_image` config value as a boolean This PR is a minor fix for #3683 The dags_in_image config value is read as a string. However, the existing code expects this to be a boolean. For example, in worker_configuration.py there is the statement: if not self.kube_config.dags_in_image: Since the value is a non-empty string ('False') and not a boolean, this evaluates to true (since non-empty strings are truthy) and skips the logic to add the dags_volume_claim volume mount. This results in the CI tests failing because the dag volume is missing in the k8s pod definition. This PR reads the dags_in_image using the conf.getboolean to fix this error. Rebased on 457ad83e4eb02b7348e5ce00292ca9bd27032651, before the previous dags_in_image commit was reverted. * Revert "Revert [AIRFLOW-2770] [AIRFLOW-3505] (#4318)" This reverts commit 77c368fd228fe5edfdb3304ed4cb000a50667010. commit 29d140fa61480cd9b7a9e9cd30b1b7425fd7d6c0 Author: John Cheng <ckljohn@gmail.com> Date: Tue Nov 6 00:28:01 2018 +0800 Add volume mount to KubernetesExecutorConfig (#3855) Added volumes and volume_mounts to the KubernetesExecutorConfig so `volumes` or `secrets` can be mount to worker pod. commit 9233726288666fdb506a0a7a32754b5e01f223c6 Author: John Cheng <ckljohn@gmail.com> Date: Sun Aug 19 22:07:53 2018 +0800 Set AIRFLOW__CORE__SQL_ALCHEMY_CONN only when needed (#3766) Only when `airflow_configmap` is not provided and `AIRFLOW__CORE__SQL_ALCHEMY_CONN` not in secrets, it is set as an env var. commit a83d355016954a6c7fc46103aefde11200b16870 Author: Aldo Giambelluca <xoen@users.noreply.github.com> Date: Mon Aug 6 21:44:48 2018 +0100 Added `kubernetes.worker_dags_folder` configuration (#3612) It was previously hardcoded to `/tmp/dags`. This causes problems with python import of modules in the DAGs folder. commit c602e2850c64012cf1fd28914b33b35154c819bc Author: Shintaro Murakami <mrkm4ntr@gmail.com> Date: Wed Jul 4 17:48:51 2018 +0100 Fix inconsistency of default config of kubernetes worker Closes #3529 from mrkm4ntr/airflow-2655 commit f216bdd7e58e15b617d38b8a18ec20680cd84bb3 Author: roc <rockerchen@tencent.com> Date: Wed Jun 20 20:37:39 2018 +0200 Add worker_container_image_pull_policy Set worker_container_image_pull_policy in default_airflow.cfg As AIRFLOW-2617 added worker_container_image_pull_policy config to the section of kubernetes, but the airflow_default.cfg was not updated, this PR add worker_container_image_pull_policy to default_airflow.cfg. Closes #3521 from imroc/AIRFLOW-2645 commit 911884bef5106797e503b830c35f6087b0de4117 Author: Ravi Kotecha <kotecha.ravi@gmail.com> Date: Fri Jun 22 16:37:46 2018 +0200 fix config dags_volume_subpath and logs_volume_subpath Make sure you have checked _all_ steps below. ### JIRA - [x] My PR addresses the following [Airflow JIRA] (https://issues.apache.org/jira/browse/AIRFLOW/) issues and references them in the PR title. For example, "\[AIRFLOW-XXX\] My Airflow PR" - https://issues.apache.org/jira/browse/AIRFLOW-2661 - In case you are fixing a typo in the documentation you can prepend your commit with \[AIRFLOW-XXX\], code changes always need a JIRA issue. ### Description - [x] Here are some details about my PR, including screenshots of any UI changes: Changes the use of `log_volume_subpath` and `dags_volume_subpath` which are now passed into the construction of the worker pod's volumeMounts instead of the volume section (where subPath is not valid). ### Tests - [x] My PR adds the following unit tests __OR__ does not need testing for this extremely good reason: Unit tests have been added but I'm not sure how to add integration tests for this without breaking the other minikube tests ### Commits - [x] My commits all reference JIRA issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "[How to write a good git commit message](http://chris.beams.io/posts/git- commit/)": 1. Subject is separated from body by a blank line 2. Subject is limited to 50 characters 3. Subject does not end with a period 4. Subject uses the imperative mood ("add", not "adding") 5. Body wraps at 72 characters 6. Body explains "what" and "why", not "how" ### Documentation - [x] In case of new functionality, my PR adds documentation that describes how to use it. No new functionality added ### Code Quality - [x] Passes `git diff upstream/master -u -- "*.py" | flake8 --diff` Closes #3537 from r4vi/AIRFLOW-2661 commit 27a3c893a8bad408d96cd423f71fec017c7c52f0 Author: Tom Kunc <tom.kunc@kinesis.org> Date: Tue Jul 24 01:12:09 2018 +0100 Pass annotations to KubernetesExecutorConfig commit 2ccae80e4358f47470bc94a20ce7f95ce0d85b49 Author: Joshua Carp <jm.carp@gmail.com> Date: Sat Dec 29 19:12:17 2018 -0500 Standardize GKE hook (#4364) commit f2f649a1255991498946fb1edf09080d4caeebdb Author: Cameron Moberg <cjmoberg@gmail.com> Date: Tue Aug 7 09:57:41 2018 -0700 Fix GKEClusterHook catching wrong exception (#3711) commit 8534ef4c2c852e258eba5dcdaf8daf79dacb25f9 Author: Yohei Onishi <vivre214@gmail.com> Date: Sun Dec 30 08:09:00 2018 +0800 Fix TypeError in GCSToS3Op & S3ToGCSOp (#4371) Fix TypeError on GoogleCloudStorageToS3Operator & S3ToGoogleCloudStorageOperator commit 5dd247b2d7e15597414c7e67f1de3325965356b0 Author: Kaxil Naik <kaxilnaik@gmail.com> Date: Sat Dec 29 11:00:20 2018 +0000 Add support for location in BigQueryHook (#4324) commit fbab5ca4eedfcbbde5bae86eed45e2aa4b7708a5 Author: Kaxil Naik <kaxilnaik@gmail.com> Date: Sun Dec 9 22:29:11 2018 +0000 Fix default values in BigQuery Hook & BigQueryOperator (… …#4274) commit 290da3586da05c792fb9b7f4664aa9bd9c739f46 Author: Ryan Yuan <ryan.yuan@outlook.com> Date: Thu Nov 22 10:16:18 2018 +1100 BigQueryHook's Ability to Create View (#4213) commit 246ad9e4067027450ebbe220edf3582619652cfe Author: Ryan Yuan <ryan.yuan@outlook.com> Date: Sat Nov 17 22:52:03 2018 +1100 Add method to allow inserting rows into BQ table (#4179) commit 7c08b9892de805dc684ba97b9bc74469092e01c0 Author: Kengo Seki <sekikn@apache.org> Date: Fri Nov 16 14:31:24 2018 -0800 Fix BigQueryCursor.execute to work with Python3 (#4198) BigQueryCursor.execute uses dict.iteritems internally, so it fails with Python3 if binding parameters are provided. This PR fixes this problem. commit e2975e531f718a256e5afbf2cc306ac57f5ef187 Author: Iuliia Volkova <xnuinside@gmail.com> Date: Mon Oct 22 12:03:22 2018 +0300 Support autodetected schemas in BigQuery run_load (#3880) commit f6145a5d140a1180168f33e26c1642ec23ecfcf8 Author: Kaxil Naik <kaxilnaik@gmail.com> Date: Sat Oct 20 13:05:17 2018 +0100 BigQuery Hook - Minor Refactoring (#4066) commit 9da372b97b2fdf47b7868c30784fa88be6ac7e35 Author: Iuliia Volkova <xnuinside@gmail.com> Date: Sun Oct 7 21:49:50 2018 +0300 add get_dataset and get_datasets_list to bigquery_hook (#3894) * [AIRFLOW-3055] add get_dataset and get_datasets_list to bigquery_hook commit a3cf0b3899f7b3b27b48cf132d9ad217b295c87e Author: Iuliia Volkova <xnuinside@gmail.com> Date: Fri Sep 21 17:46:59 2018 +0300 Added BigQueryCreateEmptyDatasetOperator and create_emty_dataset to bigquery_hook (#3876) commit f6bbd7142c323734c0d91a6339362e71401813a8 Author: Gordon Ball <chronitis@gmail.com> Date: Fri Sep 7 18:41:03 2018 +0200 Support cluster fields in bigquery (#3838) This adds a cluster_fields argument to the bigquery hook, GCS to bigquery operator and bigquery query operators. This field requests that bigquery store the result of the query/load operation sorted according to the specified fields (the order of fields given is significant). commit b7e17e181b49941640a575ad33dcc07eabeeb68e Author: Kaxil Naik <kaxilnaik@gmail.com> Date: Fri Aug 31 09:36:24 2018 +0100 Fix Docstrings for Operators (#3820) commit abc89aea43c20772ec659d3245855b4fca6e2b6a Author: Xiaodong <xd_deng@hotmail.com> Date: Tue Aug 28 20:36:29 2018 +0800 Arg `verify` for AwsHook() & S3 sensors/operators (#3764) commit 541793c510d4fa59caa9a62785a837c85adbf4ab Author: Kengo Seki <sekikn@apache.org> Date: Wed Jul 18 09:04:25 2018 -0700 Add a sensor for MongoDB This PR adds a sensor for MongoDB, which waits for some document that matches the given query to be inserted to the specified collection. Closes #3611 from sekikn/AIRFLOW-2758 commit 0a49c8fb0c3d3eb057ebea09a44b14453e79616a Author: Kengo Seki <sekikn@apache.org> Date: Wed Jun 20 20:36:32 2018 +0200 Add Cassandra table sensor Just like a partition sensor for Hive, this PR adds a sensor that waits for a table to be created in Cassandra cluster. Closes #3518 from sekikn/AIRFLOW-2640 commit de6d19f57f01a2d3380cd5600233c3e4afdfa8e2 Author: Yuliya Volkova <xnuinside@gmail.com> Date: Mon Sep 3 23:30:22 2018 +0300 Add feature to pass extra api configs to BQ Hook (#3733) commit a5dcbdef5626e46f86873be73655f775aebe3a0e Author: Kazuhiro Sera <seratch@gmail.com> Date: Sun Aug 12 13:11:19 2018 +0900 Fix typos detected by github.com/client9/misspell (#3732) commit bba5721d11528aba525e0c4d4804b1b113fe1f07 Author: Tao Feng <tfeng@lyft.com> Date: Sat Dec 22 10:13:39 2018 -0800 Add a PythonSensor (#4349) commit adcacfc1f2d49d57b4bfa5a81bf784e2ffb66fb7 Author: Xiaodong <xd_deng@hotmail.com> Date: Thu Dec 20 06:43:09 2018 +0800 Fix inconsistent comment in example_python_operator.py (#4337) commit 4137dda52e017d5c9bf8146ce6b0cd9a1ef4d824 Author: eladkal <45845474+eladkal@users.noreply.github.com> Date: Thu Dec 20 00:40:28 2018 +0200 Fix incorrect parameter in SFTPOperator example (#4344) commit 2ef9df6af223e45edb5a11b5872be25b2b0880e5 Author: Szymon Przedwojski <szymon.przedwojski@gmail.com> Date: Wed Dec 19 22:41:53 2018 +0100 Google Cloud Spanner instance database query operator (#4314) commit ef19572c6d04a39be81df2a921c2192cbd8580be Author: gseva <gavrilovseva@gmail.com> Date: Wed Dec 19 06:48:57 2018 -0300 Make hmsclient optional in airflow.hooks.hive_hooks (#4080) Delay the import right up until it is needed, like how we do with the thrift imports. commit a80f01ddcb767860991acf0a1e6340d9d29d39f0 Author: Felix <feluelle@users.noreply.github.com> Date: Mon Dec 17 19:05:21 2018 +0100 Add missing remote logging field (#4333) commit 017f745f3f7c45c60f49e1b12c30686319069c98 Author: Kaxil Naik <kaxilnaik@gmail.com> Date: Sat Dec 15 23:13:36 2018 +0000 Add 2 options for ts_nodash Macro (#4323) commit 3f0c50390ec2da69d1c2bde0d86dd78701575b53 Author: Alvin Ali Khaled <aakside@gmail.com> Date: Wed Nov 14 02:59:30 2018 -0800 Revise template variables documentation (#4172) Updated documentation to elaborate on the (yesterday|tomorrow)_.* variables' relations to the execution date. commit 5acc53d4d17154d1656336e8b446d414e100de46 Author: thomasbrockmeier <thomas.brockmeier@gmail.com> Date: Sat Dec 15 16:27:10 2018 +0100 Airflow Filter_by_owner not working with password_auth (#4276) Local users were always a superuser, this adds a column to the DB (and defaults to false, which is going to cause a bit of an upgrade pain for people, but defaulting to not being an admin is the only secure default.) commit c6a79388ff82cbc2848d7318e554c76814ab843b Author: Stefan Seelmann <mail@stefan-seelmann.de> Date: Fri Sep 21 07:00:29 2018 +0200 Explicit re-schedule of sensors (#3596) * [AIRFLOW-2747] Explicit re-schedule of sensors Add `mode` property to sensors. If set to `reschedule` an AirflowRescheduleException is raised instead of sleeping which sets the task back to state `NONE`. Reschedules are recorded in new `task_schedule` table and visualized in the Gantt view. New TI dependency checks if a sensor task is ready to be re-scheduled. * Reformat sqlalchemy imports * Make `_handle_reschedule` private * Remove print * Add comment * Add comment * Don't record reschule request in test mode commit 7abaa169edce078aaf7470e741f12e5d190ae318 Author: Kevin Yang <yrqls21@gmail.com> Date: Mon Nov 26 21:49:31 2018 -0800 Add index on dag_id in sla_miss table (#4235) The select queries on sla_miss table produce a great % of DB traffic and thus made the DB CPU usage unnecessarily high. It would be a low hanging fruit to add an index and reduce the load. commit f171593b5e3e6b7cc8a2cea24a2613c69e4368cc Author: ubermen <kjh3477@gmail.com> Date: Fri Oct 12 18:38:51 2018 +0900 Add index 'ti_dag_date' to taskinstance (#3885) To optimize query performance commit d221699325362ac9246f21a1edc76383826c567e Author: Vardan Gupta <vardanguptacse@gmail.com> Date: Wed Aug 8 22:43:53 2018 +0530 Add index on log table (#3709) commit 7740e1bc259a18f98dbe76076c98f11e7d0ed1d2 Author: Niels Zeilemaker <niels@zeilemaker.nl> Date: Sat Dec 15 16:11:26 2018 +0100 Performance fixes for topological_sort of Tasks (#4322) For larger DAGs topological_sort was found to be very inefficient. Made some small changes to the code to improve the data structures used in the method. commit 974bb04c013b0747c2c89edac939077f8be45431 Author: Ash Berlin-Taylor <ash_github@firemirror.com> Date: Sat Dec 15 11:50:22 2018 +0000 Fetch more than 50 items in `airflow-jira compare` script (#4300) commit 9b0e6963979441070a5efab7ac94997cf4d1d8da Author: Tao feng <tfeng@lyft.com> Date: Wed Jun 20 20:32:50 2018 +0200 Add option to query for DAG runs given a DAG ID Closes #3515 from feng-tao/airflow-1919 commit d8f709ae1944739990b01190d1bfb9eaca826a2a Author: Joshua Carp <jm.carp@gmail.com> Date: Fri Dec 14 07:23:23 2018 -0500 Explicitly set transfer operator description. (#4279) commit 71a3a0340970b85a75947690dd57741d82f2e973 Author: tal181 <tal181@gmail.com> Date: Thu Dec 13 03:23:47 2018 +0200 Add OpenFaaS hook (#4267) commit 8d178443c606f738bcd2f655b5724b62fb9ea00c Author: Szymon Przedwojski <szymon.przedwojski@gmail.com> Date: Thu Dec 13 02:15:43 2018 +0100 Google Cloud Spanner deploy / delete operators (#4286) commit f088e6be54f3846583e7fbb542d2b23c45c64928 Author: Andy Cooper <andycooper.s@gmail.com> Date: Tue Jul 24 01:01:38 2018 +0100 Add context manager entry points to mongoHook Closes #3628 from andscoop/Add-connection-close- to-mongo-hook commit e14c4e43cbc12bdc0647e4f0f8da3c1719c1d2ca Author: yangaws <31293788+yangaws@users.noreply.github.com> Date: Thu Dec 6 11:51:11 2018 -0800 Add SageMaker doc to AWS integration section (#4278) commit c1f63705e204db0ca1b1441a27660ef389b86d35 Author: Daniel Imberman <daniel.imberman@gmail.com> Date: Fri Dec 7 15:39:47 2018 -0800 Fix Over-logging in the k8s executor (#4296) There are two log lines in the k8sexecutor that can cause schedulers to crash due to too many logs. commit 5567d251d8da5041eefc56f96fbb3166d9d45b4c Author: Xiaodong <xd_deng@hotmail.com> Date: Sat Dec 8 09:14:43 2018 +0800 Keeps records in Log Table when DAG is deleted (#4287) Users will use either API or web UI to delete DAG (after DAG file is removed): - Using API: provide one boolean parameter to let users decide if they want to keep records in Log table when they delete a DAG. Default value it True (to keep records in Log table). - From UI: will keep records in the Log table when delete records for a specific DAG ID (pop-up message is updated accordingly). commit 65d04fd55d2ea2157201be5869069388ebcddddf Author: Kaxil Naik <kaxilnaik@gmail.com> Date: Sun Sep 2 22:27:19 2018 +0100 Fix typo in docstring of gcs_to_bq (#3833) commit 9dbe78c512c7219a850b99d61d1320ef8f0d35f4 Author: Tom Miller <tmiller@microsoft.com> Date: Thu Dec 6 10:18:44 2018 -0800 Implement an Azure CosmosDB operator (#4265) Add an operator and hook to manipulate and use Azure CosmosDB documents, including creation, deletion, and updating documents and collections. Includes sensor to detect documents being added to a collection. commit df178de8130cc39f778e77a3c1792e3563248cce Author: yangaws <31293788+yangaws@users.noreply.github.com> Date: Thu Dec 6 11:51:11 2018 -0800 Add SageMaker doc to AWS integration section (#4278) commit 1a50b8498e86741162b74962d8a59911c3ff5c1f Author: John Cheng <ckljohn@gmail.com> Date: Thu Nov 15 06:16:08 2018 +0800 Add MongoDB connection (#4154) commit 577e026d1608bfd3c087b21b7c56373d839a61fc Author: Kaxil Naik <kaxilnaik@gmail.com> Date: Mon Dec 3 12:39:46 2018 +0000 Allows creating intermediate dirs in SFTPOperator (#4270) commit 6110c7152146d81e52c0a113e913929831956483 Author: Xiaodong <xd_deng@hotmail.com> Date: Thu Aug 30 03:20:11 2018 +0800 Arg check & better doc - SSHOperator & SFTPOperator (#3793) There may be different combinations of arguments, and some processings are being done 'silently', while users may not be fully aware of them. For example - User only needs to provide either `ssh_hook` or `ssh_conn_id`, while this is not clear in doc - if both provided, `ssh_conn_id` will be ignored. - if `remote_host` is provided, it will replace the `remote_host` which wasndefined in `ssh_hook` or predefined in the connection of `ssh_conn_id` These should be documented clearly to ensure it's transparent to the users. log.info() should also be used to remind users and provide clear logs. In addition, add instance check for ssh_hook to ensure it is of the correct type (SSHHook). Tests are updated for this PR. commit 3fbbded39c9bfc0a5e055eac29891325671e504d Author: John Cheng <ckljohn@gmail.com> Date: Sun Aug 19 22:10:25 2018 +0800 Add remote_host of SSH/SFTP operator as templated field (#3765) It allows remote_host to be passed to operator with XCOM. commit f81f14ec43fd035e0b71400b8016aa3145f27fd4 Author: Cameron Moberg <cjmoberg@gmail.com> Date: Tue Jul 31 15:24:34 2018 -0400 Update SSH Operator's Hook to respect timeout (#3666) commit b6c5d1f721c91bea27de336b8500c2854598e370 Author: Ash Berlin-Taylor <ash_gi…
[ Description ]
There was no index composed of dag_id and execution_date. So, when scheduler finds all tis of dagrun like this "select * from task_instance where dag_id = 'some_id' and execution_date = '2018-09-01 ...'", this query will be using ti_dag_state index (I was testing it in mysql workbench. I was expecting 'ti_state_lkp' but, it was not that case). Perhaps there's no problem when range of execution_date is small (under 1000 dagrun), but I had experienced slow allocation of tis when the dag had 1000+ accumulative dagrun. So, now I was using airflow with adding new index ti_dag_date (dag_id, execution_date) on task_instance table. I have attached result of my test :)
[ Test ] I have tested using 1.10 version
models.py > DAG.run
jobs.py > BaseJob.run
jobs.py > BackfillJob._execute
jobs.py > BackfillJob._execute_for_run_dates
jobs.py > BackfillJob._task_instances_for_dag_run
models.py > DagRun.get_task_instances
tis = session.query(TI).filter(
TI.dag_id == self.dag_id,
TI.execution_date == self.execution_date,
)
I can find that many slow query logs get to occur from mysql log file. (query like below sample)
"select * from task_instance where dag_id = 'some_id' and execution_date = '2018-09-01 ...'"
[ASIS] current
[TOBE] after adding new index
Jira
Description
Tests
Commits
Documentation
Code Quality
git diff upstream/master -u -- "*.py" | flake8 --diff