airflow.git
2 months agoUpdate Dockerfile - set proper airflow version (#22925) v2-2-stable
piotrekklis [Tue, 12 Apr 2022 07:33:04 +0000 (09:33 +0200)] 
Update Dockerfile - set proper airflow version (#22925)

Currently set airflow version won't let to build the image

2 months agoUpdate Airflow 2.2.5 release date on Changelog (#22732)
Ephraim Anierobi [Mon, 4 Apr 2022 16:37:56 +0000 (17:37 +0100)] 
Update Airflow 2.2.5 release date on Changelog (#22732)

3 months agofixup! Add 2.2.5 to CHANGELOG.txt and UPDATING.md 2.2.5 2.2.5rc3
Ephraim Anierobi [Wed, 30 Mar 2022 06:06:10 +0000 (07:06 +0100)] 
fixup! Add 2.2.5 to CHANGELOG.txt and UPDATING.md

3 months agoAdd 2.2.5 to CHANGELOG.txt and UPDATING.md
Ephraim Anierobi [Thu, 24 Mar 2022 01:39:28 +0000 (02:39 +0100)] 
Add 2.2.5 to CHANGELOG.txt and UPDATING.md

3 months agoUpdate FAB to latest released from 3.4 line (3.4.5) (#22596)
Jarek Potiuk [Tue, 29 Mar 2022 20:15:38 +0000 (22:15 +0200)] 
Update FAB to latest released from 3.4 line (3.4.5) (#22596)

We checked that the changes introduced between 3.4.4 and 3.4.5
do not require from us to change the vendored-in security manager.

(cherry picked from commit d4846e4137b84e86ff107da6e495579c143fe7bd)

3 months agofixup! Add 2.2.5 to CHANGELOG.txt and UPDATING.md 2.2.5rc2
Ephraim Anierobi [Mon, 28 Mar 2022 18:44:12 +0000 (19:44 +0100)] 
fixup! Add 2.2.5 to CHANGELOG.txt and UPDATING.md

3 months agoAdd 2.2.5 to CHANGELOG.txt and UPDATING.md
Ephraim Anierobi [Thu, 24 Mar 2022 01:39:28 +0000 (02:39 +0100)] 
Add 2.2.5 to CHANGELOG.txt and UPDATING.md

3 months agoRevert "Fix handling some None parameters in kubernetes 23 libs. (#21905)"
Jarek Potiuk [Mon, 28 Mar 2022 15:38:20 +0000 (17:38 +0200)] 
Revert "Fix handling some None parameters in kubernetes 23 libs. (#21905)"

This reverts commit 24c84f0acee622f6bba4b8d73f2e9a6831cd364a.

3 months agoRevert "Update Kubernetes library version (#18797)"
Ash Berlin-Taylor [Tue, 4 Jan 2022 12:09:35 +0000 (12:09 +0000)] 
Revert "Update Kubernetes library version (#18797)"

This reverts commit cb9cdf5285502381298bf459d000dc689c6aab2a.

3 months agoRevert "Remove RefreshConfiguration workaround for K8s token refreshing (#20759)"
Jarek Potiuk [Mon, 28 Mar 2022 15:32:36 +0000 (17:32 +0200)] 
Revert "Remove RefreshConfiguration workaround for K8s token refreshing (#20759)"

This reverts commit d39197fd13b0d96c2ab84ca3f1f13391dbf59572.

3 months agoFix rat-exclides and issue template licence (#22550) 2.2.5rc1
Jarek Potiuk [Sun, 27 Mar 2022 12:03:56 +0000 (14:03 +0200)] 
Fix rat-exclides and issue template licence (#22550)

3 months agoAdd 2.2.5 to CHANGELOG.txt and UPDATING.md
Ephraim Anierobi [Thu, 24 Mar 2022 01:39:28 +0000 (02:39 +0100)] 
Add 2.2.5 to CHANGELOG.txt and UPDATING.md

3 months agoUpdate Kubernetes library version (#18797)
Ash Berlin-Taylor [Tue, 4 Jan 2022 12:09:35 +0000 (12:09 +0000)] 
Update Kubernetes library version (#18797)

Previously we pinned this version as v12 as a change to Kube library
internals meant v1.Pod objects now have a logger object inside them, and
couldn't be pickled on Python 3.6.

To fix that we have "backported" the change in Python 3.7 to make Logger
objects be pickled "by name". (In Python 3.7 the change adds
`__reduce__` methods on to the Logger and RootLogger objects, but here
we achieve it `copyreg` stdlib module so we don't monkeypatch
anything.)

This fix is also applied in to airflow core in a separate commit, but we
also apply it here in the provider so that cncf.kubernetes client
library can be updated but still used with older versions of Airflow
that don't have this fix in.

(cherry picked from commit 7222f68d374787f95acc7110a1165bd21e7722a1)

3 months agoFix handling some None parameters in kubernetes 23 libs. (#21905)
Jarek Potiuk [Tue, 1 Mar 2022 19:33:15 +0000 (20:33 +0100)] 
Fix handling some None parameters in kubernetes 23 libs. (#21905)

Kubernetes 23.* is more picky when it comes to values passed to
Pod Generator - it requires:

* imagePullPolicy
* dnsPolicy
* restartPolicy

to be not None.

We are fixing it in the way, that we simply skip setting those
if they are None.

(cherry picked from commit e856aba9ab69094787dfd0f6e911f20782069e92)

3 months agoRemove RefreshConfiguration workaround for K8s token refreshing (#20759)
Daniel Standish [Wed, 16 Mar 2022 19:33:01 +0000 (12:33 -0700)] 
Remove RefreshConfiguration workaround for K8s token refreshing (#20759)

A workaround was added (https://github.com/apache/airflow/pull/5731) to handle the refreshing of EKS tokens.  It was necessary because of an upstream bug.  It has since been fixed (https://github.com/kubernetes-client/python-base/commit/70b78cd8488068c014b6d762a0c8d358273865b4) and released in v21.7.0 (https://github.com/kubernetes-client/python/blob/master/CHANGELOG.md#v2170).

(cherry picked from commit 7bd165fbe2cbbfa8208803ec352c5d16ca2bd3ec)

3 months agoAdd the new Airflow Trove Classifier to setup.cfg (#22241)
Jarek Potiuk [Mon, 14 Mar 2022 11:33:50 +0000 (12:33 +0100)] 
Add the new Airflow Trove Classifier to setup.cfg (#22241)

We have new Trove Classifiers in PyPI for Apache Airflow:

https://github.com/pypa/trove-classifiers/pull/87

This PR adds it for Airflow. The next release of Providers will
add the classifiers for providers.

(cherry picked from commit d3c65b64e9ff4c24a1a899af0d5eba954775f646)

3 months agoRemove cache timeout limit
Jarek Potiuk [Fri, 25 Mar 2022 23:52:10 +0000 (23:52 +0000)] 
Remove cache timeout limit

3 months agoFixed Dask TLS tests to work on Python 3.6
Jarek Potiuk [Fri, 25 Mar 2022 17:48:39 +0000 (17:48 +0000)] 
Fixed Dask TLS tests to work on Python 3.6

3 months agoMake regular expressions deep-copyable in Python 3.6
Jarek Potiuk [Fri, 25 Mar 2022 17:29:15 +0000 (17:29 +0000)] 
Make regular expressions deep-copyable in Python 3.6

3 months agoCheck and disallow a relative path for sqlite (#22530)
YenchenLiu [Fri, 25 Mar 2022 16:56:12 +0000 (00:56 +0800)] 
Check and disallow a relative path for sqlite (#22530)

(cherry picked from commit 96e880d6ab33e05a0dc1d1e2b2178ed1053490af)

3 months agoReplace timedelta.max with year long timdelta in test_manager (#22527)
Jarek Potiuk [Fri, 25 Mar 2022 14:56:41 +0000 (15:56 +0100)] 
Replace timedelta.max with year long timdelta in test_manager (#22527)

Timedelta.max used in tests is not realistic and in some
circumstances, when it is added to date, it might cause
date OverflowError. Using long (but not 999999999 days long)
timedelta solves the problem.

(cherry picked from commit 18da1217d7ae593ff33c681353b027fac9252523)

3 months agoFixed dask executor and tests (#22027)
Kanthi [Tue, 8 Mar 2022 15:34:48 +0000 (10:34 -0500)] 
Fixed dask executor and tests (#22027)

 Fixed dask executor and tests, distributed package does not ship with tests folder and the certificates, added certificates to certs folder

(cherry picked from commit d3c168c5a341327bb55eb855433620249941584a)

3 months agoRemove auto-generated LICENSES-ui from pre-commit
Jarek Potiuk [Fri, 25 Mar 2022 07:28:32 +0000 (08:28 +0100)] 
Remove auto-generated LICENSES-ui from pre-commit

3 months agoRemove provider check from pre-commits
Jarek Potiuk [Fri, 25 Mar 2022 07:23:42 +0000 (08:23 +0100)] 
Remove provider check from pre-commits

3 months agoMake v2-2-specific CI workflow
Jarek Potiuk [Thu, 24 Mar 2022 23:30:39 +0000 (00:30 +0100)] 
Make v2-2-specific CI workflow

This is a change in the workflow specific to v2-2-branch that
brings build-ci-images and tests in the same workflow.

This can only be done because v2-2 branch no longer accepts PRs
from forks and workflows are only one with direct pushes from
commmitters. Then the workflows have write access to the packages
and can be simplified (and can also contain different set
of images than the main workflow).

3 months agoRemove tests that are not really needed in 2-2 branch
Jarek Potiuk [Thu, 24 Mar 2022 21:40:11 +0000 (21:40 +0000)] 
Remove tests that are not really needed in 2-2 branch

3 months agoAdd `dbt` spelling
Jarek Potiuk [Thu, 24 Mar 2022 21:34:01 +0000 (21:34 +0000)] 
Add `dbt` spelling

3 months agoAdd back celery intersphinx mapping (#22370)
Kaxil Naik [Fri, 18 Mar 2022 22:57:21 +0000 (22:57 +0000)] 
Add back celery intersphinx mapping (#22370)

We had disabled this previously in (#22254) but now the website is up on a different domain as listed in https://github.com/celery/celeryproject/issues/51#issuecomment-1072248499

(cherry picked from commit 1f7836e07e643b3e9e0eca57ae077072358c63c1)

3 months agoFix broken links to celery documentation (#22364)
davidhagens [Fri, 18 Mar 2022 22:57:35 +0000 (23:57 +0100)] 
Fix broken links to celery documentation (#22364)

The celery documentation have been moved from https://docs.celeryproject.org/ to https://docs.celeryq.dev/. The old links now refer to a 404 error page, the new links to the actual documentation.

(cherry picked from commit a8de170c9fb31d219ec4287f79cfa58cc0b44866)

3 months agoLimit docs build to apache-airlfow + docker-stack for 2.2 branch
Jarek Potiuk [Thu, 24 Mar 2022 21:02:43 +0000 (21:02 +0000)] 
Limit docs build to apache-airlfow + docker-stack for 2.2 branch

3 months agoRemove provider tests failing collection because lack of cherry-picks
Jarek Potiuk [Thu, 24 Mar 2022 20:45:44 +0000 (21:45 +0100)] 
Remove provider tests failing collection because lack of cherry-picks

3 months agoRemove memorystore, not importable example
Jarek Potiuk [Thu, 24 Mar 2022 20:38:35 +0000 (21:38 +0100)] 
Remove memorystore, not importable example

3 months agoSynchronize setup.py files with new providers
Jarek Potiuk [Thu, 24 Mar 2022 16:27:42 +0000 (17:27 +0100)] 
Synchronize setup.py files with new providers

3 months agoUpdate version added for `deactivate_stale_dags_interval` config (#22478)
Ephraim Anierobi [Tue, 22 Mar 2022 21:55:42 +0000 (22:55 +0100)] 
Update version added for `deactivate_stale_dags_interval` config (#22478)

(cherry picked from commit 24a0d6a6ad71cdb922fec3df59acbc0e9f2da39c)

3 months agoadding `on_execute_callback` to callbacks docs (#22362)
aspain [Mon, 21 Mar 2022 20:38:25 +0000 (16:38 -0400)] 
adding `on_execute_callback` to callbacks docs (#22362)

* adding on_execute_callback

`on_execute_callback` is not listed but is an available callback to use. Could not find a docs page to link the way that the rest of them have.

* Update docs/apache-airflow/logging-monitoring/callbacks.rst

Co-authored-by: Josh Fell <48934154+josh-fell@users.noreply.github.com>
* Update docs/apache-airflow/logging-monitoring/callbacks.rst

Co-authored-by: Josh Fell <48934154+josh-fell@users.noreply.github.com>
Co-authored-by: eladkal <45845474+eladkal@users.noreply.github.com>
(cherry picked from commit 179c5b67e9c838f2f6df2b2b9a4451bfa09db42d)

3 months agoAdd documentation on specifying a DB schema. (#22347)
nick [Sun, 20 Mar 2022 07:28:19 +0000 (00:28 -0700)] 
Add documentation on specifying a DB schema. (#22347)

* Add documentation on specifying a DB schema.

From request - https://github.com/apache/airflow/issues/17374#issuecomment-1060019956

Co-authored-by: Nick Shook <nick.shook@apple.com>
(cherry picked from commit 3bc0da326e7963eab9154913d5cafbc1be1c1a67)

3 months agoDB upgrade is required when updating Airflow (#22061)
Jed Cunningham [Mon, 7 Mar 2022 22:05:58 +0000 (15:05 -0700)] 
DB upgrade is required when updating Airflow (#22061)

Just strengthen the language that it is "required", not "recommended" to
run `airflow db upgrade` when upgrading Airflow versions.

(cherry picked from commit e7fed6bb3d7c8def86b47204176cebbfc6c401ff)

3 months agoChange the storage of frame to use threadLocal rather than Dict (#21993)
Jarek Potiuk [Fri, 4 Mar 2022 17:46:37 +0000 (18:46 +0100)] 
Change the storage of frame to use threadLocal rather than Dict (#21993)

There is a very probable WeakKeyDict bug in Python standard
library (to be confirmed and investigated further) that
manifests itself in a very rare failure of the
test_stacktrace_on_failure_starts_with_task_execute_method

This turned out to be related to an unexpected behaviour
(and most likely a bug - to be confirmed) of WeakKeyDict
when you have potentially two different objects with the
same `equals` and `hash` values added to the same
WeakKeyDict as keys.

More info on similar report (but raised for a bit different
reason) bug in Python can be found here:

https://bugs.python.org/issue44140

We submitted a PR to fix the problem found
https://github.com/python/cpython/pull/31685

(cherry picked from commit 1949f5d76b5842d56db91c868ae4655bb7a7689f)

3 months agoFix incorrect data provided to tries & landing times charts (#21928)
Alexander Millin [Wed, 2 Mar 2022 10:29:07 +0000 (13:29 +0300)] 
Fix incorrect data provided to tries & landing times charts (#21928)

(cherry picked from commit 2c57ad4ff9ddde8102c62f2e25c2a2e82cceb3e7)

3 months agoFix assignment of unassigned triggers (#21770)
jkramer-ginkgo [Sat, 26 Feb 2022 19:25:15 +0000 (14:25 -0500)] 
Fix assignment of unassigned triggers (#21770)

Previously, the query returned no alive triggerers which resulted
in all triggers to be assigned to the current triggerer. This works
fine, despite the logic bug, in the case where there's a single
triggerer. But with multiple triggerers, concurrent iterations of
the TriggerJob loop would bounce trigger ownership to whichever
loop ran last.

Addresses https://github.com/apache/airflow/issues/21616

(cherry picked from commit b26d4d8a290ce0104992ba28850113490c1ca445)

3 months agoFix the triggerer capacity test (#21760)
Mark Norman Francis [Wed, 23 Feb 2022 12:38:27 +0000 (12:38 +0000)] 
Fix the triggerer capacity test (#21760)

Commit 9076b67 changed the triggerer logic to use int not string.

(cherry picked from commit e1fe30c70d0fe9c033db9daf9d4420f7fa815b2d)

3 months agoFix triggerer --capacity parameter (#21753)
Sumit Maheshwari [Wed, 23 Feb 2022 10:20:13 +0000 (15:50 +0530)] 
Fix triggerer --capacity parameter (#21753)

(cherry picked from commit 9076b67c05cdba23e8fa51ebe5ad7f7d53e1c2ba)

3 months agoCorrect a couple grammatical errors in docs (#21750)
Zach McQuiston [Wed, 23 Feb 2022 20:48:06 +0000 (13:48 -0700)] 
Correct a couple grammatical errors in docs (#21750)

Just reading through the docs as we implement Airflow on our end, saw a couple additions that could be made.

(cherry picked from commit 3bb63d4cfbbb534298d32ec987f25a02c11fc4e6)

3 months agoFix graph autorefresh on page load (#21736)
Brent Bovenzi [Tue, 22 Feb 2022 16:41:39 +0000 (11:41 -0500)] 
Fix graph autorefresh on page load (#21736)

* fix auto refresh check on page load

* minor code cleanup

* remove new line

(cherry picked from commit b2c0a921c155e82d1140029e6495594061945025)

3 months agoFix filesystem sensor for directories (#21729)
Mikhail Ilchenko [Tue, 1 Mar 2022 09:18:34 +0000 (12:18 +0300)] 
Fix filesystem sensor for directories (#21729)

Fix walking through wildcarded directory in `FileSensor.poke` method

(cherry picked from commit 6b0ca646ec849af91fe8a10d3d5656cafa3ed4bd)

3 months agoFix stray order_by(TaskInstance.execution_date) (#21705)
Tzu-ping Chung [Tue, 22 Feb 2022 20:12:22 +0000 (04:12 +0800)] 
Fix stray order_by(TaskInstance.execution_date) (#21705)

(cherry picked from commit bb577a98494369b22ae252ac8d23fb8e95508a1c)

3 months agoCorrectly handle multiple '=' in LocalFileSystem secrets. (#21694)
Davy [Thu, 24 Feb 2022 07:40:57 +0000 (15:40 +0800)] 
Correctly handle multiple '=' in LocalFileSystem secrets. (#21694)

(cherry picked from commit 919b75ba20083cc83c4e84e35aae8102af2b5871)

3 months agoLog exception in local executor (#21667)
Tzu-ping Chung [Sun, 20 Feb 2022 21:05:01 +0000 (05:05 +0800)] 
Log exception in local executor (#21667)

(cherry picked from commit a0fb0bbad312df06dd0a85453bd4f93ee2e01cbb)

3 months agoDisable default_pool delete on web ui (#21658)
Chenglong Yan [Wed, 16 Mar 2022 08:31:12 +0000 (16:31 +0800)] 
Disable default_pool delete on web ui (#21658)

(cherry picked from commit df6058c862a910a99fbb86858502d9d93fdbe1e5)

3 months agoFix postgres part of pipeline example of tutorial (#21586)
KevinYanesG [Tue, 15 Feb 2022 18:26:30 +0000 (19:26 +0100)] 
Fix postgres part of pipeline example of tutorial (#21586)

(cherry picked from commit 40028f3ea3e78a9cf0db9de6b16fa67fa730dd7a)

3 months agoextends typing-extensions to be installed with python 3.8+ #21566 (#21567)
Frank Cash [Fri, 25 Feb 2022 21:26:22 +0000 (16:26 -0500)] 
extends typing-extensions to be installed with python 3.8+ #21566 (#21567)

(cherry picked from commit e4ead2b10dccdbe446f137f5624255aa2ff2a99a)

3 months agoDispose unused connection pool (#21565)
Ping Zhang [Tue, 15 Feb 2022 06:15:39 +0000 (22:15 -0800)] 
Dispose unused connection pool (#21565)

(cherry picked from commit 8155e8ac0abcaf3bb02b164fd7552e20fa702260)

3 months agoFix logging JDBC SQL error when task fails (#21540)
hubert-pietron [Sun, 27 Feb 2022 13:07:14 +0000 (14:07 +0100)] 
Fix logging JDBC SQL error when task fails (#21540)

(cherry picked from commit bc1b422e1ce3a5b170618a7a6589f8ae2fc33ad6)

3 months agoFilter out default configs when overrides exist. (#21539)
Xiao Yu [Tue, 15 Mar 2022 18:06:50 +0000 (18:06 +0000)] 
Filter out default configs when overrides exist. (#21539)

* Filter out default configs when overrides exist.

When sending configs to Airflow workers we materialize a temp config file. In #18772 a feature was added so that `_cmd` generated secrets are not written to the files in some cases instead favoring maintaining the raw `_cmd` settings. Unfortunately during materializing of the configs via `as_dict()` Airflow defaults are generated and materialized as well including defaults for the non `_cmd` versions of some settings. And due to Airflow setting precedence stating bare versions of settings winning over `_cmd` versions it results in `_cmd` settings being discarded:
https://airflow.apache.org/docs/apache-airflow/stable/howto/set-config.html

This change checks `_cmd`, env, and secrets when materializing configs via `as_dict()` so that if the bare versions of the values is exactly the same as Airflow defaults and we have "hidden" / special versions of these configs that are trying to be set we remove the bare versions so that the correct version can be used.

Fixes: #20092
Related to: #18772 #4050

(cherry picked from commit e07bc63ec0e5b679c87de8e8d4cdff1cf4671146)

3 months agoFix Resources __eq__ check (#21442)
Ping Zhang [Thu, 10 Feb 2022 12:40:51 +0000 (04:40 -0800)] 
Fix Resources __eq__ check (#21442)

(cherry picked from commit 6b308446eae2f83bf379f976c7d7801aa53370a3)

3 months agoFix max_active_runs=1 not scheduling runs when min_file_process_interval is high...
Ephraim Anierobi [Thu, 24 Feb 2022 07:12:12 +0000 (08:12 +0100)] 
Fix max_active_runs=1 not scheduling runs when min_file_process_interval is high (#21413)

The finished dagrun was still being seen as running when we call dag.get_num_active_runs
because the session was not flushed. This PR fixes it

(cherry picked from commit feea143af9b1db3b1f8cd8d29677f0b2b2ab757a)

3 months agoReduce DB load incurred by Stale DAG deactivation (#21399)
Sam Wheating [Sun, 20 Mar 2022 07:17:42 +0000 (00:17 -0700)] 
Reduce DB load incurred by Stale DAG deactivation (#21399)

Deactivating stale DAGs periodically in bulk

By moving this logic into the DagFileProcessorManager and running it across all processed file periodically, we can prevent the use of un-indexed queries.

The basic logic is that we can look at the last processed time of a file (for a given processor) and compare that to the last_parsed_time of an entry in the dag table. If the file has been processed significantly more recently than the DAG has been updated, then its safe to assume that the DAG is missing and can be marked inactive.

(cherry picked from commit f309ea78f7d8b62383bc41eac217681a0916382b)

3 months agoExtend documentation for states of DAGs & tasks and update trigger rules docs (#21382)
Mateusz Nojek [Mon, 21 Feb 2022 16:27:36 +0000 (17:27 +0100)] 
Extend documentation for states of DAGs & tasks and update trigger rules docs (#21382)

(cherry picked from commit 4e959358ac4ef9554ff5d82cdc85ab7dc142a639)

3 months agoFix race condition between triggerer and scheduler (#21316)
Malthe Borch [Tue, 15 Feb 2022 13:12:51 +0000 (13:12 +0000)] 
Fix race condition between triggerer and scheduler (#21316)

(cherry picked from commit 2a6792d94d153c6f2dd116843a43ee63cd296c8d)

3 months agoFix trigger dag redirect from task instance log view (#21239)
Daniel Standish [Tue, 1 Feb 2022 05:39:45 +0000 (21:39 -0800)] 
Fix trigger dag redirect from task instance log view (#21239)

(cherry picked from commit fd8f21f1a3c1e9f447a96a193adf5d28e207731a)

3 months agoLog traceback in trigger excs (#21213)
Malthe Borch [Mon, 28 Feb 2022 22:41:39 +0000 (22:41 +0000)] 
Log traceback in trigger excs (#21213)

(cherry picked from commit 4ad21f5f7c2d416cf813a860564bc2bf3e161d46)

3 months agoA trigger might use a connection; make sure we mask passwords (#21207)
Malthe Borch [Sat, 29 Jan 2022 16:10:23 +0000 (16:10 +0000)] 
A trigger might use a connection; make sure we mask passwords (#21207)

(cherry picked from commit 3d0c1aea5a85a4d31d3ade530e4c5b85b045503a)

3 months agoUpdate `ExternalTaskSensorLink` to handle templated `external_dag_id` (#21192)
Josh Fell [Sun, 6 Feb 2022 21:14:21 +0000 (16:14 -0500)] 
Update `ExternalTaskSensorLink` to handle templated `external_dag_id` (#21192)

(cherry picked from commit 8da7af2bc0f27e6d926071439900ddb27f3ae6c1)

3 months agoEnsure clear_task_instances sets valid run state (#21116)
Tzu-ping Chung [Thu, 27 Jan 2022 05:36:58 +0000 (13:36 +0800)] 
Ensure clear_task_instances sets valid run state (#21116)

(cherry picked from commit d17db3ce8ee8a8a724ced9502c73bc308740a358)

3 months agofix: Update custom connection field processing (#20883)
Mike McDonald [Mon, 24 Jan 2022 00:33:06 +0000 (16:33 -0800)] 
fix: Update custom connection field processing (#20883)

* fix: Update custom connection field processing

Fixes issue where custom connectionfields are not updated because `extra` field is in form and has previous values, overriding custom field values.
Adds portion of connection form tests to test functionality.

(cherry picked from commit 44df1420582b358594c8d7344865811cff02956c)

3 months agoTruncate stack trace to DAG user code for exceptions raised during execution (#20731)
Malthe Borch [Sat, 26 Feb 2022 21:11:02 +0000 (21:11 +0000)] 
Truncate stack trace to DAG user code for exceptions raised during execution (#20731)

(cherry picked from commit 7ea0f7694f39358583cf8f0f8af896fc776bdffe)

3 months agoFix duplicate trigger creation race condition (#20699)
Daniel Standish [Thu, 6 Jan 2022 23:16:02 +0000 (15:16 -0800)] 
Fix duplicate trigger creation race condition (#20699)

The process for queueing up a trigger, for execution by the TriggerRunner, is handled by the TriggerJob's `load_triggers` method.  It fetches the triggers that should be running according to the database, checks if they are running and if not it adds them to `TriggerRunner.to_create`.  The problem is tha there's a small window of time between the moment a trigger (upon termination) is purged from the `TriggerRunner.triggers` set,  and the time that the database is updated to reflect the trigger's doneness.  If `TriggerJob.load_triggers` runs during this window, the trigger will be added back to the `TriggerRunner.to_create` set and it will run again.

To resolve this what we do here is, before adding a trigger to the `to_create` queue, instead of comparing against the "running" triggers, we compare against all triggers known to the TriggerRunner instance.  When triggers move out of the `triggers` set they move into other data structures such as `events` and `failed_triggers` and `to_cancel`.  So we union all of these and only create those triggers which the database indicates should exist _and_ which are know already being handled (whatever state they may be in) by the TriggerRunner instance.

(cherry picked from commit 16b8c476518ed76e3689966ec4b0b788be935410)

3 months agoRename `to_delete` to `to_cancel` in TriggerRunner (#20658)
Daniel Standish [Wed, 5 Jan 2022 20:12:27 +0000 (12:12 -0800)] 
Rename `to_delete` to `to_cancel` in TriggerRunner (#20658)

The queue's purpose is to track triggers that need to be canceled. The language `to_delete` was a bit confusing because for one it does not actually delete them but cancel them.  The deletion work is actually in `cleanup_finished_triggers`.  It seems that this method will usually not do anything and it's only for cancelling triggers that are currently running but for whatever reason no longer should be.  E.g. when a task is killed and therefore the trigger is no longer needed, or some multi-triggerer scenarios.  So putting cancel in the name also highlights that this is about stopping running triggers, not e.g. purging completed ones.

(cherry picked from commit c20ad79b40ea2b213f6dca221221c6dbd55bd08f)

3 months agoFix Tasks getting stuck in scheduled state (#19747)
Tanel Kiis [Tue, 22 Mar 2022 17:30:37 +0000 (19:30 +0200)] 
Fix Tasks getting stuck in scheduled state (#19747)

The scheduler_job can get stuck in a state, where it is not able to queue new tasks. It will get out of this state on its own, but the time taken depends on the runtime of current tasks - this could be several hours or even days.

If the scheduler can't queue any tasks because of different concurrency limits (per pool, dag or task), then on next iterations of the scheduler loop it will try to queue the same tasks. Meanwhile there could be some scheduled tasks with lower priority_weight that could be queued, but they will remain waiting.

The proposed solution is to keep track of dag and task ids, that are concurrecy limited and then repeat the query with these dags and tasks filtered out.

Co-authored-by: Tanel Kiis <tanel.kiis@reach-u.com>
(cherry picked from commit cd68540ef19b36180fdd1ebe38435637586747d4)

3 months agoFix: Do not render undefined graph edges (#19684)
Brent Bovenzi [Thu, 18 Nov 2021 17:01:52 +0000 (11:01 -0600)] 
Fix: Do not render undefined graph edges (#19684)

* Fix: Do not render undefined graph edges

A user had an issue where a `targetId` was undefined and that caused the whole graph view to crash. Instead, we should check for the source and target before rendering the edge.

* move all checks to one line

(cherry picked from commit bd109b4336d5950168b6ff04a752290e437b2b8c)

3 months agoSet X-Frame-Options header to DENY only if X_FRAME_ENABLED is set to true. (#19491)
Kanthi [Sat, 22 Jan 2022 23:09:51 +0000 (18:09 -0500)] 
Set X-Frame-Options header to DENY only if X_FRAME_ENABLED is set to true. (#19491)

(cherry picked from commit 084079f446570ba43114857ea1a54df896201419)

3 months agoBump version to 2.2.5
Ephraim Anierobi [Thu, 24 Mar 2022 01:35:00 +0000 (02:35 +0100)] 
Bump version to 2.2.5

3 months agoFix 2.2.4 changelog date and remove dups (#22011)
Jed Cunningham [Sat, 5 Mar 2022 09:38:48 +0000 (02:38 -0700)] 
Fix 2.2.4 changelog date and remove dups (#22011)

3 months agoRemove misleading MSSQL information from the docs (#21998)
Jarek Potiuk [Fri, 4 Mar 2022 19:03:05 +0000 (20:03 +0100)] 
Remove misleading MSSQL information from the docs (#21998)

Some of the past cherry-picks brought MSSQL information to Airflow
docs in stable version. This should be removed.

4 months agofixup! Add changelog for 2.2.4rc1 21659/head 2.2.4 2.2.4rc1
Jarek Potiuk [Fri, 18 Feb 2022 12:02:28 +0000 (13:02 +0100)] 
fixup! Add changelog for 2.2.4rc1

4 months agoPin Markupsafe until we are able to upgrade Flask/Jinja (#21664)
Jarek Potiuk [Fri, 18 Feb 2022 10:12:37 +0000 (11:12 +0100)] 
Pin Markupsafe until we are able to upgrade Flask/Jinja (#21664)

Markupsafe 2.1.0 breaks with error: import name 'soft_unicode' from 'markupsafe'.
This should be removed when either this issue is closed:
https://github.com/pallets/markupsafe/issues/284
or when we will be able to upgrade JINJA to newer version (currently
limited due to Flask and Flask Application Builder)

(cherry picked from commit 366c66b8f6eddc0d22028ef494c62bb757bd8b8b)

4 months agoAdd changelog for 2.2.4rc1
Jed Cunningham [Fri, 18 Feb 2022 00:11:00 +0000 (17:11 -0700)] 
Add changelog for 2.2.4rc1

4 months agoClarify pendulum use in timezone cases (#21646)
Jarek Potiuk [Fri, 18 Feb 2022 00:08:42 +0000 (01:08 +0100)] 
Clarify pendulum use in timezone cases (#21646)

It is important to use Pendulum in case timezone is used - because
there are a number of limitations coming from using stdlib
timezone implementation.

However our documentation was not very clear about it, especially
some examples shown using standard datetime in DAGs which could
mislead our users to continue using datetime if they use timezone.

This PR clarifies and stresses the use of pendulum is necessary
when timezone is used. Also it points to the documentation
in case serialization throws error about not using Pendulum
so that the users can learn about the reasoning.

This is the first part of the change - the follow up will be
changing all provider examples to also use timezone and
pendulum explicitly.

See also #20070

(cherry picked from commit f011da235f705411239d992bc3c92f1c072f89a9)

4 months agoadded explaining concept of logical date in DAG run docs (#21433)
Howard Yoo [Thu, 17 Feb 2022 20:01:58 +0000 (14:01 -0600)] 
added explaining concept of logical date in DAG run docs (#21433)

(cherry picked from commit 752d53860e636ead2be7c3f2044b9b312ba86b95)

4 months agoAdding missing login provider related methods from Flask-Appbuilder (#21294)
Pankaj Singh [Thu, 17 Feb 2022 20:55:22 +0000 (02:25 +0530)] 
Adding missing login provider related methods from Flask-Appbuilder (#21294)

(cherry picked from commit 38894e8013b5c38468e912164f80282e3b579993)

4 months agoAdd note about Variable precedence with env vars (#21568)
Madison Swain-Bowden [Tue, 15 Feb 2022 21:56:00 +0000 (13:56 -0800)] 
Add note about Variable precedence with env vars (#21568)

This PR updates some documentation regarding setting Airflow Variables using environment variables. Environment variables take precedence over variables defined in the UI/metastore based on this default search path list: https://github.dev/apache/airflow/blob/7864693e43c40fd8f0914c05f7e196a007d16d50/airflow/secrets/__init__.py#L29-L30

(cherry picked from commit 7a268cb3c9fc6bc03f2400c6632ff8dccf4e451e)

4 months agoReorder migrations to include bugfix in 2.2.4 (#21598)
Jed Cunningham [Tue, 15 Feb 2022 23:38:56 +0000 (16:38 -0700)] 
Reorder migrations to include bugfix in 2.2.4 (#21598)

(cherry picked from commit 005cef042bc4184c24ad03c1b4ee40cdbaf96cb5)

4 months agoFix slow DAG deletion due to missing ``dag_id`` index for job table (#20282)
Kush [Thu, 30 Dec 2021 10:26:24 +0000 (15:56 +0530)] 
Fix slow DAG deletion due to missing ``dag_id`` index for job table (#20282)

Fixes #20249

(cherry picked from commit ac9f29da200c208bb52d412186c5a1b936eb0b5a)

4 months agoupdate tutorial_etl_dag notes (#21503)
eladkal [Fri, 11 Feb 2022 08:17:18 +0000 (10:17 +0200)] 
update tutorial_etl_dag notes (#21503)

* update tutorial_etl_dag notes

(cherry picked from commit a42607a4b75586a396d6a56145ed048d127dd344)

4 months agoSimplify trigger cancel button (#21591)
Jed Cunningham [Tue, 15 Feb 2022 18:00:26 +0000 (11:00 -0700)] 
Simplify trigger cancel button (#21591)

Co-authored-by: Jed Cunningham <jedcunningham@apache.org>
(cherry picked from commit 65297673a318660fba76797e50d0c06804dfcafc)

4 months agoAdd a session backend to store session data in the database (#21478)
Jed Cunningham [Tue, 15 Feb 2022 17:57:46 +0000 (10:57 -0700)] 
Add a session backend to store session data in the database (#21478)

Co-authored-by: Jed Cunningham <jedcunningham@apache.org>
(cherry picked from commit da9d0863c7ff121c111a455708163b026943bdf1)

4 months agoShow task status only for running dags or only for the last finished dag (#21352)
Aleksey Kirilishin [Mon, 14 Feb 2022 15:55:00 +0000 (18:55 +0300)] 
Show task status only for running dags or only for the last finished dag (#21352)

* Show task status only for running dags or only for the last finished dag

* Brought the logic of getting task statistics into a separate function

(cherry picked from commit 28d7bde2750c38300e5cf70ba32be153b1a11f2c)

4 months agoUse compat data interval shim in log handlers (#21289)
Tzu-ping Chung [Sat, 12 Feb 2022 03:40:29 +0000 (11:40 +0800)] 
Use compat data interval shim in log handlers (#21289)

(cherry picked from commit 44bd211b19dcb75eeb53ced5bea2cf0c80654b1a)

4 months agoFix postgres hook import pipeline tutorial (#21491)
KevinYanesG [Thu, 10 Feb 2022 14:38:53 +0000 (15:38 +0100)] 
Fix postgres hook import pipeline tutorial (#21491)

(cherry picked from commit a2abf663157aea14525e1a55eb9735ba659ae8d6)

4 months agoFix mismatch in generated run_id and logical date of DAG run (#18707)
David Caron [Fri, 4 Feb 2022 02:14:19 +0000 (21:14 -0500)] 
Fix mismatch in generated run_id and logical date of DAG run (#18707)

Co-authored-by: Tzu-ping Chung <tp@astronomer.io>
Co-authored-by: Jed Cunningham <jedcunningham@apache.org>
(cherry picked from commit 1f08d281632670aef1de8dfc62c9f63aeec18760)

4 months agoFix TriggerDagRunOperator extra link (#19410)
Niko [Thu, 9 Dec 2021 13:46:59 +0000 (05:46 -0800)] 
Fix TriggerDagRunOperator extra link (#19410)

The extra link provided by the operator was previously using the
execution date of the triggering dag, not the triggered dag. Store the
execution date of the triggered dag in xcom so that it can be read back
later within the webserver when the link is being created.

(cherry picked from commit 820e836c4a2e45239279d4d71e1db9434022fec5)

4 months agoAdd possibility to create user in the Remote User mode (#19963)
Łukasz Wyszomirski [Fri, 28 Jan 2022 05:18:05 +0000 (06:18 +0100)] 
Add possibility to create user in the Remote User mode (#19963)

(cherry picked from commit cdd9ea66208e3d70d1cf2a34530ba69bc3c58a50)

4 months agoAvoid deadlock when rescheduling task (#21362)
Jarek Potiuk [Mon, 7 Feb 2022 19:12:05 +0000 (20:12 +0100)] 
Avoid deadlock when rescheduling task (#21362)

The scheduler job performs scheduling after locking the "scheduled"
DagRun row for writing. This should prevent from modifying DagRun
and related task instances by another scheduler or "mini-scheduler"
run after task is completed.

However there is apparently one more case where the DagRun is being
locked by "Task" processes - namely when task throws
AirflowRescheduleException. In this case a new "TaskReschedule"
entity is inserted into the database and it also performs lock
on the DagRun (because TaskReschedule has "DagRun" relationship.

This PR modifies handling the AirflowRescheduleException to obtain the
very same DagRun lock before it attempts to insert TaskReschedule
entity.

Seems that TaskReschedule is the only one that has this relationship
so likely all the misterious SchedulerJob deadlock cases we
experienced might be explained (and fixed) by this one.

It is likely that this one:

* Fixes: #16982
* Fixes: #19957

(cherry picked from commit 6d110b565a505505351d1ff19592626fb24e4516)

4 months agoFix docs link for smart sensor deprecation (#21394)
Jed Cunningham [Mon, 7 Feb 2022 15:55:20 +0000 (08:55 -0700)] 
Fix docs link for smart sensor deprecation (#21394)

We are releasing the deprecation in version 2.2.4, not 2.3.0 like
originally planned.

(cherry picked from commit 3a780380d8f5d50ffc876c326e70ee0eee033c0d)

4 months agoUpdate example DAGs (#21372)
Jed Cunningham [Mon, 7 Feb 2022 18:24:31 +0000 (11:24 -0700)] 
Update example DAGs (#21372)

(cherry picked from commit 7a38ec2ad3b3bd6fda5e1ee9fe9e644ccb8b4c12)

4 months agoUpdate error docs to include before_send option (#21275)
Abhijeet Prasad [Thu, 3 Feb 2022 23:15:21 +0000 (18:15 -0500)] 
Update error docs to include before_send option (#21275)

https://github.com/apache/airflow/pull/18261 Added support for the `before_send` option when initializing the Sentry SDK in airflow. This patch updates the documentation to reflect this change.
(cherry picked from commit b38391e2f91760e64576723c876341f532a6ee2d)

4 months agoFix the incorrect scheduling time for the first run of dag (#21011)
wano [Sun, 6 Feb 2022 18:02:57 +0000 (02:02 +0800)] 
Fix the incorrect scheduling time for the first run of dag (#21011)

When Catchup_by_default is set to false and start_date in the DAG is the
previous day, the first schedule time for this DAG may be incorrect

Co-authored-by: wanlce <who@foxmail.com>
(cherry picked from commit 0bcca55f4881bacc3fbe86f69e71981f5552b398)

4 months agoUpdate stat_name_handler documentation (#21298)
Fran Sánchez [Thu, 3 Feb 2022 18:12:08 +0000 (18:12 +0000)] 
Update stat_name_handler documentation (#21298)

Previously stat_name_handler was under the scheduler section of the
configuration but it was moved to the metrics section since 2.0.0.

(cherry picked from commit 0ae31e9cb95e5061a23c2f397ab9716391c1a488)

4 months agoDocs: Fix task order in overview example (#21282)
Lucia Kasman [Thu, 3 Feb 2022 18:10:27 +0000 (15:10 -0300)] 
Docs: Fix task order in overview example (#21282)

(cherry picked from commit 1ba83c01b2b466ad5a76a453e5f6ee2884081e53)

4 months agoUpdate recipe for Google Cloud SDK (#21268)
Kamil Breguła [Thu, 3 Feb 2022 18:20:12 +0000 (19:20 +0100)] 
Update recipe for Google Cloud SDK (#21268)

(cherry picked from commit 874a22ee9b77f8f100736558723ceaf2d04b446b)