airflow.git
5 months agoSupport glob syntax in ``.airflowignore`` files (#21392) (#22051)
Ian Buss [Wed, 13 Apr 2022 10:19:58 +0000 (11:19 +0100)] 
Support glob syntax in ``.airflowignore`` files (#21392) (#22051)

A new configuration parameter "CORE_IGNORE_FILE_SYNTAX" is added to
allow patterns in .airflowignore files to be interpreted as either
regular expressions (the default) or glob expressions as found in
.gitignore files. This allows users to use patterns they will be
familiar with from tools such as git, helm and docker.

Glob expressions support wildcard matches ("*", "?") within a directory
as well as character classes ("[0-9]"). In addition, zero or more
directories can be matched using "**". Patterns can be negated by
prefixing a "!" at the beginning of the pattern.

The "fnmatch" library in core Python does not produce patterns that are
fully compliant with the kind of patterns that users will be used to
from gitignore or dockerignore files, so the globs are parsed using
the pathspec package from PyPI.

To aid with debugging ignorefile patterns a more helpful error
message is emitted in the logs for invalid patterns, which are
now skipped rather than causing a hard-to-read scheduler stack trace.

closes: #21392

5 months agoAdd template support for external_task_ids. (#22809)
Karthikeyan Singaravelan [Wed, 13 Apr 2022 09:44:21 +0000 (15:14 +0530)] 
Add template support for external_task_ids. (#22809)

5 months agoFix select * query xcom push for BigQueryGetDataOperator (#22936)
pankajastro [Wed, 13 Apr 2022 09:43:38 +0000 (15:13 +0530)] 
Fix select * query xcom push for BigQueryGetDataOperator (#22936)

Use in instead of get for conditinal check

5 months agoAllow DagParam to hold falsy values (#22964)
Tzu-ping Chung [Wed, 13 Apr 2022 07:48:46 +0000 (15:48 +0800)] 
Allow DagParam to hold falsy values (#22964)

5 months agoRefactor airbyte provider tests to use assert_has_calls (#22951)
GitStart-AirFlow [Wed, 13 Apr 2022 07:02:57 +0000 (08:02 +0100)] 
Refactor airbyte provider tests to use assert_has_calls (#22951)

* refactor airbyte provider tests to use assert_has_calls

Co-authored-by: gitstart-airflow <gitstart@users.noreply.github.com>
5 months agoDeprecate `DummyOperator` in favor of `EmptyOperator` (#22832)
eladkal [Wed, 13 Apr 2022 06:47:56 +0000 (09:47 +0300)] 
Deprecate `DummyOperator` in favor of `EmptyOperator` (#22832)

* Deprecate `DummyOperator` in favor of `EmptyOperator`

5 months agoFix screenshot generation for dumb terminal (#22962)
Jarek Potiuk [Tue, 12 Apr 2022 21:55:42 +0000 (23:55 +0200)] 
Fix screenshot generation for dumb terminal (#22962)

When dumb terminal is set when screenshot image is generated, the terminal width is decreased to 80 and screenshots are changing.

We force 256 color xterm during screenshot generation.

5 months agoHide pagination when data is a single page (#22963)
Brent Bovenzi [Tue, 12 Apr 2022 21:49:35 +0000 (16:49 -0500)] 
Hide pagination when data is a single page (#22963)

5 months agoBug Fix for `apache-airflow-providers-jenkins` `JenkinsJobTriggerOperator` (#22802)
Sasan Ahmadi [Tue, 12 Apr 2022 19:58:04 +0000 (12:58 -0700)] 
Bug Fix for `apache-airflow-providers-jenkins` `JenkinsJobTriggerOperator` (#22802)

* bugfix for when polling for the created job, if fail to get job info it should not fail the task, instead it should continue polling until reaches the max allowed polling tries

5 months agoRemove duplicate Apache License line in `ci.yml` (#22960)
Kaxil Naik [Tue, 12 Apr 2022 19:56:28 +0000 (20:56 +0100)] 
Remove duplicate Apache License line in `ci.yml` (#22960)

Remove duplicate line !

5 months agoRemove Grid labels and differentiate runs from tasks more (#22950)
Brent Bovenzi [Tue, 12 Apr 2022 19:19:43 +0000 (14:19 -0500)] 
Remove Grid labels and differentiate runs from tasks more (#22950)

5 months agoEnsure that BackfillJob re-runs existing mapped task instances (#22952)
Ash Berlin-Taylor [Tue, 12 Apr 2022 19:15:29 +0000 (20:15 +0100)] 
Ensure that BackfillJob re-runs existing mapped task instances (#22952)

* Ensure that BackfillJob re-runs existing mapped task instances

`expand_mapped_task` only returns _new_ TaskInstances, so if a backfill
job was run for a dag that already existed the mapped task would never
be executed.

* Fix tests for changed interface

This interface also isn't the best, and we could probably do with
refactoring it. Not now though

5 months agoFix regression in pool metrics (#22939)
Tanel Kiis [Tue, 12 Apr 2022 18:33:44 +0000 (21:33 +0300)] 
Fix regression in pool metrics (#22939)

Co-authored-by: Tanel Kiis <tanel.kiis@reach-u.com>
Co-authored-by: Ash Berlin-Taylor <ash_github@firemirror.com>
5 months agoRemove badly merged conflict for BREEZE.rst (#22953)
Jarek Potiuk [Tue, 12 Apr 2022 16:32:54 +0000 (18:32 +0200)] 
Remove badly merged conflict for BREEZE.rst (#22953)

Missed the conflict when merging #22876. Github hides
such big changes by default :(

5 months agoHelm support for LocalKubernetesExecutor (#22388)
Kanthi [Tue, 12 Apr 2022 16:13:31 +0000 (12:13 -0400)] 
Helm support for LocalKubernetesExecutor (#22388)

5 months agoPriority order tasks even when using pools (#22483)
Tanel Kiis [Tue, 12 Apr 2022 13:34:25 +0000 (16:34 +0300)] 
Priority order tasks even when using pools (#22483)

When picking tasks to queue, the scheduler_job groups candidate task instances
by their pools and picks tasks for queueing for each "pool group".

This way tasks with lower priority could be queued before tasks with higher
priority. This is demostrated in new UT
`test_find_executable_task_instances_order_priority_with_pools` - before this
change `dummy3` and `dummy1` are queued instead of `dummy3` and `dummy2`.

Co-authored-by: Tanel Kiis <tanel.kiis@reach-u.com>
5 months agoDelete old Spark Application in SparkKubernetesOperator (#21092)
Thees Gieselmann [Tue, 12 Apr 2022 13:32:13 +0000 (15:32 +0200)] 
Delete old Spark Application in SparkKubernetesOperator (#21092)

* Delete previous SparkApp in Kubernetes

+ KubernetesHook: adding delete_custom_object
+ SparkKubernetesOperator: extract name from k8
yaml and delete if exists
+ Update SparkKubernetesOperator docstring
* Delete previous SparkApp in Kubernetes
+ KubernetesHook: adding delete_custom_object
+ SparkKubernetesOperator: extract name from k8
yaml and delete if exists
+ Update SparkKubernetesOperator docstring

5 months agoEnsure that mapped TIs in BackfillJob have a start_date (#22946)
Ash Berlin-Taylor [Tue, 12 Apr 2022 13:02:59 +0000 (14:02 +0100)] 
Ensure that mapped TIs in BackfillJob have a start_date (#22946)

Since BackfillJob is not at all like the scheduler it handles things
differently, and mapped TIs were ending up with a null start date when
executing!

5 months agoCall mapped_dependants only on the original task (#22904)
Tzu-ping Chung [Tue, 12 Apr 2022 08:38:17 +0000 (16:38 +0800)] 
Call mapped_dependants only on the original task (#22904)

* Add literal expands in test DAGs

* Call mapped_dependants only on the original task

We've made change on this in the scheduler, but need to match it in
the BackfillJob.

5 months agocache and typo fix (#22876)
Bowrna [Tue, 12 Apr 2022 08:12:55 +0000 (13:42 +0530)] 
cache and typo fix (#22876)

5 months agoFix column names in "moved" tables created pre-upgrade (#22937)
Daniel Standish [Tue, 12 Apr 2022 07:17:11 +0000 (00:17 -0700)] 
Fix column names in "moved" tables created pre-upgrade (#22937)

I inadvertently prefixed all the column names with the table name.  This fixes by adding column aliases explicitly.

5 months agoFix airflow version in migration 587bdf053233 (#22935)
Daniel Standish [Tue, 12 Apr 2022 07:15:45 +0000 (00:15 -0700)] 
Fix airflow version in migration 587bdf053233 (#22935)

5 months agoReuse reflect_tables helper in db.py (#22922)
Daniel Standish [Tue, 12 Apr 2022 07:13:01 +0000 (00:13 -0700)] 
Reuse reflect_tables helper in db.py (#22922)

5 months agoAdd `parameters` to templated fields in `OracleOperator` (#22857)
Malthe Borch [Tue, 12 Apr 2022 06:51:03 +0000 (06:51 +0000)] 
Add `parameters` to templated fields in `OracleOperator` (#22857)

5 months agoDo not clear XCom when resuming from deferral (#22932)
Rocco Pascale [Tue, 12 Apr 2022 06:12:01 +0000 (02:12 -0400)] 
Do not clear XCom when resuming from deferral (#22932)

5 months agoUse full version string for deprecated config (#22930)
Jed Cunningham [Tue, 12 Apr 2022 06:06:30 +0000 (00:06 -0600)] 
Use full version string for deprecated config (#22930)

5 months agoMove database config move note to `main` section (#22929)
Jed Cunningham [Tue, 12 Apr 2022 05:53:19 +0000 (23:53 -0600)] 
Move database config move note to `main` section (#22929)

This was accidentally added to the `2.2.4` notes instead of `main`.

5 months agoDeprecate `S3PrefixSensor` and `S3KeySizeSensor` in favor of `S3KeySensor` (#22737)
Vincent [Tue, 12 Apr 2022 05:22:51 +0000 (01:22 -0400)] 
Deprecate `S3PrefixSensor` and `S3KeySizeSensor` in favor of `S3KeySensor` (#22737)

Deprecate `S3PrefixSensor` and `S3KeySizeSensor` in favor of `S3KeySensor`

Co-authored-by: Ash Berlin-Taylor <ash_github@firemirror.com>
5 months agoFix changelog spelling (#22926)
Jed Cunningham [Mon, 11 Apr 2022 23:26:42 +0000 (17:26 -0600)] 
Fix changelog spelling (#22926)

5 months agoMove the database configuration to a new section (#22284)
GitStart-AirFlow [Mon, 11 Apr 2022 22:00:25 +0000 (23:00 +0100)] 
Move the database configuration to a new section (#22284)

Co-authored-by: gitstart-airflow <gitstart@users.noreply.github.com>
Co-authored-by: GitStart <1501599+gitstart@users.noreply.github.com>
Co-authored-by: Egbosi Kelechi <egbosikelechi@gmail.com>
5 months agoRemove installation instructions from Breeze's cheatsheet (#22923)
Jarek Potiuk [Mon, 11 Apr 2022 21:51:52 +0000 (23:51 +0200)] 
Remove installation instructions from Breeze's cheatsheet (#22923)

The cheatsheet is displayed only after Breeze is installed so it
makes no sense to display installation instructions in the
cheathsheet.

5 months agoFix bug where dynamically mapped tasks got set to REMOVED (#22909)
Ash Berlin-Taylor [Mon, 11 Apr 2022 21:34:06 +0000 (22:34 +0100)] 
Fix bug where dynamically mapped tasks got set to REMOVED (#22909)

* Fix bug where dynamically mapped tasks got set to REMOVED

This mostly affects backfil/`airflow tasks test`.

5 months agoimplements #22859 - Add .sql as templatable extension (#22920)
LennyKLB [Mon, 11 Apr 2022 20:48:56 +0000 (22:48 +0200)] 
implements #22859 - Add .sql as templatable extension (#22920)

5 months agotypo in BREEZE.rst (#22919)
D. Ferruzzi [Mon, 11 Apr 2022 19:51:38 +0000 (12:51 -0700)] 
typo in BREEZE.rst (#22919)

5 months agoAdd SmoothOperator (#22813)
Tomek Urbaszek [Mon, 11 Apr 2022 18:55:58 +0000 (20:55 +0200)] 
Add SmoothOperator (#22813)

Easter is coming so I just came with idea of an easter egg.

5 months agoUpdate mapped task UX (#22911)
Brent Bovenzi [Mon, 11 Apr 2022 18:34:21 +0000 (13:34 -0500)] 
Update mapped task UX (#22911)

* Add Xcom button, hide map index actions, disabled run

* Allow bulk mapped task actions

* Remove table selection for now

* fix linting error

* Fix copy and isDisabled

5 months agoCatch error in Breeze when docker is not running (#22901)
Joppe Vos [Mon, 11 Apr 2022 14:30:21 +0000 (16:30 +0200)] 
Catch error in Breeze when docker is not running (#22901)

5 months agoAdd concept doc for Dynamic Task Mapping (#22867)
Ash Berlin-Taylor [Mon, 11 Apr 2022 14:28:40 +0000 (15:28 +0100)] 
Add concept doc for Dynamic Task Mapping (#22867)

Co-authored-by: Daniel Standish <15932138+dstandish@users.noreply.github.com>
Co-authored-by: Jed Cunningham <jedcunningham@apache.org>
Co-authored-by: Tzu-ping Chung <uranusjr@gmail.com>
Co-authored-by: eladkal <45845474+eladkal@users.noreply.github.com>
5 months agoHandle invalid JSON metadata in get_logs_with_metadata endpoint. (#22898)
Karthikeyan Singaravelan [Mon, 11 Apr 2022 10:48:10 +0000 (16:18 +0530)] 
Handle invalid JSON metadata in get_logs_with_metadata endpoint. (#22898)

5 months agoAllow using mapped upstream's aggregated XCom (#22849)
Tzu-ping Chung [Mon, 11 Apr 2022 09:29:32 +0000 (17:29 +0800)] 
Allow using mapped upstream's aggregated XCom (#22849)

This needs two changes. First, when the upstream pushes the return value
to XCom, we need to identify that the pushed value is not used on its
own, but only aggregated with other return values from other mapped task
instances. Fortunately, this is actually the only possible case right
now, since we have not implemented support for depending on individual
return values from a mapped task (aka nested mapping). So we instead
skip recording any TaskMap metadata from a mapped task to avoid the
problem altogether.

The second change is for when the downstream task is expanded. Since the
task depends on the mapped upstream as a whole, we should not use
TaskMap from the upstream (which corresponds to individual task
instances, as mentioned above), but the XComs pushed by every instance
of the mapped task. Again, since we don't nested mapping now, we can cut
corners and simply check whether the upstream is mapped or not to decide
what to do, and leave further logic to the future.

Co-authored-by: Ash Berlin-Taylor <ash@apache.org>
5 months agoAdd test case for clearTaskInstance call with invalid Task IDs. (#22894)
Karthikeyan Singaravelan [Mon, 11 Apr 2022 09:05:12 +0000 (14:35 +0530)] 
Add test case for clearTaskInstance call with invalid Task IDs. (#22894)

5 months agomake operator's execution_timeout configurable (#22389)
sagman.sercan [Mon, 11 Apr 2022 08:31:10 +0000 (11:31 +0300)] 
make operator's execution_timeout configurable (#22389)

* make operator's execution_timeout configurable

By this commit, execution_timeout attribute of the
operators is now configurable globally via airflow.cfg.

* The default value is still `None`. Users are expected to
define a positive integer value to be passed into timedelta object
to set timeout in terms of seconds by default, via configuration.
* If the key is missing or is set to a non-positive value, then it is
considered as `None`.
* Added `gettimedelta` method to be used in abstractoperator
to get timedelta or None type object. The method raises exception
for the values that are not convertible to integer and/or the values
too large to be converted to C int.
* Sample config cases are added into unit tests.

Closes #18578

* raise error for non-positive execution_timeout

* By this commit, error raises for the values <= 0
instead of using fallback value
* Updated unit tests

* include OverflowError error message in exception

To be more clear to the user, added relevant error message
into to AirflowConfigException.

* rename default_execution_timeout

This parameter specifies the tasks' execution timeout,
so all configuration and variable names are now contains
`task` in it.

* update `version_added` for execution_timeout

* update execution_timeout description

fixed the description of default_task_execution_timeout
based on the recent changes

* update inline comment for non-positive value check

* update `gettimedelta` docstring

* allow non-positive values in gettimedelta

Before this commit, gettimedelta method was preventing
user to provide non-positive values. Now it is totally up to
users to provide a sensible value for this configuration

Co-authored-by: sercan.sagman <sercan.sagman@inventanalytics.com>
5 months agoupdate INTHEWILD (#22896)
eladkal [Mon, 11 Apr 2022 08:00:27 +0000 (11:00 +0300)] 
update INTHEWILD (#22896)

5 months agoMSSQLToGCSOperator fails: datetime is not JSON Serializable (#22882)
pierrejeambrun [Mon, 11 Apr 2022 06:36:44 +0000 (08:36 +0200)] 
MSSQLToGCSOperator fails: datetime is not JSON Serializable (#22882)

* Handle date and time convert_type for MSSQLToGCSOperator

5 months agoFix "force_answers" parameter to be "yes" rather than "true" (#22891)
Jarek Potiuk [Sun, 10 Apr 2022 22:28:26 +0000 (00:28 +0200)] 
Fix "force_answers" parameter to be "yes" rather than "true" (#22891)

5 months agoPush CI image using new Python Breeze (#22888)
Jarek Potiuk [Sun, 10 Apr 2022 22:05:35 +0000 (00:05 +0200)] 
Push CI image using new Python Breeze (#22888)

Fixes: #22821

5 months agoRemove unnecessary python 3.6 conditionals (#20549)
Jed Cunningham [Sun, 10 Apr 2022 19:58:26 +0000 (13:58 -0600)] 
Remove unnecessary python 3.6 conditionals (#20549)

Since Python 3.7 is now the lowest supported version, we no longer need
to have conditionals to support 3.6.

5 months agoReplace old Breeze with Python based implementation (#22880)
Jarek Potiuk [Sun, 10 Apr 2022 18:47:29 +0000 (20:47 +0200)] 
Replace old Breeze with Python based implementation (#22880)

Over the last few months together with Outreachy interns
we rewrote the most important functionality of the old Bash-based
Breeze with Python Based implementation.

We approached it in systematic way with capturing all our decisions
in the ADR format (dev/breeze/docs) and implementing the parts
that are used on a daily basis by the users. Breeze2 as it was
called is ready for Prime-Time with the users so we are swapping
out the old breeze wiht the new one.

The old `breeze` has been moved to `breeze-legacy` and we will
gradually parts of it that are already migrated to Python and proven
while continue rewriting the parts that are missing (mostly the
maintainer tools) and replacing the remaining CI shell scripts with
the new `breeze` commands.

We also need to make sure that there is no accidental top-level
import for extra packages added in the future, because people
who installed breeze before will not have it - so we have
a pre-commit that checks if breeze.py can be parsed and
--help executed with just rich and click installed.

This PR:

* moves `breeze` to `breeze-legacy`
* moves `Breeze2` to `breeze`
* updates documentation and screenshots where applicable
* explains old vs. new breeze in documentation
* adds protection so that no accidental top-level import is
  added to breeze.py and files imported from there

Fixes: #22827

5 months agoDatabricks SQL operators are now Python 3.10 compatible (#22886)
Alex Ott [Sun, 10 Apr 2022 18:32:01 +0000 (20:32 +0200)] 
Databricks SQL operators are now Python 3.10 compatible (#22886)

New version of databricks-sql-connector fixes incompatibility with
Python 3.10, so rollig back #22221, and bumping dependency.

This closes #22220

5 months agoDatabricks: Correctly handle HTTP exception (#22885)
Alex Ott [Sun, 10 Apr 2022 18:31:29 +0000 (20:31 +0200)] 
Databricks: Correctly handle HTTP exception (#22885)

Exception for non-existent repo wasn't correctly handled for Databricks
Repos operations

5 months agoFix new MyPy errors in main (#22884)
Jarek Potiuk [Sun, 10 Apr 2022 14:44:59 +0000 (16:44 +0200)] 
Fix new MyPy errors in main (#22884)

Those MyPe errors are side effect of some new dependencies.

5 months agoSupport unknown backends in entrypoint_prod.sh (#22883)
Kamil Breguła [Sun, 10 Apr 2022 07:50:26 +0000 (09:50 +0200)] 
Support unknown backends in entrypoint_prod.sh (#22883)

5 months agoAdjust DAG/TI details panel (#22877)
Brent Bovenzi [Sat, 9 Apr 2022 22:07:02 +0000 (17:07 -0500)] 
Adjust DAG/TI details panel (#22877)

* Clean up Dag details

* Clean up TI details

* Add mapped task count

5 months agoBump moment from 2.29.1 to 2.29.2 in /airflow/www (#22873)
dependabot[bot] [Sat, 9 Apr 2022 15:43:04 +0000 (10:43 -0500)] 
Bump moment from 2.29.1 to 2.29.2 in /airflow/www (#22873)

Bumps [moment](https://github.com/moment/moment) from 2.29.1 to 2.29.2.
- [Release notes](https://github.com/moment/moment/releases)
- [Changelog](https://github.com/moment/moment/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/moment/moment/compare/2.29.1...2.29.2)

---
updated-dependencies:
- dependency-name: moment
  dependency-type: direct:development
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
5 months agoSupport for sorting DAGs in the web UI (#22671)
pierrejeambrun [Sat, 9 Apr 2022 15:21:00 +0000 (17:21 +0200)] 
Support for sorting DAGs in the web UI (#22671)

* Add sort + small test

* clean code

* Remove useless forgotten macro, fix nullslast for mysql

* Changes following code review

* Remove nullslast

* Changes desc syntax

5 months agoCleanup dup code now that k8s provider requires 2.3.0+ (#22845)
Jed Cunningham [Sat, 9 Apr 2022 13:24:31 +0000 (07:24 -0600)] 
Cleanup dup code now that k8s provider requires 2.3.0+ (#22845)

5 months agoFix Grid view font sizing (#22866)
Brent Bovenzi [Fri, 8 Apr 2022 21:59:27 +0000 (16:59 -0500)] 
Fix Grid view font sizing (#22866)

5 months agoFix pre-upgrade check for rows dangling w.r.t. dag_run (#22850)
Daniel Standish [Fri, 8 Apr 2022 21:58:57 +0000 (14:58 -0700)] 
Fix pre-upgrade check for rows dangling w.r.t. dag_run (#22850)

Some migrations for 2.3.0 add keys to TI and DR.

We have a check that purges any rows that can't be mapped to a DR.

But it doesn't work correctly. It does DELETE FROM USING dag_run; but this is an inner join so if the dag run isn't there the rows won't be deleted.

Instead we can do DELETE FROM WHERE NOT EXISTS. This has a happy side effect of letting us remove some dialect-specific code.

This PR does not add a check for dangling w.r.t. TI -- that is deferred for a later PR.

5 months agoSpeed up `has_access` decorator by ~200ms (#22858)
Ash Berlin-Taylor [Fri, 8 Apr 2022 21:24:26 +0000 (22:24 +0100)] 
Speed up `has_access` decorator by ~200ms (#22858)

Using the ORM to create all the Role, Permission, Action and Resource
objects, only to throw them all away _on every request_ is slow. And
since we are now using the API more and more in the UI it's starting to
get noticeable

This changes the `user.perm` property to issue a custom query that
returns the tuple of action_name, permission_name we want, bypassing the
ORM object inflation entirely, and since `user.roles` isn't needed in
most requests we no longer eagerly load that.

* Fix tests

Caching issues that only crop up in tests (but not ever a problem in the
request life cycle of webserver

5 months agoEvents Timetable (#22332)
Collin McNulty [Fri, 8 Apr 2022 20:38:23 +0000 (15:38 -0500)] 
Events Timetable (#22332)

This Timetable will be widely useful for timing based on sporting events, planned communication campaigns,
and other schedules that are arbitrary and irregular but predictable.

5 months agoFixed backfill interference with scheduler (#22701)
QP Hou [Fri, 8 Apr 2022 20:13:45 +0000 (13:13 -0700)] 
Fixed backfill interference with scheduler (#22701)

Co-authored-by: Dmirty Suvorov <dmitry.suvorov@scribd.com>
5 months agoSupport dag serialization with custom ti_deps rules (#22698)
QP Hou [Fri, 8 Apr 2022 20:12:18 +0000 (13:12 -0700)] 
Support dag serialization with custom ti_deps rules (#22698)

5 months agoAdd securityContext config for Redis to helm chart (#22182)
Dan Vaughan [Fri, 8 Apr 2022 19:38:32 +0000 (20:38 +0100)] 
Add securityContext config for Redis to helm chart (#22182)

Co-authored-by: Jed Cunningham <jedcunningham@apache.org>
5 months agoAdd `pre-commit` to check that `REVISION_HEADS_MAP` is up-to-date (#22860)
Ephraim Anierobi [Fri, 8 Apr 2022 17:20:54 +0000 (18:20 +0100)] 
Add `pre-commit` to check that `REVISION_HEADS_MAP` is up-to-date (#22860)

This PR adds a `pre-commit` to make sure that the `REVISION_HEADS_MAP` is
up-to-date in any release. Since this feature will be out in 2.3.0 and is very important
to update it for releases, not just main, this would be helpful

5 months agoAdd XComArg to lazy-imported list of Airflow module (#22862)
Ash Berlin-Taylor [Fri, 8 Apr 2022 16:12:49 +0000 (17:12 +0100)] 
Add XComArg to lazy-imported list of Airflow module (#22862)

In writing the docs for Dynamic Task Mapping (AIP-42) I noticed that
there are some cases where users need to use XComArg directly, and it
didn't feel right to make the import things from `airflow.models`.

And I've now refactored the lazy import to be "data-driven" as three
blocks of almost identical code was my limit.

Co-authored-by: Jed Cunningham <66968678+jedcunningham@users.noreply.github.com>
Co-authored-by: D. Ferruzzi <ferruzzi@amazon.com>
5 months agoBetter handle auto-refresh errors (#22840)
Brent Bovenzi [Fri, 8 Apr 2022 15:33:11 +0000 (10:33 -0500)] 
Better handle auto-refresh errors (#22840)

5 months agoSupport log download in task log view (#22804)
Hank Ehly [Fri, 8 Apr 2022 14:55:00 +0000 (23:55 +0900)] 
Support log download in task log view (#22804)

* Add download button to ti_log view

* Use native anchor tag for ti_log download button

* Replace regex with data attribute

* Update airflow/www/static/js/ti_log.js

Co-authored-by: Brent Bovenzi <brent.bovenzi@gmail.com>
Co-authored-by: Brent Bovenzi <brent.bovenzi@gmail.com>
5 months agoAdd example DAG for demonstrating usage of GCS sensors (#22808)
Pankaj Koti [Fri, 8 Apr 2022 14:26:54 +0000 (19:56 +0530)] 
Add example DAG for demonstrating usage of GCS sensors (#22808)

Following GCS Sensors examples are provided as part of the change:
1. GCSUploadSessionCompleteSensor
2. GCSObjectUpdateSensor

The commit does the following:
1. Delete the newly created top level example_gcs.py as it was a
   wrong place for the sensors
2. Add the intended sensors of the PR to the existing example_gcs.py file
   located in airflow/cloud/example_dags directory

5 months agofix message in prepare_provider_packages.py (#22856)
eladkal [Fri, 8 Apr 2022 14:07:20 +0000 (17:07 +0300)] 
fix message in prepare_provider_packages.py (#22856)

5 months agoAdd more fields to REST API dags/dag_id/details endpoint (#22756)
Ephraim Anierobi [Fri, 8 Apr 2022 11:58:47 +0000 (12:58 +0100)] 
Add more fields to REST API dags/dag_id/details endpoint (#22756)

Added more fields to the DAG details endpoint, which is the endpoint for
getting DAG `object` details

5 months agoDocs: `remote_log_conn_id` can also be used to write logs (#22844)
Jed Cunningham [Fri, 8 Apr 2022 11:57:40 +0000 (05:57 -0600)] 
Docs: `remote_log_conn_id` can also be used to write logs (#22844)

5 months agoBring back limits on branches/tags builds in Airlfow repo (#22855)
Jarek Potiuk [Fri, 8 Apr 2022 11:05:30 +0000 (13:05 +0200)] 
Bring back limits on branches/tags builds in Airlfow repo (#22855)

The change #22542 accidentally removed limit on branches
that trigger direct push workflows in CI.

Currently the builds are also triggered when a new TAG is pushed
not only when new branch is created and this is quite too much
especially when we push multiple tags for providers :(

5 months agoRevert "Print configuration on scheduler startup. (#22588)" (#22851)
Jed Cunningham [Fri, 8 Apr 2022 08:29:26 +0000 (02:29 -0600)] 
Revert "Print configuration on scheduler startup. (#22588)" (#22851)

This reverts commit 78586b45a0f6007ab6b94c35b33790a944856e5e.

5 months agoDisable foreign keys on sqlite when modifying dag_run (#22848)
Daniel Standish [Fri, 8 Apr 2022 08:15:08 +0000 (01:15 -0700)] 
Disable foreign keys on sqlite when modifying dag_run (#22848)

If we do not disable FKs then it has the side effect of deleting all task instances.

5 months agoSupport conf param override for backfill runs (#22837)
QP Hou [Fri, 8 Apr 2022 04:58:25 +0000 (21:58 -0700)] 
Support conf param override for backfill runs (#22837)

Co-authored-by: Dmirty Suvorov <dmitry.suvorov@scribd.com>
5 months agoTemporarily disable task_fail pre-upgrade duplicates check (#22839)
Daniel Standish [Fri, 8 Apr 2022 02:22:11 +0000 (19:22 -0700)] 
Temporarily disable task_fail pre-upgrade duplicates check (#22839)

I am reworking it to actually move the rows but for now we can disable it.

5 months agoAdd 2.2.5 to revision heads map (#22841)
Daniel Standish [Fri, 8 Apr 2022 02:21:33 +0000 (19:21 -0700)] 
Add 2.2.5 to revision heads map (#22841)

This is necessary to allow for downgrade to 2.2.5

5 months agoBump prismjs from 1.26.0 to 1.27.0 in /airflow/www (#22823)
dependabot[bot] [Thu, 7 Apr 2022 18:42:10 +0000 (13:42 -0500)] 
Bump prismjs from 1.26.0 to 1.27.0 in /airflow/www (#22823)

Bumps [prismjs](https://github.com/PrismJS/prism) from 1.26.0 to 1.27.0.
- [Release notes](https://github.com/PrismJS/prism/releases)
- [Changelog](https://github.com/PrismJS/prism/blob/master/CHANGELOG.md)
- [Commits](https://github.com/PrismJS/prism/compare/v1.26.0...v1.27.0)

---
updated-dependencies:
- dependency-name: prismjs
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
5 months agoPrepare mid-April provider documentation. (#22819) providers-google/6 providers-amazon/3.3.0 providers-amazon/3.3.0rc1 providers-apache-livy/2.2.3 providers-apache-livy/2.2.3rc1 providers-arangodb/1.0.0 providers-arangodb/1.0.0rc1 providers-celery/2.1.4 providers-celery/2.1.4rc1 providers-cncf-kubernetes/4.0.0 providers-cncf-kubernetes/4.0.0rc1 providers-databricks/2.6.0rc1 providers-discord/2.1.4 providers-discord/2.1.4rc1 providers-docker/2.6.0 providers-docker/2.6.0rc1 providers-elasticsearch/3.0.3 providers-elasticsearch/3.0.3rc1 providers-google/6.8.0 providers-google/6.8.0rc1 providers-hashicorp/2.2.0 providers-hashicorp/2.2.0rc1 providers-jenkins/2.1.0 providers-jenkins/2.1.0rc1 providers-microsoft-azure/3.8.0 providers-microsoft-azure/3.8.0rc1 providers-microsoft-psrp/1.1.4 providers-microsoft-psrp/1.1.4rc1 providers-presto/2.2.0 providers-presto/2.2.0rc1 providers-sftp/2.6.0 providers-sftp/2.6.0rc1 providers-trino/2.2.0 providers-trino/2.2.0rc1
Jarek Potiuk [Thu, 7 Apr 2022 14:41:56 +0000 (16:41 +0200)] 
Prepare mid-April provider documentation. (#22819)

5 months agoBump postcss from 7.0.35 to 7.0.39 in /airflow/ui (#22831)
dependabot[bot] [Thu, 7 Apr 2022 14:11:29 +0000 (09:11 -0500)] 
Bump postcss from 7.0.35 to 7.0.39 in /airflow/ui (#22831)

Bumps [postcss](https://github.com/postcss/postcss) from 7.0.35 to 7.0.39.
- [Release notes](https://github.com/postcss/postcss/releases)
- [Changelog](https://github.com/postcss/postcss/blob/7.0.39/CHANGELOG.md)
- [Commits](https://github.com/postcss/postcss/compare/7.0.35...7.0.39)

---
updated-dependencies:
- dependency-name: postcss
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
5 months agoBump nanoid from 3.1.23 to 3.3.2 in /airflow/www (#22803)
dependabot[bot] [Thu, 7 Apr 2022 14:09:04 +0000 (09:09 -0500)] 
Bump nanoid from 3.1.23 to 3.3.2 in /airflow/www (#22803)

Bumps [nanoid](https://github.com/ai/nanoid) from 3.1.23 to 3.3.2.
- [Release notes](https://github.com/ai/nanoid/releases)
- [Changelog](https://github.com/ai/nanoid/blob/main/CHANGELOG.md)
- [Commits](https://github.com/ai/nanoid/compare/3.1.23...3.3.2)

---
updated-dependencies:
- dependency-name: nanoid
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
5 months agoDisable SLAs for mapped operators (#22641)
Daniel Standish [Thu, 7 Apr 2022 13:58:38 +0000 (06:58 -0700)] 
Disable SLAs for mapped operators (#22641)

When trying to update SLA logic to handle mapped operators we discovered some odd behavior and decided to defer adding support for SLAs with mapped tasks.

5 months agoBump url-parse from 1.5.1 to 1.5.10 in /airflow/ui (#22822)
dependabot[bot] [Thu, 7 Apr 2022 13:50:58 +0000 (08:50 -0500)] 
Bump url-parse from 1.5.1 to 1.5.10 in /airflow/ui (#22822)

Bumps [url-parse](https://github.com/unshiftio/url-parse) from 1.5.1 to 1.5.10.
- [Release notes](https://github.com/unshiftio/url-parse/releases)
- [Commits](https://github.com/unshiftio/url-parse/compare/1.5.1...1.5.10)

---
updated-dependencies:
- dependency-name: url-parse
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
5 months agoBump minimist from 1.2.5 to 1.2.6 in /airflow/www (#22798)
dependabot[bot] [Thu, 7 Apr 2022 13:48:43 +0000 (08:48 -0500)] 
Bump minimist from 1.2.5 to 1.2.6 in /airflow/www (#22798)

Bumps [minimist](https://github.com/substack/minimist) from 1.2.5 to 1.2.6.
- [Release notes](https://github.com/substack/minimist/releases)
- [Commits](https://github.com/substack/minimist/compare/1.2.5...1.2.6)

---
updated-dependencies:
- dependency-name: minimist
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
5 months agoCorrectly fetch logs for mapped task instances (#22818)
Ash Berlin-Taylor [Thu, 7 Apr 2022 13:38:53 +0000 (14:38 +0100)] 
Correctly fetch logs for mapped task instances (#22818)

We weren't passing the map_index param down to the server

5 months agoGive up on trying to recreate task_id logic (#22794)
Tzu-ping Chung [Thu, 7 Apr 2022 13:14:01 +0000 (21:14 +0800)] 
Give up on trying to recreate task_id logic (#22794)

5 months agoCheck if map_index is accidentally used in providers. (#22817)
Jarek Potiuk [Thu, 7 Apr 2022 12:57:06 +0000 (14:57 +0200)] 
Check if map_index is accidentally used in providers. (#22817)

5 months agoMake ElasticSearch Provider compatible for Airflow<2.3 (#22814)
Kaxil Naik [Thu, 7 Apr 2022 12:50:02 +0000 (13:50 +0100)] 
Make ElasticSearch Provider compatible for Airflow<2.3 (#22814)

`ti.map_index` is not released yet and even once it is released in 2.3, we still want this provider to be backwards compatible, this fixes it.

5 months agoDon't show irrelevant/duplicated/"internal" Task attrs in UI (#22812)
Ash Berlin-Taylor [Thu, 7 Apr 2022 12:27:58 +0000 (13:27 +0100)] 
Don't show irrelevant/duplicated/"internal" Task attrs in UI (#22812)

For example, showing `Log Logger <airflow.models.mappedoperator.MappedOperator (INFO)>` isn't useful.

5 months agoCorrectly interpolate pool name in PoolSlotsAvailableDep statues (#22807)
Tanel Kiis [Thu, 7 Apr 2022 10:33:35 +0000 (13:33 +0300)] 
Correctly interpolate pool name in PoolSlotsAvailableDep statues (#22807)

5 months agoFix `email_on_failure` with `render_template_as_native_obj` (#22770)
Jed Cunningham [Thu, 7 Apr 2022 08:48:14 +0000 (02:48 -0600)] 
Fix `email_on_failure` with `render_template_as_native_obj` (#22770)

Co-authored-by: andyhuang <andyhuang@mirrormedia.mg>
Co-authored-by: Tzu-ping Chung <tp@astronomer.io>
5 months agoExpand mapped tasks at DagRun.Veriy_integrity (#22679)
Ephraim Anierobi [Thu, 7 Apr 2022 08:27:27 +0000 (09:27 +0100)] 
Expand mapped tasks at DagRun.Veriy_integrity (#22679)

Create the necessary task instances for a mapped task at dagrun.verify_integrity

Co-authored-by: Ash Berlin-Taylor <ash@apache.org>
5 months agoBump minimist from 1.2.5 to 1.2.6 in /airflow/ui (#22799)
dependabot[bot] [Wed, 6 Apr 2022 23:24:27 +0000 (00:24 +0100)] 
Bump minimist from 1.2.5 to 1.2.6 in /airflow/ui (#22799)

Bumps [minimist](https://github.com/substack/minimist) from 1.2.5 to 1.2.6.
- [Release notes](https://github.com/substack/minimist/releases)
- [Commits](https://github.com/substack/minimist/compare/1.2.5...1.2.6)

---
updated-dependencies:
- dependency-name: minimist
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
5 months agoBump axios from 0.21.1 to 0.21.2 in /airflow/ui (#22797)
dependabot[bot] [Wed, 6 Apr 2022 23:23:53 +0000 (00:23 +0100)] 
Bump axios from 0.21.1 to 0.21.2 in /airflow/ui (#22797)

Bumps [axios](https://github.com/axios/axios) from 0.21.1 to 0.21.2.
- [Release notes](https://github.com/axios/axios/releases)
- [Changelog](https://github.com/axios/axios/blob/master/CHANGELOG.md)
- [Commits](https://github.com/axios/axios/compare/v0.21.1...v0.21.2)

---
updated-dependencies:
- dependency-name: axios
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
5 months agoPass custom headers through in SES email backend (#22667)
Mike Roest [Wed, 6 Apr 2022 20:43:37 +0000 (14:43 -0600)] 
Pass custom headers through in SES email backend (#22667)

5 months agoFix processor cleanup on DagFileProcessorManager (#22685)
Pablo Collado [Wed, 6 Apr 2022 20:42:31 +0000 (22:42 +0200)] 
Fix processor cleanup on DagFileProcessorManager (#22685)

* Fix processor cleanup

References to processors weren't being cleaned up after
killing them in the event of a timeout. This lead to
a crash caused by an unhandled exception when trying to
read from a closed end of a pipe.

* Reap the zombie when killing the processor

When calling `_kill_process()` we're generating
zombies which weren't being `wait()`ed for. This
led to a process leak we fix by just calling
`waitpid()` on the appropriate PIDs.

* Reap resulting zombies in a safe way

According to @potiuk's and @malthe's input, the way
we were reaping the zombies could cause some racy and
unwanted situations. As seen on the discussion over at
`https://bugs.python.org/issue42558` we can safely
reap the spawned zombies with the changes we have
introduced.

* Explain why we are actively waiting

As suggested by @potiuk explaining why we chose to actively wait on an scenario such as this one can indeed be useful for anybody taking a look at the code some time from now...

Co-authored-by: Jarek Potiuk <jarek@potiuk.com>
* Fix small typo and triling whitespace

After accepting the changes proposed on the PR
we found a small typo (we make those on a daily basis)
and a trailing whitespace we though was nice to delete.
Hope we made the right choice!

* Fix call to `poll()`

We were calling `poll()` through the `_process` attribute
and, as shown on the static checks triggered by GitHub,
it's not defined for the `BaseProcess` class. We instead
have to call `poll()` through `BaseProcess`'s `_popen`
attribute.

* Fix processor cleanup

References to processors weren't being cleaned up after
killing them in the event of a timeout. This lead to
a crash caused by an unhandled exception when trying to
read from a closed end of a pipe.

* Reap the zombie when killing the processor

When calling `_kill_process()` we're generating
zombies which weren't being `wait()`ed for. This
led to a process leak we fix by just calling
`waitpid()` on the appropriate PIDs.

* Reap resulting zombies in a safe way

According to @potiuk's and @malthe's input, the way
we were reaping the zombies could cause some racy and
unwanted situations. As seen on the discussion over at
`https://bugs.python.org/issue42558` we can safely
reap the spawned zombies with the changes we have
introduced.

* Explain why we are actively waiting

As suggested by @potiuk explaining why we chose to actively wait on an scenario such as this one can indeed be useful for anybody taking a look at the code some time from now...

Co-authored-by: Jarek Potiuk <jarek@potiuk.com>
* Fix small typo and triling whitespace

After accepting the changes proposed on the PR
we found a small typo (we make those on a daily basis)
and a trailing whitespace we though was nice to delete.
Hope we made the right choice!

* Fix call to `poll()`

We were calling `poll()` through the `_process` attribute
and, as shown on the static checks triggered by GitHub,
it's not defined for the `BaseProcess` class. We instead
have to call `poll()` through `BaseProcess`'s `_popen`
attribute.

* Prevent static check from failing

After reading through `multiprocessing`'s implementation we
really didn't know why the static check on line `239` was
failing: the process should contain a `_popen` attribute...
That's when we found line `223` and discovered the trailing
`# type: ignore` comment. After reading up on it we found
that it instructs *MyPy* not to statically check that very
line. Given we're having trouble with the exact same attribute
we decided to include the same directive for the static checker.
Hope we made the right call!

* Fix test for `_kill_timed_out_processors()`

We hadn't updated the tests for the method whose
body we've altered. This caused the tests to fail
when trying to retrieve a processor's *waitable*,
a property similar to a *file descriptor* in
UNIX-like systems. We have added a mock property to
the `processor` and we've also updated the `manager`'s
attributes so as to faithfully recreate the state of
the data sctructures at a moment when a `processor`
is to be terminated.

Please note the `assertions` at the end are meant to
check we reach the `manager`'s expected state. We have
chosen to check the number of processor's against an
explicit value because we're defining `manager._processors`
explicitly within the test. On the other hand, `manager.waitables`
can have a different length depending on the call to
`DagFileProcessorManager`'s `__init__()`. In this test the
expected initial length is `1` given we're passing `MagicMock()`
as the `signal_conn` when instantiating the manager. However,
if this were to be changed the tests would 'inexplicably' fail.
Instead of checking `manager.waitables`' length against a hardcoded
value we decided to instead compare it to its initial length
so as to emphasize we're interested in the change in length, not
its absolute value.

* Fix `black` checks and `mock` decorators

One of the methods we are to mock required a rather
long `@mock.patch` decorator which didn't pass the
checks made by `black` on the precommit hooks. On
top of that, we messed up the ordering of the
`@mock.patch` decorators which meant we didn't
set them up properly. This manifested as a `KeyError`
on the method we're currently testing. O_o

Co-authored-by: Jarek Potiuk <jarek@potiuk.com>
5 months agoFix sqlalchemy warning about coercing subquery for use in IN() (#22788)
Ephraim Anierobi [Wed, 6 Apr 2022 20:36:06 +0000 (21:36 +0100)] 
Fix sqlalchemy warning about coercing subquery for use in IN() (#22788)

On task log, when using postgres db, you would see a warning:

WARNING - /usr/local/lib/python3.7/site-packages/sqlalchemy/sql/coercions.py:521
SAWarning: Coercing Subquery object into a select() for use in IN();
please pass a select() construct explicitly

This PR fixes it

5 months agoSwitch to `pipx` as the only installation Breeze2 method (#22740)
Jarek Potiuk [Wed, 6 Apr 2022 20:01:58 +0000 (22:01 +0200)] 
Switch to `pipx` as the only installation Breeze2 method (#22740)

Switching Breeze2 to only use `pipx` for installation of Breeze2
due to problems it might cause for autocompletion if entrypoint
is not avaiable on PATH.

5 months agoMove CTAS logic into a CTAS function in db pre-upgrade (#22791)
Daniel Standish [Wed, 6 Apr 2022 20:01:01 +0000 (13:01 -0700)] 
Move CTAS logic into a CTAS function in db pre-upgrade (#22791)

Will be reusing this in later PRs that need to do the same thing, but doing in separate PR for ease of review.

5 months agoSerialize mapped operator expansion kwargs logic (#22792)
Tzu-ping Chung [Wed, 6 Apr 2022 19:48:18 +0000 (03:48 +0800)] 
Serialize mapped operator expansion kwargs logic (#22792)

This is needed because we need to be able to access the correct
expansion kwargs from the serialized mapped operator in the scheduler.