airflow.git
18 hours agoAdd slim image to docs/docker-stack/README.md (#23710) main
Kamil Breguła [Sun, 15 May 2022 09:52:51 +0000 (11:52 +0200)] 
Add slim image to docs/docker-stack/README.md (#23710)

2 days agoAdd UI tests for /utils and /components (#23456)
Brent Bovenzi [Fri, 13 May 2022 14:58:28 +0000 (10:58 -0400)] 
Add UI tests for /utils and /components (#23456)

* Add UI tests for /utils and /components

* add test for Table

* Address PR feedback

* Fix window prompt var

* Fix TaskName test from rebase

* fix lint errors

2 days agoAdd environment check and build image check for more Breeze commands (#23687)
Jarek Potiuk [Fri, 13 May 2022 13:16:33 +0000 (15:16 +0200)] 
Add environment check and build image check for more Breeze commands (#23687)

Several commands of Breeze depends on docker, docker compose
being available as well as breeze image. They will work
fine if you "just" built the image but they might benefit
from the image being rebuilt (to make sure all latest
dependencies are installed in the image). The common checks
done in "shell" command for that are now extracted to common
utils and run as first thing in those commands that need it.

2 days agoClarify that bundle extras should not be used for PyPi installs (#23697)
Jarek Potiuk [Fri, 13 May 2022 11:33:17 +0000 (13:33 +0200)] 
Clarify that bundle extras should not be used for PyPi installs (#23697)

The bundle extras we have are only used for development and they
should not be used to install airflow from PyPI. This update
to documentation clarifies it.

Closes: #23692

2 days agoFix property name in breeze Shell Params (#23696)
Jarek Potiuk [Fri, 13 May 2022 11:00:39 +0000 (13:00 +0200)] 
Fix property name in breeze Shell Params (#23696)

The rename from #23562 missed few shell_parms usage where it
also should be replaced.

2 days agoDisable Flower by default from docker-compose (#23685)
Jarek Potiuk [Fri, 13 May 2022 10:21:36 +0000 (12:21 +0200)] 
Disable Flower by default from docker-compose (#23685)

2 days agoAdd git_source to DatabricksSubmitRunOperator (#23620)
akolar-db [Fri, 13 May 2022 09:56:13 +0000 (11:56 +0200)] 
Add git_source to DatabricksSubmitRunOperator (#23620)

The existing `DatabricksSubmitRunOperator` is extended with the support for the `git_source` parameter which allows users to run notebook tasks from files committed to git repositories.

If specified, any notebook task that is part of the payload will clone the repository and check out the commit, tag, or the tip of the specified branch. This is an alternative to dev repos ([docs](https://docs.databricks.com/repos/index.html)) where the checkout/update would have to be triggered manually.

Public documentation for the feature available here: https://docs.databricks.com/dev-tools/api/latest/jobs.html (NB: as noted in the docs, the feature is currently in public preview).

3 days agoUse func.count to count rows (#23657)
Ping Zhang [Thu, 12 May 2022 21:49:06 +0000 (14:49 -0700)] 
Use func.count to count rows (#23657)

3 days agoUpdate doc and sample dag for Quicksight (#23653)
Vincent [Thu, 12 May 2022 20:19:35 +0000 (14:19 -0600)] 
Update doc and sample dag for Quicksight (#23653)

3 days agoFix expand/collapse all buttons (#23590)
Brent Bovenzi [Thu, 12 May 2022 19:48:31 +0000 (15:48 -0400)] 
Fix expand/collapse all buttons (#23590)

* communicate via customevents

* Handle open group logic in wrapper

* fix tests

* Make grid action buttons sticky

* Add default toggle fn

* fix splitting task id by '.'

* fix missing dagrun ids

3 days agoMove around overflow, position and padding (#23044)
Brent Bovenzi [Thu, 12 May 2022 19:47:24 +0000 (15:47 -0400)] 
Move around overflow, position and padding (#23044)

3 days agoremove stale serialized dags (#22917)
Ping Zhang [Thu, 12 May 2022 19:01:47 +0000 (12:01 -0700)] 
remove stale serialized dags (#22917)

3 days agoShorten max pre-commit hook name length (#23677)
Daniel Standish [Thu, 12 May 2022 18:46:56 +0000 (11:46 -0700)] 
Shorten max pre-commit hook name length (#23677)

When names are too long, pre-commit output looks very ugly and takes up 2x lines. Here I reduce max length just a little bit further so that pre-commit output renders properly on a macbook pro 16" with terminal window splitting screen horizontally.

3 days agoUpgrade `pip` to latest released 22.1.0 version (#23665)
Jarek Potiuk [Thu, 12 May 2022 17:36:06 +0000 (19:36 +0200)] 
Upgrade `pip` to latest released 22.1.0 version (#23665)

We are finally able to get rid of the annoying false-positive
warnings and we have finally a chance on having warning-free
installation during docker builds.

3 days agoReplace "absolute()" with "resolve()" in pathlib objects (#23675)
Jarek Potiuk [Thu, 12 May 2022 17:30:39 +0000 (19:30 +0200)] 
Replace "absolute()" with "resolve()" in pathlib objects (#23675)

TIL that absolute() is an undocumented in Pathlib and that we
should use resolve() instead.

So this is it.

3 days agoAdd wildcard possibility to `package-filter` parametere (#23672)
Jarek Potiuk [Thu, 12 May 2022 17:23:38 +0000 (19:23 +0200)] 
Add wildcard possibility to `package-filter` parametere (#23672)

the glob parameters (for example `apache-airflow-providers-*`) did
not work because only fixed list of parameters was allowed.

This PR converts the package-filter parameter to stop verifying the
value passed - so autocomplete continues to work but you should
still be able to use glob.

It also removes few places where the parameters were used with
`--` separator.

3 days agoMigrate Dataproc to new system tests design (#22777)
Bartłomiej Hirsz [Thu, 12 May 2022 16:49:10 +0000 (18:49 +0200)] 
Migrate Dataproc to new system tests design (#22777)

3 days agoSynchronize support for Postgres and K8S in docs (#23673)
Jarek Potiuk [Thu, 12 May 2022 13:26:04 +0000 (15:26 +0200)] 
Synchronize support for Postgres and K8S in docs (#23673)

We just added support for Postgres 14 and K8S 1.24 and since we
did not have any changes to support either in main we are bringing
the support to 2.3 line as well.

This documentation syncs all remaining places where it should be
updated.

3 days agoremove `--` in `./breeze build-docs` command (#23671)
ishiis [Thu, 12 May 2022 12:33:53 +0000 (21:33 +0900)] 
remove `--` in `./breeze build-docs` command (#23671)

3 days agoAIP45 Remove dag parsing in airflow run local (#21877)
Ping Zhang [Thu, 12 May 2022 10:09:48 +0000 (03:09 -0700)] 
AIP45 Remove dag parsing in airflow run local (#21877)

4 days agoPrepare provider documentation 2022.05.11 (#23631) providers-amazon/3.4.0rc1 providers-apache-beam/3.4.0rc1 providers-apache-hive/2.3.3rc1 providers-cncf-kubernetes/4.0.2rc1 providers-databricks/2.7.0rc1 providers-docker/2.7.0rc1 providers-google/7.0.0rc1 providers-jira/2.0.5rc1 providers-microsoft-azure/3.9.0rc1 providers-opsgenie/3.1.0rc1 providers-presto/2.2.1rc1 providers-salesforce/3.4.4rc1 providers-snowflake/2.7.0rc1 providers-ssh/2.4.4rc1 providers-tableau/2.1.8rc1 providers-trino/2.3.0rc1
Jarek Potiuk [Wed, 11 May 2022 23:01:16 +0000 (01:01 +0200)] 
Prepare provider documentation 2022.05.11 (#23631)

Co-authored-by: eladkal <45845474+eladkal@users.noreply.github.com>
Co-authored-by: eladkal <45845474+eladkal@users.noreply.github.com>
4 days agoRevert "Fix k8s pod.execute randomly stuck indefinitely by logs consumption (#23497...
Jarek Potiuk [Wed, 11 May 2022 22:20:02 +0000 (00:20 +0200)] 
Revert "Fix k8s pod.execute randomly stuck indefinitely by logs consumption (#23497) (#23618)" (#23656)

This reverts commit ee342b85b97649e2e29fcf83f439279b68f1b4d4.

4 days agoRename cluster_policy to task_policy (#23468)
humit [Wed, 11 May 2022 19:40:10 +0000 (04:40 +0900)] 
Rename cluster_policy to task_policy (#23468)

* Rename cluster_policy to task_policy

* rename task_policy as example_task_policy.

4 days ago[FEATURE] google provider - BigQueryInsertJobOperator log query (#23648)
raphaelauv [Wed, 11 May 2022 19:28:19 +0000 (21:28 +0200)] 
[FEATURE] google provider - BigQueryInsertJobOperator log query (#23648)

4 days agoFix k8s pod.execute randomly stuck indefinitely by logs consumption (#23497) (#23618)
Sebastian Chamena [Wed, 11 May 2022 19:20:49 +0000 (12:20 -0700)] 
Fix k8s pod.execute randomly stuck indefinitely by logs consumption (#23497) (#23618)

4 days agoFixed test and remove pytest.mark.xfail for test_exc_tb (#23650)
Kanthi [Wed, 11 May 2022 19:16:49 +0000 (15:16 -0400)] 
Fixed test and remove pytest.mark.xfail for test_exc_tb (#23650)

4 days agoAdded kubernetes version (1.24) in README.md(for Main version(dev)), … (#23649)
Kanthi [Wed, 11 May 2022 17:13:01 +0000 (13:13 -0400)] 
Added kubernetes version (1.24) in README.md(for Main version(dev)), … (#23649)

* Added kubernetes version (1.24) in README.md(for Main version(dev)), accidentally removed in merge cnflict.

* Update README.md

Co-authored-by: Jarek Potiuk <jarek@potiuk.com>
4 days agoAdd `RedshiftDeleteClusterOperator` support (#23563)
pankajastro [Wed, 11 May 2022 17:07:01 +0000 (22:37 +0530)] 
Add `RedshiftDeleteClusterOperator` support (#23563)

Add support for `RedshiftDeleteClusterOperator`. This will help to clean resources using airflow operators when needed. In the current implementation, By default, I'm waiting until the cluster is completely removed to return immediately without waiting set `wait_for_completion` param to False

- Add operator class
- Add basic unit test
- Add an example task
- Add relevant documentation

4 days agoAdded postgres 14 to support versions(including breeze) (#23506)
Kanthi [Wed, 11 May 2022 16:26:19 +0000 (12:26 -0400)] 
Added postgres 14 to support versions(including breeze) (#23506)

* Added postgres 14 to support versions(including breeze)

4 days agoDon't run pre-migration checks for downgrade (#23634)
Daniel Standish [Wed, 11 May 2022 16:08:06 +0000 (09:08 -0700)] 
Don't run pre-migration checks for downgrade (#23634)

These checks are only make sense for upgrades.  Generally they exist to resolve referential integrity issues etc before adding constraints.  In the downgrade context, we generally only remove constraints, so it's a non-issue.

4 days agoAdd index for event column in log table (#23625)
Gabriel Machado [Wed, 11 May 2022 14:45:33 +0000 (16:45 +0200)] 
Add index for event column in log table (#23625)

4 days agoSimplify flash message for _airflow_moved tables (#23635)
Daniel Standish [Wed, 11 May 2022 14:13:57 +0000 (07:13 -0700)] 
Simplify flash message for _airflow_moved tables (#23635)

Co-authored-by: Jed Cunningham <66968678+jedcunningham@users.noreply.github.com>
4 days agoFix assuming "Feature" answer on CI when generating docs (#23640)
Jarek Potiuk [Wed, 11 May 2022 11:15:22 +0000 (13:15 +0200)] 
Fix assuming "Feature" answer on CI when generating docs (#23640)

We have now different answers posisble when generating docs, and
for testing we assume we answered randomly during the generation
of documentation.

4 days agoFix typo issue (#23633)
humit [Wed, 11 May 2022 10:58:26 +0000 (19:58 +0900)] 
Fix typo issue (#23633)

4 days ago[FEATURE] add K8S 1.24 support (#23637)
raphaelauv [Wed, 11 May 2022 10:52:24 +0000 (12:52 +0200)] 
[FEATURE] add K8S 1.24 support (#23637)

4 days ago[FEATURE] update K8S-KIND to 0.13.0 (#23636)
raphaelauv [Wed, 11 May 2022 08:26:14 +0000 (10:26 +0200)] 
[FEATURE] update K8S-KIND to 0.13.0 (#23636)

4 days agoPrevent KubernetesJobWatcher getting stuck on resource too old (#23521)
Ruben Laguna [Wed, 11 May 2022 06:25:49 +0000 (08:25 +0200)] 
Prevent KubernetesJobWatcher getting stuck on resource too old (#23521)

* Prevent KubernetesJobWatcher getting stuck on resource too old

If the watch fails because "resource too old" the
KubernetesJobWatcher should not retry with the same resource version
as that will end up in loop where there is no progress.

* Reset ResourceVersion().resource_version to 0

5 days agoMake provider doc preparation a bit more fun :) (#23629)
Jarek Potiuk [Tue, 10 May 2022 22:19:54 +0000 (00:19 +0200)] 
Make provider doc preparation a bit more fun :) (#23629)

Previously you had to manually add versions when changelog was
modified. But why not to get a bit more fun and get the versions
bumped automatically based on your assesment when reviewing the
provideers rather than after looking at the generated changelog.

5 days agoFix: Exception when parsing log #20966 (#23301)
Jakub Novák [Tue, 10 May 2022 20:43:25 +0000 (22:43 +0200)] 
Fix: Exception when parsing log #20966 (#23301)

* UnicodeDecodeError: 'utf-8' codec can't decode byte 0xXX in position X: invalid start byte

  File "/opt/work/python395/lib/python3.9/site-packages/airflow/hooks/subprocess.py", line 89, in run_command
    line = raw_line.decode(output_encoding).rstrip()            # raw_line ==  b'\x00\x00\x00\x11\xa9\x01\n'
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xa9 in position 4: invalid start byte

* Update subprocess.py

* Update subprocess.py

* Fix:  Exception when parsing log #20966

* Fix:  Exception when parsing log #20966

 Another alternative is: try-catch it.

e.g.

```
            line = ''
            for raw_line in iter(self.sub_process.stdout.readline, b''):
                try:
                    line = raw_line.decode(output_encoding).rstrip()
                except UnicodeDecodeError as err:
                    print(err, output_encoding, raw_line)
                self.log.info("%s", line)
```

* Create test_subprocess.sh

* Update test_subprocess.py

* Added shell directive and license to test_subprocess.sh

* Distinguish between raw and decoded lines as suggested by @uranusjr

* simplify test

Co-authored-by: muhua <microhuang@live.com>
5 days agoImplement send_callback method for CeleryKubernetesExecutor and LocalKubernetesExecut...
mhenc [Tue, 10 May 2022 17:13:00 +0000 (19:13 +0200)] 
Implement send_callback method for CeleryKubernetesExecutor and LocalKubernetesExecutor (#23617)

5 days ago[FEATURE] google provider - split GkeStartPodOperator execute (#23518)
raphaelauv [Tue, 10 May 2022 15:51:37 +0000 (17:51 +0200)] 
[FEATURE] google provider - split GkeStartPodOperator execute (#23518)

5 days agoFixed Kubernetes Operator large xcom content Defect (#23490)
rahulgoyal2987 [Tue, 10 May 2022 15:46:55 +0000 (21:16 +0530)] 
Fixed Kubernetes Operator large xcom content Defect  (#23490)

5 days agoAdd slim images to docker-stack docs index (#23601)
Jarek Potiuk [Tue, 10 May 2022 15:24:26 +0000 (17:24 +0200)] 
Add slim images to docker-stack docs index (#23601)

5 days agoAdd Quicksight create ingestion Hook and Operator (#21863)
Harpreet Singh [Tue, 10 May 2022 14:54:13 +0000 (20:24 +0530)] 
Add Quicksight create ingestion Hook and Operator (#21863)

* Add Quicksight create ingestion Hook and Operator

Co-authored-by: eladkal <45845474+eladkal@users.noreply.github.com>
5 days agoMake Breeze help generation indepdent from having breeze installed (#23612)
Jarek Potiuk [Tue, 10 May 2022 09:49:39 +0000 (11:49 +0200)] 
Make Breeze help generation indepdent from having breeze installed (#23612)

Generation of Breeze help requires breeze to be installed. However
if you have locally installed breeze with different dependencies
and did not run self-upgrade, the results of generation of the
images might be different (for example when different rich
version is used). This change works in the way that:
* you do not have to have breeze installed at all to make it work
* it always upgrades to latest breeze when it is not installed
* but this only happens when you actually modified some breeze code

5 days agoAdd exportContext.offload flag to CLOUD_SQL_EXPORT_VALIDATION. (#23614)
ishiis [Tue, 10 May 2022 09:49:18 +0000 (18:49 +0900)] 
Add exportContext.offload flag to CLOUD_SQL_EXPORT_VALIDATION. (#23614)

5 days agoUpdate min requirements for rich to 12.4.1 (#23604)
Jarek Potiuk [Tue, 10 May 2022 06:36:28 +0000 (08:36 +0200)] 
Update min requirements for rich to 12.4.1 (#23604)

6 days agoAdd sample dag and doc for S3ListPrefixesOperator (#23448)
Vincent [Mon, 9 May 2022 22:54:59 +0000 (16:54 -0600)] 
Add sample dag and doc for S3ListPrefixesOperator (#23448)

* Add sample dag and doc for S3ListPrefixesOperator

* Fix static checks

6 days agoAdd exception to catch single line private keys (#23043)
nsAstro [Mon, 9 May 2022 22:49:22 +0000 (18:49 -0400)] 
Add exception to catch single line private keys (#23043)

6 days agoUse inclusive words in apache airflow project (#23090)
Edith Puclla [Mon, 9 May 2022 21:52:29 +0000 (16:52 -0500)] 
Use inclusive words in apache airflow project (#23090)

6 days agoImprove caching for multi-platform images. (#23562)
Jarek Potiuk [Mon, 9 May 2022 21:02:25 +0000 (23:02 +0200)] 
Improve caching for multi-platform images. (#23562)

This is another attempt to improve caching performance for
multi-platform images as the previous ones were undermined by a
bug in buildx multiplatform cache-to implementattion that caused
the image cache to be overwritten between platforms,
when multiple images were build.

The bug is created for the buildx behaviour at
https://github.com/docker/buildx/issues/1044 and until it is fixed
we have to prpare separate caches for each platform and push them
to separate tags.

That adds a bit overhead on the building step, but for now it is
the simplest way we can workaround the bug if we do not want to
manually manipulate manifests and images.

6 days ago19943 Grid view status filters (#23392)
pierrejeambrun [Mon, 9 May 2022 20:32:02 +0000 (22:32 +0200)] 
19943 Grid view status filters (#23392)

* Move tree filtering inside react and add some filters

* Move filters from context to utils

* Fix tests for useTreeData

* Fix last tests.

* Add tests for useFilters

* Refact to use existing SimpleStatus component

* Additional fix after rebase.

* Update following bbovenzi code review

* Update following code review

* Fix tests.

* Fix page flickering issues from react-query

* Fix side panel and small changes.

* Use default_dag_run_display_number in the filter options

* Handle timezone

* Fix flaky test

Co-authored-by: Brent Bovenzi <brent.bovenzi@gmail.com>
6 days agoAdd sample dag and doc for S3ListOperator (#23449)
Vincent [Mon, 9 May 2022 18:21:51 +0000 (12:21 -0600)] 
Add sample dag and doc for S3ListOperator (#23449)

* Add sample dag and doc for S3ListOperator

* Fix doc

6 days agoHelm chart 1.6.0rc1 (#23548)
Jed Cunningham [Mon, 9 May 2022 18:14:44 +0000 (12:14 -0600)] 
Helm chart 1.6.0rc1 (#23548)

6 days agoAdd doc and sample dag for EC2 (#23547)
Vincent [Mon, 9 May 2022 17:56:50 +0000 (11:56 -0600)] 
Add doc and sample dag for EC2 (#23547)

6 days agoApply specific ID collation to root_dag_id too (#23536)
Michael Peteuil [Mon, 9 May 2022 17:48:11 +0000 (13:48 -0400)] 
Apply specific ID collation to root_dag_id too (#23536)

In certain databases there is a need to set the collation for ID fields
like dag_id or task_id to something different than the database default.
This is because in MySQL with utf8mb4 the index size becomes too big for
the MySQL limits. In past pull requests this was handled
[#7570](https://github.com/apache/airflow/pull/7570),
[#17729](https://github.com/apache/airflow/pull/17729), but the
root_dag_id field on the dag model was missed. Since this field is used
to join with the dag_id in various other models ([and
self-referentially](https://github.com/apache/airflow/blob/451c7cbc42a83a180c4362693508ed33dd1d1dab/airflow/models/dag.py#L2766)),
it also needs to have the same collation as other ID fields.

This can be seen by running `airflow db reset` before and after applying
this change while also specifying `sql_engine_collation_for_ids` in the
configuration.

Other related PRs
[#19408](https://github.com/apache/airflow/pull/19408)

6 days agoClean up in-line f-string concatenation (#23591)
Josh Fell [Mon, 9 May 2022 17:44:41 +0000 (13:44 -0400)] 
Clean up in-line f-string concatenation (#23591)

6 days agoUpdate sample dag and doc for Datasync (#23511)
Vincent [Mon, 9 May 2022 17:40:27 +0000 (11:40 -0600)] 
Update sample dag and doc for Datasync (#23511)

6 days agoAdd default 'aws_conn_id' to SageMaker Operators #21808 (#23515)
Harpreet Singh [Mon, 9 May 2022 17:36:35 +0000 (23:06 +0530)] 
Add default 'aws_conn_id' to SageMaker Operators #21808 (#23515)

6 days agoFix broken dagrun links when many runs start at the same time (#23462)
Chris Redekop [Mon, 9 May 2022 15:49:53 +0000 (09:49 -0600)] 
Fix broken dagrun links when many runs start at the same time (#23462)

* Load requested dagrun even when there are many dagruns at (almost) the same time

* Fix code formatting issues

6 days agoFix `PythonVirtualenvOperator` templated_fields (#23559)
eladkal [Mon, 9 May 2022 15:17:34 +0000 (18:17 +0300)] 
Fix `PythonVirtualenvOperator` templated_fields (#23559)

* Fix `PythonVirtualenvOperator` templated_fields
The `PythonVirtualenvOperator` templated_fields override `PythonOperator` templated_fields which caused functionality not to work as expected.
fixes: https://github.com/apache/airflow/issues/23557

6 days agoPools with negative open slots should not block other pools (#23143)
Tanel Kiis [Mon, 9 May 2022 15:12:40 +0000 (18:12 +0300)] 
Pools with negative open slots should not block other pools (#23143)

6 days agoAdd `device_requests` parameter to `DockerOperator` (#23554)
eladkal [Mon, 9 May 2022 15:08:15 +0000 (18:08 +0300)] 
Add `device_requests` parameter to `DockerOperator` (#23554)

* Expose device_requests to DockerOperator

Co-authored-by: Tedi Papajorgji <tedi.papajorgji@hotmail.com>
6 days agoFix scheduler crash when expanding with mapped task that returned none (#23486)
Ephraim Anierobi [Mon, 9 May 2022 12:44:35 +0000 (13:44 +0100)] 
Fix scheduler crash when expanding with mapped task that returned none (#23486)

When task is expanded from a mapped task that returned no value, it
crashes the scheduler. This PR fixes it by first checking if there's
a return value from the mapped task, if no returned value, then error
in the task itself instead of crashing the scheduler

6 days agoAdd support for queued state in DagRun update endpoint. (#23481)
Karthikeyan Singaravelan [Mon, 9 May 2022 12:25:48 +0000 (17:55 +0530)] 
Add support for queued state in DagRun update endpoint. (#23481)

6 days agoFixed option name in Breeze description (#23582)
Jarek Potiuk [Mon, 9 May 2022 10:15:43 +0000 (12:15 +0200)] 
Fixed option name in Breeze description (#23582)

6 days agotHe output of commands of Breeze are only generated when they change (#23570)
Jarek Potiuk [Mon, 9 May 2022 09:59:11 +0000 (11:59 +0200)] 
tHe output of commands of Breeze are only generated when they change (#23570)

Previously we generated output of all the commands from Breeze always,
hoping that they will be the same, but rich already had two changes
in the format of the SVG files which made the output different and
breaking our PRs.

Temporarily we pinned rich to fix the output, but better solution is
to get the hash of all the configuration options and see if it changed,
and only run generation when it did. This way we keep automated
generation on pre-commit but we are protected from accidental change
of the output.

We also remove the rich limits and regenerated all svg files to ones
generated by 12.4.0. Also found a way to run the check if we should
run generation at all in pre-commit without prior installing breeze.

Fixes: #22908

6 days agoFix dag-processor fetch metabase config (#23575)
Andrey Anshin [Mon, 9 May 2022 08:50:33 +0000 (11:50 +0300)] 
Fix dag-processor fetch metabase config (#23575)

6 days agoUpdate dags.rst (#23579)
mthakare-onshape [Mon, 9 May 2022 08:17:12 +0000 (13:47 +0530)] 
Update dags.rst (#23579)

Update missing bracket

6 days agoTemporarily pin xmltodict to 0.12.0 to fix main failure (#23577)
Jarek Potiuk [Mon, 9 May 2022 05:41:06 +0000 (07:41 +0200)] 
Temporarily pin xmltodict to 0.12.0 to fix main failure (#23577)

The xmltodict 0,13.0 breaks some tests and likely 0.13.0 is buggy
as the error is ValueError: Malformatted input.

We pin it to 0.12.0 to fix the main failing.

Related: #23576

7 days agoFix conn close error on retrieving log events (#23470)
thinhnd2104 [Sun, 8 May 2022 22:32:09 +0000 (05:32 +0700)] 
Fix conn close error on retrieving log events (#23470)

related: [#23469] (https://github.com/apache/airflow/issues/23469).

7 days agoFix `PostgresToGCSOperator` does not allow nested JSON (#23063)
pierrejeambrun [Sun, 8 May 2022 22:06:23 +0000 (00:06 +0200)] 
Fix `PostgresToGCSOperator` does not allow nested JSON (#23063)

* Avoid double json.dumps for json data export in PostgresToGCSOperator.

* Fix CI

7 days agoOpsgenie: Fix `close_alert` to properly send `kwargs` (#23442)
Benoit Person [Sun, 8 May 2022 21:38:50 +0000 (23:38 +0200)] 
Opsgenie: Fix `close_alert` to properly send `kwargs` (#23442)

7 days agoAmazon Sagemaker Sample DAG and docs update (#23256)
D. Ferruzzi [Sun, 8 May 2022 21:37:51 +0000 (14:37 -0700)] 
Amazon Sagemaker Sample DAG and docs update (#23256)

7 days agowasb hook: user defaultAzureCredentials instead of managedIdentity (#23394)
sanjayp [Sun, 8 May 2022 21:12:26 +0000 (02:42 +0530)] 
wasb hook: user defaultAzureCredentials instead of managedIdentity (#23394)

Co-authored-by: Sanjay Pillai <sanjaypillai11 [at] gmail.com>
7 days agoMove dag_processing.processor_timeouts to counters section (#23393)
Yeachan Park [Sun, 8 May 2022 21:11:51 +0000 (23:11 +0200)] 
Move dag_processing.processor_timeouts to counters section (#23393)

7 days agoFix GCSToGCSOperator ignores replace parameter when there is no wildcard (#23340)
GitStart-AirFlow [Sun, 8 May 2022 19:46:55 +0000 (20:46 +0100)] 
Fix GCSToGCSOperator ignores replace parameter when there is no wildcard (#23340)

7 days agoTests for provider code structure (#23351)
Bartłomiej Hirsz [Sun, 8 May 2022 19:32:26 +0000 (21:32 +0200)] 
Tests for provider code structure (#23351)

Improved test for code structure that can be re-used among various providders.

8 days agoAdd slim images to release process (#23391)
Jarek Potiuk [Sat, 7 May 2022 21:53:11 +0000 (23:53 +0200)] 
Add slim images to release process (#23391)

This PR adds slim images to release process of Airflow.

Those images are small as they do not contain any extras.

Fixes: #20849

8 days agoFix _PIP_ADDITIONAL_REQUIREMENTS case for docker-compose (#23517)
Jarek Potiuk [Sat, 7 May 2022 14:17:48 +0000 (16:17 +0200)] 
Fix _PIP_ADDITIONAL_REQUIREMENTS case for docker-compose (#23517)

Recent versions of Airflow do not allow to run `pip install` as
root but the `init` job runs as root so when the variable
_PIP_ADDITIONAL_REQUIREMENTS is set, the init container fails.

This PR forces _PIP_ADDITIONAL_REQUIREMENTS to be empty for the init
job.

8 days agoRefactor Breeze to group related methods and classes together (#23556)
Jarek Potiuk [Sat, 7 May 2022 13:56:34 +0000 (15:56 +0200)] 
Refactor Breeze to group related methods and classes together (#23556)

This change refactors Breeze classes to more consistent approach.

* The "commands" package only contains commands
* All Parameters (BuildCi, BuildProd, BuildDoc, Shell) are now
  in "params" package
* Required/Optional Build args are now members of the
  BuildCiParams, BuildProdParams which makes the params
  much more self-contained..
* All utils are in "utils" package

This helps with avoiding circular imports (all utios are now
standalone and do not use any of the commands.

Co-authored-by: eladkal <45845474+eladkal@users.noreply.github.com>
8 days agoAdd IPV6 form of the address in cassandra status check (#23537)
Jarek Potiuk [Sat, 7 May 2022 13:36:55 +0000 (15:36 +0200)] 
Add IPV6 form of the address in cassandra status check (#23537)

This PR fixes problem introduced in 3.0.26 of cassandra image which
adds square brackets around IP address regardless of its type.

The problem was workarounded by pinning cassandra to 3.0.25 in
the ##23522 as a quick fix, but this one introducec permanent,
future-proof solution.

Based on discussion in https://issues.apache.org/jira/browse/CASSANDRA-17612

Fixes: #23523

8 days agoAdd logging in to Github Registry for breeze pull (#23551)
Jarek Potiuk [Sat, 7 May 2022 10:24:05 +0000 (12:24 +0200)] 
Add logging in to Github Registry for breeze pull (#23551)

All of the Airlfow Images are Public in ghcr.io but default setting
for iamges is "private" and when users want to build CI workflows
in their forks, had to manually change their images to Public, so
that ci.yml workflow can pull the images prepared in the build-images
workflow.

This PR adds logging in for `breeze pull` command when GITHUB_TOKEN
is available, also the workflow gets packages: read permissions.

This way ci should works in forks of users without any action from
user except first-time workflow enabling.

8 days agoFix LocalFilesystemToS3Operator and S3CreateObjectOperator to support full s3://...
Vincent [Sat, 7 May 2022 09:19:45 +0000 (03:19 -0600)] 
Fix LocalFilesystemToS3Operator and S3CreateObjectOperator to support full s3:// style keys (#23180)

* Fix LocalFilesystemToS3Operator and S3CreateObjectOperator.
Support full s3:// style keys

* Fix spelling error

8 days agoChange chart annotation generator to use RELEASE_NOTES (#23549)
Jed Cunningham [Sat, 7 May 2022 09:15:25 +0000 (03:15 -0600)] 
Change chart annotation generator to use RELEASE_NOTES (#23549)

8 days agoUpdate the Athena Sample DAG and Docs (#23428)
D. Ferruzzi [Sat, 7 May 2022 06:28:44 +0000 (23:28 -0700)] 
Update the Athena Sample DAG and Docs (#23428)

* Update the Athena Sample DAG and Docs

8 days agoFix accidental including of providers in airflow package (#23552)
Jarek Potiuk [Sat, 7 May 2022 06:26:04 +0000 (08:26 +0200)] 
Fix accidental including of providers in airflow package (#23552)

The change #23454 accidentally remove INSTALL_PROVIDERS_FROM_SOURCES
setting to "false" which resulted in airflow package containing all
providers. This has been caught by our tests (but it was only
visible after merging)

This PR brings the variable back.

9 days agoReplace `pytest.mark.xfail` in Postgres tests (#23541)
eladkal [Fri, 6 May 2022 23:51:56 +0000 (02:51 +0300)] 
Replace `pytest.mark.xfail` in Postgres tests (#23541)

9 days agoSeperate provider verification as standalone breeze command (#23454)
Jarek Potiuk [Fri, 6 May 2022 22:47:00 +0000 (00:47 +0200)] 
Seperate provider verification as standalone breeze command (#23454)

This is another step in simplifying and converting to Python all of
the CI/local development tooling.

This PR separates out verification of providers as a separate
breeze command `verify-provider-packages`. It was previously part of
"prepare_provider_packages.py" but it has been now
extracted to a separate in-container python file and it was
wrapped with breeze's `verify-provider-packages` command.

No longer provider verification is run with "preparing provider docs"
nor "preparing provider packages" - it's a standaline command.

This command is also used in CI now to run the tests:

* all provider packages are built and created on CI together with
  airflow version
* the packages are installed inside the CI image and providers are
  verified
* the 2.1 version of Airflow is installed together with all 2.1
  - compatible providers and provider verification is run there too.

This all is much simpler now - we got rediof some 500 lines of bash
code again in favour of breeze python code.

Fixes: #23430

9 days agoTrinoHook add authentication via JWT token and Impersonation (#23116)
Pragya [Fri, 6 May 2022 19:45:33 +0000 (01:15 +0530)] 
TrinoHook add authentication via JWT token and Impersonation  (#23116)

* added trino authentication via JWT token and impersonation

* added test cases for jwt verification in trino

* added documenation for trino hook

9 days agoUpdate docs Amazon Glacier Docs (#23372)
Niko [Fri, 6 May 2022 18:03:24 +0000 (11:03 -0700)] 
Update docs Amazon Glacier Docs (#23372)

9 days agoChange approach to finding bad rows to LEFT OUTER JOIN. (#23528)
Ash Berlin-Taylor [Fri, 6 May 2022 16:02:27 +0000 (17:02 +0100)] 
Change approach to finding bad rows to LEFT OUTER JOIN. (#23528)

Rather than sub-selects (two for count, or one for the CREATE TABLE).

For a _large_ database (27m TaskInstances, 2m DagRuns) this takes the
time from 10minutes to around 3 minutes per table (we have 3) down to 3
minutes per table. (All times on Postgres.)

Before:

```sql
CREATE TABLE _airflow_moved__2_3__dangling__rendered_task_instance_fields AS
SELECT
  rendered_task_instance_fields.dag_id AS dag_id,
  rendered_task_instance_fields.task_id AS task_id,
  rendered_task_instance_fields.execution_date AS execution_date,
  rendered_task_instance_fields.rendered_fields AS rendered_fields,
  rendered_task_instance_fields.k8s_pod_yaml AS k8s_pod_yaml +
FROM
  rendered_task_instance_fields
WHERE
  NOT (
    EXISTS (
      SELECT
        1
      FROM
        task_instance
        JOIN dag_run ON dag_run.dag_id = task_instance.dag_id
        AND dag_run.run_id = task_instance.run_id
      WHERE
        rendered_task_instance_fields.dag_id = task_instance.dag_id
        AND rendered_task_instance_fields.task_id = task_instance.task_id
        AND rendered_task_instance_fields.execution_date = dag_run.execution_date
    )
  )
```

After:

```sql
CREATE TABLE _airflow_moved__2_3__dangling__rendered_task_instance_fields AS
SELECT
  rendered_task_instance_fields.dag_id AS dag_id,
  rendered_task_instance_fields.task_id AS task_id,
  rendered_task_instance_fields.execution_date AS execution_date,
  rendered_task_instance_fields.rendered_fields AS rendered_fields,
  rendered_task_instance_fields.k8s_pod_yaml AS k8s_pod_yaml +
FROM
  rendered_task_instance_fields
  LEFT OUTER JOIN dag_run ON rendered_task_instance_fields.dag_id = dag_run.dag_id
  AND rendered_task_instance_fields.execution_date = dag_run.execution_date
  LEFT OUTER JOIN task_instance ON dag_run.dag_id = task_instance.dag_id
  AND dag_run.run_id = task_instance.run_id
  AND rendered_task_instance_fields.task_id = task_instance.task_id
WHERE
  task_instance.dag_id IS NULL
  OR dag_run.dag_id IS NULL
;
```

9 days agoOnly count bad refs when `moved` table exists (#23491)
Daniel Standish [Fri, 6 May 2022 12:42:22 +0000 (05:42 -0700)] 
Only count bad refs when `moved` table exists (#23491)

This keeps the logic to fail without upgrading when (A) there are bad rows and
(B) the "moved" table already exists. But we optimize so that we don't count
the bad rows unless the "moved" table is there. Previously we counted always,
but the first time a user attempts upgrade, the tables won't be there so
there's no point in counting.

Instead what we do is skip right to the CTAS, creating the _airflow_moved
tables. If there aren't any rows in the "moved" table, then we delete the table
immediately.

Also included here is a delete optimization, where we join to the moved table
instead of running the not exists query again.

Co-authored-by: Jed Cunningham <66968678+jedcunningham@users.noreply.github.com>
Co-authored-by: Ash Berlin-Taylor <ash@apache.org>
9 days agoAdd `OpsgenieDeleteAlertOperator` (#23405)
eladkal [Fri, 6 May 2022 11:37:03 +0000 (14:37 +0300)] 
Add `OpsgenieDeleteAlertOperator` (#23405)

* Add `OpsgenieDeleteAlertOperator`

9 days agoFix cassandra to 3.0.25 (#23522)
Jarek Potiuk [Fri, 6 May 2022 10:25:45 +0000 (12:25 +0200)] 
Fix cassandra to 3.0.25 (#23522)

fix cassandra to 3.0.25 as latest 3.0 (3.0.26) does not start cleanly

9 days agoMove tests command in new breeze (#23445)
Joppe Vos [Fri, 6 May 2022 09:03:05 +0000 (11:03 +0200)] 
Move tests command in new breeze (#23445)

10 days agoExpand/collapse all groups (#23487)
Brent Bovenzi [Thu, 5 May 2022 18:20:22 +0000 (14:20 -0400)] 
Expand/collapse all groups (#23487)

* Add expand/collapse all groups button to Grid

* add tests

* add comments

* Switch to 2 icon buttons

Disable buttons if all groups are expanded or collapsed

* Update localStorage key

10 days agoReplace DummyOperator references in docs (#23502)
Leah E. Cole [Thu, 5 May 2022 15:26:14 +0000 (11:26 -0400)] 
Replace DummyOperator references in docs (#23502)

10 days agoChanged word 'the' instead 'his' (#23493)
Edith Puclla [Thu, 5 May 2022 15:06:35 +0000 (10:06 -0500)] 
Changed word 'the' instead 'his' (#23493)

10 days agoUse kubernetes queue in kubernetes hybrid executors (#23048)
Tanel Kiis [Thu, 5 May 2022 10:23:18 +0000 (13:23 +0300)] 
Use kubernetes queue in kubernetes hybrid executors (#23048)

When using "hybrid" executors (`CeleryKubernetesExecutor` or `LocalKubernetesExecutor`),
then the `clear_not_launched_queued_tasks` mechnism in the `KubernetesExecutor` can
reset the queued tasks, that were given to the other executor.

`KuberneterExecutor` should limit itself to the configured queue when working in the
"hybrid" mode.