airflow.git
6 weeks agoPrepare provider documentation 2022.05.11 (#23631) providers-amazon/3.4.0 providers-amazon/3.4.0rc1 providers-apache-beam/3.4.0 providers-apache-beam/3.4.0rc1 providers-apache-hive/2.3.3 providers-apache-hive/2.3.3rc1 providers-cncf-kubernetes/4.0.2 providers-cncf-kubernetes/4.0.2rc1 providers-databricks/2.7.0 providers-databricks/2.7.0rc1 providers-docker/2.7.0 providers-docker/2.7.0rc1 providers-google/7.0.0 providers-google/7.0.0rc1 providers-jira/2.0.5 providers-jira/2.0.5rc1 providers-microsoft-azure/3.9.0 providers-microsoft-azure/3.9.0rc1 providers-opsgenie/3.1.0 providers-opsgenie/3.1.0rc1 providers-presto/2.2.1 providers-presto/2.2.1rc1 providers-salesforce/3.4.4 providers-salesforce/3.4.4rc1 providers-snowflake/2.7.0 providers-snowflake/2.7.0rc1 providers-ssh/2.4.4 providers-ssh/2.4.4rc1 providers-tableau/2.1.8 providers-tableau/2.1.8rc1 providers-trino/2.3.0 providers-trino/2.3.0rc1
Jarek Potiuk [Wed, 11 May 2022 23:01:16 +0000 (01:01 +0200)] 
Prepare provider documentation 2022.05.11 (#23631)

Co-authored-by: eladkal <45845474+eladkal@users.noreply.github.com>
Co-authored-by: eladkal <45845474+eladkal@users.noreply.github.com>
6 weeks agoRevert "Fix k8s pod.execute randomly stuck indefinitely by logs consumption (#23497...
Jarek Potiuk [Wed, 11 May 2022 22:20:02 +0000 (00:20 +0200)] 
Revert "Fix k8s pod.execute randomly stuck indefinitely by logs consumption (#23497) (#23618)" (#23656)

This reverts commit ee342b85b97649e2e29fcf83f439279b68f1b4d4.

7 weeks agoRename cluster_policy to task_policy (#23468)
humit [Wed, 11 May 2022 19:40:10 +0000 (04:40 +0900)] 
Rename cluster_policy to task_policy (#23468)

* Rename cluster_policy to task_policy

* rename task_policy as example_task_policy.

7 weeks ago[FEATURE] google provider - BigQueryInsertJobOperator log query (#23648)
raphaelauv [Wed, 11 May 2022 19:28:19 +0000 (21:28 +0200)] 
[FEATURE] google provider - BigQueryInsertJobOperator log query (#23648)

7 weeks agoFix k8s pod.execute randomly stuck indefinitely by logs consumption (#23497) (#23618)
Sebastian Chamena [Wed, 11 May 2022 19:20:49 +0000 (12:20 -0700)] 
Fix k8s pod.execute randomly stuck indefinitely by logs consumption (#23497) (#23618)

7 weeks agoFixed test and remove pytest.mark.xfail for test_exc_tb (#23650)
Kanthi [Wed, 11 May 2022 19:16:49 +0000 (15:16 -0400)] 
Fixed test and remove pytest.mark.xfail for test_exc_tb (#23650)

7 weeks agoAdded kubernetes version (1.24) in README.md(for Main version(dev)), … (#23649)
Kanthi [Wed, 11 May 2022 17:13:01 +0000 (13:13 -0400)] 
Added kubernetes version (1.24) in README.md(for Main version(dev)), … (#23649)

* Added kubernetes version (1.24) in README.md(for Main version(dev)), accidentally removed in merge cnflict.

* Update README.md

Co-authored-by: Jarek Potiuk <jarek@potiuk.com>
7 weeks agoAdd `RedshiftDeleteClusterOperator` support (#23563)
pankajastro [Wed, 11 May 2022 17:07:01 +0000 (22:37 +0530)] 
Add `RedshiftDeleteClusterOperator` support (#23563)

Add support for `RedshiftDeleteClusterOperator`. This will help to clean resources using airflow operators when needed. In the current implementation, By default, I'm waiting until the cluster is completely removed to return immediately without waiting set `wait_for_completion` param to False

- Add operator class
- Add basic unit test
- Add an example task
- Add relevant documentation

7 weeks agoAdded postgres 14 to support versions(including breeze) (#23506)
Kanthi [Wed, 11 May 2022 16:26:19 +0000 (12:26 -0400)] 
Added postgres 14 to support versions(including breeze) (#23506)

* Added postgres 14 to support versions(including breeze)

7 weeks agoDon't run pre-migration checks for downgrade (#23634)
Daniel Standish [Wed, 11 May 2022 16:08:06 +0000 (09:08 -0700)] 
Don't run pre-migration checks for downgrade (#23634)

These checks are only make sense for upgrades.  Generally they exist to resolve referential integrity issues etc before adding constraints.  In the downgrade context, we generally only remove constraints, so it's a non-issue.

7 weeks agoAdd index for event column in log table (#23625)
Gabriel Machado [Wed, 11 May 2022 14:45:33 +0000 (16:45 +0200)] 
Add index for event column in log table (#23625)

7 weeks agoSimplify flash message for _airflow_moved tables (#23635)
Daniel Standish [Wed, 11 May 2022 14:13:57 +0000 (07:13 -0700)] 
Simplify flash message for _airflow_moved tables (#23635)

Co-authored-by: Jed Cunningham <66968678+jedcunningham@users.noreply.github.com>
7 weeks agoFix assuming "Feature" answer on CI when generating docs (#23640)
Jarek Potiuk [Wed, 11 May 2022 11:15:22 +0000 (13:15 +0200)] 
Fix assuming "Feature" answer on CI when generating docs (#23640)

We have now different answers posisble when generating docs, and
for testing we assume we answered randomly during the generation
of documentation.

7 weeks agoFix typo issue (#23633)
humit [Wed, 11 May 2022 10:58:26 +0000 (19:58 +0900)] 
Fix typo issue (#23633)

7 weeks ago[FEATURE] add K8S 1.24 support (#23637)
raphaelauv [Wed, 11 May 2022 10:52:24 +0000 (12:52 +0200)] 
[FEATURE] add K8S 1.24 support (#23637)

7 weeks ago[FEATURE] update K8S-KIND to 0.13.0 (#23636)
raphaelauv [Wed, 11 May 2022 08:26:14 +0000 (10:26 +0200)] 
[FEATURE] update K8S-KIND to 0.13.0 (#23636)

7 weeks agoPrevent KubernetesJobWatcher getting stuck on resource too old (#23521)
Ruben Laguna [Wed, 11 May 2022 06:25:49 +0000 (08:25 +0200)] 
Prevent KubernetesJobWatcher getting stuck on resource too old (#23521)

* Prevent KubernetesJobWatcher getting stuck on resource too old

If the watch fails because "resource too old" the
KubernetesJobWatcher should not retry with the same resource version
as that will end up in loop where there is no progress.

* Reset ResourceVersion().resource_version to 0

7 weeks agoMake provider doc preparation a bit more fun :) (#23629)
Jarek Potiuk [Tue, 10 May 2022 22:19:54 +0000 (00:19 +0200)] 
Make provider doc preparation a bit more fun :) (#23629)

Previously you had to manually add versions when changelog was
modified. But why not to get a bit more fun and get the versions
bumped automatically based on your assesment when reviewing the
provideers rather than after looking at the generated changelog.

7 weeks agoFix: Exception when parsing log #20966 (#23301)
Jakub Novák [Tue, 10 May 2022 20:43:25 +0000 (22:43 +0200)] 
Fix: Exception when parsing log #20966 (#23301)

* UnicodeDecodeError: 'utf-8' codec can't decode byte 0xXX in position X: invalid start byte

  File "/opt/work/python395/lib/python3.9/site-packages/airflow/hooks/subprocess.py", line 89, in run_command
    line = raw_line.decode(output_encoding).rstrip()            # raw_line ==  b'\x00\x00\x00\x11\xa9\x01\n'
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xa9 in position 4: invalid start byte

* Update subprocess.py

* Update subprocess.py

* Fix:  Exception when parsing log #20966

* Fix:  Exception when parsing log #20966

 Another alternative is: try-catch it.

e.g.

```
            line = ''
            for raw_line in iter(self.sub_process.stdout.readline, b''):
                try:
                    line = raw_line.decode(output_encoding).rstrip()
                except UnicodeDecodeError as err:
                    print(err, output_encoding, raw_line)
                self.log.info("%s", line)
```

* Create test_subprocess.sh

* Update test_subprocess.py

* Added shell directive and license to test_subprocess.sh

* Distinguish between raw and decoded lines as suggested by @uranusjr

* simplify test

Co-authored-by: muhua <microhuang@live.com>
7 weeks agoImplement send_callback method for CeleryKubernetesExecutor and LocalKubernetesExecut...
mhenc [Tue, 10 May 2022 17:13:00 +0000 (19:13 +0200)] 
Implement send_callback method for CeleryKubernetesExecutor and LocalKubernetesExecutor (#23617)

7 weeks ago[FEATURE] google provider - split GkeStartPodOperator execute (#23518)
raphaelauv [Tue, 10 May 2022 15:51:37 +0000 (17:51 +0200)] 
[FEATURE] google provider - split GkeStartPodOperator execute (#23518)

7 weeks agoFixed Kubernetes Operator large xcom content Defect (#23490)
rahulgoyal2987 [Tue, 10 May 2022 15:46:55 +0000 (21:16 +0530)] 
Fixed Kubernetes Operator large xcom content Defect  (#23490)

7 weeks agoAdd slim images to docker-stack docs index (#23601)
Jarek Potiuk [Tue, 10 May 2022 15:24:26 +0000 (17:24 +0200)] 
Add slim images to docker-stack docs index (#23601)

7 weeks agoAdd Quicksight create ingestion Hook and Operator (#21863)
Harpreet Singh [Tue, 10 May 2022 14:54:13 +0000 (20:24 +0530)] 
Add Quicksight create ingestion Hook and Operator (#21863)

* Add Quicksight create ingestion Hook and Operator

Co-authored-by: eladkal <45845474+eladkal@users.noreply.github.com>
7 weeks agoMake Breeze help generation indepdent from having breeze installed (#23612)
Jarek Potiuk [Tue, 10 May 2022 09:49:39 +0000 (11:49 +0200)] 
Make Breeze help generation indepdent from having breeze installed (#23612)

Generation of Breeze help requires breeze to be installed. However
if you have locally installed breeze with different dependencies
and did not run self-upgrade, the results of generation of the
images might be different (for example when different rich
version is used). This change works in the way that:
* you do not have to have breeze installed at all to make it work
* it always upgrades to latest breeze when it is not installed
* but this only happens when you actually modified some breeze code

7 weeks agoAdd exportContext.offload flag to CLOUD_SQL_EXPORT_VALIDATION. (#23614)
ishiis [Tue, 10 May 2022 09:49:18 +0000 (18:49 +0900)] 
Add exportContext.offload flag to CLOUD_SQL_EXPORT_VALIDATION. (#23614)

7 weeks agoUpdate min requirements for rich to 12.4.1 (#23604)
Jarek Potiuk [Tue, 10 May 2022 06:36:28 +0000 (08:36 +0200)] 
Update min requirements for rich to 12.4.1 (#23604)

7 weeks agoAdd sample dag and doc for S3ListPrefixesOperator (#23448)
Vincent [Mon, 9 May 2022 22:54:59 +0000 (16:54 -0600)] 
Add sample dag and doc for S3ListPrefixesOperator (#23448)

* Add sample dag and doc for S3ListPrefixesOperator

* Fix static checks

7 weeks agoAdd exception to catch single line private keys (#23043)
nsAstro [Mon, 9 May 2022 22:49:22 +0000 (18:49 -0400)] 
Add exception to catch single line private keys (#23043)

7 weeks agoUse inclusive words in apache airflow project (#23090)
Edith Puclla [Mon, 9 May 2022 21:52:29 +0000 (16:52 -0500)] 
Use inclusive words in apache airflow project (#23090)

7 weeks agoImprove caching for multi-platform images. (#23562)
Jarek Potiuk [Mon, 9 May 2022 21:02:25 +0000 (23:02 +0200)] 
Improve caching for multi-platform images. (#23562)

This is another attempt to improve caching performance for
multi-platform images as the previous ones were undermined by a
bug in buildx multiplatform cache-to implementattion that caused
the image cache to be overwritten between platforms,
when multiple images were build.

The bug is created for the buildx behaviour at
https://github.com/docker/buildx/issues/1044 and until it is fixed
we have to prpare separate caches for each platform and push them
to separate tags.

That adds a bit overhead on the building step, but for now it is
the simplest way we can workaround the bug if we do not want to
manually manipulate manifests and images.

7 weeks ago19943 Grid view status filters (#23392)
pierrejeambrun [Mon, 9 May 2022 20:32:02 +0000 (22:32 +0200)] 
19943 Grid view status filters (#23392)

* Move tree filtering inside react and add some filters

* Move filters from context to utils

* Fix tests for useTreeData

* Fix last tests.

* Add tests for useFilters

* Refact to use existing SimpleStatus component

* Additional fix after rebase.

* Update following bbovenzi code review

* Update following code review

* Fix tests.

* Fix page flickering issues from react-query

* Fix side panel and small changes.

* Use default_dag_run_display_number in the filter options

* Handle timezone

* Fix flaky test

Co-authored-by: Brent Bovenzi <brent.bovenzi@gmail.com>
7 weeks agoAdd sample dag and doc for S3ListOperator (#23449)
Vincent [Mon, 9 May 2022 18:21:51 +0000 (12:21 -0600)] 
Add sample dag and doc for S3ListOperator (#23449)

* Add sample dag and doc for S3ListOperator

* Fix doc

7 weeks agoHelm chart 1.6.0rc1 (#23548)
Jed Cunningham [Mon, 9 May 2022 18:14:44 +0000 (12:14 -0600)] 
Helm chart 1.6.0rc1 (#23548)

7 weeks agoAdd doc and sample dag for EC2 (#23547)
Vincent [Mon, 9 May 2022 17:56:50 +0000 (11:56 -0600)] 
Add doc and sample dag for EC2 (#23547)

7 weeks agoApply specific ID collation to root_dag_id too (#23536)
Michael Peteuil [Mon, 9 May 2022 17:48:11 +0000 (13:48 -0400)] 
Apply specific ID collation to root_dag_id too (#23536)

In certain databases there is a need to set the collation for ID fields
like dag_id or task_id to something different than the database default.
This is because in MySQL with utf8mb4 the index size becomes too big for
the MySQL limits. In past pull requests this was handled
[#7570](https://github.com/apache/airflow/pull/7570),
[#17729](https://github.com/apache/airflow/pull/17729), but the
root_dag_id field on the dag model was missed. Since this field is used
to join with the dag_id in various other models ([and
self-referentially](https://github.com/apache/airflow/blob/451c7cbc42a83a180c4362693508ed33dd1d1dab/airflow/models/dag.py#L2766)),
it also needs to have the same collation as other ID fields.

This can be seen by running `airflow db reset` before and after applying
this change while also specifying `sql_engine_collation_for_ids` in the
configuration.

Other related PRs
[#19408](https://github.com/apache/airflow/pull/19408)

7 weeks agoClean up in-line f-string concatenation (#23591)
Josh Fell [Mon, 9 May 2022 17:44:41 +0000 (13:44 -0400)] 
Clean up in-line f-string concatenation (#23591)

7 weeks agoUpdate sample dag and doc for Datasync (#23511)
Vincent [Mon, 9 May 2022 17:40:27 +0000 (11:40 -0600)] 
Update sample dag and doc for Datasync (#23511)

7 weeks agoAdd default 'aws_conn_id' to SageMaker Operators #21808 (#23515)
Harpreet Singh [Mon, 9 May 2022 17:36:35 +0000 (23:06 +0530)] 
Add default 'aws_conn_id' to SageMaker Operators #21808 (#23515)

7 weeks agoFix broken dagrun links when many runs start at the same time (#23462)
Chris Redekop [Mon, 9 May 2022 15:49:53 +0000 (09:49 -0600)] 
Fix broken dagrun links when many runs start at the same time (#23462)

* Load requested dagrun even when there are many dagruns at (almost) the same time

* Fix code formatting issues

7 weeks agoFix `PythonVirtualenvOperator` templated_fields (#23559)
eladkal [Mon, 9 May 2022 15:17:34 +0000 (18:17 +0300)] 
Fix `PythonVirtualenvOperator` templated_fields (#23559)

* Fix `PythonVirtualenvOperator` templated_fields
The `PythonVirtualenvOperator` templated_fields override `PythonOperator` templated_fields which caused functionality not to work as expected.
fixes: https://github.com/apache/airflow/issues/23557

7 weeks agoPools with negative open slots should not block other pools (#23143)
Tanel Kiis [Mon, 9 May 2022 15:12:40 +0000 (18:12 +0300)] 
Pools with negative open slots should not block other pools (#23143)

7 weeks agoAdd `device_requests` parameter to `DockerOperator` (#23554)
eladkal [Mon, 9 May 2022 15:08:15 +0000 (18:08 +0300)] 
Add `device_requests` parameter to `DockerOperator` (#23554)

* Expose device_requests to DockerOperator

Co-authored-by: Tedi Papajorgji <tedi.papajorgji@hotmail.com>
7 weeks agoFix scheduler crash when expanding with mapped task that returned none (#23486)
Ephraim Anierobi [Mon, 9 May 2022 12:44:35 +0000 (13:44 +0100)] 
Fix scheduler crash when expanding with mapped task that returned none (#23486)

When task is expanded from a mapped task that returned no value, it
crashes the scheduler. This PR fixes it by first checking if there's
a return value from the mapped task, if no returned value, then error
in the task itself instead of crashing the scheduler

7 weeks agoAdd support for queued state in DagRun update endpoint. (#23481)
Karthikeyan Singaravelan [Mon, 9 May 2022 12:25:48 +0000 (17:55 +0530)] 
Add support for queued state in DagRun update endpoint. (#23481)

7 weeks agoFixed option name in Breeze description (#23582)
Jarek Potiuk [Mon, 9 May 2022 10:15:43 +0000 (12:15 +0200)] 
Fixed option name in Breeze description (#23582)

7 weeks agotHe output of commands of Breeze are only generated when they change (#23570)
Jarek Potiuk [Mon, 9 May 2022 09:59:11 +0000 (11:59 +0200)] 
tHe output of commands of Breeze are only generated when they change (#23570)

Previously we generated output of all the commands from Breeze always,
hoping that they will be the same, but rich already had two changes
in the format of the SVG files which made the output different and
breaking our PRs.

Temporarily we pinned rich to fix the output, but better solution is
to get the hash of all the configuration options and see if it changed,
and only run generation when it did. This way we keep automated
generation on pre-commit but we are protected from accidental change
of the output.

We also remove the rich limits and regenerated all svg files to ones
generated by 12.4.0. Also found a way to run the check if we should
run generation at all in pre-commit without prior installing breeze.

Fixes: #22908

7 weeks agoFix dag-processor fetch metabase config (#23575)
Andrey Anshin [Mon, 9 May 2022 08:50:33 +0000 (11:50 +0300)] 
Fix dag-processor fetch metabase config (#23575)

7 weeks agoUpdate dags.rst (#23579)
mthakare-onshape [Mon, 9 May 2022 08:17:12 +0000 (13:47 +0530)] 
Update dags.rst (#23579)

Update missing bracket

7 weeks agoTemporarily pin xmltodict to 0.12.0 to fix main failure (#23577)
Jarek Potiuk [Mon, 9 May 2022 05:41:06 +0000 (07:41 +0200)] 
Temporarily pin xmltodict to 0.12.0 to fix main failure (#23577)

The xmltodict 0,13.0 breaks some tests and likely 0.13.0 is buggy
as the error is ValueError: Malformatted input.

We pin it to 0.12.0 to fix the main failing.

Related: #23576

7 weeks agoFix conn close error on retrieving log events (#23470)
thinhnd2104 [Sun, 8 May 2022 22:32:09 +0000 (05:32 +0700)] 
Fix conn close error on retrieving log events (#23470)

related: [#23469] (https://github.com/apache/airflow/issues/23469).

7 weeks agoFix `PostgresToGCSOperator` does not allow nested JSON (#23063)
pierrejeambrun [Sun, 8 May 2022 22:06:23 +0000 (00:06 +0200)] 
Fix `PostgresToGCSOperator` does not allow nested JSON (#23063)

* Avoid double json.dumps for json data export in PostgresToGCSOperator.

* Fix CI

7 weeks agoOpsgenie: Fix `close_alert` to properly send `kwargs` (#23442)
Benoit Person [Sun, 8 May 2022 21:38:50 +0000 (23:38 +0200)] 
Opsgenie: Fix `close_alert` to properly send `kwargs` (#23442)

7 weeks agoAmazon Sagemaker Sample DAG and docs update (#23256)
D. Ferruzzi [Sun, 8 May 2022 21:37:51 +0000 (14:37 -0700)] 
Amazon Sagemaker Sample DAG and docs update (#23256)

7 weeks agowasb hook: user defaultAzureCredentials instead of managedIdentity (#23394)
sanjayp [Sun, 8 May 2022 21:12:26 +0000 (02:42 +0530)] 
wasb hook: user defaultAzureCredentials instead of managedIdentity (#23394)

Co-authored-by: Sanjay Pillai <sanjaypillai11 [at] gmail.com>
7 weeks agoMove dag_processing.processor_timeouts to counters section (#23393)
Yeachan Park [Sun, 8 May 2022 21:11:51 +0000 (23:11 +0200)] 
Move dag_processing.processor_timeouts to counters section (#23393)

7 weeks agoFix GCSToGCSOperator ignores replace parameter when there is no wildcard (#23340)
GitStart-AirFlow [Sun, 8 May 2022 19:46:55 +0000 (20:46 +0100)] 
Fix GCSToGCSOperator ignores replace parameter when there is no wildcard (#23340)

7 weeks agoTests for provider code structure (#23351)
Bartłomiej Hirsz [Sun, 8 May 2022 19:32:26 +0000 (21:32 +0200)] 
Tests for provider code structure (#23351)

Improved test for code structure that can be re-used among various providders.

7 weeks agoAdd slim images to release process (#23391)
Jarek Potiuk [Sat, 7 May 2022 21:53:11 +0000 (23:53 +0200)] 
Add slim images to release process (#23391)

This PR adds slim images to release process of Airflow.

Those images are small as they do not contain any extras.

Fixes: #20849

7 weeks agoFix _PIP_ADDITIONAL_REQUIREMENTS case for docker-compose (#23517)
Jarek Potiuk [Sat, 7 May 2022 14:17:48 +0000 (16:17 +0200)] 
Fix _PIP_ADDITIONAL_REQUIREMENTS case for docker-compose (#23517)

Recent versions of Airflow do not allow to run `pip install` as
root but the `init` job runs as root so when the variable
_PIP_ADDITIONAL_REQUIREMENTS is set, the init container fails.

This PR forces _PIP_ADDITIONAL_REQUIREMENTS to be empty for the init
job.

7 weeks agoRefactor Breeze to group related methods and classes together (#23556)
Jarek Potiuk [Sat, 7 May 2022 13:56:34 +0000 (15:56 +0200)] 
Refactor Breeze to group related methods and classes together (#23556)

This change refactors Breeze classes to more consistent approach.

* The "commands" package only contains commands
* All Parameters (BuildCi, BuildProd, BuildDoc, Shell) are now
  in "params" package
* Required/Optional Build args are now members of the
  BuildCiParams, BuildProdParams which makes the params
  much more self-contained..
* All utils are in "utils" package

This helps with avoiding circular imports (all utios are now
standalone and do not use any of the commands.

Co-authored-by: eladkal <45845474+eladkal@users.noreply.github.com>
7 weeks agoAdd IPV6 form of the address in cassandra status check (#23537)
Jarek Potiuk [Sat, 7 May 2022 13:36:55 +0000 (15:36 +0200)] 
Add IPV6 form of the address in cassandra status check (#23537)

This PR fixes problem introduced in 3.0.26 of cassandra image which
adds square brackets around IP address regardless of its type.

The problem was workarounded by pinning cassandra to 3.0.25 in
the ##23522 as a quick fix, but this one introducec permanent,
future-proof solution.

Based on discussion in https://issues.apache.org/jira/browse/CASSANDRA-17612

Fixes: #23523

7 weeks agoAdd logging in to Github Registry for breeze pull (#23551)
Jarek Potiuk [Sat, 7 May 2022 10:24:05 +0000 (12:24 +0200)] 
Add logging in to Github Registry for breeze pull (#23551)

All of the Airlfow Images are Public in ghcr.io but default setting
for iamges is "private" and when users want to build CI workflows
in their forks, had to manually change their images to Public, so
that ci.yml workflow can pull the images prepared in the build-images
workflow.

This PR adds logging in for `breeze pull` command when GITHUB_TOKEN
is available, also the workflow gets packages: read permissions.

This way ci should works in forks of users without any action from
user except first-time workflow enabling.

7 weeks agoFix LocalFilesystemToS3Operator and S3CreateObjectOperator to support full s3://...
Vincent [Sat, 7 May 2022 09:19:45 +0000 (03:19 -0600)] 
Fix LocalFilesystemToS3Operator and S3CreateObjectOperator to support full s3:// style keys (#23180)

* Fix LocalFilesystemToS3Operator and S3CreateObjectOperator.
Support full s3:// style keys

* Fix spelling error

7 weeks agoChange chart annotation generator to use RELEASE_NOTES (#23549)
Jed Cunningham [Sat, 7 May 2022 09:15:25 +0000 (03:15 -0600)] 
Change chart annotation generator to use RELEASE_NOTES (#23549)

7 weeks agoUpdate the Athena Sample DAG and Docs (#23428)
D. Ferruzzi [Sat, 7 May 2022 06:28:44 +0000 (23:28 -0700)] 
Update the Athena Sample DAG and Docs (#23428)

* Update the Athena Sample DAG and Docs

7 weeks agoFix accidental including of providers in airflow package (#23552)
Jarek Potiuk [Sat, 7 May 2022 06:26:04 +0000 (08:26 +0200)] 
Fix accidental including of providers in airflow package (#23552)

The change #23454 accidentally remove INSTALL_PROVIDERS_FROM_SOURCES
setting to "false" which resulted in airflow package containing all
providers. This has been caught by our tests (but it was only
visible after merging)

This PR brings the variable back.

7 weeks agoReplace `pytest.mark.xfail` in Postgres tests (#23541)
eladkal [Fri, 6 May 2022 23:51:56 +0000 (02:51 +0300)] 
Replace `pytest.mark.xfail` in Postgres tests (#23541)

7 weeks agoSeperate provider verification as standalone breeze command (#23454)
Jarek Potiuk [Fri, 6 May 2022 22:47:00 +0000 (00:47 +0200)] 
Seperate provider verification as standalone breeze command (#23454)

This is another step in simplifying and converting to Python all of
the CI/local development tooling.

This PR separates out verification of providers as a separate
breeze command `verify-provider-packages`. It was previously part of
"prepare_provider_packages.py" but it has been now
extracted to a separate in-container python file and it was
wrapped with breeze's `verify-provider-packages` command.

No longer provider verification is run with "preparing provider docs"
nor "preparing provider packages" - it's a standaline command.

This command is also used in CI now to run the tests:

* all provider packages are built and created on CI together with
  airflow version
* the packages are installed inside the CI image and providers are
  verified
* the 2.1 version of Airflow is installed together with all 2.1
  - compatible providers and provider verification is run there too.

This all is much simpler now - we got rediof some 500 lines of bash
code again in favour of breeze python code.

Fixes: #23430

7 weeks agoTrinoHook add authentication via JWT token and Impersonation (#23116)
Pragya [Fri, 6 May 2022 19:45:33 +0000 (01:15 +0530)] 
TrinoHook add authentication via JWT token and Impersonation  (#23116)

* added trino authentication via JWT token and impersonation

* added test cases for jwt verification in trino

* added documenation for trino hook

7 weeks agoUpdate docs Amazon Glacier Docs (#23372)
Niko [Fri, 6 May 2022 18:03:24 +0000 (11:03 -0700)] 
Update docs Amazon Glacier Docs (#23372)

7 weeks agoChange approach to finding bad rows to LEFT OUTER JOIN. (#23528)
Ash Berlin-Taylor [Fri, 6 May 2022 16:02:27 +0000 (17:02 +0100)] 
Change approach to finding bad rows to LEFT OUTER JOIN. (#23528)

Rather than sub-selects (two for count, or one for the CREATE TABLE).

For a _large_ database (27m TaskInstances, 2m DagRuns) this takes the
time from 10minutes to around 3 minutes per table (we have 3) down to 3
minutes per table. (All times on Postgres.)

Before:

```sql
CREATE TABLE _airflow_moved__2_3__dangling__rendered_task_instance_fields AS
SELECT
  rendered_task_instance_fields.dag_id AS dag_id,
  rendered_task_instance_fields.task_id AS task_id,
  rendered_task_instance_fields.execution_date AS execution_date,
  rendered_task_instance_fields.rendered_fields AS rendered_fields,
  rendered_task_instance_fields.k8s_pod_yaml AS k8s_pod_yaml +
FROM
  rendered_task_instance_fields
WHERE
  NOT (
    EXISTS (
      SELECT
        1
      FROM
        task_instance
        JOIN dag_run ON dag_run.dag_id = task_instance.dag_id
        AND dag_run.run_id = task_instance.run_id
      WHERE
        rendered_task_instance_fields.dag_id = task_instance.dag_id
        AND rendered_task_instance_fields.task_id = task_instance.task_id
        AND rendered_task_instance_fields.execution_date = dag_run.execution_date
    )
  )
```

After:

```sql
CREATE TABLE _airflow_moved__2_3__dangling__rendered_task_instance_fields AS
SELECT
  rendered_task_instance_fields.dag_id AS dag_id,
  rendered_task_instance_fields.task_id AS task_id,
  rendered_task_instance_fields.execution_date AS execution_date,
  rendered_task_instance_fields.rendered_fields AS rendered_fields,
  rendered_task_instance_fields.k8s_pod_yaml AS k8s_pod_yaml +
FROM
  rendered_task_instance_fields
  LEFT OUTER JOIN dag_run ON rendered_task_instance_fields.dag_id = dag_run.dag_id
  AND rendered_task_instance_fields.execution_date = dag_run.execution_date
  LEFT OUTER JOIN task_instance ON dag_run.dag_id = task_instance.dag_id
  AND dag_run.run_id = task_instance.run_id
  AND rendered_task_instance_fields.task_id = task_instance.task_id
WHERE
  task_instance.dag_id IS NULL
  OR dag_run.dag_id IS NULL
;
```

7 weeks agoOnly count bad refs when `moved` table exists (#23491)
Daniel Standish [Fri, 6 May 2022 12:42:22 +0000 (05:42 -0700)] 
Only count bad refs when `moved` table exists (#23491)

This keeps the logic to fail without upgrading when (A) there are bad rows and
(B) the "moved" table already exists. But we optimize so that we don't count
the bad rows unless the "moved" table is there. Previously we counted always,
but the first time a user attempts upgrade, the tables won't be there so
there's no point in counting.

Instead what we do is skip right to the CTAS, creating the _airflow_moved
tables. If there aren't any rows in the "moved" table, then we delete the table
immediately.

Also included here is a delete optimization, where we join to the moved table
instead of running the not exists query again.

Co-authored-by: Jed Cunningham <66968678+jedcunningham@users.noreply.github.com>
Co-authored-by: Ash Berlin-Taylor <ash@apache.org>
7 weeks agoAdd `OpsgenieDeleteAlertOperator` (#23405)
eladkal [Fri, 6 May 2022 11:37:03 +0000 (14:37 +0300)] 
Add `OpsgenieDeleteAlertOperator` (#23405)

* Add `OpsgenieDeleteAlertOperator`

7 weeks agoFix cassandra to 3.0.25 (#23522)
Jarek Potiuk [Fri, 6 May 2022 10:25:45 +0000 (12:25 +0200)] 
Fix cassandra to 3.0.25 (#23522)

fix cassandra to 3.0.25 as latest 3.0 (3.0.26) does not start cleanly

7 weeks agoMove tests command in new breeze (#23445)
Joppe Vos [Fri, 6 May 2022 09:03:05 +0000 (11:03 +0200)] 
Move tests command in new breeze (#23445)

7 weeks agoExpand/collapse all groups (#23487)
Brent Bovenzi [Thu, 5 May 2022 18:20:22 +0000 (14:20 -0400)] 
Expand/collapse all groups (#23487)

* Add expand/collapse all groups button to Grid

* add tests

* add comments

* Switch to 2 icon buttons

Disable buttons if all groups are expanded or collapsed

* Update localStorage key

7 weeks agoReplace DummyOperator references in docs (#23502)
Leah E. Cole [Thu, 5 May 2022 15:26:14 +0000 (11:26 -0400)] 
Replace DummyOperator references in docs (#23502)

7 weeks agoChanged word 'the' instead 'his' (#23493)
Edith Puclla [Thu, 5 May 2022 15:06:35 +0000 (10:06 -0500)] 
Changed word 'the' instead 'his' (#23493)

7 weeks agoUse kubernetes queue in kubernetes hybrid executors (#23048)
Tanel Kiis [Thu, 5 May 2022 10:23:18 +0000 (13:23 +0300)] 
Use kubernetes queue in kubernetes hybrid executors (#23048)

When using "hybrid" executors (`CeleryKubernetesExecutor` or `LocalKubernetesExecutor`),
then the `clear_not_launched_queued_tasks` mechnism in the `KubernetesExecutor` can
reset the queued tasks, that were given to the other executor.

`KuberneterExecutor` should limit itself to the configured queue when working in the
"hybrid" mode.

7 weeks agoAdd Stackdriver assets and migrate system tests to AIP-47 (#23320)
Bartłomiej Hirsz [Wed, 4 May 2022 21:49:58 +0000 (23:49 +0200)] 
Add Stackdriver assets and migrate system tests to AIP-47 (#23320)

Change-Id: I6f751e6576f57a89a5145aeb05f506da8a22b379

Co-authored-by: Bartlomiej Hirsz <bartomiejh@google.com>
7 weeks agoAdd doc and example dag for Amazon SQS Operators (#23312)
Niko [Wed, 4 May 2022 21:45:38 +0000 (14:45 -0700)] 
Add doc and example dag for Amazon SQS Operators (#23312)

7 weeks agoAdds resultBackendSecretName warning in Helm production docs (#23307)
Rafael Passos [Wed, 4 May 2022 21:28:15 +0000 (18:28 -0300)] 
Adds resultBackendSecretName warning in Helm production docs (#23307)

7 weeks agoCloudTasks assets & system tests migration (AIP-47) (#23282)
Bartłomiej Hirsz [Wed, 4 May 2022 20:42:45 +0000 (22:42 +0200)] 
CloudTasks assets & system tests migration (AIP-47) (#23282)

7 weeks agoAdd support for timezone as string in cron interval timetable (#23279)
Malthe Borch [Wed, 4 May 2022 20:41:34 +0000 (20:41 +0000)] 
Add support for timezone as string in cron interval timetable (#23279)

7 weeks agoTextToSpeech assets & system tests migration (AIP-47) (#23247)
Bartłomiej Hirsz [Wed, 4 May 2022 20:40:00 +0000 (22:40 +0200)] 
TextToSpeech assets & system tests migration (AIP-47) (#23247)

8 weeks agoFix literal cross product expansion (#23434)
Jed Cunningham [Wed, 4 May 2022 19:02:09 +0000 (13:02 -0600)] 
Fix literal cross product expansion (#23434)

8 weeks agoVisually distinguish task group summarys (#23488)
Brent Bovenzi [Wed, 4 May 2022 19:01:51 +0000 (15:01 -0400)] 
Visually distinguish task group summarys (#23488)

Bold task groups names and darken their bottom row border.

8 weeks agoEnsure the messages from migration job show up early (#23479)
Ash Berlin-Taylor [Wed, 4 May 2022 15:52:42 +0000 (16:52 +0100)] 
Ensure the messages from migration job show up early (#23479)

The default for python is to buffer stdout, which means that log lines
might now show up in the output straight away (until a certain number of
lines or number of bytes of output have been written) -- this is
especially problematic if the pre-migration checks taking a long time as
it makes it look like it has hung

8 weeks agoRemove color change for highly nested groups (#23482)
Brent Bovenzi [Wed, 4 May 2022 15:30:58 +0000 (11:30 -0400)] 
Remove color change for highly nested groups (#23482)

8 weeks agoRemove remaining Python3.6 references (#23474)
Ephraim Anierobi [Wed, 4 May 2022 14:23:00 +0000 (15:23 +0100)] 
Remove remaining Python3.6 references (#23474)

8 weeks agoAdd Python 3.10 trove classifier (#23464)
Jed Cunningham [Wed, 4 May 2022 00:12:47 +0000 (18:12 -0600)] 
Add Python 3.10 trove classifier (#23464)

8 weeks agoBump pre-commit hook versions (#22887)
Kamil Breguła [Tue, 3 May 2022 22:37:30 +0000 (00:37 +0200)] 
Bump pre-commit hook versions (#22887)

8 weeks agoMove non-opencontainer labeling of the image to breeze from Dockerfile (#23379)
Jarek Potiuk [Tue, 3 May 2022 22:15:52 +0000 (00:15 +0200)] 
Move non-opencontainer labeling of the image to breeze from Dockerfile (#23379)

* Extract "extra" labeling of the image to breeze from Dockerfile

Fixes: #21046

* Add more ArtifictHub-specific labels

Co-authored-by: Kamil Breguła <kamilbregula@apache.org>
8 weeks agoShow warning if '/' is used in a DAG run ID (#23106)
Tzu-ping Chung [Tue, 3 May 2022 21:22:12 +0000 (15:22 -0600)] 
Show warning if '/' is used in a DAG run ID (#23106)

8 weeks agoUnify approach for user questions asked in Breeze (#23335)
Jarek Potiuk [Tue, 3 May 2022 20:22:42 +0000 (22:22 +0200)] 
Unify approach for user questions asked in Breeze (#23335)

This change documents and unifies the approach we've taken for
the user inut handling when it comes to confirmation questions.

8 weeks agoDocs: Python 3.10 is now supported (#23457)
Jed Cunningham [Tue, 3 May 2022 19:00:50 +0000 (13:00 -0600)] 
Docs: Python 3.10 is now supported (#23457)

8 weeks agoOptimize 2.3.0 pre-upgrade check queries (#23458)
Daniel Standish [Tue, 3 May 2022 18:50:23 +0000 (11:50 -0700)] 
Optimize 2.3.0 pre-upgrade check queries (#23458)

We have to check for rows that are missing either corresponding TI or DR and move them out of table before adding FKs.  We were doing correlation in the JOIN condition but it appears postgres does *not* like this so here we move correlation to WHERE.

8 weeks agoFix `check_files.py` to work on new minor releases (#23287)
Jed Cunningham [Tue, 3 May 2022 18:33:05 +0000 (12:33 -0600)] 
Fix `check_files.py` to work on new minor releases (#23287)

8 weeks agoSupport annotations on volumeClaimTemplates (#23433)
Jed Cunningham [Tue, 3 May 2022 17:17:33 +0000 (11:17 -0600)] 
Support annotations on volumeClaimTemplates (#23433)