Kamil Breguła [Sun, 15 May 2022 09:52:51 +0000 (11:52 +0200)]
Add slim image to docs/docker-stack/README.md (#23710)
Brent Bovenzi [Fri, 13 May 2022 14:58:28 +0000 (10:58 -0400)]
Add UI tests for /utils and /components (#23456)
* Add UI tests for /utils and /components
* add test for Table
* Address PR feedback
* Fix window prompt var
* Fix TaskName test from rebase
* fix lint errors
Jarek Potiuk [Fri, 13 May 2022 13:16:33 +0000 (15:16 +0200)]
Add environment check and build image check for more Breeze commands (#23687)
Several commands of Breeze depends on docker, docker compose
being available as well as breeze image. They will work
fine if you "just" built the image but they might benefit
from the image being rebuilt (to make sure all latest
dependencies are installed in the image). The common checks
done in "shell" command for that are now extracted to common
utils and run as first thing in those commands that need it.
Jarek Potiuk [Fri, 13 May 2022 11:33:17 +0000 (13:33 +0200)]
Clarify that bundle extras should not be used for PyPi installs (#23697)
The bundle extras we have are only used for development and they
should not be used to install airflow from PyPI. This update
to documentation clarifies it.
Closes: #23692
Jarek Potiuk [Fri, 13 May 2022 11:00:39 +0000 (13:00 +0200)]
Fix property name in breeze Shell Params (#23696)
The rename from #23562 missed few shell_parms usage where it
also should be replaced.
Jarek Potiuk [Fri, 13 May 2022 10:21:36 +0000 (12:21 +0200)]
Disable Flower by default from docker-compose (#23685)
akolar-db [Fri, 13 May 2022 09:56:13 +0000 (11:56 +0200)]
Add git_source to DatabricksSubmitRunOperator (#23620)
The existing `DatabricksSubmitRunOperator` is extended with the support for the `git_source` parameter which allows users to run notebook tasks from files committed to git repositories.
If specified, any notebook task that is part of the payload will clone the repository and check out the commit, tag, or the tip of the specified branch. This is an alternative to dev repos ([docs](https://docs.databricks.com/repos/index.html)) where the checkout/update would have to be triggered manually.
Public documentation for the feature available here: https://docs.databricks.com/dev-tools/api/latest/jobs.html (NB: as noted in the docs, the feature is currently in public preview).
Ping Zhang [Thu, 12 May 2022 21:49:06 +0000 (14:49 -0700)]
Use func.count to count rows (#23657)
Vincent [Thu, 12 May 2022 20:19:35 +0000 (14:19 -0600)]
Update doc and sample dag for Quicksight (#23653)
Brent Bovenzi [Thu, 12 May 2022 19:48:31 +0000 (15:48 -0400)]
Fix expand/collapse all buttons (#23590)
* communicate via customevents
* Handle open group logic in wrapper
* fix tests
* Make grid action buttons sticky
* Add default toggle fn
* fix splitting task id by '.'
* fix missing dagrun ids
Brent Bovenzi [Thu, 12 May 2022 19:47:24 +0000 (15:47 -0400)]
Move around overflow, position and padding (#23044)
Ping Zhang [Thu, 12 May 2022 19:01:47 +0000 (12:01 -0700)]
remove stale serialized dags (#22917)
Daniel Standish [Thu, 12 May 2022 18:46:56 +0000 (11:46 -0700)]
Shorten max pre-commit hook name length (#23677)
When names are too long, pre-commit output looks very ugly and takes up 2x lines. Here I reduce max length just a little bit further so that pre-commit output renders properly on a macbook pro 16" with terminal window splitting screen horizontally.
Jarek Potiuk [Thu, 12 May 2022 17:36:06 +0000 (19:36 +0200)]
Upgrade `pip` to latest released 22.1.0 version (#23665)
We are finally able to get rid of the annoying false-positive
warnings and we have finally a chance on having warning-free
installation during docker builds.
Jarek Potiuk [Thu, 12 May 2022 17:30:39 +0000 (19:30 +0200)]
Replace "absolute()" with "resolve()" in pathlib objects (#23675)
TIL that absolute() is an undocumented in Pathlib and that we
should use resolve() instead.
So this is it.
Jarek Potiuk [Thu, 12 May 2022 17:23:38 +0000 (19:23 +0200)]
Add wildcard possibility to `package-filter` parametere (#23672)
the glob parameters (for example `apache-airflow-providers-*`) did
not work because only fixed list of parameters was allowed.
This PR converts the package-filter parameter to stop verifying the
value passed - so autocomplete continues to work but you should
still be able to use glob.
It also removes few places where the parameters were used with
`--` separator.
Bartłomiej Hirsz [Thu, 12 May 2022 16:49:10 +0000 (18:49 +0200)]
Migrate Dataproc to new system tests design (#22777)
Jarek Potiuk [Thu, 12 May 2022 13:26:04 +0000 (15:26 +0200)]
Synchronize support for Postgres and K8S in docs (#23673)
We just added support for Postgres 14 and K8S 1.24 and since we
did not have any changes to support either in main we are bringing
the support to 2.3 line as well.
This documentation syncs all remaining places where it should be
updated.
ishiis [Thu, 12 May 2022 12:33:53 +0000 (21:33 +0900)]
remove `--` in `./breeze build-docs` command (#23671)
Ping Zhang [Thu, 12 May 2022 10:09:48 +0000 (03:09 -0700)]
AIP45 Remove dag parsing in airflow run local (#21877)
Jarek Potiuk [Wed, 11 May 2022 23:01:16 +0000 (01:01 +0200)]
Prepare provider documentation 2022.05.11 (#23631)
Co-authored-by: eladkal <45845474+eladkal@users.noreply.github.com>
Co-authored-by: eladkal <45845474+eladkal@users.noreply.github.com>
Jarek Potiuk [Wed, 11 May 2022 22:20:02 +0000 (00:20 +0200)]
Revert "Fix k8s pod.execute randomly stuck indefinitely by logs consumption (#23497) (#23618)" (#23656)
This reverts commit
ee342b85b97649e2e29fcf83f439279b68f1b4d4.
humit [Wed, 11 May 2022 19:40:10 +0000 (04:40 +0900)]
Rename cluster_policy to task_policy (#23468)
* Rename cluster_policy to task_policy
* rename task_policy as example_task_policy.
raphaelauv [Wed, 11 May 2022 19:28:19 +0000 (21:28 +0200)]
[FEATURE] google provider - BigQueryInsertJobOperator log query (#23648)
Sebastian Chamena [Wed, 11 May 2022 19:20:49 +0000 (12:20 -0700)]
Fix k8s pod.execute randomly stuck indefinitely by logs consumption (#23497) (#23618)
Kanthi [Wed, 11 May 2022 19:16:49 +0000 (15:16 -0400)]
Fixed test and remove pytest.mark.xfail for test_exc_tb (#23650)
Kanthi [Wed, 11 May 2022 17:13:01 +0000 (13:13 -0400)]
Added kubernetes version (1.24) in README.md(for Main version(dev)), … (#23649)
* Added kubernetes version (1.24) in README.md(for Main version(dev)), accidentally removed in merge cnflict.
* Update README.md
Co-authored-by: Jarek Potiuk <jarek@potiuk.com>
pankajastro [Wed, 11 May 2022 17:07:01 +0000 (22:37 +0530)]
Add `RedshiftDeleteClusterOperator` support (#23563)
Add support for `RedshiftDeleteClusterOperator`. This will help to clean resources using airflow operators when needed. In the current implementation, By default, I'm waiting until the cluster is completely removed to return immediately without waiting set `wait_for_completion` param to False
- Add operator class
- Add basic unit test
- Add an example task
- Add relevant documentation
Kanthi [Wed, 11 May 2022 16:26:19 +0000 (12:26 -0400)]
Added postgres 14 to support versions(including breeze) (#23506)
* Added postgres 14 to support versions(including breeze)
Daniel Standish [Wed, 11 May 2022 16:08:06 +0000 (09:08 -0700)]
Don't run pre-migration checks for downgrade (#23634)
These checks are only make sense for upgrades. Generally they exist to resolve referential integrity issues etc before adding constraints. In the downgrade context, we generally only remove constraints, so it's a non-issue.
Gabriel Machado [Wed, 11 May 2022 14:45:33 +0000 (16:45 +0200)]
Add index for event column in log table (#23625)
Daniel Standish [Wed, 11 May 2022 14:13:57 +0000 (07:13 -0700)]
Simplify flash message for _airflow_moved tables (#23635)
Co-authored-by: Jed Cunningham <66968678+jedcunningham@users.noreply.github.com>
Jarek Potiuk [Wed, 11 May 2022 11:15:22 +0000 (13:15 +0200)]
Fix assuming "Feature" answer on CI when generating docs (#23640)
We have now different answers posisble when generating docs, and
for testing we assume we answered randomly during the generation
of documentation.
humit [Wed, 11 May 2022 10:58:26 +0000 (19:58 +0900)]
Fix typo issue (#23633)
raphaelauv [Wed, 11 May 2022 10:52:24 +0000 (12:52 +0200)]
[FEATURE] add K8S 1.24 support (#23637)
raphaelauv [Wed, 11 May 2022 08:26:14 +0000 (10:26 +0200)]
[FEATURE] update K8S-KIND to 0.13.0 (#23636)
Ruben Laguna [Wed, 11 May 2022 06:25:49 +0000 (08:25 +0200)]
Prevent KubernetesJobWatcher getting stuck on resource too old (#23521)
* Prevent KubernetesJobWatcher getting stuck on resource too old
If the watch fails because "resource too old" the
KubernetesJobWatcher should not retry with the same resource version
as that will end up in loop where there is no progress.
* Reset ResourceVersion().resource_version to 0
Jarek Potiuk [Tue, 10 May 2022 22:19:54 +0000 (00:19 +0200)]
Make provider doc preparation a bit more fun :) (#23629)
Previously you had to manually add versions when changelog was
modified. But why not to get a bit more fun and get the versions
bumped automatically based on your assesment when reviewing the
provideers rather than after looking at the generated changelog.
Jakub Novák [Tue, 10 May 2022 20:43:25 +0000 (22:43 +0200)]
Fix: Exception when parsing log #20966 (#23301)
* UnicodeDecodeError: 'utf-8' codec can't decode byte 0xXX in position X: invalid start byte
File "/opt/work/python395/lib/python3.9/site-packages/airflow/hooks/subprocess.py", line 89, in run_command
line = raw_line.decode(output_encoding).rstrip() # raw_line == b'\x00\x00\x00\x11\xa9\x01\n'
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xa9 in position 4: invalid start byte
* Update subprocess.py
* Update subprocess.py
* Fix: Exception when parsing log #20966
* Fix: Exception when parsing log #20966
Another alternative is: try-catch it.
e.g.
```
line = ''
for raw_line in iter(self.sub_process.stdout.readline, b''):
try:
line = raw_line.decode(output_encoding).rstrip()
except UnicodeDecodeError as err:
print(err, output_encoding, raw_line)
self.log.info("%s", line)
```
* Create test_subprocess.sh
* Update test_subprocess.py
* Added shell directive and license to test_subprocess.sh
* Distinguish between raw and decoded lines as suggested by @uranusjr
* simplify test
Co-authored-by: muhua <microhuang@live.com>
mhenc [Tue, 10 May 2022 17:13:00 +0000 (19:13 +0200)]
Implement send_callback method for CeleryKubernetesExecutor and LocalKubernetesExecutor (#23617)
raphaelauv [Tue, 10 May 2022 15:51:37 +0000 (17:51 +0200)]
[FEATURE] google provider - split GkeStartPodOperator execute (#23518)
rahulgoyal2987 [Tue, 10 May 2022 15:46:55 +0000 (21:16 +0530)]
Fixed Kubernetes Operator large xcom content Defect (#23490)
Jarek Potiuk [Tue, 10 May 2022 15:24:26 +0000 (17:24 +0200)]
Add slim images to docker-stack docs index (#23601)
Harpreet Singh [Tue, 10 May 2022 14:54:13 +0000 (20:24 +0530)]
Add Quicksight create ingestion Hook and Operator (#21863)
* Add Quicksight create ingestion Hook and Operator
Co-authored-by: eladkal <45845474+eladkal@users.noreply.github.com>
Jarek Potiuk [Tue, 10 May 2022 09:49:39 +0000 (11:49 +0200)]
Make Breeze help generation indepdent from having breeze installed (#23612)
Generation of Breeze help requires breeze to be installed. However
if you have locally installed breeze with different dependencies
and did not run self-upgrade, the results of generation of the
images might be different (for example when different rich
version is used). This change works in the way that:
* you do not have to have breeze installed at all to make it work
* it always upgrades to latest breeze when it is not installed
* but this only happens when you actually modified some breeze code
ishiis [Tue, 10 May 2022 09:49:18 +0000 (18:49 +0900)]
Add exportContext.offload flag to CLOUD_SQL_EXPORT_VALIDATION. (#23614)
Jarek Potiuk [Tue, 10 May 2022 06:36:28 +0000 (08:36 +0200)]
Update min requirements for rich to 12.4.1 (#23604)
Vincent [Mon, 9 May 2022 22:54:59 +0000 (16:54 -0600)]
Add sample dag and doc for S3ListPrefixesOperator (#23448)
* Add sample dag and doc for S3ListPrefixesOperator
* Fix static checks
nsAstro [Mon, 9 May 2022 22:49:22 +0000 (18:49 -0400)]
Add exception to catch single line private keys (#23043)
Edith Puclla [Mon, 9 May 2022 21:52:29 +0000 (16:52 -0500)]
Use inclusive words in apache airflow project (#23090)
Jarek Potiuk [Mon, 9 May 2022 21:02:25 +0000 (23:02 +0200)]
Improve caching for multi-platform images. (#23562)
This is another attempt to improve caching performance for
multi-platform images as the previous ones were undermined by a
bug in buildx multiplatform cache-to implementattion that caused
the image cache to be overwritten between platforms,
when multiple images were build.
The bug is created for the buildx behaviour at
https://github.com/docker/buildx/issues/1044 and until it is fixed
we have to prpare separate caches for each platform and push them
to separate tags.
That adds a bit overhead on the building step, but for now it is
the simplest way we can workaround the bug if we do not want to
manually manipulate manifests and images.
pierrejeambrun [Mon, 9 May 2022 20:32:02 +0000 (22:32 +0200)]
19943 Grid view status filters (#23392)
* Move tree filtering inside react and add some filters
* Move filters from context to utils
* Fix tests for useTreeData
* Fix last tests.
* Add tests for useFilters
* Refact to use existing SimpleStatus component
* Additional fix after rebase.
* Update following bbovenzi code review
* Update following code review
* Fix tests.
* Fix page flickering issues from react-query
* Fix side panel and small changes.
* Use default_dag_run_display_number in the filter options
* Handle timezone
* Fix flaky test
Co-authored-by: Brent Bovenzi <brent.bovenzi@gmail.com>
Vincent [Mon, 9 May 2022 18:21:51 +0000 (12:21 -0600)]
Add sample dag and doc for S3ListOperator (#23449)
* Add sample dag and doc for S3ListOperator
* Fix doc
Jed Cunningham [Mon, 9 May 2022 18:14:44 +0000 (12:14 -0600)]
Helm chart 1.6.0rc1 (#23548)
Vincent [Mon, 9 May 2022 17:56:50 +0000 (11:56 -0600)]
Add doc and sample dag for EC2 (#23547)
Michael Peteuil [Mon, 9 May 2022 17:48:11 +0000 (13:48 -0400)]
Apply specific ID collation to root_dag_id too (#23536)
In certain databases there is a need to set the collation for ID fields
like dag_id or task_id to something different than the database default.
This is because in MySQL with utf8mb4 the index size becomes too big for
the MySQL limits. In past pull requests this was handled
[#7570](https://github.com/apache/airflow/pull/7570),
[#17729](https://github.com/apache/airflow/pull/17729), but the
root_dag_id field on the dag model was missed. Since this field is used
to join with the dag_id in various other models ([and
self-referentially](https://github.com/apache/airflow/blob/
451c7cbc42a83a180c4362693508ed33dd1d1dab/airflow/models/dag.py#L2766)),
it also needs to have the same collation as other ID fields.
This can be seen by running `airflow db reset` before and after applying
this change while also specifying `sql_engine_collation_for_ids` in the
configuration.
Other related PRs
[#19408](https://github.com/apache/airflow/pull/19408)
Josh Fell [Mon, 9 May 2022 17:44:41 +0000 (13:44 -0400)]
Clean up in-line f-string concatenation (#23591)
Vincent [Mon, 9 May 2022 17:40:27 +0000 (11:40 -0600)]
Update sample dag and doc for Datasync (#23511)
Harpreet Singh [Mon, 9 May 2022 17:36:35 +0000 (23:06 +0530)]
Add default 'aws_conn_id' to SageMaker Operators #21808 (#23515)
Chris Redekop [Mon, 9 May 2022 15:49:53 +0000 (09:49 -0600)]
Fix broken dagrun links when many runs start at the same time (#23462)
* Load requested dagrun even when there are many dagruns at (almost) the same time
* Fix code formatting issues
eladkal [Mon, 9 May 2022 15:17:34 +0000 (18:17 +0300)]
Fix `PythonVirtualenvOperator` templated_fields (#23559)
* Fix `PythonVirtualenvOperator` templated_fields
The `PythonVirtualenvOperator` templated_fields override `PythonOperator` templated_fields which caused functionality not to work as expected.
fixes: https://github.com/apache/airflow/issues/23557
Tanel Kiis [Mon, 9 May 2022 15:12:40 +0000 (18:12 +0300)]
Pools with negative open slots should not block other pools (#23143)
eladkal [Mon, 9 May 2022 15:08:15 +0000 (18:08 +0300)]
Add `device_requests` parameter to `DockerOperator` (#23554)
* Expose device_requests to DockerOperator
Co-authored-by: Tedi Papajorgji <tedi.papajorgji@hotmail.com>
Ephraim Anierobi [Mon, 9 May 2022 12:44:35 +0000 (13:44 +0100)]
Fix scheduler crash when expanding with mapped task that returned none (#23486)
When task is expanded from a mapped task that returned no value, it
crashes the scheduler. This PR fixes it by first checking if there's
a return value from the mapped task, if no returned value, then error
in the task itself instead of crashing the scheduler
Karthikeyan Singaravelan [Mon, 9 May 2022 12:25:48 +0000 (17:55 +0530)]
Add support for queued state in DagRun update endpoint. (#23481)
Jarek Potiuk [Mon, 9 May 2022 10:15:43 +0000 (12:15 +0200)]
Fixed option name in Breeze description (#23582)
Jarek Potiuk [Mon, 9 May 2022 09:59:11 +0000 (11:59 +0200)]
tHe output of commands of Breeze are only generated when they change (#23570)
Previously we generated output of all the commands from Breeze always,
hoping that they will be the same, but rich already had two changes
in the format of the SVG files which made the output different and
breaking our PRs.
Temporarily we pinned rich to fix the output, but better solution is
to get the hash of all the configuration options and see if it changed,
and only run generation when it did. This way we keep automated
generation on pre-commit but we are protected from accidental change
of the output.
We also remove the rich limits and regenerated all svg files to ones
generated by 12.4.0. Also found a way to run the check if we should
run generation at all in pre-commit without prior installing breeze.
Fixes: #22908
Andrey Anshin [Mon, 9 May 2022 08:50:33 +0000 (11:50 +0300)]
Fix dag-processor fetch metabase config (#23575)
mthakare-onshape [Mon, 9 May 2022 08:17:12 +0000 (13:47 +0530)]
Update dags.rst (#23579)
Update missing bracket
Jarek Potiuk [Mon, 9 May 2022 05:41:06 +0000 (07:41 +0200)]
Temporarily pin xmltodict to 0.12.0 to fix main failure (#23577)
The xmltodict 0,13.0 breaks some tests and likely 0.13.0 is buggy
as the error is ValueError: Malformatted input.
We pin it to 0.12.0 to fix the main failing.
Related: #23576
thinhnd2104 [Sun, 8 May 2022 22:32:09 +0000 (05:32 +0700)]
Fix conn close error on retrieving log events (#23470)
related: [#23469] (https://github.com/apache/airflow/issues/23469).
pierrejeambrun [Sun, 8 May 2022 22:06:23 +0000 (00:06 +0200)]
Fix `PostgresToGCSOperator` does not allow nested JSON (#23063)
* Avoid double json.dumps for json data export in PostgresToGCSOperator.
* Fix CI
Benoit Person [Sun, 8 May 2022 21:38:50 +0000 (23:38 +0200)]
Opsgenie: Fix `close_alert` to properly send `kwargs` (#23442)
D. Ferruzzi [Sun, 8 May 2022 21:37:51 +0000 (14:37 -0700)]
Amazon Sagemaker Sample DAG and docs update (#23256)
sanjayp [Sun, 8 May 2022 21:12:26 +0000 (02:42 +0530)]
wasb hook: user defaultAzureCredentials instead of managedIdentity (#23394)
Co-authored-by: Sanjay Pillai <sanjaypillai11 [at] gmail.com>
Yeachan Park [Sun, 8 May 2022 21:11:51 +0000 (23:11 +0200)]
Move dag_processing.processor_timeouts to counters section (#23393)
GitStart-AirFlow [Sun, 8 May 2022 19:46:55 +0000 (20:46 +0100)]
Fix GCSToGCSOperator ignores replace parameter when there is no wildcard (#23340)
Bartłomiej Hirsz [Sun, 8 May 2022 19:32:26 +0000 (21:32 +0200)]
Tests for provider code structure (#23351)
Improved test for code structure that can be re-used among various providders.
Jarek Potiuk [Sat, 7 May 2022 21:53:11 +0000 (23:53 +0200)]
Add slim images to release process (#23391)
This PR adds slim images to release process of Airflow.
Those images are small as they do not contain any extras.
Fixes: #20849
Jarek Potiuk [Sat, 7 May 2022 14:17:48 +0000 (16:17 +0200)]
Fix _PIP_ADDITIONAL_REQUIREMENTS case for docker-compose (#23517)
Recent versions of Airflow do not allow to run `pip install` as
root but the `init` job runs as root so when the variable
_PIP_ADDITIONAL_REQUIREMENTS is set, the init container fails.
This PR forces _PIP_ADDITIONAL_REQUIREMENTS to be empty for the init
job.
Jarek Potiuk [Sat, 7 May 2022 13:56:34 +0000 (15:56 +0200)]
Refactor Breeze to group related methods and classes together (#23556)
This change refactors Breeze classes to more consistent approach.
* The "commands" package only contains commands
* All Parameters (BuildCi, BuildProd, BuildDoc, Shell) are now
in "params" package
* Required/Optional Build args are now members of the
BuildCiParams, BuildProdParams which makes the params
much more self-contained..
* All utils are in "utils" package
This helps with avoiding circular imports (all utios are now
standalone and do not use any of the commands.
Co-authored-by: eladkal <45845474+eladkal@users.noreply.github.com>
Jarek Potiuk [Sat, 7 May 2022 13:36:55 +0000 (15:36 +0200)]
Add IPV6 form of the address in cassandra status check (#23537)
This PR fixes problem introduced in 3.0.26 of cassandra image which
adds square brackets around IP address regardless of its type.
The problem was workarounded by pinning cassandra to 3.0.25 in
the ##23522 as a quick fix, but this one introducec permanent,
future-proof solution.
Based on discussion in https://issues.apache.org/jira/browse/CASSANDRA-17612
Fixes: #23523
Jarek Potiuk [Sat, 7 May 2022 10:24:05 +0000 (12:24 +0200)]
Add logging in to Github Registry for breeze pull (#23551)
All of the Airlfow Images are Public in ghcr.io but default setting
for iamges is "private" and when users want to build CI workflows
in their forks, had to manually change their images to Public, so
that ci.yml workflow can pull the images prepared in the build-images
workflow.
This PR adds logging in for `breeze pull` command when GITHUB_TOKEN
is available, also the workflow gets packages: read permissions.
This way ci should works in forks of users without any action from
user except first-time workflow enabling.
Vincent [Sat, 7 May 2022 09:19:45 +0000 (03:19 -0600)]
Fix LocalFilesystemToS3Operator and S3CreateObjectOperator to support full s3:// style keys (#23180)
* Fix LocalFilesystemToS3Operator and S3CreateObjectOperator.
Support full s3:// style keys
* Fix spelling error
Jed Cunningham [Sat, 7 May 2022 09:15:25 +0000 (03:15 -0600)]
Change chart annotation generator to use RELEASE_NOTES (#23549)
D. Ferruzzi [Sat, 7 May 2022 06:28:44 +0000 (23:28 -0700)]
Update the Athena Sample DAG and Docs (#23428)
* Update the Athena Sample DAG and Docs
Jarek Potiuk [Sat, 7 May 2022 06:26:04 +0000 (08:26 +0200)]
Fix accidental including of providers in airflow package (#23552)
The change #23454 accidentally remove INSTALL_PROVIDERS_FROM_SOURCES
setting to "false" which resulted in airflow package containing all
providers. This has been caught by our tests (but it was only
visible after merging)
This PR brings the variable back.
eladkal [Fri, 6 May 2022 23:51:56 +0000 (02:51 +0300)]
Replace `pytest.mark.xfail` in Postgres tests (#23541)
Jarek Potiuk [Fri, 6 May 2022 22:47:00 +0000 (00:47 +0200)]
Seperate provider verification as standalone breeze command (#23454)
This is another step in simplifying and converting to Python all of
the CI/local development tooling.
This PR separates out verification of providers as a separate
breeze command `verify-provider-packages`. It was previously part of
"prepare_provider_packages.py" but it has been now
extracted to a separate in-container python file and it was
wrapped with breeze's `verify-provider-packages` command.
No longer provider verification is run with "preparing provider docs"
nor "preparing provider packages" - it's a standaline command.
This command is also used in CI now to run the tests:
* all provider packages are built and created on CI together with
airflow version
* the packages are installed inside the CI image and providers are
verified
* the 2.1 version of Airflow is installed together with all 2.1
- compatible providers and provider verification is run there too.
This all is much simpler now - we got rediof some 500 lines of bash
code again in favour of breeze python code.
Fixes: #23430
Pragya [Fri, 6 May 2022 19:45:33 +0000 (01:15 +0530)]
TrinoHook add authentication via JWT token and Impersonation (#23116)
* added trino authentication via JWT token and impersonation
* added test cases for jwt verification in trino
* added documenation for trino hook
Niko [Fri, 6 May 2022 18:03:24 +0000 (11:03 -0700)]
Update docs Amazon Glacier Docs (#23372)
Ash Berlin-Taylor [Fri, 6 May 2022 16:02:27 +0000 (17:02 +0100)]
Change approach to finding bad rows to LEFT OUTER JOIN. (#23528)
Rather than sub-selects (two for count, or one for the CREATE TABLE).
For a _large_ database (27m TaskInstances, 2m DagRuns) this takes the
time from 10minutes to around 3 minutes per table (we have 3) down to 3
minutes per table. (All times on Postgres.)
Before:
```sql
CREATE TABLE _airflow_moved__2_3__dangling__rendered_task_instance_fields AS
SELECT
rendered_task_instance_fields.dag_id AS dag_id,
rendered_task_instance_fields.task_id AS task_id,
rendered_task_instance_fields.execution_date AS execution_date,
rendered_task_instance_fields.rendered_fields AS rendered_fields,
rendered_task_instance_fields.k8s_pod_yaml AS k8s_pod_yaml +
FROM
rendered_task_instance_fields
WHERE
NOT (
EXISTS (
SELECT
1
FROM
task_instance
JOIN dag_run ON dag_run.dag_id = task_instance.dag_id
AND dag_run.run_id = task_instance.run_id
WHERE
rendered_task_instance_fields.dag_id = task_instance.dag_id
AND rendered_task_instance_fields.task_id = task_instance.task_id
AND rendered_task_instance_fields.execution_date = dag_run.execution_date
)
)
```
After:
```sql
CREATE TABLE _airflow_moved__2_3__dangling__rendered_task_instance_fields AS
SELECT
rendered_task_instance_fields.dag_id AS dag_id,
rendered_task_instance_fields.task_id AS task_id,
rendered_task_instance_fields.execution_date AS execution_date,
rendered_task_instance_fields.rendered_fields AS rendered_fields,
rendered_task_instance_fields.k8s_pod_yaml AS k8s_pod_yaml +
FROM
rendered_task_instance_fields
LEFT OUTER JOIN dag_run ON rendered_task_instance_fields.dag_id = dag_run.dag_id
AND rendered_task_instance_fields.execution_date = dag_run.execution_date
LEFT OUTER JOIN task_instance ON dag_run.dag_id = task_instance.dag_id
AND dag_run.run_id = task_instance.run_id
AND rendered_task_instance_fields.task_id = task_instance.task_id
WHERE
task_instance.dag_id IS NULL
OR dag_run.dag_id IS NULL
;
```
Daniel Standish [Fri, 6 May 2022 12:42:22 +0000 (05:42 -0700)]
Only count bad refs when `moved` table exists (#23491)
This keeps the logic to fail without upgrading when (A) there are bad rows and
(B) the "moved" table already exists. But we optimize so that we don't count
the bad rows unless the "moved" table is there. Previously we counted always,
but the first time a user attempts upgrade, the tables won't be there so
there's no point in counting.
Instead what we do is skip right to the CTAS, creating the _airflow_moved
tables. If there aren't any rows in the "moved" table, then we delete the table
immediately.
Also included here is a delete optimization, where we join to the moved table
instead of running the not exists query again.
Co-authored-by: Jed Cunningham <66968678+jedcunningham@users.noreply.github.com>
Co-authored-by: Ash Berlin-Taylor <ash@apache.org>
eladkal [Fri, 6 May 2022 11:37:03 +0000 (14:37 +0300)]
Add `OpsgenieDeleteAlertOperator` (#23405)
* Add `OpsgenieDeleteAlertOperator`
Jarek Potiuk [Fri, 6 May 2022 10:25:45 +0000 (12:25 +0200)]
Fix cassandra to 3.0.25 (#23522)
fix cassandra to 3.0.25 as latest 3.0 (3.0.26) does not start cleanly
Joppe Vos [Fri, 6 May 2022 09:03:05 +0000 (11:03 +0200)]
Move tests command in new breeze (#23445)
Brent Bovenzi [Thu, 5 May 2022 18:20:22 +0000 (14:20 -0400)]
Expand/collapse all groups (#23487)
* Add expand/collapse all groups button to Grid
* add tests
* add comments
* Switch to 2 icon buttons
Disable buttons if all groups are expanded or collapsed
* Update localStorage key
Leah E. Cole [Thu, 5 May 2022 15:26:14 +0000 (11:26 -0400)]
Replace DummyOperator references in docs (#23502)
Edith Puclla [Thu, 5 May 2022 15:06:35 +0000 (10:06 -0500)]
Changed word 'the' instead 'his' (#23493)
Tanel Kiis [Thu, 5 May 2022 10:23:18 +0000 (13:23 +0300)]
Use kubernetes queue in kubernetes hybrid executors (#23048)
When using "hybrid" executors (`CeleryKubernetesExecutor` or `LocalKubernetesExecutor`),
then the `clear_not_launched_queued_tasks` mechnism in the `KubernetesExecutor` can
reset the queued tasks, that were given to the other executor.
`KuberneterExecutor` should limit itself to the configured queue when working in the
"hybrid" mode.