Support glob syntax in ``.airflowignore`` files (#21392) (#22051)
authorIan Buss <ianbuss@users.noreply.github.com>
Wed, 13 Apr 2022 10:19:58 +0000 (11:19 +0100)
committerGitHub <noreply@github.com>
Wed, 13 Apr 2022 10:19:58 +0000 (11:19 +0100)
commit0cd8833df74f4b0498026c4103bab130e1fc1068
tree4fdac524b76544ac12c709e9e1897fa6c28cf889
parent7331eefc393b8f1fae6f3cf061cf17eb5eaa3fc8
Support glob syntax in ``.airflowignore`` files (#21392) (#22051)

A new configuration parameter "CORE_IGNORE_FILE_SYNTAX" is added to
allow patterns in .airflowignore files to be interpreted as either
regular expressions (the default) or glob expressions as found in
.gitignore files. This allows users to use patterns they will be
familiar with from tools such as git, helm and docker.

Glob expressions support wildcard matches ("*", "?") within a directory
as well as character classes ("[0-9]"). In addition, zero or more
directories can be matched using "**". Patterns can be negated by
prefixing a "!" at the beginning of the pattern.

The "fnmatch" library in core Python does not produce patterns that are
fully compliant with the kind of patterns that users will be used to
from gitignore or dockerignore files, so the globs are parsed using
the pathspec package from PyPI.

To aid with debugging ignorefile patterns a more helpful error
message is emitted in the logs for invalid patterns, which are
now skipped rather than causing a hard-to-read scheduler stack trace.

closes: #21392
16 files changed:
airflow/config_templates/config.yml
airflow/config_templates/default_airflow.cfg
airflow/configuration.py
airflow/models/dagbag.py
airflow/utils/file.py
docs/apache-airflow/concepts/dags.rst
docs/apache-airflow/howto/dynamic-dag-generation.rst
docs/apache-airflow/modules_management.rst
setup.cfg
tests/dags/.airflowignore
tests/dags/.airflowignore_glob [new file with mode: 0644]
tests/dags/subdir2/.airflowignore_glob [new file with mode: 0644]
tests/dags/subdir2/subdir3/test_nested_dag.py [new file with mode: 0644]
tests/jobs/test_scheduler_job.py
tests/plugins/test_plugin_ignore.py
tests/utils/test_file.py