Skip to content

Latest commit

 

History

History
338 lines (263 loc) · 19.4 KB

File metadata and controls

338 lines (263 loc) · 19.4 KB

apache-airflow-providers-google

Content

.. toctree::
    :maxdepth: 1
    :caption: Guides

    Connection types <connections/index>
    Logging handlers <logging/index>
    Secrets backends <secrets-backends/google-cloud-secret-manager-backend>
    API Authentication backend <api-auth-backend/google-openid>
    Operators <operators/index>

.. toctree::
    :maxdepth: 1
    :caption: References

    Python API <_api/airflow/providers/google/index>
    Configuration <configurations-ref>

.. toctree::
    :maxdepth: 1
    :caption: Resources

    Example DAGs <example-dags>
    PyPI Repository <https://pypi.org/project/apache-airflow-providers-google/>

.. toctree::
    :maxdepth: 1
    :caption: Commits

    Detailed list of commits <commits>


Package apache-airflow-providers-google

Google services including:

Release: 2.2.0

Provider package

This is a provider package for google provider. All classes for this provider package are in airflow.providers.google python package.

Installation

Note

On November 2020, new version of PIP (20.3) has been released with a new, 2020 resolver. This resolver does not yet work with Apache Airflow and might lead to errors in installation - depends on your choice of extras. In order to install Airflow you need to either downgrade pip to version 20.2.4 pip install --upgrade pip==20.2.4 or, in case you use Pip 20.3, you need to add option --use-deprecated legacy-resolver to your pip install command.

You can install this package on top of an existing airflow 2.* installation via pip install apache-airflow-providers-google

PIP requirements

PIP package Version required
PyOpenSSL  
google-ads >=4.0.0,<8.0.0
google-api-core >=1.25.1,<2.0.0
google-api-python-client >=1.6.0,<2.0.0
google-auth-httplib2 >=0.0.1
google-auth >=1.0.0,<2.0.0
google-cloud-automl >=2.1.0,<3.0.0
google-cloud-bigquery-datatransfer >=3.0.0,<4.0.0
google-cloud-bigtable >=1.0.0,<2.0.0
google-cloud-container >=0.1.1,<2.0.0
google-cloud-datacatalog >=3.0.0,<4.0.0
google-cloud-dataproc >=2.2.0,<3.0.0
google-cloud-dlp >=0.11.0,<2.0.0
google-cloud-kms >=2.0.0,<3.0.0
google-cloud-language >=1.1.1,<2.0.0
google-cloud-logging >=2.1.1,<3.0.0
google-cloud-memcache >=0.2.0
google-cloud-monitoring >=2.0.0,<3.0.0
google-cloud-os-login >=2.0.0,<3.0.0
google-cloud-pubsub >=2.0.0,<3.0.0
google-cloud-redis >=2.0.0,<3.0.0
google-cloud-secret-manager >=0.2.0,<2.0.0
google-cloud-spanner >=1.10.0,<2.0.0
google-cloud-speech >=0.36.3,<2.0.0
google-cloud-storage >=1.30,<2.0.0
google-cloud-tasks >=2.0.0,<3.0.0
google-cloud-texttospeech >=0.4.0,<2.0.0
google-cloud-translate >=1.5.0,<2.0.0
google-cloud-videointelligence >=1.7.0,<2.0.0
google-cloud-vision >=0.35.2,<2.0.0
google-cloud-workflows >=0.1.0,<2.0.0
grpcio-gcp >=0.2.2
json-merge-patch ~=0.2
pandas-gbq <0.15.0
plyvel  

Cross provider package dependencies

Those are dependencies that might be needed in order to use all the features of the package. You need to install the specified provider packages in order to use them.

You can install such cross-provider dependencies when installing from PyPI. For example:

pip install apache-airflow-providers-google[amazon]
Dependent package Extra
apache-airflow-providers-amazon amazon
apache-airflow-providers-apache-beam apache.beam
apache-airflow-providers-apache-cassandra apache.cassandra
apache-airflow-providers-cncf-kubernetes cncf.kubernetes
apache-airflow-providers-facebook facebook
apache-airflow-providers-microsoft-azure microsoft.azure
apache-airflow-providers-microsoft-mssql microsoft.mssql
apache-airflow-providers-mysql mysql
apache-airflow-providers-oracle oracle
apache-airflow-providers-postgres postgres
apache-airflow-providers-presto presto
apache-airflow-providers-salesforce salesforce
apache-airflow-providers-sftp sftp
apache-airflow-providers-ssh ssh
apache-airflow-providers-trino trino

Changelog

2.2.0

Features
  • Adds 'Trino' provider (with lower memory footprint for tests) (#15187)
  • update remaining old import paths of operators (#15127)
  • Override project in dataprocSubmitJobOperator (#14981)
  • GCS to BigQuery Transfer Operator with Labels and Description parameter (#14881)
  • Add GCS timespan transform operator (#13996)
  • Add job labels to bigquery check operators. (#14685)
  • Use libyaml C library when available. (#14577)
  • Add Google leveldb hook and operator (#13109) (#14105)
Bug fixes
  • Google Dataflow Hook to handle no Job Type (#14914)

2.1.0

Features
  • Corrects order of argument in docstring in GCSHook.download method (#14497)
  • Refactor SQL/BigQuery/Qubole/Druid Check operators (#12677)
  • Add GoogleDriveToLocalOperator (#14191)
  • Add 'exists_ok' flag to BigQueryCreateEmptyTable(Dataset)Operator (#14026)
  • Add materialized view support for BigQuery (#14201)
  • Add BigQueryUpdateTableOperator (#14149)
  • Add param to CloudDataTransferServiceOperator (#14118)
  • Add gdrive_to_gcs operator, drive sensor, additional functionality to drive hook (#13982)
  • Improve GCSToSFTPOperator paths handling (#11284)
Bug Fixes
  • Fixes to dataproc operators and hook (#14086)
  • #9803 fix bug in copy operation without wildcard (#13919)

2.0.0

Breaking changes
Updated google-cloud-* libraries

This release of the provider package contains third-party library updates, which may require updating your DAG files or custom hooks and operators, if you were using objects from those libraries. Updating of these libraries is necessary to be able to use new features made available by new versions of the libraries and to obtain bug fixes that are only available for new versions of the library.

Details are covered in the UPDATING.md files for each library, but there are some details that you should pay attention to.

Library name Previous constraints Current constraints Upgrade Documentation
google-cloud-automl >=0.4.0,<2.0.0 >=2.1.0,<3.0.0 Upgrading google-cloud-automl
google-cloud-bigquery-datatransfer >=0.4.0,<2.0.0 >=3.0.0,<4.0.0 Upgrading google-cloud-bigquery-datatransfer
google-cloud-datacatalog >=0.5.0,<0.8 >=3.0.0,<4.0.0 Upgrading google-cloud-datacatalog
google-cloud-dataproc >=1.0.1,<2.0.0 >=2.2.0,<3.0.0 Upgrading google-cloud-dataproc
google-cloud-kms >=1.2.1,<2.0.0 >=2.0.0,<3.0.0 Upgrading google-cloud-kms
google-cloud-logging >=1.14.0,<2.0.0 >=2.0.0,<3.0.0 Upgrading google-cloud-logging
google-cloud-monitoring >=0.34.0,<2.0.0 >=2.0.0,<3.0.0 Upgrading google-cloud-monitoring
google-cloud-os-login >=1.0.0,<2.0.0 >=2.0.0,<3.0.0 Upgrading google-cloud-os-login
google-cloud-pubsub >=1.0.0,<2.0.0 >=2.0.0,<3.0.0 Upgrading google-cloud-pubsub
google-cloud-tasks >=1.2.1,<2.0.0 >=2.0.0,<3.0.0 Upgrading google-cloud-task
The field names use the snake_case convention

If your DAG uses an object from the above mentioned libraries passed by XCom, it is necessary to update the naming convention of the fields that are read. Previously, the fields used the CamelSnake convention, now the snake_case convention is used.

Before:

set_acl_permission = GCSBucketCreateAclEntryOperator(
    task_id="gcs-set-acl-permission",
    bucket=BUCKET_NAME,
    entity="user-{{ task_instance.xcom_pull('get-instance')['persistenceIamIdentity']"
    ".split(':', 2)[1] }}",
    role="OWNER",
)

After:

set_acl_permission = GCSBucketCreateAclEntryOperator(
    task_id="gcs-set-acl-permission",
    bucket=BUCKET_NAME,
    entity="user-{{ task_instance.xcom_pull('get-instance')['persistence_iam_identity']"
    ".split(':', 2)[1] }}",
    role="OWNER",
)
Features
  • Add Apache Beam operators (#12814)
  • Add Google Cloud Workflows Operators (#13366)
  • Replace 'google_cloud_storage_conn_id' by 'gcp_conn_id' when using 'GCSHook' (#13851)
  • Add How To Guide for Dataflow (#13461)
  • Generalize MLEngineStartTrainingJobOperator to custom images (#13318)
  • Add Parquet data type to BaseSQLToGCSOperator (#13359)
  • Add DataprocCreateWorkflowTemplateOperator (#13338)
  • Add OracleToGCS Transfer (#13246)
  • Add timeout option to gcs hook methods. (#13156)
  • Add regional support to dataproc workflow template operators (#12907)
  • Add project_id to client inside BigQuery hook update_table method (#13018)
Bug fixes
  • Fix four bugs in StackdriverTaskHandler (#13784)
  • Decode Remote Google Logs (#13115)
  • Fix and improve GCP BigTable hook and system test (#13896)
  • updated Google DV360 Hook to fix SDF issue (#13703)
  • Fix insert_all method of BigQueryHook to support tables without schema (#13138)
  • Fix Google BigQueryHook method get_schema() (#13136)
  • Fix Data Catalog operators (#13096)

1.0.0

Initial version of the provider.