24 Mayıs 2021 Pazartesi

Apache Airflow - Workflow Engine

Giriş
Açıklaması şöyle
Apache Airflow is an open-source workflow engine created by Airbnb. It allows users to create their workflow definitions using Python code. The workflow can be executed based on a specified time interval (Cron based) or by an external (or manual) trigger (via the cli or web UI for example). Airflow also provides a UI in which you can monitor the progress of an execution
Açıklaması şöyle
Apache Airflow is an orchestration tool that helps you to programmatically create and handle task execution into a single workflow. It then handles monitoring its progress and takes care of scheduling future workflows depending on the schedule defined.

Workflows are created using Python scripts, which define how your tasks are executed. They are usually defined as Directed Acyclic Graphs (DAG).

The workflow execution is based on the schedule you provide, which is as per Unix cron schedule format. Once you create Python scripts and place them in the dags folder of Airflow, Airflow will automatically create the workflow for you.
YAML
Normalde Apache Airflow için Python kodu yazmak gerekir ama daha karmaşık şeyler için şablon YAML dosyalarını okuyan Python kodları kullanılıyor. Açıklaması şöyle. Hatta Kestra bu işi daha kolay hale getiriyor
Engineers who treat Apache Airflow deployments seriously often build a config system on top of it, e.g., in the form of templated YAML files. This way, the end users don’t need to write a boilerplate configuration in Python DAGs
Örnek
Burada ki örnekte Apache Airflow, Airbyte sunucusundan veriyi okuyup, Apache Spark'a gönderiyor

Hiç yorum yok:

Yorum Gönder