KB: Airflow Operators

In Apache Airflow, operators are tasks that perform specific actions in a workflow or Directed Acyclic Graph (DAG). Each operator defines a particular action or job, such as running a script, transferring files, or interacting with external systems. Airflow provides various types of operators, which can be categorized broadly as follows:


1. Basic Operators

  • PythonOperator: Executes Python functions directly.
  • BashOperator: Runs Bash commands.
  • DummyOperator: Used as a placeholder or for creating dependencies.

2. Transfer Operators

  • Used to transfer data between different systems or databases.
  • Examples: S3ToRedshiftOperator (transfers data from S3 to Redshift), MySqlToS3Operator, S3ToSnowflakeOperator.

3. Sensors

  • Special operators that wait for a specific condition to be true.
  • Examples: FileSensor (waits for a file to be present), ExternalTaskSensor (waits for another task to complete), TimeDeltaSensor (waits for a specified time).

4. Hooks

  • Not exactly operators, but hooks allow operators to connect to external systems (e.g., databases, APIs).
  • Commonly used with custom operators to extend functionality.

5. Custom Operators

  • You can create custom operators by subclassing BaseOperator, which is helpful when standard operators don’t meet specific requirements.

6. ETL Operators

  • Specific operators for ETL processes, like BigQueryOperator or HiveOperator.

7. Utility Operators

  • Used for more advanced workflow control.
  • BranchPythonOperator: Chooses which path to take in the DAG.
  • SubDagOperator: Allows running a DAG as part of another DAG.

Each operator has its own set of parameters, allowing customization, and can be chained together to define complex workflows. These are essential for building workflows that manage data pipelines and automate tasks effectively.


 Airflow Operators: https://airflow.apache.org/docs/apache-airflow/stable/core-concepts/operators.html


Comments

Popular posts from this blog

KB: Azure ACA Container fails to start (no User Assigned or Delegated Managed Identity found for specified ClientId)

Electron Process Execution Failure with FSLogix

KB:RMM VS DEX (Remote Monitoring Management vs Digital Employee Experience)