![]() It’s amusing that the official Airflow documentation is still very consistent with the initial 2015 blog post - for example, the following snippet taken from the Airflow 2.3.4 documentation can easily be mistaken for a snippet from the 2015 post: The DAG itself doesn’t care about what is happening inside the tasks it is merely concerned with how to execute them - the order to run them in, how many times to retry them, if they have timeouts, and so on. They’re still a collection of tasks that are executed in a given order. And to this date, its DAGs still look and behave like they did seven years ago. A world still ruled by DAGsĪirflow’s undisputed cornerstone is the DAG (Directed Acyclic Graph). Seven years, 52 releases, 27k GitHub stars, and 11k forks later, Airflow still relies on the same core concepts outlined in the initial 2015 post. Airflow never cared about what are the processes it was automating for Airflow, they were all tasks. This was clearly explained in the initial blog post that accompanied Airflow’s open-sourcing: it was built to automate and optimize data processes - processes that are scheduled, mission-critical, evolving, and heterogeneous. Like most other tools that were built within major tech companies, Airflow was initially designed to help with a set of use cases and data processing workflows that Airbnb had at the time. Airflow’s world: Everything is a task, and a task can be anything
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |