Please share with the community what you think needs improvement with Apache Airflow.
What are its weaknesses? What would you like to see changed in a future version?
One specific feature that is missing from Airflow is that the steps of your workflow are not pipelined, meaning the stageless steps of any workflow. Not every workflow can be implemented within Airflow. For example, Step 1 of my workflow will have output which I definitely want to automatically be provided as an input to my Step 2. At the workflow level, we want to have common state management where, across steps, we'll be able to reach the state information. Right now, we're using an external state repository to maintain the state. If Airflow could come up with some kind of implementation, where not every step of the pipeline is an independent step, that would be helpful. I would like it if a part of the output of your previous steps could be Apache input for your next step. That kind of pipeline is missing. When we consider other products like jBPM, Camunda, or Cadence, they have the concept of pipelining. I would also like to see support for more platforms, in terms of programming BPMs. Cadence supports Golang and Java. Legacy components can be from any platform, so if they could provide more client support for Java client library and Golang, that would be helpful. I want it to program in Java.
There are some drawbacks to this solution. The code does not cover all tasks in the data warehouse automation process. Currently , in production, we have a large installation with a complex workflow that includes hundreds of tasks. Most of them are dispatched by existing engine, but not all. For example, sometimes we need to create cycles in our workflow but we are not able to, because Airflow supports only Direct Acyclic Graphs ( DAGs ) We need to develop our workflow description and notations because out of the box, Apache Airflow does not provide some features that are needed. It is our understanding that it is limited by design. We will wait for the latest 2.0 version, as it is awaited to be much more mature than the 1.8-1.10 version. We believe that it will be better. There should be some improvement made to the Doc Management features from within the UI. They should think about Outlook integration, which should be out of the box, and the object model should be expanded to support cyclic graphs inside the workflow.
What do you like most about Apache Airflow?
Thanks for sharing your thoughts with the community!
I'm seeing a spike of people researching Appian, IBM BPM, Bizagi, and other BPM solutions. What are the most important features to look at when evaluating such tools?
What advice would you give to your peers who are researching Business Process Management software?