We just raised a $30M Series A: Read our story
2020-10-07T07:04:00Z

What needs improvement with Apache Flink?

2

Please share with the community what you think needs improvement with Apache Flink.

What are its weaknesses? What would you like to see changed in a future version?

ITCS user
Guest
88 Answers

author avatar
Top 5Real User

One way to improve Flink would be to enhance integration between different ecosystems. For example, there could be more integration with other big data vendors and platforms similar in scope to how Apache Flink works with Cloudera. Apache Flink is a part of the same ecosystem as Cloudera, and for batch processing it's actually very useful but for real-time processing there could be more development with regards to the big data capabilities amongst the various ecosystems out there. I am also looking for more possibilities in terms of what can be implemented in containers and not in Kubernetes. I think our architecture would work really great with more options available to us in this sense. Finally, it's a challenge to find people with the appropriate skills for using Flink. There are a lot of people who know what should be done better in big data systems, but there are still very few people with Flink capabilities.

2021-03-03T20:13:19Z
author avatar
Top 5Real User

I am using the Python API and I have found the solution to be underdeveloped compared to others. There needs to be better integration with notebooks to allow for more practical development. Additionally, there are no managed services. For example, on Azure, you would have to set everything up yourself. In a future release, they could improve on making the error descriptions more clear.

2021-02-02T17:14:03Z
author avatar
Top 5Real User

Flink has become a lot more stable but the machine learning library is still not very flexible. There are some models which are not able to plug and play. In order to use some of the libraries and models, I need to have a Python library because there might be some pre-processing or post-processing requirements, or to even parse and use the models. The lack of Python support is something they can maybe work on in the future.

2020-11-08T16:21:05Z
author avatar
Top 10Real User

In terms of improvement, there should be better reporting. You can integrate with reporting solutions but Flink doesn't offer it themselves. They're more about the processing side. Low latency processing is out of their scope. As ar as low latency is concerned, you can integrate to other backend solutions as well. They have that flexibility. APIs are good enough. Its in-memory is so fast, you could have faster-developed data and stuff like that.

2020-10-21T04:33:00Z
author avatar
Top 10Real User

In Flink, maintaining the infrastructure is not easy. You have to design the architecture well. If you want to scale for a larger number of streaming data you need good machines. You need good resilience architecture so that if it fails, you can recover from those with minimum downtime. You should have good storage systems to store and retrieve intermediate flink states(in case of stateful applications). Basically all the problems that come with a distribution system. So you have to have all that infrastructure for it to perform well. Best way is to look at the use cases you wish to support in 5-10 years ahead and design the architecture around flink accordingly.

2020-10-19T09:33:00Z
author avatar
Top 5Real User

TimeWindow feature. The timing of the content and the windowing is a bit changed in 1.11. They have introduced watermarks. Watermark is basically associating data in the stream with a timestamp. Documentation can be referred. They have updated rest of the documentaion but not the testing documentation. Therefore, We have to manually try and understand few concepts. Integration of Apache Flink with other metric services or failure handling data tools needs some kind of update or its in-depth knowledge is expected before integrating. Consider a use case where you want to actually analyze or get analytics about how much data you have processed and how many failed? Prometheus is one of the common metric tools out of the box supported by flink, along with other metric services. The documentation is straight forward. There is a learning curve with metric services, which can consume a lot of time, if not well versed with those tools. Failure handling basic documentation is provided by flink, like restart on task failure, fixed delay restart...etc.

2020-10-13T07:21:29Z
author avatar
Top 5Real User

We have a machine learning team that works with Python, but Apache Flink does not have full support for the language. We needed to use Java to implement some of our job posting pipelines.

2020-10-13T07:21:29Z
author avatar
Top 5Real User

The state maintains checkpoints and they use RocksDB or S3. They are good but sometimes the performance is affected when you use RocksDB for checkpointing. We can write python bolts/applications inside Apache Storm Code and it supports Python as a programming language, but with Flink, the Python support is not that great. When we do machine learning, data science, or ML work, we want to integrate the data science or machine learning pipeline with our real-time pipeline and most of the data science or machine learning work is in Python. It was very easy with Storm. Storm supports native Python language, so integration was easy. But Flink is mostly Java. The integration of Python with Java is difficult, so it's not direct integration. We need to find an alternative way. We created an API layer in between so the Java and Python layers were communicating by using an API. We just called data science models or ML models using the API which runs in Python while Flink runs in Java. We would like to see improvement where we can have another way to run it. Currently, it's there, but it's not that great. This is an area that we would like to see improvement.

2020-10-07T07:04:00Z
Learn what your peers think about Apache Flink. Get advice and tips from experienced pros sharing their opinions. Updated: October 2021.
541,708 professionals have used our research since 2012.