Apache Flink Overview

Apache Flink is the #4 ranked solution in our list of Streaming Analytics tools. It is most often compared to Amazon Kinesis: Apache Flink vs Amazon Kinesis

What is Apache Flink?

Apache Flink is a framework and distributed processing engine for stateful computations over unbounded and bounded data streams. Flink has been designed to run in all common cluster environments, perform computations at in-memory speed and at any scale.

Apache Flink is also known as Flink.

Apache Flink Buyer's Guide

Download the Apache Flink Buyer's Guide including reviews and more. Updated: June 2021

Apache Flink Customers
LogRhythm, Inc., Inter-American Development Bank, Scientific Technologies Corporation, LotLinx, Inc., Benevity, Inc.
Apache Flink Video

Pricing Advice

What users are saying about Apache Flink pricing:
  • "This is an open-source platform that can be used free of charge."
  • "Apache Flink is open source so we pay no licensing for the use of the software."
  • "The solution is open-source, which is free."

Filter Reviews

Filter by:
Filter Reviews
Industry
Loading...
Filter Unavailable
Company Size
Loading...
Filter Unavailable
Job Level
Loading...
Filter Unavailable
Rating
Loading...
Filter Unavailable
Considered
Loading...
Filter Unavailable
Order by:
Loading...
  • Date
  • Highest Rating
  • Lowest Rating
  • Review Length
Search:
Showingreviews based on the current filters. Reset all filters
-Rahul Agarwal
Sr. Software Engineer at a tech services company with 10,001+ employees
Real User
Top 10
Scalable framework for stateful streaming aggregations

What is our primary use case?

Initially, we created our own servers and then eBay created their infrastructure. Now it's deployed on the eBay cloud. Our primary use case is trying to do real time aggregations/near-real time aggregations. Let's say for example that we are trying to do some count, sum,min,max distinct counts for different metrics that we care about, but we do this in real time. So let's say, you have an e-commerce company and you want to measure different metrics. If I take the example of risk, let's say you want to check if one particular seller on your site is doing something fishy or not. What is the… more »

Pros and Cons

  • "Another feature is how Flink handles its radiuses. It has something called the checkpointing concept. You're dealing with billions and billions of requests, so your system is going to fail in large storage systems. Flink handles this by using the concept of checkpointing and savepointing, where they write the aggregated state into some separate storage. So in case of failure, you can basically recall from that state and come back."
  • "In terms of stability with Flink, it is something that you have to deal with every time. Stability is the number one problem that we have seen with Flink, and it really depends on the kind of problem that you're trying to solve."

What other advice do I have?

This is general advice if you're trying to do anything: Any problem that you're trying to evaluate, you have to really understand the problem that you're trying to solve, what is the nature of the problem? And by nature of the problem, the business side is one thing, but you have to understand how you're solving things. For example, do you want something to be fast enough, scalable and for any new product? Every time they advertise it is fast, scalable, highly distributed, etc... But in what context? What kind of use cases is this product built for? You have to understand the principle and…
Sandesh Deshmane
Software Architect at a tech vendor with 501-1,000 employees
Real User
Top 5
Provides out-of-the-box checkpointing and state management

What is our primary use case?

We have our own infrastructure on AWS. We deploy Flink on Kubernetes Cluster in AWS. The Kubernetes cluster is managed by our internal Devops team. We also use Apache Kafka. That is where we get our event streams. We get millions of events through Kafka. There are more than 300K to 500K events per second that we get through that channel. We aggregate the events and generate reporting metrics based on the actual events that are recorded. There are certain real-time high-volume events that are coming through Kafka which are like any other stream. We use Flink for aggregation purposes in this… more »

Pros and Cons

  • "With Flink, it provides out-of-the-box checkpointing and state management. It helps us in that way. When Storm used to restart, sometimes we would lose messages. With Flink, it provides guaranteed message processing, which helped us. It also helped us with maintenance or restarts."
  • "The state maintains checkpoints and they use RocksDB or S3. They are good but sometimes the performance is affected when you use RocksDB for checkpointing."

What other advice do I have?

My advice would be to validate your use case. If you are using already a streaming mechanism, I suggest that you validate what your actual use cases are and what the advantages of Flink are. Make sure that the use case that you are trying can be done by Flink. If you're doing simple aggregation and you don't want to worry about the message order then it's fine. You can use Storm or whatever you are using. If you see features that are there and are useful for you, then you should go for Flink. Validate your use case, validate your data and pipeline, do a small POC, and see if it is useful. If…
Learn what your peers think about Apache Flink. Get advice and tips from experienced pros sharing their opinions. Updated: June 2021.
512,711 professionals have used our research since 2012.
Jyala Rahul Jyala
Sr Software Engineer at a tech vendor with 10,001+ employees
Real User
Top 5
Good documentation, API support, and metrics, but it only has partial Python support

What is our primary use case?

We are using Flink as a pipeline for data cleaning. We are not using all of the features of Flink. Rather, we are using Flink Runner on top of Apache Beam. We are a CRM product-based company and we have a lot of customers that we provide our CRM for. We like to give them as much insight as we can, based on their activities. This includes how many transitions they do over a particular time. We do have other services, including machine learning, and so far, the resulting data is not very clean. This means that you have to clean it up manually. In real-time, working with Big Data in this… more »

Pros and Cons

  • "The documentation is very good."
  • "We have a machine learning team that works with Python, but Apache Flink does not have full support for the language."

What other advice do I have?

We are very happy with the product, and we have been able to achieve all of the use cases that we are expected to deliver for our customers. Over time, I have seen many improvements including in the documentation. An example is that when we first started using this product, almost two years ago, there was no support available. At this point, we do not have much opt-in but we have some use cases to ensure that our system is not breaking. We have QA who can validate these things based on what is expected versus what we have done. My advice for anybody who is considering Flink is that it has very…
RP
Software Development Engineer III at a tech services company with 5,001-10,000 employees
Real User
Top 10
Provides truly real-time data streaming with better control over resources; ML library could be more flexible

What is our primary use case?

My company is a cab aggregator, similar to Uber in terms of scale as well. Just like Uber, we have two sources of real-time events. One is the customer mobile app, one is the driver mobile app. We get a lot of events from both of these sources, and there are a lot of things which you have to process in real-time and that is our primary use case of Flink. It includes things like surge pricing, where you might have a lot of people wanting to book a cab so the price increases and if there are fewer people, the price drops. All that needs to be done quickly and precisely. We also need to process… more »

Pros and Cons

  • "This is truly a real-time solution."
  • "The machine learning library is not very flexible."

What other advice do I have?

My advice would be to make sure you understand your requirements, flink's architecture, how it works and whether it is the right solution for you. They provide very good documentation which is useful. The solution isn't suitable for every case and it may be that Spark or some other framework is more suitable. If you are a major company that cannot afford any downtime, and given that Flink is a relatively new technology, it might be worthwhile investing in the monitoring. That would include writing scripts for monitoring and making sure that the throughput of the applications is always steady…
Vinod Iyer
Principal Software Engineer at a tech services company with 1,001-5,000 employees
Real User
Top 10
Offers good API extraction and in-memory state management

What is our primary use case?

The last POC we did was for map-making. I work for a map-making company. India is one ADR and you have states within, you have districts within, and you have cities within. There are certain hierarchical areas. When you go to Google and when you search for a city within India, you would see the entire hierarchy. It's falls in India. We get third party sources, government sources, or we get it from different sources, if we can. We get the data, and this data is geometry. It's not a straightforward index. If we get raw geometry, we will get the entire map and the layout. We do geometry… more »

Pros and Cons

  • "Apache Flink is meant for low latency applications. You take one event opposite if you want to maintain a certain state. When another event comes and you want to associate those events together, in-memory state management was a key feature for us."
  • "In terms of improvement, there should be better reporting. You can integrate with reporting solutions but Flink doesn't offer it themselves."

What other advice do I have?

Flink is really simple and simple to adopt. You can use any backend state management tools, like DB or something of that sort. it has the visibility to integrate with different technologies, that's also very important. It's pretty welded and I believe for low latency. The API is pretty well written that way to support you. I would rate Apache Flink an eight out of ten.
BH
Senior Software Engineer at a tech services company with 5,001-10,000 employees
Real User
Top 10
Drastically reduces the turnaround/ processing time, Documentation is in depth and most of the things are straight forward.

What is our primary use case?

Services that need real-time and fast updates as well as lot of data to process, flink is the way to go. Apache Flink with kubernetes is a good combination. Lots of data transformation grouping, keying, state mangements are some of the features of Flink. My use case is to provide faster and latest data as soon as possible in real time.

Pros and Cons

  • "The event processing function is the most useful or the most used function. The filter function and the mapping function are also very useful because we have a lot of data to transform. For example, we store a lot of information about a person, and when we want to retrieve this person's details, we need all the details. In the map function, we can actually map all persons based on their age group. That's why the mapping function is very useful. We can really get a lot of events, and then we keep on doing what we need to do."
  • "The TimeWindow feature is a bit tricky. The timing of the content and the windowing is a bit changed in 1.11. They have introduced watermarks. A watermark is basically associating every data with a timestamp. The timestamp could be anything, and we can provide the timestamp. So, whenever I receive a tweet, I can actually assign a timestamp, like what time did I get that tweet. The watermark helps us to uniquely identify the data. Watermarks are tricky if you use multiple events in the pipeline. For example, you have three resources from different locations, and you want to combine all those inputs and also perform some kind of logic. When you have more than one input screen and you want to collect all the information together, you have to apply TimeWindow all. That means that all the events from the upstream or from the up sources should be in that TimeWindow, and they were coming back. Internally, it is a batch of events that may be getting collected every five minutes or whatever timing is given. Sometimes, the use case for TimeWindow is a bit tricky. It depends on the application as well as on how people have given this TimeWindow. This kind of documentation is not updated. Even the test case documentation is a bit wrong. It doesn't work. Flink has updated the version of Apache Flink, but they have not updated the testing documentation. Therefore, I have to manually understand it. We have also been exploring failure handling. I was looking into changelogs for which they have posted the future plans and what are they going to deliver. We have two concerns regarding this, which have been noted down. I hope in the future that they will provide this functionality. Integration of Apache Flink with other metric services or failure handling data tools needs some kind of update or its in-depth knowledge is required in the documentation. We have a use case where we want to actually analyze or get analytics about how much data we process and how many failures we have. For that, we need to use Tomcat, which is an analytics tool for implementing counters. We can manage reports in the analyzer. This kind of integration is pretty much straightforward. They say that people must be well familiar with all the things before using this type of integration. They have given this complete file, which you can update, but it took some time. There is a learning curve with it, which consumed a lot of time. It is evolving to a newer version, but the documentation is not demonstrating that update. The documentation is not well incorporated. Hopefully, these things will get resolved now that they are implementing it. Failure is another area where it is a bit rigid or not that flexible. We never use this for scaling because complexity is very high in case of a failure. Processing and providing the scaled data back to Apache Flink is a bit challenging. They have this concept of offsetting, which could be simplified."

What other advice do I have?

To get your hands wet on streaming or big data processing applications, to understand the basic concepts of big data processing and how complex analytics or complications can be made simple. For eg: If you want to analyze tweets or patterns, its a simple use case where you just use flink-twitter-connector and provide that as your input source to Flink. The stream of random tweets keeps on coming and then you can apply your own grouping, keying, filtering logic to understand their concepts. An important thing I learned while using flink, is basic concepts of windowing, transformation, Data…
AB
Partner / Head of Data & Analytics at a computer software company with 11-50 employees
Real User
Top 10
Gives us low latency for fast, real-time data, with useful alerts for live data processing

What is our primary use case?

We use Apache Flink to monitor the network consumption for mobile data in fast, real-time data architectures in Mexico. The projects we get from clients are typically quite large, and there are around 100 users using Apache Flink currently. For maintenance and deployment, we split our team into two squads, with one squad that takes care of the data architecture and the other squad that handles the data analysis technology. Each squad is three members each.

Pros and Cons

  • "The top feature of Apache Flink is its low latency for fast, real-time data. Another great feature is the real-time indicators and alerts which make a big difference when it comes to data processing and analysis."
  • "One way to improve Flink would be to enhance integration between different ecosystems. For example, there could be more integration with other big data vendors and platforms similar in scope to how Apache Flink works with Cloudera. Apache Flink is a part of the same ecosystem as Cloudera, and for batch processing it's actually very useful but for real-time processing there could be more development with regards to the big data capabilities amongst the various ecosystems out there."

What other advice do I have?

My advice to others when using Apache Flink is to hire good people to manage it. When you have the right team, it's very easy to operate and scale big data platforms. I would rate Apache Flink a nine out of ten.
JV
Product owner of the APO data science team at a energy/utilities company with 10,001+ employees
Real User
Top 10
Easy deployment and install, open-source, but underdeveloped API

What is our primary use case?

I use the solution for detection of streaming data.

Pros and Cons

  • "The setup was not too difficult."
  • "In a future release, they could improve on making the error descriptions more clear."

What other advice do I have?

When choosing this solution you have to look at your use case to see if this is the best choice for you. If you need to have super-fast realtime streaming, and you can develop in Scala, then it might make a lot of sense to use it. If you are looking at delays of seconds, and you are working on Python, then Pyspark might be a better solution. I rate Apache Flink a six out of ten.
Product Categories
Streaming Analytics
Buyer's Guide
Download our free Apache Flink Report and get advice and tips from experienced pros sharing their opinions.