Apache Flink Valuable Features

-Rahul Agarwal
Sr. Software Engineer at a tech services company with 10,001+ employees
You need to understand that a bunch of links work on streaming data. Among the streaming data, out of the nuances of streaming, we have called exactly once are the ones we call semantics. Exactly once being the hardest, which is being very well maintained by flex. What I mean by that is, when an event or when data is coming in, you're only processing that event. Only one stop. This is very important when you're trying to do some aggregates. Let's say you are a seller on the site and we know that depending on when a seller makes a setting limit, you're allowed to sell only three items a day. If we miscount you in the last five minutes, from three to four, but you did three listings, because of the current listing the data does not reflect that, we do not aggregate now. We allowed you to do this. What is wrong? We can't allow a wrong decision on your side based on the setting limits and the reason for that is because the data is out-dated because the data got delayed. That is a reason the data got delayed. The other reasons could be that the data got lost or they will not aggregate exactly the way it was supposed to be aggregated. So exit aggregations are what makes it much harder to do in real time. That's why you use offline systems like Spark and Hadoop. But you can do real time with Symantec, which is very well supported in Flink. So that's one of the best features. Another feature is how Flink handles its radiuses. It has something called the checkpointing concept. You're dealing with billions and billions of requests, so your system is going to fail in large storage systems. Flink handles this by using the concept of checkpointing and savepointing, where they write the aggregated state into some separate storage. So in case of failure, you can basically recall from that state and come back. I'll take an example of Call of Duty. Let's say when you play a game of Call of Duty there are five levels. And in each level, there are different obstacles that you need to clear to advance. There are five levels and in each level, there are different obstacles. So let's say in each level there are 10 obstacles. If you've cleared three obstacles and you die in that process you don't want to start from scratch again in the same level, right? You want to start from the third level. So that is what the concept of checkpointing and savepointing allow you to do. I have done work until this point at the third level, and now I want to restart from the third level only. I don't want to redo that part again. That's what checkpointing and savepointing do. So how does it help in our case? What is the current date? It's Wednesday, October, 15, right? So October 15 until 12:00 PM, we open all the applications. We take that at a regular interval. We take that aggregation snapshot and store it in a different storage. If the system goes down after 12:00PM because the load is high or some other hardware failure, you can recover that data from 12:00 PM and reprocess. You're not recovering the entire thing. Let's say the seller's count listing was two but you don't want to contrast for that particular seller from zero. You want to count from two after 12:00 PM. Right? That's what Flink helps you do. Additionally, it helps you scale very well, but there are a lot of nuances. Because Flink allows you to do good aggregations upon segregations, maintaining the system is not that easy because you need a machine that has high RAM requirements. You need to have good memory requirements. It depends on what kind of problems you're solving. The problem that I'm describing is actually the the hardest kind of problem, when you're putting state in Flink's memory, stateful problems. There are other problems called stateless problems. Let's say you are a trucking company and you want to track the new data of your trucks. Let's say you have 15 trucks and you want to look at each of your trucks and each and every truck is spitting out latitude and longitude info when they're moving. In this case, maybe your intention is only to track them real time but you're okay with it being delayed by five, 10 minutes. But in this case, you're not aggregating something. There's nothing stateful about it. You take the data and you dump it to storage. That storage could be anything, plastic surgery or anything. And then you can create a graph and plot a trendline. Where is it going? In a map or something like that. The data comes in and it gets written to a place and then uses a straight graph. So it stays put. In this kind of application, Flink works very well and you can really do a lot of real time analytics on it. But in the case of the problem that I described where you do real time applications, that's a little bit tricky because in this case, you need to recover from the right state so that you don't mess up the relations. View full review »
Sandesh Deshmane
Solution Architect at a tech vendor with 501-1,000 employees
When we use the Flink streaming pipeline, the first thing we use is the Windowing mechanism with event time feature. So as that happens, the Flink aggregation is very easy. The next thing is we were using Apache Storm. Apache Storm is stateless, and Apache Flink is stateful. With Apache Storm, we were supposed to use an intermediate distributed cache. But because we use Flink, and it is stateful, we can manage the state or failure mechanism. The result is that we do aggregation every 10 minutes and we do not need to worry about our application stopping in between those 10 minutes and then restarting. When we were using Storm, we used to manage all of it ourselves. We created manual checkpoints in Redis, but with Flink, it supports inbuilt features like checkpointing and statefulness. There is event time or author time that you can have for your messages. Another important thing is the out-of-order message processing. When you use any streaming mechanism, there is a chance that your source is producing messages that are out-of-order. When you build a state machine, it's very important that you can have the messages in order, so that your computations/results are correct. What happens with Storm or any other framework that you're using is that to get messages in order, you have to use an intermediate Redis cache, and then sort the messages. When we use Flink, it has an inbuilt way to have the messages in order, and we can process them. It saves a lot of time and a lot of code. I have written both Storm and Flink code. With Storm, I used to write a lot of code, hundreds of lines but with Flink, it's less, around 50 to 60 lines. I don't need to use Redis to do the intermediate cache. There is a lot of code that is being saved. I have to aggregate around 10 minutes and there is an inbuilt mechanism. With Storm, I need to write out logic and then I need to write a bunch of connected bolts and intermediate Redis. The same code that would take me one week to write in Storm, I could do that same thing in a couple of days with Flink. I started with Flink five to six years ago for one use case and the community support and documentation were not good at that time. When we started back again in 2019, we saw that documentation and community support were good. View full review »
Jyala Rahul Jyala
Sr Software Engineer at a tech vendor with 10,001+ employees
The most valuable feature is that there is no distinction between batch and streaming data. When we want to use batch mode, we use Apache Spark. The problem with Spark is that when it comes to time-series data, it does not train well. With Flink, however, we can have the streaming capability that we want. The documentation is very good. A lot of metrics are supported and there is also logging capability. There is API support. View full review »
Find out what your peers are saying about Apache, Amazon, VMware and others in Streaming Analytics. Updated: October 2020.
442,194 professionals have used our research since 2012.
Hitesh Baid
Senior Software Engineer at a tech services company with 5,001-10,000 employees
MapFunction, FilterFunction are the most useful or the most used function in Flink. Data transformation becomes easy for example, Applications that store information about people and when they want to retrieve those person's details in some kind of relation, in the map function, we can actually filter all persons based on their age group. That's why the mapping function is very useful. This could be helpful in analytics to target specific news to specific age group. View full review »
Find out what your peers are saying about Apache, Amazon, VMware and others in Streaming Analytics. Updated: October 2020.
442,194 professionals have used our research since 2012.