Apache Kafka Review

This is the base streaming component of our IoT platform. It needs a separate cluster and a separate administrator.


What is most valuable?

  • Distributed
  • Persistence
  • Offset management by consumer

How has it helped my organization?

This is the base streaming component of our IoT platform.

In case of disaster recovery, we mirror the data in the cluster by maintaining the offsets and store the data within Hadoop 2.8 HDFS.

What needs improvement?

  • It needs a separate cluster and a separate administrator to manage the Kafka cluster, adding an extra cost.
  • It is challenging when data is moved to a mirror cluster, in the case of disaster recovery. It doesn't keep the offset.

For how long have I used the solution?

I have used this solution for one year.

How is customer service and technical support?

The open source community is very strong. Also, distributors like Cloudera and Hortonworks provide paid support.

Which solutions did we use previously?

For big data, we did not have a previous solution. I have used Microsoft MQ for building traditional systems.

How was the initial setup?

The setup was straightforward.

What's my experience with pricing, setup cost, and licensing?

This is open source with the cost of a cluster administrator.

Which other solutions did I evaluate?

We did not look at anything else. At that time, this was already accepted by the industry for streaming data processing.

What other advice do I have?

If the Hadoop distribution is MapR, then consider MapR Streaming. MapR Streaming has overcome these fundamental issues. It stores data within the MapR-FS itself. So there is extra overhead, but with a licensing cost.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
Add a Comment
Guest
Sign Up with Email