Ravi Kuppusamy - PeerSpot reviewer
CEO and Founder at BAssure Solutions
Real User
Top 5Leaderboard
Plenty of adapters, beneficial for enterprises, and high availability
Pros and Cons
  • "Apache Kafka has good integration capabilities and has plenty of adapters in its ecosystem if you want to build something. There are adapters for many platforms, such as Java, Azure, and Microsoft's ecosystem. Other solutions, such as Pulsar have fewer adapters available."
  • "Pulsar gives more scalability to an even grouping, but Apache Kafka is used more if you want to send something in a time series-based. If this does not matter to you then Pulsar could be more customizable. Apache Kafka is nothing but a streaming system with local storage."

What is our primary use case?

We are building solutions on Apache Kafka for four customers. The customers we have are in various sectors, such as healthcare and architecture.

What is most valuable?

Apache Kafka has good integration capabilities and has plenty of adapters in its ecosystem if you want to build something. There are adapters for many platforms, such as Java, Azure, and Microsoft's ecosystem. Other solutions, such as Pulsar have fewer adapters available.

For how long have I used the solution?

I have been using Apache Kafka for three years.

What do I think about the stability of the solution?

Apache Kafka is stable.

Buyer's Guide
Apache Kafka
May 2024
Learn what your peers think about Apache Kafka. Get advice and tips from experienced pros sharing their opinions. Updated: May 2024.
772,422 professionals have used our research since 2012.

What do I think about the scalability of the solution?

I would recommend Apache Kafka for any enterprise.

The amount of people using the solution depends on the application. However, the starting point is from 6,000 to 7,000 concurrent users.

How are customer service and support?

There is not any support, Apache Kafka is open-source.

Which solution did I use previously and why did I switch?

We have been experimenting with other solutions such as VMware RabbitMQ and Pulsar.

We are going to replace the Apache Kafka solution using Pulsar.

Pulsar gives more scalability to an even grouping, but Apache Kafka is used more if you want to send something in a time series-based. If this does not matter to you then Pulsar could be more customizable. Apache Kafka is nothing but a streaming system with local storage. Apache Kafka fits into many use cases, it's very direct, but if you want more specific use cases and you use Apache Kafka, Pulsar could be considered.

How was the initial setup?

Apache Kafka was simple to install. If you have a complicated clustered production, it takes time. However, for the development, it doesn't take more than one or two hours.

What about the implementation team?

We have approximately two to four technical managers that are deploying and supporting Apache Kafka. A technical manager is necessary.

What's my experience with pricing, setup cost, and licensing?

Apache Kafka is an open-sourced solution. There are fees if you want the support, and I would recommend it for enterprises. There are annual subscriptions available.

What other advice do I have?

Apache Kafka is one of the best open-source solutions that are available today.

I would recommend this solution to others.

I rate Apache Kafka an eight out of ten.

Which deployment model are you using for this solution?

Public Cloud
Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
it_user653562 - PeerSpot reviewer
Solutions Architect at a consultancy with 1,001-5,000 employees
Consultant
Has the ability to write data at one velocity and have subscribing consumers read at different velocities.
Pros and Cons
  • "Apache Kafka is actually a distributed commit log. That is different than most messaging and queuing systems before it."
  • "The GUI tools for monitoring and support are still very basic and not very rich. There is no help in determining a shard key for performance."

How has it helped my organization?

Kafka has a guaranteed delivery mechanism that is very easy to set up. When starting out with minimal hardware, it can handle very large data volumes. When prototyping and creating a proof of concept, Kafka has helped to speed up the timeline from the prototype all the way to production volumes.

What is most valuable?

Apache Kafka is actually a distributed commit log. That is different than most messaging and queuing systems before it. I find the ability to write data at one velocity and have subscribing consumers read at different velocities to be the best feature.

What needs improvement?

The GUI tools for monitoring and support are still very basic and not very rich. There is no help in determining a shard key for performance.

What do I think about the stability of the solution?

We did not have any issues with stability.

What do I think about the scalability of the solution?

We did not have any issues with scalability.

How are customer service and technical support?

  • Kafka is open source from LinkedIn and support comes from the community of users.
  • You can go with Confluent, the company that was founded by the original engineers from LinkedIn.
  • You can go with a cloud hosting service, like AWS EMR or Azure HDInsight.


    Which solution did I use previously and why did I switch?

    We used traditional message queues and file semaphores. There was a lot of overhead with asynchronous messages being put into an order and making sure nothing got dropped. It required a lot of code and maintenance.

    How was the initial setup?

    Since it is open source, you are on your own for setup. However, the tutorials from the Apache foundation and online sources have been an immense help.

    Getting started is very easy. The complexity of very large volumes of data and appropriate sharding, however, is difficult. There are fewer resources for tuning and best practices.

    What's my experience with pricing, setup cost, and licensing?

    When starting to look at a distributed message system, look for a cloud solution first. It is an easier entry point than an on-premises hardware solution. A lot of the complexity has already been taken care of. Both AWS and Azure have supported Kafka clusters that can be provisioned very easily.

    Which other solutions did I evaluate?

    We looked at RabbitMQ and Spark Streaming.

    What other advice do I have?

    Be sure to define the use cases as best as possible at first.

    Kafka is very good, but it is complex to support. It can handle any message size, whereas native cloud options have size limitations.

    Be sure to understand what messages will be sent and how many discrete topics will be needed.

    Be aware that you must code both producers and consumers.

    The bulk of the work is with the consumer.

    The Apache stack for Kafka is very open source. There are essentially no tools other than command line options to monitor brokers and topic health. So there are 3rd party tools that will help with that, some free, some paid – but it requires that you install agents on the servers hosting Kafka and open up ports for netbeans on the scripts that start up the Kafka services. Additionally, you also have to monitor zookeeper – which is very memory intensive. Cloud offerings that provide the whole modern data architecture stack – like AWS EMR and Azure HDInsight as well as Hortonworks and Cloudera provide a console GUI as part of each of their offerings. Also Confluent, a company founded by the Linked-In engineers that designed Kafka, also have a paid enterprise offering that has much better tools for maintain the kafka cluster. But apache Kafka with the community – you are on your own.

    Disclosure: I am a real user, and this review is based on my own experience and opinions.
    PeerSpot user
    Buyer's Guide
    Apache Kafka
    May 2024
    Learn what your peers think about Apache Kafka. Get advice and tips from experienced pros sharing their opinions. Updated: May 2024.
    772,422 professionals have used our research since 2012.
    Abdul-Samad - PeerSpot reviewer
    Software Engineer at a tech services company with 201-500 employees
    Real User
    Top 10
    It can manage a high volume of data from many sources
    Pros and Cons
    • "Kafka is scalable. It can manage a high volume of data from many sources."
    • "The interface has room for improvement, and there is a steep learning curve for Hadoop integration. It was a struggle learning to send from Hadoop to Kafka. In future releases, I'd like to see improvements in ETL functionality and Hadoop integration."

    What is our primary use case?

    I use Kafka to send network packets from different sources to my cluster. We have around 10 users at my company.

    What is most valuable?

    Kafka is scalable. It can manage a high volume of data from many sources.

    What needs improvement?

    The interface has room for improvement, and there is a steep learning curve for Hadoop integration. It was a struggle learning to send from Hadoop to Kafka. In future releases, I'd like to see improvements in ETL functionality and Hadoop integration. 

    For how long have I used the solution?

    I have used Kafka for around six months.

    What do I think about the stability of the solution?

    I rate Apache Kafka seven out of 10 for stability. 

    What do I think about the scalability of the solution?

    I rate Kafka eight out of 10 for scalability. 

    How are customer service and support?

    I rate Apache support six out of 10. It was hard to find the information I needed. 

    How would you rate customer service and support?

    Neutral

    Which solution did I use previously and why did I switch?

    Before Kafka, I sent feeds directly to Hadoop.

    How was the initial setup?

    I initially found Kafka difficult to set up, so I would rate it about five out of 10 for ease of setup. After I learned more about the platform, I would rate it eight out of 10. It is deployed on-premises over a cluster of three or four PCs. You can deploy Kafka in a few hours with one person. 

    What's my experience with pricing, setup cost, and licensing?

    Kafka is open source. 

    What other advice do I have?

    I rate Apache Kafka eight out of 10. I would recommend it to others. 

    Which deployment model are you using for this solution?

    On-premises
    Disclosure: I am a real user, and this review is based on my own experience and opinions.
    PeerSpot user
    Guirino Ciliberti - PeerSpot reviewer
    Data Governance & Lineage Product Manager at Primeur
    Real User
    Top 5
    Impressive solution with a speedy deployment
    Pros and Cons
    • "Deployment is speedy."
    • "It's not possible to substitute IBM MQ with Apache Kafka because the JMS part is not very stable."

    What is our primary use case?

    Our primary use case for this solution is streaming.

    For how long have I used the solution?

    We have been using this solution for four years.

    What do I think about the stability of the solution?

    The solution is stable. However, it's not possible to substitute IBM MQ with Apache Kafka because the JMS part is not very stable. It is inadequate and doesn't have the support of the MQI interface of IBM MQ.

    What do I think about the scalability of the solution?

    The solution is scalable. Deployment is speedy, but we don't have many installations. We have over a thousand users using this solution and will most likely increase the number of users because we have tested 100,000 messages per second. The solution is impressive.

    Which solution did I use previously and why did I switch?

    We previously used Mosquitto and Rabbit solutions, but we currently use Apache Kafka.

    What's my experience with pricing, setup cost, and licensing?

    We are licensed annually for this solution.

    What other advice do I have?

    I rate this solution a nine out of ten for streaming. I recommend it to other people. The solution is good, but its performance can be improved.

    Disclosure: I am a real user, and this review is based on my own experience and opinions.
    PeerSpot user
    Principal Technology Architect at a computer software company with 5,001-10,000 employees
    Real User
    Events and streaming are persistent, and multiple subscribers can consume the data
    Pros and Cons
    • "With Kafka, events and streaming are persistent, and multiple subscribers can consume the data. This is an advantage of Kafka compared to simple queue-based solutions."
    • "Kafka's interface could also use some work. Some of our products are in C, and we don't have any libraries to use with C. From an interface perspective, we had a library from the readies. And we are streaming some of the products we built to readies. That is one of the requirements. It would be good to have those libraries available in a future release for our C++ clients or public libraries, so we can include them in our product and build on that."

    What is our primary use case?

    It's a combination of an on-premise and cloud deployment. We use AWS, and we have our offshore deployment that's on-premise for OpenShift, Red Hat, and Kafka. Red Hat provides managed services and everything. We use Kafka and a specific deployment where we deploy on our basic VMs and consume Kafka as well.

    We publish or stream all our business events as well as some of the technical events. You stream it out to Kafka, and multiple consumers develop a different set of solutions. It could be reporting, analytics, or even some data persistence. Later, we used it to build a data lake solution. They all would be consuming the data or events we are streaming into Kafka.

    What is most valuable?

    With Kafka, events and streaming are persistent, and multiple subscribers can consume the data. This is an advantage of Kafka compared to simple queue-based solutions.

    What needs improvement?

    We are still on the production aspect, with our service provider or hyper-scalers providing the solutions. I would like to see some improvement on the HA and DR solutions, where everything is happening in real-time. 

    Kafka's interface could also use some work. Some of our products are in C, and we don't have any libraries to use with C. From an interface perspective, we had a library from the readies. And we are streaming some of the products we built to readies. That is one of the requirements. It would be good to have those libraries available in a future release for our C++ clients or public libraries, so we can include them in our product and build on that.

    For how long have I used the solution?

    We've been using Apache Kafka for the past two to three years.

    What do I think about the stability of the solution?

    Kafka is stable. It's a great product. 

    What do I think about the scalability of the solution?

    We did some benchmarking, but we are still looking further to scale up some of the benchmarking and performances. So far, it meets all our business requirements. We are just developers, so everything goes to the clients, who will deploy it at their scale and use it for their end customers. So were are looking at it from a developer's perspective. Those who are developing the products are working on this.

    How are customer service and support?

    We haven't really contacted technical support, but some of our clients have subscribed to support from the vendors. We generally look for open-source solutions. From there, we try to figure out if there are any issues. There's a good online community where you can ask questions.

    How was the initial setup?

    We were able to deploy and use it with no problems for our use case. We didn't find it so complex. We work with so many applications, databases, Postgres, and so many other things, so we could manage it easily. We deployed Kafka in a few hours. We have an infrastructure team and DevOps. Those teams are pretty capable, and they've completely automated the whole deployment. It always takes time the first time you upgrade any application, not just Kafka. We might discover some issues, such as configuration, parameters, compatibility, etc. Once that becomes standard, it is stable, and then they only need to replicate it to the different environments or different developers groups. We have a sophisticated process.

    What other advice do I have?

    I rate Apache Kafka eight out of 10. There are so many products on the market, so my advice is to consider if Kafka suits your business requirements first. If it's suitable, the next step is to check whether all the technical requirements are met. If everything checks out, I would say that Kafka is a relatively stable, sound, and scalable product, so they can try it out. 

    Which deployment model are you using for this solution?

    Hybrid Cloud
    Disclosure: I am a real user, and this review is based on my own experience and opinions.
    PeerSpot user
    Project Engineer at Wipro Limited
    Real User
    Top 20
    Free to use, mature, and offers good scalability
    Pros and Cons
    • "It's an open-source product, which means it doesn't cost us anything to use it."
    • "The UI is based on command line. It would be helpful if they could come up with a simpler user interface."

    What is our primary use case?

    We primarily use the solution for big data. We often get a million messages per second, and with such a high output we use Kafka to help us handle it. 

    What is most valuable?

    When we're working with big data, we need a throughput computing panel, which is something that Kafka provides, and something we find extremely valuable. It helps us support computing and ensures there's no loss of data. It can even do replication with some data.

    The delivery of data is it's most valuable aspect.

    It's an easy to use product overall.

    The solution is quite mature.

    It's an open-source product, which means it doesn't cost us anything to use it.

    What needs improvement?

    We're still going through the solution. Right now, I can't suggest any features that might be missing. I don't see where there can be an improvement in that regard.

    The speed isn't as fast as RabbitMQ, even though the solution touts itself as very quick. It could be faster. They should work to make it at least as fast as RabbitMQ.

    The UI is based on command line. It would be helpful if they could come up with a simpler user interface.

    They should make it easier to configure items on the solution.

    The solution would benefit from the addition of better monitoring tools.

    For how long have I used the solution?

    I've been using the solution for six months.

    What do I think about the stability of the solution?

    The solution is a bit slow in comparison to RabbitMQ. It's supposed to be a very fast solution, and it has okay performance, but speed-wise, it's quite slow.

    What do I think about the scalability of the solution?

    The scaling of the solution is quite good.

    How are customer service and technical support?

    In terms of technical support, we don't get that directly from Apache Kafka. We have certain cloud data distribution so we get assistance from our cloud data support.

    How was the initial setup?

    We're continuously deploying the product. We're still in the process of deployment.

    What's my experience with pricing, setup cost, and licensing?

    It's an open-source product, so the pricing isn't an issue. It's free to use. We don't have costs associated with it.

    Which other solutions did I evaluate?

    I'm not the product owner, so I didn't have a say in what should be chosen. We were seeing a high throughput with Kafka which is why we ultimately chose it.

    What other advice do I have?

     I'd rate the solution eight out of ten. It's good at scaling, and, performance-wise, it's excellent. If they could add upon the UI and allow for easier configuration, I'd rate them higher.

    Which deployment model are you using for this solution?

    On-premises
    Disclosure: I am a real user, and this review is based on my own experience and opinions.
    PeerSpot user
    Technical Consultant at KPMG
    Real User
    It eases our current data flow and framework
    Pros and Cons
    • "It eases our current data flow and framework."
    • "Kafka 2.0 has been released for over a month, and I wanted to try out the new features. However, the configuration is a little bit complicated: Kafka Broker, Kafka Manager, ZooKeeper Servers, etc."

    What is our primary use case?

    It's convenient and flexible for almost all kinds of data producers. We integrated it with Kafka Streams, which can perform some easy data processing, like summary, count, group, etc

    How has it helped my organization?

    It eases our current data flow and framework, which digests all types of sources regardless of it being structured or not.

    What is most valuable?

    • High availability
    • High throughput

    With such a large digest, I was genuinely impressed at the process being almost real-time.

    What needs improvement?

    Kafka 2.0 has been released for over a month, and I wanted to try out the new features. However, the configuration is a little bit complicated: Kafka Broker, Kafka Manager, ZooKeeper Servers, etc.

    For how long have I used the solution?

    Less than one year.
    Disclosure: I am a real user, and this review is based on my own experience and opinions.
    PeerSpot user
    it_user660591 - PeerSpot reviewer
    Senior Java Consultant at a tech services company with 501-1,000 employees
    Consultant
    The product is a distributed system for persistent messaging

    What is most valuable?

    The most valuable features are performance, persistent messaging, and reliability. It allows us to persist the message for a configurable number of days, even after it has been delivered to the consumer. The message delivery is also fast.

    How has it helped my organization?

    We wanted to track the customer activities on our application and store those details on another system(RDBMS/Apache Hadoop). We do extensive analysis with that. This helps the company to analyze the customer activities, such as search terms, and do better.

    What needs improvement?

    It’s perfect for our requirements.

    For how long have I used the solution?

    I have been using Apache Kafka for two years.

    What do I think about the stability of the solution?

    We have had no issues with stability.

    What do I think about the scalability of the solution?

    We have had no issues with scalability.

    How are customer service and technical support?

    We use the open source one, so we did not opt for any technical support.

    Which solution did I use previously and why did I switch?

    We started to use Apache Kafka with our application from scratch.

    How was the initial setup?

    The initial setup was straightforward. We faced some issues during the development in areas such as message producer and consumer. We rectified those with the tweaking the producer and consumer configurations. The documentation is very good.

    What's my experience with pricing, setup cost, and licensing?

    I don’t have any idea, as we use the open source version.

    What other advice do I have?

    It's a high-performance distributed system. If you want to track the user activities or any stream processing, then this is perfect. We have used Docker Kafka for our implementation. It's very easy for setup and testing. You could also try the same.

    Disclosure: I am a real user, and this review is based on my own experience and opinions.
    PeerSpot user