Joaquin Marques - PeerSpot reviewer
CEO - Founder / Principal Data Scientist / Principal AI Architect at Kanayma LLC
Real User
Top 5Leaderboard
Excellent for heavy-duty data classification; should do away with configuration problems
Pros and Cons
  • "Kafka allows you to handle huge amounts of data and classify it into different categories. If you have huge amounts of data, Kafka is a very good solution for data classification."
  • "Kafka is a nightmare to administer."

What is our primary use case?

My primary use case for Apache Kafka is replacing ETL and doing data transformations.

How has it helped my organization?

Kafka allows you to handle huge amounts of data and classify it into different categories. If you have huge amounts of data, Kafka is a very good solution for data classification. When you need to route it in different directions, you have to take a look at the messages that you get, interfile them, and then send them to the correct place. Kafka is a good product to use in the backend.

What is most valuable?

The feature I find most valuable is the classification feature. Kafka enables you to tag content with a category.

What needs improvement?

Kafka contains two components. The component that does the synchronization between the rest of the components, that's an older version of the software and it causes all kinds of configuration problems. The Confluent, which is the company that sells a commercial version of Kafka is getting away from that component precisely because of that. Kafka is a nightmare to administer.

In the next release, I would like to see that one troublesome component that causes configuration issues removed.

Buyer's Guide
Apache Kafka
May 2024
Learn what your peers think about Apache Kafka. Get advice and tips from experienced pros sharing their opinions. Updated: May 2024.
772,422 professionals have used our research since 2012.

For how long have I used the solution?

I have been using Apache Kafka for a couple of years.

What do I think about the stability of the solution?

The stability of this solution depends on whether it is properly configured. Having said that, Kafka is incredibly complex to configure, set up, administer, and maintain.

What do I think about the scalability of the solution?

My opinion is that Apache Kafka is a scalable solution. In our organization, there are hundreds of thousands of users using Kafka.

How was the initial setup?

The initial setup was extremely complex. In our case, it took a team of 12 two months to deploy.

What about the implementation team?

These systems were installed by somebody else, not me.

What's my experience with pricing, setup cost, and licensing?

I would advise others to schedule a month or two to just set it up and have it up and running.

Which other solutions did I evaluate?

There are other options. For example, Databricks is a Kafka alternative. We decided to go with Kafka because one of our clients already chose Kafka.

While evaluating, we found out Databricks is more expensive, for the level of activity that Kafka handles (in this case, millions of requests per day). Databricks could do it, but it would be overly expensive.

I would rate Apache Kafka's pricing a seven out of ten, with one being cheap and 10 being very expensive.

What other advice do I have?

Since it has become so popular, large enterprises especially want to do it. For smaller enterprises, Kafka would probably be too expensive because they would have to hire people to maintain it.

I would rate the Apache Kafka solution a seven out of ten.

Which deployment model are you using for this solution?

Private Cloud
Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Head of Technology - Money Movement Platform at a financial services firm with 10,001+ employees
Real User
Feature rich, highly scalable, and straightforward to implement
Pros and Cons
  • "All the features of Apache Kafka are valuable, I cannot single out one feature."
  • "Prioritization of messages in Apache Kafka could improve."

What is our primary use case?

We use Apache Kafka primarily to queue the transactions or total the transactions.

How has it helped my organization?

Apache Kafka has helped our organization handle larger volumes without affecting the infrastructure load.

What is most valuable?

All the features of Apache Kafka are valuable, I cannot single out one feature.

What needs improvement?

Prioritization of messages in Apache Kafka could improve.

For how long have I used the solution?

I have been using Apache Kafka for approximately six years.

What do I think about the stability of the solution?

The stability of Apache Kafka is very good.

What do I think about the scalability of the solution?

Apache Kafka is the most scalable solution in the market.

How are customer service and support?

I have not used the support from Apache Kafka.

How was the initial setup?

Apache Kafka is straightforward to implement.

What about the implementation team?

We did the implementation of Apache Kafka in-house.

Which other solutions did I evaluate?

I did not evaluate other solutions.

What other advice do I have?

I rate Apache Kafka a nine out of ten.

Which deployment model are you using for this solution?

Hybrid Cloud
Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Buyer's Guide
Apache Kafka
May 2024
Learn what your peers think about Apache Kafka. Get advice and tips from experienced pros sharing their opinions. Updated: May 2024.
772,422 professionals have used our research since 2012.
Teodor Muraru - PeerSpot reviewer
Developer at Emag
Real User
Top 10
Reliable solution for processing broker messages from many clients
Pros and Cons
  • "The most valuable feature is the messaging function and reliability."
  • "Something that could be improved is having an interface to monitor the consuming rate."

What is our primary use case?

I have a lot of messages, and we need to process those messages from many clients. Each client takes those messages and processes them.

I'm using the brokerage partner. I'm not storing or maintaining the application on servers. I'm just a client for the Apache Kafka server.

The solution is deployed on-prem.

How has it helped my organization?

Apache Kafka has improved our organization because it's more reliable than Rabbit. That's the whole point for us.

What is most valuable?

The most valuable feature is the messaging function and reliability.

What needs improvement?

Something that could be improved is having an interface to monitor the consuming rate. We use something, but I'm not sure if it's from Apache Kafka, or if it's a borrowed third-party solution. So, the interface for monitoring the processes is an additional feature that could be added.

For how long have I used the solution?

I have been using this solution for two years.

What do I think about the stability of the solution?

The solution is pretty stable compared to Rabbit or other brokers. 

What do I think about the scalability of the solution?

The solution is scalable. We have about 10 departments that use Kafka in various forms. Each department might have 5 or 10 people.

We use the solution all the time. We have consumers that consume messages that come every day because we have clients and customers for the main website. All of those messages go to KAF clients. Our backend departments consume messages from the actions of the final customers.

Which solution did I use previously and why did I switch?

We used Rabbit and we switched to Kafka because it seemed like an upgrade in ability, reliability, and in the consuming process of broker messages.

How was the initial setup?

Implementations took half a year for everyone to learn the solution. It was quite lengthy.

What other advice do I have?

I would rate this solution 9 out of 10.

My advice is to take some time in investigating how to implement the solution.

We used to require about half a year to implement in our organization. Someone who needs to implement Kafka has to be prepared for a quite lengthy process. Don't expect implementation to be completed in a week. It's a little bit longer because it's complex.

Which deployment model are you using for this solution?

On-premises
Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Senior Technology Architect at a tech services company with 10,001+ employees
Real User
A resilient solution for metrics collection and monitoring
Pros and Cons
  • "Resiliency is great and also the fact that it handles different data formats."
  • "Some vendors don't offer extra features for monitoring."

What is our primary use case?

We use Apache Kafka for financial purposes. Every time one of our subscribed customers is due for an insurance payment, Apache Kafka sends an automated notification to the customer to let them know that their bill is due.

What is most valuable?

Resiliency is great and also the fact that it handles different data formats. There is one data format that's universal across multiple application domains — Avro. It's pretty universal compared to JSON, XML, SQI, and other formats.

What needs improvement?

Some vendors don't offer extra features for monitoring. Some come with Linux for default monitoring. Monitoring is very important. If something is not working properly, then our subscribers won't receive a notification. You then have to trace it back to Kafka and find the glitch or the messaging sequence that hasn't been racked up correctly.

It should support Avro — which handles different data formats — as a default data format. It would be much more flexible if it did.

For how long have I used the solution?

I have been using Apache Kafka for three years.

What do I think about the stability of the solution?

It seems to be quite stable.

What do I think about the scalability of the solution?

Apache Kafka is Scalable. You can actually launch a server node or a broker. Three nodes and Zookeeper (the Kafka server management system) is optimal. If one of them goes down you can automatically launch another one. You can go three servers or brokers back — there's a repetition on each Kafka broker.

How are customer service and technical support?

Apache Kafka is open-source. They don't offer technical support.

What other advice do I have?

On a scale from one to ten, I would give Apache Kafka a rating of eight.

Which deployment model are you using for this solution?

On-premises
Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
it_user590451 - PeerSpot reviewer
Lead Engineer at a retailer with 10,001+ employees
Real User
We use the product for high-scale distributed messaging. Multiple consumers can sync with it and fetch messages.

What is most valuable?

We use the product for high-scale distributed messaging. The processing capability of the product is enormous. Being a distributed platform, multiple consumers can sync with it and fetch messages.

Another great feature is the consumer offset log which tells you where the consumer left and where he needs to start again. Consumers aren’t required to code and put extra effort to maintain the offset.

How has it helped my organization?

We were using another commercial messaging engine, which was not scalable unless you paid more. Each hub that we provisioned was expensive. This solution is open source, which is much easier to use and doesn’t cost us anything.

What needs improvement?

This product guarantees at-least-once delivery. We have asked JIRA to provide features such as at-most-once delivery to remove duplicate message consumption.

What do I think about the stability of the solution?

We haven’t faced any issues so far. Some of the clusters churn millions of records per seconds with ease.

What do I think about the scalability of the solution?

We have clustered environments and we haven’t seen any scalability issues. We can provision a new node in as little as 45 minutes.

How are customer service and technical support?

It is open source, so support is in our own hands. The only option is to make a new feature request through JIRA. When multiple people in the community make a request for similar feature, it gets priority.

Which solution did I use previously and why did I switch?

We switched from a previous solution mainly to reduce costs and to have a more scalable solution.

How was the initial setup?

The initial setup was a bit complex in terms of how to manage it across data centers. But once it was setup, we never faced issues.

Which other solutions did I evaluate?

We evaluated multiple options, such as ActiveMQ and RabbitMQ. We leaned towards this solution.

What other advice do I have?

I would advise others to start with non-SSL implementations and try to do PoCs. Afterwards, they should move towards more secure features.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Felipe Lopes - PeerSpot reviewer
Engineering Manager at Alice
Real User
Top 5Leaderboard
You can receive and distribute data in real-time
Pros and Cons
  • "I have seen a return on investment with this solution."
  • "I suggest using cloud services because the solution is expensive if you are using it on-premises."

What is our primary use case?

The primary use case of the solution is for asset communication through our microservices.

How has it helped my organization?

The solution has allowed us to take the use cases provided by another communication tool and resolve those issues.

What is most valuable?

The most valuable feature is how persistent it is. For example, we are able to reprocess messages when we need to, we're able to recover methods to consume them.

What needs improvement?

The solution can be improved by reducing the cost to run it on the premises.

For how long have I used the solution?

I have been using the solution for four years.

What do I think about the stability of the solution?

The stability of the solution is good.

What do I think about the scalability of the solution?

The solution is scalable.

How was the initial setup?

The initial setup was straightforward.

What about the implementation team?

The implementation was through a vendor.

What was our ROI?

I have seen a return on investment with this solution.

What other advice do I have?

I give the solution a nine out of ten.

We have 80 people using the solution and five people are required to maintain it.

I suggest using cloud services because the solution is expensive if you are using it on-premises.

Which deployment model are you using for this solution?

Public Cloud
Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Director at Tibco
Real User
Top 5
The solution is stable, scalable, and open-source
Pros and Cons
  • "The open-source version is relatively straightforward to set up and only takes a few minutes."
  • "The solution can improve its cloud support."

What is our primary use case?

We have got this product, which is meant for integration. So our use cases are essentially integrating with other systems, using any messaging stack. We use these products in Dev and QA and we have connectors for various different messaging applications. Apache Kafka just happens to be one of the messaging applications that we connect with. We also have our own messaging, it's called Enterprise Messaging Server and Rendezvous, we connect to those also. Our product is essentially used for integration. So we connect to almost all messaging applications.

What is most valuable?

The most valuable feature is the speed at which the solution can be deployed.

What needs improvement?

The solution can improve its cloud support.

For how long have I used the solution?

I have been using the solution in Dev and QA for a few years.

What do I think about the stability of the solution?

The solution is stable.

What do I think about the scalability of the solution?

The solution is scalable.

Which solution did I use previously and why did I switch?

Since we are supporting various different messaging applications, we tend to use and support all the messaging applications that are popular. Like SQS, Google pops up, Active MQ, Rapid MQ, MQTT, and IBM MQ.

How was the initial setup?

The open-source version is relatively straightforward to set up and only takes a few minutes.

What about the implementation team?

We typically implement the solution in-house.

What's my experience with pricing, setup cost, and licensing?

The solution is open source.

What other advice do I have?

I give the solution an eight out of ten.

We test all the supported versions of the solution based on our customers' use.

We support our integration product. So we need to do dev and QA with Apache Kafka or any other messaging applications. But we do not provide support. The solution can be supported by someone else.

We don't need to have any specific staff for deployment. All the developers in QA can install and configure the solution. We don't have a separate person for maintenance.

Our team and our product dev and QAs all use the solution.

I think Apache Kafka is a good solution and I recommend it to others.

Which deployment model are you using for this solution?

Hybrid Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Other
Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
System Analyst // System Architect at a tech services company with 10,001+ employees
Real User
Enables us to send or push messages through a specified port
Pros and Cons
  • "For example, when you want to send a message to inform all your clients about a new feature, you can publish that message to a single topic in Apache Kafka. This allows all clients subscribed to that topic to receive the message. On the other hand, if you need to send billing information to a specific customer, you can publish that message on a topic dedicated to that customer. This message can then be sent as an SMS to the customer, allowing them to view it on their mobile device."

    What is our primary use case?

    Apache Kafka is a messaging solution where you have topics to pass on your information. You can send messages to multiple topics.

    How has it helped my organization?

    We need to manage limited resources. Additionally, we can send or push messages through a specified port. This is a significant feature because, unlike traditional queues, Kafka uses a cluster of nodes, making it easy to integrate with various algorithms. This clustering is an advantage and a key feature of Kafka, providing good interaction and scalability.

    What is most valuable?

    For example, when you want to send a message to inform all your clients about a new feature, you can publish that message to a single topic in Apache Kafka. This allows all clients subscribed to that topic to receive the message. On the other hand, if you need to send billing information to a specific customer, you can publish that message on a topic dedicated to that customer. This message can then be sent as an SMS to the customer, allowing them to view it on their mobile device.

    What needs improvement?

    Apache Kafka is different in its design. If you have topics around the front end of clusters in the facility, it is scalable. The software is scalable to handle and process data. However, it might not be suitable for handling specific types of images or media files. Other than that, it should handle the rest of the data processing needs.

    There are no multiple versions, which simplifies the process of granting access with Kaspersky. Every message is accurately delivered. However, Kafka does not support sending messages directly. You need to publish messages finalization. If you want to resend a message, you must resend it manually. Kafka does not automatically handle this. Another thing is the need for a redo option if an issue occurs. If a message is not sent properly, it can be retransmitted within the core system. You should enable the gateway in your program for it to function correctly. Messages will not be delivered or refreshed unless you enable the direct replay option in the product settings.

    For how long have I used the solution?

    I have been using Apache Kafka since 2020-21

    How was the initial setup?

    The initial setup of Apache Kafka is challenging and requires experience. Each message should always receive a response, so prioritizing traffic is essential. Furthermore, the client or consumer must always be in sync, or the message will not be processed.

    What other advice do I have?

    One pair of nodes is sufficient for the system. If our other system requires more than five nodes, it might not be feasible. Currently, other components are functioning as expected. The Kafka setup won't take much time.

    When using Apache Kafka, it’s important to manage different environments carefully to avoid confusion. For instance, you can configure different client applications for producing and consuming messages. Ensure that the configurations for each environment (development, testing, production, etc.) are separated. This includes managing source code and data appropriately to maintain security and efficiency. Proper management of Kafka assets and operations phases is crucial for a smooth workflow.

    I recommend Apache Kafka since it is extremely fast, stable and has been used for a very long time. We haven't encountered any major issues or concerns regarding its performance and customer service.

    Overall, I rate the solution a nine out of ten.

    Disclosure: I am a real user, and this review is based on my own experience and opinions.
    Flag as inappropriate
    PeerSpot user