We performed a comparison between Confluent and StreamSets based on real PeerSpot user reviews.
Find out in this report how the two Streaming Analytics solutions compare in terms of features, pricing, service and support, easy of deployment, and ROI."The client APIs are the most valuable feature."
"The most valuable feature that we are using is the data replication between the data centers allowing us to configure a disaster recovery or software. However, is it's not mandatory to use and because most of the features that we use are from Apache Kafka, such as end-to-end encryption. Internally, we can develop our own kind of product or service from Apache Kafka."
"The benefit is escaping email communication. Sometimes people ignore emails or put them into spam, but with Confluence, everyone sees the same text at the same time."
"The solution can handle a high volume of data because it works and scales well."
"I would rate the scalability of the solution at eight out of ten. We have 20 people who use Confluent in our organization now, and we hope to increase usage in the future."
"The documentation process is fast with the tool."
"We mostly use the solution's message queues and event-driven architecture."
"I find Confluent's Kafka Connectors and Kafka Streams invaluable for my use cases because they simplify real-time data processing and ETL tasks by providing reliable, pre-packaged connectors and tools."
"For me, the most valuable features in StreamSets have to be the Data Collector and Control Hub, but especially the Data Collector. That feature is very elegant and seamlessly works with numerous source systems."
"I really appreciate the numerous ready connectors available on both the source and target sides, the support for various media file formats, and the ease of configuring and managing pipelines centrally."
"I have used Data Collector, Transformer, and Control Hub products from StreamSets. What I really like about these products is that they're very user-friendly. People who are not from a technological or core development background find it easy to get started and build data pipelines and connect to the databases. They would be comfortable like any technical person within a couple of weeks."
"The best thing about StreamSets is its plugins, which are very useful and work well with almost every data source. It's also easy to use, especially if you're comfortable with SQL. You can customize it to do what you need. Many other tools have started to use features similar to those introduced by StreamSets, like automated workflows that are easy to set up."
"StreamSets data drift feature gives us an alert upfront so we know that the data can be ingested. Whatever the schema or data type changes, it lands automatically into the data lake without any intervention from us, but then that information is crucial to fix for downstream pipelines, which process the data into models, like Tableau and Power BI models. This is actually very useful for us. We are already seeing benefits. Our pipelines used to break when there were data drift changes, then we needed to spend about a week fixing it. Right now, we are saving one to two weeks. Though, it depends on the complexity of the pipeline, we are definitely seeing a lot of time being saved."
"The best feature that I really like is the integration."
"The scheduling within the data engineering pipeline is very much appreciated, and it has a wide range of connectors for connecting to any data sources like SQL Server, AWS, Azure, etc. We have used it with Kafka, Hadoop, and Azure Data Factory Datasets. Connecting to these systems with StreamSets is very easy."
"Important features include that it comprises lots of functionality to connect data from various sources through connector availability, scheduling pipelines at any time, and integration with third-party and security solutions for encryption."
"It would help if the knowledge based documents in the support portal could be available for public use as well."
"There is no local support team in Saudi Arabia."
"It could be improved by including a feature that automatically creates a new topic and puts failed messages."
"It could have more themes. They should also have more reporting-oriented plugins as well. It would be great to have free custom reports that can be dispatched directly from Jira."
"The pricing model should include the ability to pick features and be charged for them only."
"Currently, in the early stages, I see a gap on the security side. If you are using the SaaS version, we would like to get a fuller, more secure solution that can be adopted right out of the box. Confluence could do a better job sharing best practices or a reusable pattern that others have used, especially for companies that can not afford to hire professional services from Confluent."
"Confluence could improve the server version of the solution. However, most companies are going to the cloud."
"there is room for improvement in the visualization."
"The software is very good overall. Areas for improvement are the error logging and the version history. I would like to see better, more detailed error logging information."
"Currently, we can only use the query to read data from SAP HANA. What we would like to see, as soon as possible, is the ability to read from multiple tables from SAP HANA. That would be a really good thing that we could use immediately. For example, if you have 100 tables in SQL Server or Oracle, then you could just point it to the schema or the 100 tables and ingestion information. However, you can't do that in SAP HANA since StreamSets currently is lacking in this. They do not have a multi-table feature for SAP HANA. Therefore, a multi-table origin for SAP HANA would be helpful."
"We create pipelines or jobs in StreamSets Control Hub. It is a great feature, but if there is a way to have a folder structure or organize the pipelines and jobs in Control Hub, it would be great. I submitted a ticket for this some time back."
"One area for improvement could be the cloud storage server speed, as we have faced some latency issues here and there."
"Using ETL pipelines is a bit complicated and requires some technical aid."
"The monitoring visualization is not that user-friendly. It should include other features to visualize things, like how many records were streamed from a source to a destination on a particular date."
"If you use JDBC Lookup, for example, it generally takes a long time to process data."
"In terms of the product, I don't think there is any room for improvement because it is very good. One small area of improvement that is very much needed is on the knowledge base side. Sometimes, it is not very clear how to set up a certain process or a certain node for a person who's using the platform for the first time."
Confluent is ranked 4th in Streaming Analytics with 21 reviews while StreamSets is ranked 8th in Data Integration with 24 reviews. Confluent is rated 8.4, while StreamSets is rated 8.4. The top reviewer of Confluent writes "Has good technical support services and a valuable feature for real-time data streaming ". On the other hand, the top reviewer of StreamSets writes "We no longer need to hire highly skilled data engineers to create and monitor data pipelines". Confluent is most compared with Amazon MSK, Amazon Kinesis, Databricks, AWS Glue and Oracle GoldenGate, whereas StreamSets is most compared with Fivetran, Informatica PowerCenter, Azure Data Factory, SSIS and Spring Cloud Data Flow. See our Confluent vs. StreamSets report.
We monitor all Streaming Analytics reviews to prevent fraudulent reviews and keep review quality high. We do not post reviews by company employees or direct competitors. We validate each review for authenticity via cross-reference with LinkedIn, and personal follow-up with the reviewer when necessary.