We performed a comparison between AWS Glue and StreamSets based on real PeerSpot user reviews.
Find out in this report how the two Cloud Data Integration solutions compare in terms of features, pricing, service and support, easy of deployment, and ROI."The solution is highly user-friendly, and its features are easy to use. The new addition of AWS Glue Data Catalog is also very beneficial, making the tool even more helpful for its users."
"The most valuable feature of AWS Glue is that it provides a GUI format with a drag-and-drop feature."
"Our entire use case was very easily handled or solved using this solution."
"The most valuable feature of AWS Glue is its ease of use and good documentation. Additionally, we can do all the transformations that we need."
"The solution integrates well with other AWS products or services."
"What I like best about AWS Glue is its real-time data backup feature. Last week, there was a production push, and what used to take almost ten days to send out around fifty-six thousand emails now takes only two hours."
"I appreciate AWS Glue for its cost-effectiveness."
"Its user interface is quite good. You just need to choose some options to create a job in AWS Glue. The code-generation feature is also useful. If you don't want to customize it and simply want to read a file and store the data in the database, it can generate the code for you."
"The ETL capabilities are very useful for us. We extract and transform data from multiple data sources, into a single, consistent data store, and then we put it in our systems. We typically use it to connect our Apache Kafka with data lakes. That process is smooth and saves us a lot of time in our production systems."
"The UI is user-friendly, it doesn't require any technical know-how and we can navigate to social media or use it more easily."
"The best feature that I really like is the integration."
"In StreamSets, everything is in one place."
"The ability to have a good bifurcation rate and fewer mistakes is valuable."
"The most valuable features are the option of integration with a variety of protocols, languages, and origins."
"I have used Data Collector, Transformer, and Control Hub products from StreamSets. What I really like about these products is that they're very user-friendly. People who are not from a technological or core development background find it easy to get started and build data pipelines and connect to the databases. They would be comfortable like any technical person within a couple of weeks."
"One of the things I like is the data pipelines. They have a very good design. Implementing pipelines is very straightforward. It doesn't require any technical skill."
"While working on AWS Glue, I could not find any training material for it."
"It fails to handle massive databases acquired from various sources."
"One area that could be improved is the ETL view. The drag-and-drop interface is not as user-friendly as some other ETL tools."
"There is a learning curve to this tool."
"Only people who can code, either in Java or Python, can use the product freely. Those who don't know Java or Python might find using AWS Glue difficult."
"We face performance issues when using AWS Glue for data transformation and integration."
"There should be more connectors for different databases."
"If there's a cluster-related configuration, we have to make worker notes, which is quite a headache when processing a large amount of data."
"I would like to see further improvement in the UI. In addition, upgrades are not automatic and they should be automated. Currently, we have to manually upgrade versions."
"StreamSet works great for batch processing but we are looking for something that is more real-time. We need latency in numbers below milliseconds."
"We've seen a couple of cases where it appears to have a memory leak or a similar problem."
"In terms of the product, I don't think there is any room for improvement because it is very good. One small area of improvement that is very much needed is on the knowledge base side. Sometimes, it is not very clear how to set up a certain process or a certain node for a person who's using the platform for the first time."
"Sometimes, when we have large amounts of data that is very efficiently stored in Hadoop or Kafka, it is not very efficient to run it through StreamSets, due to the lack of efficiency or the resources that StreamSets is using."
"One thing that I would like to add is the ability to manually enter data. The way the solution currently works is we don't have the option to manually change the data at any point in time. Being able to do that will allow us to do everything that we want to do with our data. Sometimes, we need to manually manipulate the data to make it more accurate in case our prior bifurcation filters are not good. If we have the option to manually enter the data or make the exact iterations on the data set, that would be a good thing."
"We often faced problems, especially with SAP ERP. We struggled because many columns weren't integers or primary keys, which StreamSets couldn't handle. We had to restructure our data tables, which was painful. Also, pipeline failures were common, and data drifting wasn't addressed, which made things worse. Licensing was another issue we encountered."
"They need to improve their customer care services. Sometimes it has taken more than 48 hours to resolve an issue. That should be reduced. They are aware of small or generic issues, but not the more technical or deep issues. For those, they require some time, generally 48 to 72 hours to respond. That should be improved."
AWS Glue is ranked 1st in Cloud Data Integration with 37 reviews while StreamSets is ranked 8th in Data Integration with 24 reviews. AWS Glue is rated 7.8, while StreamSets is rated 8.4. The top reviewer of AWS Glue writes "Provides serverless mechanism, easy data transformation and automated infrastructure management". On the other hand, the top reviewer of StreamSets writes "We no longer need to hire highly skilled data engineers to create and monitor data pipelines". AWS Glue is most compared with AWS Database Migration Service, Informatica PowerCenter, SSIS, Informatica Cloud Data Integration and Talend Open Studio, whereas StreamSets is most compared with Fivetran, Azure Data Factory, Informatica PowerCenter, SSIS and Talend Open Studio. See our AWS Glue vs. StreamSets report.
See our list of best Cloud Data Integration vendors.
We monitor all Cloud Data Integration reviews to prevent fraudulent reviews and keep review quality high. We do not post reviews by company employees or direct competitors. We validate each review for authenticity via cross-reference with LinkedIn, and personal follow-up with the reviewer when necessary.