We performed a comparison between Azure Data Factory and StreamSets based on real PeerSpot user reviews.
Find out in this report how the two Data Integration solutions compare in terms of features, pricing, service and support, easy of deployment, and ROI."For me, it was that there are dedicated connectors for different targets or sources, different data sources. For example, there is direct connector to Salesforce, Oracle Service Cloud, etcetera, and that was really helpful."
"The solution includes a feature that increases the number of processors used which makes it very powerful and adds to the scalability."
"The most valuable feature of Azure Data Factory is that it has a good combination of flexibility, fine-tuning, automation, and good monitoring."
"It is very modular. It works well. We've used Data Factory and then made calls to libraries outside of Data Factory to do things that it wasn't optimized to do, and it worked really well. It is obviously proprietary in regards to Microsoft created it, but it is pretty easy and direct to bring in outside capabilities into Data Factory."
"The most valuable features are data transformations."
"One advantage of Azure Data Factory is that it's fast, unlike SSIS and other on-premise tools. It's also very convenient because it has multiple connectors. The availability of native connectors allows you to connect to several resources to analyze data streams."
"Data Factory's best features are simplicity and flexibility."
"It is beneficial that the solution is written with Spark as the back end."
"The best feature that I really like is the integration."
"The most valuable feature is the pipelines because they enable us to pull in and push out data from different sources and to manipulate and clean things up within them."
"The entire user interface is very simple and the simplicity of creating pipelines is something that I like very much about it. The design experience is very smooth."
"What I love the most is that StreamSets is very light. It's a containerized application. It's easy to use with Docker. If you are a large organization, it's very easy to use Kubernetes."
"For me, the most valuable features in StreamSets have to be the Data Collector and Control Hub, but especially the Data Collector. That feature is very elegant and seamlessly works with numerous source systems."
"The most valuable features are the option of integration with a variety of protocols, languages, and origins."
"The ETL capabilities are very useful for us. We extract and transform data from multiple data sources, into a single, consistent data store, and then we put it in our systems. We typically use it to connect our Apache Kafka with data lakes. That process is smooth and saves us a lot of time in our production systems."
"The most valuable would be the GUI platform that I saw. I first saw it at a special session that StreamSets provided towards the end of the summer. I saw the way you set it up and how you have different processes going on with your data. The design experience seemed to be pretty straightforward to me in terms of how you drag and drop these nodes and connect them with arrows."
"One area for improvement is documentation. At present, there isn't enough documentation on how to use Azure Data Factory in certain conditions. It would be good to have documentation on the various use cases."
"Areas for improvement in Azure Data Factory include connectivity and integration. When you use integration runtime, whenever there's a failure, the backup process in Azure Data Factory takes time, so this is another area for improvement."
"Azure Data Factory could benefit from improvements in its monitoring capabilities to provide a more robust feature set. Enhancing the ease of deployment to higher environments within Azure DevOps would be beneficial, as the current process often requires extensive scripting and pipeline development. It is also known for the flexibility of the data flow feature, particularly in supporting more dynamic data-driven architectures. These enhancements would contribute to a more seamless and efficient workflow within GitLab."
"Azure Data Factory should be cheaper to move data to a data center abroad for calamities in case of disasters."
"The product could provide more ways to import and export data."
"The pricing model should be more transparent and available online."
"There's space for improvement in the development process of the data pipelines."
"When working with AWS, we have noticed that the difference between ADF and AWS is that AWS is more customer-focused. They're more responsive compared to any other company. ADF is not as good as AWS, but it should be. If AWS is ten out of ten, ADF is around eight out of ten. I think AWS is easier to understand from the GUI perspective compared to ADF."
"There aren't enough hands-on labs, and debugging is also an issue because it takes a lot of time. Logs are not that clear when you are debugging, and you can only select a single source for a pipeline."
"We create pipelines or jobs in StreamSets Control Hub. It is a great feature, but if there is a way to have a folder structure or organize the pipelines and jobs in Control Hub, it would be great. I submitted a ticket for this some time back."
"The execution engine could be improved. When I was at their session, they were using some obscure platform to run. There is a controller, which controls what happens on that, but you should be able to easily do this at any of the cloud services, such as Google Cloud. You shouldn't have any issues in terms of how to run it with their online development platform or design platform, basically their execution engine. There are issues with that."
"The monitoring visualization is not that user-friendly. It should include other features to visualize things, like how many records were streamed from a source to a destination on a particular date."
"Sometimes, when we have large amounts of data that is very efficiently stored in Hadoop or Kafka, it is not very efficient to run it through StreamSets, due to the lack of efficiency or the resources that StreamSets is using."
"If you use JDBC Lookup, for example, it generally takes a long time to process data."
"The design experience is the bane of our existence because their documentation is not the best. Even when they update their software, they don't publish the best information on how to update and change your pipeline configuration to make it conform to current best practices. We don't pay for the added support. We use the "freeware version." The user community, as well as the documentation they provide for the standard user, are difficult, at best."
"StreamSet works great for batch processing but we are looking for something that is more real-time. We need latency in numbers below milliseconds."
Azure Data Factory is ranked 1st in Data Integration with 81 reviews while StreamSets is ranked 8th in Data Integration with 24 reviews. Azure Data Factory is rated 8.0, while StreamSets is rated 8.4. The top reviewer of Azure Data Factory writes "The data factory agent is quite good but pricing needs to be more transparent". On the other hand, the top reviewer of StreamSets writes "We no longer need to hire highly skilled data engineers to create and monitor data pipelines". Azure Data Factory is most compared with Informatica PowerCenter, Informatica Cloud Data Integration, Alteryx Designer, Snowflake and Microsoft Azure Synapse Analytics, whereas StreamSets is most compared with Fivetran, Informatica PowerCenter, SSIS, Oracle GoldenGate and IBM InfoSphere DataStage. See our Azure Data Factory vs. StreamSets report.
See our list of best Data Integration vendors.
We monitor all Data Integration reviews to prevent fraudulent reviews and keep review quality high. We do not post reviews by company employees or direct competitors. We validate each review for authenticity via cross-reference with LinkedIn, and personal follow-up with the reviewer when necessary.