We performed a comparison between Apache Hadoop and Azure Data Factory based on real PeerSpot user reviews.
Find out in this report how the two Cloud Data Warehouse solutions compare in terms of features, pricing, service and support, easy of deployment, and ROI."It's good for storing historical data and handling analytics on a huge amount of data."
"It's open-source, so it's very cost-effective."
"The most valuable features are the ability to process the machine data at a high speed, and to add structure to our data so that we can generate relevant analytics."
"The most valuable feature is scalability and the possibility to work with major information and open source capability."
"The ability to add multiple nodes without any restriction is the solution's most valuable aspect."
"Two valuable features are its scalability and parallel processing. There are jobs that cannot be done unless you have massively parallel processing."
"What I like about Apache Hadoop is that it's for big data, in particular big data analysis, and it's the easier solution. I like the data processing feature for AI/ML use cases the most because some solutions allow me to collect data from relational databases, while Hadoop provides me with more options for newer technologies."
"Hadoop File System is compatible with almost all the query engines."
"The solution includes a feature that increases the number of processors used which makes it very powerful and adds to the scalability."
"Data Factory's best feature is the ease of setting up pipelines for data and cloud integrations."
"Feature-wise, one of the most valuable ones is the data flows introduced recently in the solution."
"The tool's most valuable features are its connectors. It has many out-of-the-box connectors. We use ADF for ETL processes. Our main use case involves integrating data from various databases, processing it, and loading it into the target database. ADF plays a crucial role in orchestrating these ETL workflows."
"Its integrability with the rest of the activities on Azure is most valuable."
"Data Factory's best features are simplicity and flexibility."
"The function of the solution is great."
"The data copy template is a valuable feature."
"The upgrade path should be improved because it is not as easy as it should be."
"We would like to have more dynamics in merging this machine data with other internal data to make more meaning out of it."
"The main thing is the lack of community support. If you want to implement a new API or create a new file system, you won't find easy support."
"Based on our needs, we would like to see a tool for data visualization and enhanced Ambari for management, plus a pre-built IoT hub/model. These would reduce our efforts and the time needed to prove to a customer that this will help them."
"It would be helpful to have more information on how to best apply this solution to smaller organizations, with less data, and grow the data lake."
"The solution needs a better tutorial. There are only documents available currently. There's a lot of YouTube videos available. However, in terms of learning, we didn't have great success trying to learn that way. There needs to be better self-paced learning."
"The solution is not easy to use. The solution should be easy to use and suitable for almost any case connected with the use of big data or working with large amounts of data."
"Since it is an open-source product, there won't be much support."
"The solution should offer better integration with Azure machine learning. We should be able to embed the cognitive services from Microsoft, for example as a web API. It should allow us to embed Azure machine learning in a more user-friendly way."
"The deployment should be easier."
"The user interface could use improvement. It's not a major issue but it's something that can be improved."
"It would be better if it had machine learning capabilities."
"The number of standard adaptors could be extended further."
"They require more detailed error reporting, data normalization tools, easier connectivity to other services, more data services, and greater compatibility with other commonly used schemas."
"The solution needs to be more connectable to its own services."
"Snowflake connectivity was recently added and if the vendor provided some videos on how to create data then that would be helpful."
Apache Hadoop is ranked 5th in Data Warehouse with 33 reviews while Azure Data Factory is ranked 3rd in Cloud Data Warehouse with 81 reviews. Apache Hadoop is rated 7.8, while Azure Data Factory is rated 8.0. The top reviewer of Apache Hadoop writes "Handles huge data volumes and create your own workflows and tables but you need to have deeper knowledge". On the other hand, the top reviewer of Azure Data Factory writes "The data factory agent is quite good but pricing needs to be more transparent". Apache Hadoop is most compared with Microsoft Azure Synapse Analytics, Oracle Exadata, Snowflake, Teradata and BigQuery, whereas Azure Data Factory is most compared with Informatica PowerCenter, Informatica Cloud Data Integration, Alteryx Designer, Snowflake and IBM InfoSphere DataStage. See our Apache Hadoop vs. Azure Data Factory report.
See our list of best Cloud Data Warehouse vendors.
We monitor all Cloud Data Warehouse reviews to prevent fraudulent reviews and keep review quality high. We do not post reviews by company employees or direct competitors. We validate each review for authenticity via cross-reference with LinkedIn, and personal follow-up with the reviewer when necessary.