Apache Hadoop vs Azure Data Factory comparison

Cancel
You must select at least 2 products to compare!
Apache Logo
2,467 views|2,109 comparisons
87% willing to recommend
Microsoft Logo
8,126 views|6,366 comparisons
91% willing to recommend
Comparison Buyer's Guide
Executive Summary

We performed a comparison between Apache Hadoop and Azure Data Factory based on real PeerSpot user reviews.

Find out in this report how the two Cloud Data Warehouse solutions compare in terms of features, pricing, service and support, easy of deployment, and ROI.
To learn more, read our detailed Apache Hadoop vs. Azure Data Factory Report (Updated: March 2024).
770,141 professionals have used our research since 2012.
Featured Review
Quotes From Members
We asked business professionals to review the solutions they use.
Here are some excerpts of what they said:
Pros
"It's good for storing historical data and handling analytics on a huge amount of data.""It's open-source, so it's very cost-effective.""The most valuable features are the ability to process the machine data at a high speed, and to add structure to our data so that we can generate relevant analytics.""The most valuable feature is scalability and the possibility to work with major information and open source capability.""The ability to add multiple nodes without any restriction is the solution's most valuable aspect.""Two valuable features are its scalability and parallel processing. There are jobs that cannot be done unless you have massively parallel processing.""What I like about Apache Hadoop is that it's for big data, in particular big data analysis, and it's the easier solution. I like the data processing feature for AI/ML use cases the most because some solutions allow me to collect data from relational databases, while Hadoop provides me with more options for newer technologies.""Hadoop File System is compatible with almost all the query engines."

More Apache Hadoop Pros →

"The solution includes a feature that increases the number of processors used which makes it very powerful and adds to the scalability.""Data Factory's best feature is the ease of setting up pipelines for data and cloud integrations.""Feature-wise, one of the most valuable ones is the data flows introduced recently in the solution.""The tool's most valuable features are its connectors. It has many out-of-the-box connectors. We use ADF for ETL processes. Our main use case involves integrating data from various databases, processing it, and loading it into the target database. ADF plays a crucial role in orchestrating these ETL workflows.""Its integrability with the rest of the activities on Azure is most valuable.""Data Factory's best features are simplicity and flexibility.""The function of the solution is great.""The data copy template is a valuable feature."

More Azure Data Factory Pros →

Cons
"The upgrade path should be improved because it is not as easy as it should be.""We would like to have more dynamics in merging this machine data with other internal data to make more meaning out of it.""The main thing is the lack of community support. If you want to implement a new API or create a new file system, you won't find easy support.""Based on our needs, we would like to see a tool for data visualization and enhanced Ambari for management, plus a pre-built IoT hub/model. These would reduce our efforts and the time needed to prove to a customer that this will help them.""It would be helpful to have more information on how to best apply this solution to smaller organizations, with less data, and grow the data lake.""The solution needs a better tutorial. There are only documents available currently. There's a lot of YouTube videos available. However, in terms of learning, we didn't have great success trying to learn that way. There needs to be better self-paced learning.""The solution is not easy to use. The solution should be easy to use and suitable for almost any case connected with the use of big data or working with large amounts of data.""Since it is an open-source product, there won't be much support."

More Apache Hadoop Cons →

"The solution should offer better integration with Azure machine learning. We should be able to embed the cognitive services from Microsoft, for example as a web API. It should allow us to embed Azure machine learning in a more user-friendly way.""The deployment should be easier.""The user interface could use improvement. It's not a major issue but it's something that can be improved.""It would be better if it had machine learning capabilities.""The number of standard adaptors could be extended further.""They require more detailed error reporting, data normalization tools, easier connectivity to other services, more data services, and greater compatibility with other commonly used schemas.""The solution needs to be more connectable to its own services.""Snowflake connectivity was recently added and if the vendor provided some videos on how to create data then that would be helpful."

More Azure Data Factory Cons →

Pricing and Cost Advice
  • "Do take into consider that data storage and compute capacity scale differently and hence purchasing a "boxed" / 'all-in-one" solution (software and hardware) might not be the best idea."
  • "​There are no licensing costs involved, hence money is saved on the software infrastructure​."
  • "This is a low cost and powerful solution."
  • "The price of Apache Hadoop could be less expensive."
  • "If my company can use the cloud version of Apache Hadoop, particularly the cloud storage feature, it would be easier and would cost less because an on-premises deployment has a higher cost during storage, for example, though I don't know exactly how much Apache Hadoop costs."
  • "We don't directly pay for it. Our clients pay for it, and they usually don't complain about the price. So, it is probably acceptable."
  • "The price could be better. Hortonworks no longer exists, and Cloudera killed the free version of Hadoop."
  • "We just use the free version."
  • More Apache Hadoop Pricing and Cost Advice →

  • "In terms of licensing costs, we pay somewhere around S14,000 USD per month. There are some additional costs. For example, we would have to subscribe to some additional computing and for elasticity, but they are minimal."
  • "This is a cost-effective solution."
  • "The price you pay is determined by how much you use it."
  • "Understanding the pricing model for Data Factory is quite complex."
  • "I would not say that this product is overly expensive."
  • "The licensing is a pay-as-you-go model, where you pay for what you consume."
  • "Our licensing fees are approximately 15,000 ($150 USD) per month."
  • "The licensing cost is included in the Synapse."
  • More Azure Data Factory Pricing and Cost Advice →

    report
    Use our free recommendation engine to learn which Cloud Data Warehouse solutions are best for your needs.
    770,141 professionals have used our research since 2012.
    Questions from the Community
    Top Answer:Tools like Apache Hadoop are knowledge-intensive in nature. Unlike other tools in the market currently, we cannot understand knowledge-intensive products straight away. To use Apache Hadoop, a person… more »
    Top Answer:AWS Glue and Azure Data factory for ELT best performance cloud services.
    Top Answer:Azure Data Factory is flexible, modular, and works well. In terms of cost, it is not too pricey. It offers the stability and reliability I am looking for, good scalability, and is easy to set up and… more »
    Top Answer:Azure Data Factory is a solid product offering many transformation functions; It has pre-load and post-load transformations, allowing users to apply transformations either in code by using Power… more »
    Ranking
    5th
    out of 35 in Data Warehouse
    Views
    2,467
    Comparisons
    2,109
    Reviews
    11
    Average Words per Review
    573
    Rating
    7.9
    3rd
    Views
    8,126
    Comparisons
    6,366
    Reviews
    47
    Average Words per Review
    509
    Rating
    8.0
    Comparisons
    Learn More
    Overview
    The Apache Hadoop project develops open-source software for reliable, scalable, distributed computing. The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using simple programming models. It is designed to scale up from single servers to thousands of machines, each offering local computation and storage. Rather than rely on hardware to deliver high-availability, the library itself is designed to detect and handle failures at the application layer, so delivering a highly-available service on top of a cluster of computers, each of which may be prone to failures.

    Azure Data Factory efficiently manages and integrates data from various sources, enabling seamless movement and transformation across platforms. Its valuable features include seamless integration with Azure services, handling large data volumes, flexible transformation, user-friendly interface, extensive connectors, and scalability. Users have experienced improved team performance, workflow simplification, enhanced collaboration, streamlined processes, and boosted productivity.

    Sample Customers
    Amazon, Adobe, eBay, Facebook, Google, Hulu, IBM, LinkedIn, Microsoft, Spotify, AOL, Twitter, University of Maryland, Yahoo!, Cornell University Web Lab
    1. Adobe 2. BMW 3. Coca-Cola 4. General Electric 5. Johnson & Johnson 6. LinkedIn 7. Mastercard 8. Nestle 9. Pfizer 10. Samsung 11. Siemens 12. Toyota 13. Unilever 14. Verizon 15. Walmart 16. Accenture 17. American Express 18. AT&T 19. Bank of America 20. Cisco 21. Deloitte 22. ExxonMobil 23. Ford 24. General Motors 25. IBM 26. JPMorgan Chase 27. Microsoft (Azure Data Factory is developed by Microsoft) 28. Oracle 29. Procter & Gamble 30. Salesforce 31. Shell 32. Visa
    Top Industries
    REVIEWERS
    Financial Services Firm38%
    Comms Service Provider25%
    Hospitality Company6%
    Consumer Goods Company6%
    VISITORS READING REVIEWS
    Financial Services Firm28%
    Computer Software Company10%
    Comms Service Provider6%
    University6%
    REVIEWERS
    Computer Software Company34%
    Insurance Company11%
    Manufacturing Company8%
    Financial Services Firm8%
    VISITORS READING REVIEWS
    Computer Software Company13%
    Financial Services Firm13%
    Manufacturing Company8%
    Healthcare Company7%
    Company Size
    REVIEWERS
    Small Business34%
    Midsize Enterprise20%
    Large Enterprise46%
    VISITORS READING REVIEWS
    Small Business14%
    Midsize Enterprise11%
    Large Enterprise74%
    REVIEWERS
    Small Business29%
    Midsize Enterprise19%
    Large Enterprise52%
    VISITORS READING REVIEWS
    Small Business18%
    Midsize Enterprise13%
    Large Enterprise70%
    Buyer's Guide
    Apache Hadoop vs. Azure Data Factory
    March 2024
    Find out what your peers are saying about Apache Hadoop vs. Azure Data Factory and other solutions. Updated: March 2024.
    770,141 professionals have used our research since 2012.

    Apache Hadoop is ranked 5th in Data Warehouse with 33 reviews while Azure Data Factory is ranked 3rd in Cloud Data Warehouse with 81 reviews. Apache Hadoop is rated 7.8, while Azure Data Factory is rated 8.0. The top reviewer of Apache Hadoop writes "Handles huge data volumes and create your own workflows and tables but you need to have deeper knowledge". On the other hand, the top reviewer of Azure Data Factory writes "The data factory agent is quite good but pricing needs to be more transparent". Apache Hadoop is most compared with Microsoft Azure Synapse Analytics, Oracle Exadata, Snowflake, Teradata and BigQuery, whereas Azure Data Factory is most compared with Informatica PowerCenter, Informatica Cloud Data Integration, Alteryx Designer, Snowflake and IBM InfoSphere DataStage. See our Apache Hadoop vs. Azure Data Factory report.

    See our list of best Cloud Data Warehouse vendors.

    We monitor all Cloud Data Warehouse reviews to prevent fraudulent reviews and keep review quality high. We do not post reviews by company employees or direct competitors. We validate each review for authenticity via cross-reference with LinkedIn, and personal follow-up with the reviewer when necessary.