We just raised a $30M Series A: Read our story

Compare Apache Hadoop vs. Azure Data Factory

Cancel
You must select at least 2 products to compare!
Apache Hadoop Logo
7,650 views|6,306 comparisons
Azure Data Factory Logo
26,751 views|21,970 comparisons
Featured Review
Find out what your peers are saying about Snowflake Computing, Oracle, Micro Focus and others in Data Warehouse. Updated: November 2021.
552,407 professionals have used our research since 2012.
Quotes From Members

We asked business professionals to review the solutions they use. Here are some excerpts of what they said:

Pros
"Hadoop is extensible — it's elastic.""Hadoop is designed to be scalable, so I don't think that it has limitations in regards to scalability.""The most valuable feature is the database.""The solution is easy to expand. We haven't seen any issues with it in that sense. We've added 10 servers, and we've added two nodes. We've been expanding since we started using it since we started out so small. Companies that need to scale shouldn't have a problem doing so.""The performance is pretty good.""The most valuable features are powerful tools for ingestion, as data is in multiple systems.""It's good for storing historical data and handling analytics on a huge amount of data."

More Apache Hadoop Pros »

"Its integrability with the rest of the activities on Azure is most valuable.""Azure Data Factory's most valuable features are the packages and the data transformation that it allows us to do, which is more drag and drop, or a visual interface. So, that eases the entire process.""Powerful but easy-to-use and intuitive.""The best part of this product is the extraction, transformation, and load.""The most valuable feature is the copy activity.""The solution can scale very easily.""It is easy to deploy workflows and schedule jobs.""It has built-in connectors for more than 100 sources and onboarding data from many different sources to the cloud environment."

More Azure Data Factory Pros »

Cons
"Hadoop's security could be better.""It would be good to have more advanced analytics tools.""From the Apache perspective or the open-source community, they need to add more capabilities to make life easier from a configuration and deployment perspective.""The solution could use a better user interface. It needs a more effective GUI in order to create a better user environment.""It would be helpful to have more information on how to best apply this solution to smaller organizations, with less data, and grow the data lake.""The solution is very expensive.""The solution needs a better tutorial. There are only documents available currently. There's a lot of YouTube videos available. However, in terms of learning, we didn't have great success trying to learn that way. There needs to be better self-paced learning."

More Apache Hadoop Cons »

"You cannot use a custom data delimiter, which means that you have problems receiving data in certain formats.""Real-time replication is required, and this is not a simple task.""We have experienced some issues with the integration. This is an area that needs improvement.""On the UI side, they could make it a little more intuitive in terms of how to add the radius components. Somebody who has been working with tools like Informatica or DataStage gets very used to how the UI looks and feels.""The user interface could use improvement. It's not a major issue but it's something that can be improved.""The speed and performance need to be improved.""The number of standard adaptors could be extended further.""Azure Data Factory should be cheaper to move data to a data center abroad for calamities in case of disasters."

More Azure Data Factory Cons »

Pricing and Cost Advice
Information Not Available
"It's not particularly expensive.""The licensing is a pay-as-you-go model, where you pay for what you consume.""In terms of licensing costs, we pay somewhere around S14,000 USD per month. There are some additional costs. For example, we would have to subscribe to some additional computing and for elasticity, but they are minimal.""Understanding the pricing model for Data Factory is quite complex.""This is a cost-effective solution.""I would not say that this product is overly expensive.""The licensing cost is included in the Synapse.""The price you pay is determined by how much you use it."

More Azure Data Factory Pricing and Cost Advice »

report
Use our free recommendation engine to learn which Data Warehouse solutions are best for your needs.
552,407 professionals have used our research since 2012.
Questions from the Community
Top Answer: I don't think using Apache Spark without Hadoop has any major drawbacks or issues. I have used Apache Spark quite successfully with AWS S3 on many projects which are batch based. Yes for very high… more »
Top Answer: Hadoop is extensible — it's elastic.
Top Answer: Hadoop's security could be better.
Top Answer: AWS Glue and Azure Data factory for ELT best performance cloud services.
Top Answer: Azure Data Factory is flexible, modular, and works well. In terms of cost, it is not too pricey. It offers the stability and reliability I am looking for, good scalability, and is easy to set up and… more »
Top Answer: Azure Data Factory is a solid product offering many transformation functions; It has pre-load and post-load transformations, allowing users to apply transformations either in code by using Power… more »
Ranking
6th
out of 30 in Data Warehouse
Views
7,650
Comparisons
6,306
Reviews
8
Average Words per Review
388
Rating
7.5
2nd
Views
26,751
Comparisons
21,970
Reviews
24
Average Words per Review
494
Rating
7.7
Comparisons
Learn More
Overview
The Apache Hadoop project develops open-source software for reliable, scalable, distributed computing. The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using simple programming models. It is designed to scale up from single servers to thousands of machines, each offering local computation and storage. Rather than rely on hardware to deliver high-availability, the library itself is designed to detect and handle failures at the application layer, so delivering a highly-available service on top of a cluster of computers, each of which may be prone to failures.

Create, schedule, and manage your data integration at scale with Azure Data Factory - a hybrid data integration (ETL) service. Work with data wherever it lives, in the cloud or on-premises, with enterprise-grade security.

Offer
Learn more about Apache Hadoop
Learn more about Azure Data Factory
Sample Customers
Amazon, Adobe, eBay, Facebook, Google, Hulu, IBM, LinkedIn, Microsoft, Spotify, AOL, Twitter, University of Maryland, Yahoo!, Cornell University Web Lab
Milliman, Pier 1 Imports, Rockwell Automation, Ziosk, Real Madrid
Top Industries
VISITORS READING REVIEWS
Computer Software Company30%
Comms Service Provider18%
Financial Services Firm13%
Energy/Utilities Company5%
REVIEWERS
Computer Software Company31%
Insurance Company13%
Healthcare Company6%
Logistics Company6%
VISITORS READING REVIEWS
Computer Software Company32%
Comms Service Provider14%
Financial Services Firm6%
Energy/Utilities Company6%
Company Size
REVIEWERS
Small Business40%
Midsize Enterprise20%
Large Enterprise40%
REVIEWERS
Small Business24%
Midsize Enterprise29%
Large Enterprise47%
VISITORS READING REVIEWS
Small Business17%
Midsize Enterprise8%
Large Enterprise75%
Find out what your peers are saying about Snowflake Computing, Oracle, Micro Focus and others in Data Warehouse. Updated: November 2021.
552,407 professionals have used our research since 2012.

Apache Hadoop is ranked 6th in Data Warehouse with 7 reviews while Azure Data Factory is ranked 2nd in Data Integration Tools with 25 reviews. Apache Hadoop is rated 7.6, while Azure Data Factory is rated 7.8. The top reviewer of Apache Hadoop writes "Great micro-partitions, helpful technical support and quite stable". On the other hand, the top reviewer of Azure Data Factory writes "Easy to bring in outside capabilities, flexible, and works well". Apache Hadoop is most compared with Microsoft Azure Synapse Analytics, Snowflake, VMware Tanzu Greenplum, Oracle Exadata and Microsoft Parallel Data Warehouse, whereas Azure Data Factory is most compared with Informatica PowerCenter, Talend Open Studio, Informatica Cloud Data Integration, Palantir Foundry and Denodo.

We monitor all Data Warehouse reviews to prevent fraudulent reviews and keep review quality high. We do not post reviews by company employees or direct competitors. We validate each review for authenticity via cross-reference with LinkedIn, and personal follow-up with the reviewer when necessary.