Apache Hadoop Valuable Features

Syed Afroz Pasha - PeerSpot reviewer
Head Of Data Governance at Alibaba Group

Hadoop File System is a perfect choice if we want to use any database systems or file systems because it is open-source. It has no cost. Or else, we’ll have to use Amazon S3 or Azure database, for which we will have to pay a lot. A lot of big data processing needs a proper partition and structure. Hadoop File System is compatible with almost all the query engines. That’s another reason why people would be very comfortable working with the Hadoop ecosystem.

View full review »
Juliet Hoimonthi - PeerSpot reviewer
Manager at Robi Axiata Limited

What I like about Apache Hadoop is that it's for big data, in particular big data analysis, and it's the easier solution. I like the data processing feature for AI/ML use cases the most because some solutions allow me to collect data from relational databases, while Hadoop provides me with more options for newer technologies.

View full review »
GM
Data Architect at a computer software company with 51-200 employees

It's open-source, so it's very cost-effective. Apache Hadoop has its strengths. For example, in my previous organization, which was a small startup, we used it because it was cost-effective. 

We only had to pay for the servers, and we could optimize applications and performance using our employees, which was especially cost-effective in India. So, human resources were the main investment, not software. 

That was five years ago, though. In the last five years, I've mainly seen Redshift, Azure, and Oracle in the market.

View full review »
Buyer's Guide
Apache Hadoop
April 2024
Learn what your peers think about Apache Hadoop. Get advice and tips from experienced pros sharing their opinions. Updated: April 2024.
767,995 professionals have used our research since 2012.
AM
Credit & Fraud Risk Analyst at a financial services firm with 10,001+ employees

The ability to take a lot of data and attempt to basically deliver the appropriate splices and summary chart is the most crucial function that I have discovered. 

This stands in contrast to some of the other tools that are available, such as SQL and SAS, which are likely incapable of handling such a large volume of data. Even R, for instance, is unable to handle such data volumes. 

Apache Hadoop can manage large amounts and volumes of data with relative ease, which is a feature that is beneficial.

View full review »
Abhik Ray - PeerSpot reviewer
Co-Founder at Quantic

The most important feature is its ability to handle large volumes. Some of our customers have really large volumes, and it is capable of handling their data in terms of the core volume and daily incremental volume. So, its processing power and speed are most valuable.

Another feature that I like is online analysis. In some cases, data requires online analysis. We like using Hadoop for that.

View full review »
RC
Senior Associate at a financial services firm with 10,001+ employees

Impala. As compared to Hive on MapReduce, Impala on MPP returns results of SQL queries in a fairly short amount of time, and is relatively fast when reading data into other platforms like R (for further data analysis) or QlikView (for data visualisation).

View full review »
Aria Amini - PeerSpot reviewer
Data Engineer at Behsazan Mellat

Its integration is Hadoop's best feature because that allows us to support different tools in a big data platform. Hadoop can integrate all of these features in various environments and have use cases beyond all of the tools in the environment.

View full review »
YM
CEO at AM-BITS LLC

The most valuable feature is scalability and the possibility to work with major information and open source capability.

View full review »
SF
Analytics Platform Manager at a consultancy with 10,001+ employees
  • Scalability
  • Parallel processing

There are jobs that cannot be done unless you have massively parallel processing; for instance, processing call-detail records for telecom.

View full review »
it_user340983 - PeerSpot reviewer
Infrastructure Engineer at Zirous, Inc.

The Distributed File System, which is the base of Hadoop, has been the most valuable feature with its ability to store video, pictures, JSON, XML, and plain text all in the same file system.

View full review »
Lucas Dreyer - PeerSpot reviewer
Data Engineer at BBD

We don't use many of the Hadoop features, like Pig, or Sqoop, but what I like most is using the Ambari feature. You have to use Ambari otherwise it is very difficult to configure.

What comes with the standard setup is what we mostly use, but Ambari is the most important.

View full review »
AM
CEO

HDFS and Kafka: Ingestion of huge volumes and variety of unstructured/semi-structured data is feasible, and it helps us to quickly onboard a new Big Data analytics prospect.

View full review »
YT
Business data analyst at RBSG Internet operations

One valuable feature is that we can download data. Another is that it is a low-cost solution. Hadoop has also made it feasible to have all the data available in one area.

View full review »
JP
Vice President - Finance & IT at a consumer goods company with 1-10 employees

The data is stored in micro-partitions which makes the processes very fast compared to other RDBMS systems. Apache Spark is in the memory process, and it's much better than MapReduce.

Micro-partitions and the HDFS are both excellent features.

View full review »
MS
Works

The most valuable features are the ability to process the machine data at a high speed, and to add structure to our data so that we can generate relevant analytics.

View full review »
it_user265830 - PeerSpot reviewer
Senior Hadoop Engineer with 1,001-5,000 employees
  • Storage
  • Processing (cost efficient)
View full review »
DD
Partner at a tech services company with 11-50 employees

Hadoop is extensible — it's elastic.

View full review »
MB
IT Expert at a comms service provider with 1,001-5,000 employees

I liked that Apache Hadoop was powerful, had a lot of tools, and the fact that it was free and community-developed. 

View full review »
YM
CEO at AM-BITS LLC

The ability to add multiple nodes without any restriction is the solution's most valuable aspect.

View full review »
MB
IT Expert at a comms service provider with 1,001-5,000 employees

The most valuable thing about this program for us is that it is very powerful and very cheap. We're using a lot of the program's modules and features because we're using software and hardware that can be difficult to integrate. For example, we're using supersets and a lot of old products from difficult systems. We love having the various options and features that allow us to work with flexibility.

View full review »
SS
Technical Lead at a government with 201-500 employees

The distributed processing is excellent. 

On the solution, Spark is very good. 

The performance is pretty good.

View full review »
CB
Database/Middleware Consultant (Currently at U.S. Department of Labor) at a tech services company with 51-200 employees
  • Data ingestion: It has rapid speed, if Apache Accumulo is used.
  • Data security
  • Inexpensive
View full review »
GA
Founder & CTO at a tech services company with 1-10 employees

I actually like most of the capabilities, but I think Spark has added reposit capabilities on top of the Hadoop ecosystem. The Spark area includes the capabilities that I like the most with Hadoop. 

View full review »
it_user1093134 - PeerSpot reviewer
Technical Architect at RBSG Internet Operations

The most valuable feature is the database.

View full review »
it_user693231 - PeerSpot reviewer
Big Data Engineer at a tech vendor with 5,001-10,000 employees

HDFS allows you to store large data sets optimally.

View full review »
Abhik Ray - PeerSpot reviewer
Co-Founder at Quantic

The most valuable features are powerful tools for ingestion, as data is in multiple systems.

View full review »
it_user1208307 - PeerSpot reviewer
Practice Lead (BI/ Data Science) at a tech services company with 11-50 employees

The solution is perfect for when you have big data. It's good for managing and replication.

It's good for storing historical data and handling analytics on a huge amount of data.

View full review »
it_user576504 - PeerSpot reviewer
Software Architect at a tech services company with 10,001+ employees

High throughput and low latency. We start with data mashing on Hive and finally use this for KPI visualization.

View full review »
Buyer's Guide
Apache Hadoop
April 2024
Learn what your peers think about Apache Hadoop. Get advice and tips from experienced pros sharing their opinions. Updated: April 2024.
767,995 professionals have used our research since 2012.