Apache Hadoop Review

HDFS allows you to store large data sets optimally. After switching to big data pipelines our query performances had improved hundred times.


What is most valuable?

HDFS allows you to store large data sets optimally.

How has it helped my organization?

After switching to big data pipelines, our query performance improved a hundred times.

What needs improvement?

Rolling restarts of data nodes need to be done in a way that can be further optimized. Also, I/O operations can be optimized for more performance.

For how long have I used the solution?

I have used Hadoop for over three years.

What do I think about the stability of the solution?

Once we had an issue with stability, due to a complete shutdown of a cluster. Bringing up a cluster took a lot of time because of some order that needed to be followed.

What do I think about the scalability of the solution?

We have not had scalability issues.

How is customer service and technical support?

The community is very supportive and provided prompt replies and suggestions to JIRA tickets.

Which solutions did we use previously?

We didn’t have a previous solution. It was a move from RDBMS to big data.

How was the initial setup?

Initial setup of a few nodes was simple, but as we increased the node count it became complex, as we need to maintain rack topology, etc.

What's my experience with pricing, setup cost, and licensing?

It’s free and it is open source.

What other advice do I have?

I would suggest using this product. We were able to use this for petabytes of data.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
Add a Comment
Guest
Sign Up with Email