Solution Architect at Teradata Corporation
Vendor
We use it for data science activities. Security and workload management need improvement.
Pros and Cons
  • "We use it for data science activities."
  • "Security and workload management need improvement."

What is our primary use case?

We use it for data science activities.

How has it helped my organization?

Data is now available.

What is most valuable?

I have no preferences towards any feature.

What needs improvement?

  • Security
  • Performance
  • Workload management
Buyer's Guide
Hadoop
April 2024
Find out what your peers are saying about Cloudera, IBM, Amazon Web Services (AWS) and others in Hadoop. Updated: April 2024.
770,141 professionals have used our research since 2012.

For how long have I used the solution?

Less than one year.
Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
PeerSpot user
Big Data Consultant at a tech services company with 51-200 employees
Consultant
It allows us to provide our customers with data insights that they previously were unable to obtain, but the governance initiatives are far from production ready.

What is most valuable?

Its ability to scale out seamlessly with little to no effort is very valuable to us. All the tools in the stack are built from the ground up to support massive amounts of data.

How has it helped my organization?

It allows us to provide our customers with data insights that they previously were unable to obtain.

What needs improvement?

There have been some governance initiatives, but they are far from production ready. I would like to see a big improvement in that space, as governance is critical in many regulated industries.

For how long have I used the solution?

I've been using it for one year.

What do I think about the stability of the solution?

Stability is good if configured properly, but for some tools such as for instance HBase, configuration is extremely hard to get right.

What do I think about the scalability of the solution?

Scalability is superb.

How are customer service and technical support?

Customer Service:

I never interacted with customer support.

Technical Support:

Cloudera and vanilla Big Data tech. We continue to use them alongside HortonWorks, depending on our clients preferences and needs.

Which solution did I use previously and why did I switch?

Cloudera and vanilla Big Data tech. We continue to use them alongside Hortonworks, depending on our clients preferences and needs.

How was the initial setup?

With Ambari, it is pretty straightforward, but I have no idea why they prefer FQDN over IP.

What about the implementation team?

My colleagues and I are the implementation team. The general advice is to start out with a small enough scope. Try to get an MVP up and running before bringing out the big guns./

What's my experience with pricing, setup cost, and licensing?

Licensing is on a per node basis and it encourages people to scale vertically rather than horizontally yet the whole purpose of the tools they sell is to scale horizontally. I do like that everything is also available freely for those that do not require support.

What other advice do I have?

Make sure you understand what happens under the hood. Out-of-the-box tools are sub-par. Customisation is the way to go for now.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Buyer's Guide
Hadoop
April 2024
Find out what your peers are saying about Cloudera, IBM, Amazon Web Services (AWS) and others in Hadoop. Updated: April 2024.
770,141 professionals have used our research since 2012.
PeerSpot user
ICT Consultant (Advanced Infrastructure) at a tech services company with 1,001-5,000 employees
Consultant
The Ambari server provides the user an easy way to manage, administrate, and configure their clusters, but it needs to support having more than two HDFS namenodes.

What is most valuable?

There’s not only one, the all-stack of Hadoop is valuable, the distributed file system HDFS, Spark, Kafka, HBase, etc. Hortonworks has certainly got the most up-to-date version of each component of Hadoop.

Compared to the other Hadoop distributions, the Ambari server provides the user an easy way to manage, to administrate and to configure their cluster. Ambari also provides a single view that gives you the possibility to use different Hadoop components from the same web interface.

How has it helped my organization?

This product gives the possibility to the organization to easily and quickly install and configure a Hadoop cluster. With this cluster, the organization will be able to store and process their data and bring out some specificity on it. For example, unknown common points between their clients or key elements that will increase or decrease the churn of the client.

What needs improvement?

It would be interesting to have an easy way to implement multi-tenant for HDFS with federation. At the moment, you have to do it manually in command line.

Also, it needs to support having more than two HDFS namenodes. HDFS supports more than 2 namenodes, but Hortonworks doesn't.

For how long have I used the solution?

I work with it in different projects and POCs for two years now.

What was my experience with deployment of the solution?

The only issue that I had was when I tried to reinstall the software on every node. You have to manually clean up everything, as Hortonworks doesn’t provide the ability to perform a clean uninstall (software, library, log, configuration files, etc). In some case, it can generate some problems if the uninstall has not done correctly.

How are customer service and technical support?

I never had to create a case at the support, so I don’t know. I always find the answers to my questions on the web (forum or blog). There’s a big community that can support you.

Which solution did I use previously and why did I switch?

I also used Cloudera, MapR, and Microsoft HD Insight.

How was the initial setup?

The first time, I didn’t know anything about Big Data and Hadoop, so yes it was difficult because I did not clearly understand what I was doing.

What about the implementation team?

The implementation was at the clients datacenter. My advice is to perform a POC on premise or via a virtual machine to learn how to use it and how to tune the configuration of each Hadoop component.

When implementing it in production, firstly you need to have a clear view of the requirements you need to perform the install. For example, if you are using a local repository to install the software, it has to be updated with Hortonworks sources, especially if there are security rules (firewall access, root access limitation, etc.).

My last piece of advice is that if you have a heavy load, it is really important to implement the solution on premise, not in a virtualized environment. If you do both, you will see the difference in performance.

What's my experience with pricing, setup cost, and licensing?

The use of Hortonworks is free there’s no license but if you want there’s a support. It’s up to you to see if you need it (certainly) and to maybe negotiate it.

Which other solutions did I evaluate?

I did not really made the choice, as the client made it dependent on their experience, functionality of each distribution, privacy of the data and the licensing/support price.

What other advice do I have?

Firstly perform a POC to learn and to get an idea of the load of your future applications. Then, you should be able to correctly design the need infrastructure.

Disclosure: My company has a business relationship with this vendor other than being a customer: We are partners.
PeerSpot user
PeerSpot user
Lead IT Consultant at a tech services company with 5,001-10,000 employees
MSP
We've integrated our current distribution of it with Tableau, but we had issues upgrading to the newer versions, but these were resolved with their help.

What is most valuable?

The features I've found most valuable are--

  • Ambari UI
  • Hive
  • Pig
  • Hive
  • Also integrated Tableau with this distribution

How has it helped my organization?

It's easy to deploy and we've used this distribution for some of our recommendation and trend analysis use cases.

For how long have I used the solution?

I've used it for almost one year.

What was my experience with deployment of the solution?

No issues encountered.

What do I think about the stability of the solution?

No issues encountered.

What do I think about the scalability of the solution?

We faced some issues while upgrading to newer versions with current distributions, but with their support we solved it.

How are customer service and technical support?

Customer Service:

Customer service is great.

Technical Support:

Technical support is great.

Which solution did I use previously and why did I switch?

No, we did not use a previous solution.

How was the initial setup?

Initial setup was straightforward.

What about the implementation team?

We implemented it with our in-house team.

Disclosure: My company has a business relationship with this vendor other than being a customer: We're partners.
PeerSpot user
PeerSpot user
Associate Consultant at a tech vendor with 501-1,000 employees
Real User
The Ambari UI is valuable for cluster monitoring, but there are certain features that need tuning, such as the Hue UI.

What is most valuable?

From a product standpoint, their Ambari UI is incredibly valuable for cluster monitoring. It simplifies the deployment and maintenance of hosts, and we can provision, configure and test Hadoop services.

How has it helped my organization?

From an overall perspective, Hortonworks support is crucial to our operations.

What needs improvement?

As this is open source, there are certain features that need tuning, such as the Hue UI. More stability on this would be helpful.

For how long have I used the solution?

I've used it for one year.

What was my experience with deployment of the solution?

As this is all new technology, we face issues at every level. However, hardware support and documentation have been instrumental in helping us resolve the majority of those issues.

What do I think about the stability of the solution?

We've had some issues with stability, but hardware support and documentation have helped us resolve most of those.

What do I think about the scalability of the solution?

No issues with scalability.

How are customer service and technical support?

They have outstanding customer support. Their responses are prompt, and they resolve issues quickly.

Which solution did I use previously and why did I switch?

I have not used a solution of this nature before.

How was the initial setup?

The set up is straightforward enough, but at every level there are many parameters to be tuned. Ensuring all these parameters are set is the complex part, as poorly set parameters can cause unwanted issues.

What about the implementation team?

We have an in-house team to do implementations. I would advise that all implementations get seen through all the way to having users smoke test applications to ensure correct functionality.

What other advice do I have?

I would suggest that if you are implementing this at an enterprise level, the support is compulsory. Additionally having a high degree of patience is key, as this is open source and road bumps can be frequent when moving at a fast pace.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
PeerSpot user
Consultant at a tech services company with 51-200 employees
Consultant
It enables customers to perform sentimental analysis from social media data to engineering analytics. Name Node High Availability is still not stable.

Valuable Features:

Hortonworks is 100% Open Source. Hortonworks does a great job in managing all different components of Hadoop.

Improvements to My Organization:

We've done multiple implementations of it. It enables customers to perform sentimental analysis from social media data to engineering analytics.

Room for Improvement:

Security- Although they support Knox and Ranger and Kerberos, they are still missing attribute-level encryption features.

Name Node High Availability is still not stable (memory issues).

Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
PeerSpot user
Big Data Architect at a tech services company with 1,001-5,000 employees
Consultant
We have faster processing times for our apps, but it needs to automate deployment on multi nodes.

What is most valuable?

There are several features that are most valuable for us--

  • Hue
  • Hive
  • Spark
  • S3

How has it helped my organization?

With it, we have faster processing times for our apps.

What needs improvement?

It needs to be quicker and to have the ability to automate deployment on multiple nodes.

For how long have I used the solution?

I've used it for two years.

What was my experience with deployment of the solution?

Sometimes there were issues.

What do I think about the stability of the solution?

Sometimes there were issues.

What do I think about the scalability of the solution?

Sometimes there were issues.

How are customer service and technical support?

I've not had to use it.

Which solution did I use previously and why did I switch?

No solution had been used previously, but we are using it alongside AWS EMR.

How was the initial setup?

It was complex to configure.

What about the implementation team?

It was done in-house.

What other advice do I have?

We provide services for product implementation, so people looking for such products can contact me.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
it_user742794 - PeerSpot reviewer
Works at a comms service provider with 10,001+ employees
Vendor
Enabled us to implement fraud detection and improve performance at a lower cost
Pros and Cons
  • "Ranger for security; with Ranger we can manager user’s permissions/access controls very easily."
  • "Hive performance. If Hive performance increased, Hadoop would replace (not everywhere) traditional databases."

What is most valuable?

A few of them, namely: Hive/Tez, HBase, Ranger, Yarn and Ambari. Ambari helps managing the platform, Hive is very easy to use. Ranger for security; with Ranger we can manager user’s permissions/access controls very easily.

How has it helped my organization?

We have successfully ported a Microsoft SSIS product application into Hadoop, that saved millions of dollars for the company and, at the same time, they are getting better performance. Also, we implemented fraud detection, as quickly as possible, for the online orders. (Fraudulent orders became a big headache for our company. The early detection of fraud is saving the company a lot of money).

What needs improvement?

Hive performance. If Hive performance increased, Hadoop would replace (not everywhere) traditional databases (Oracle/Teradata, etc.), which would save a lot of money for the company.

For how long have I used the solution?

I have been working on this HDP platform since Jan 2015.

What do I think about the stability of the solution?

No, our company is a satisfied customer.

What do I think about the scalability of the solution?

No, not at all.

What other advice do I have?

Product is good. Reason I gave a rating of eight is that their community is very large and relatively very quick in bug fixes.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Buyer's Guide
Download our free Hadoop Report and find out what your peers are saying about Cloudera, IBM, Amazon Web Services (AWS), and more!
Updated: April 2024
Product Categories
Hadoop
Buyer's Guide
Download our free Hadoop Report and find out what your peers are saying about Cloudera, IBM, Amazon Web Services (AWS), and more!