If you were talking to someone whose organization is considering Cloudera Distribution for Hadoop, what would you say?
How would you rate it and why? Any other tips or advice?
I would rate this solution a nine out of ten.
In terms of the advice, I would say to focus on what tools are available on the market. In terms of open-source, most companies are delivering open source technologies and providing support to these tools. Now I have the option to purchase a license for whatever platform for $1. I can deliver it with another small company at a lower cost. If I was the decision-maker, I'd invest in open-source tools. Cloudera and all of these companies are trying to adapt to these big data technologies and open source tools. Cloudera is trying to put it inside their platform so that we can have a compatible solution. I would rate it an eight out of ten.
This suitability of this solution depends on the size of the data that you are going to be working with. If you have going to be working with a huge dataset that contains many gigabytes of data then this is a good solution. For smaller datasets, you should also consider other technologies. My advice for anybody who is implementing this solution is to take some time to learn it. Beyond that, be sure to contact support if you have any problems because they are very helpful. I would rate this solution a seven out of ten.
I would recommend the solution given that they've proven the business case and that they've proven the technology. We have found that if you don't use or address the right business code you end up buying a technology that doesn't necessarily solve your business problems. I would rate the solution seven out of ten. The main reason for not rating it higher is that I think that the overall support is not great and we've found some limitations. It wasn't mature when we started. It's getting there. It's getting better. The main reason for the score of seven is mainly the support as well as the limited functionality.
I will rate this solution a nine out of ten because nothing is ever perfect. You will always face problems, but I'm quite happy with Cloudera.
I had a bad experience connecting the Cloudera Distribution for Hadoop cluster to my other resources in the company, like the active directory or firewall. I would like to see the outside environment to be easier to handle. I will rate this eight out of ten because the solution doesn't cover everything. It is a very complicated solution because it contains a lot of internal tools.
I would rate this solution seven out of 10. There's tons of room for improvement.
I would say that the product as it currently is should rate at an eight out of ten. The reason that score is not higher is because of the workarounds that we have to do when it comes to certain models that do not support using multiple programming languages. For example, in a single notebook, it is inflexible if you want to use other program languages. As far as other advice for people considering this solution, I would say take a good look at your business need before you decide on this technology and which solution to choose. Make sure that you are not already able to solve for your particular, identified needs using your existing technology before even considering a change. You want to be sure you're applying the technology to the right business case because of actual need and not just change for change's sake.