If you were talking to someone whose organization is considering Spark SQL, what would you say?
How would you rate it and why? Any other tips or advice?
I would rate Spark SQL a nine out of ten. My advice would be to read Databricks books about Spark. It's a good source of knowledge. In the next update, we'd like to see better performance for small points of data. It is possible but there are better tools that are faster and cheaper.
We will have a lot of big data, which is why we need it. Otherwise, the solution is not needed. The solution really depends on the size of your data, its complexity, and the analysis that you are doing. Spark is good, but it is not mandatory. Since I don't have experience in production with the solution, the best I can rate it now is a five (out of 10).
We use both the on-premises and cloud deployment models. We have a relationship with Cloudera and use their distribution channels. We don't have a relationship with Apache. Spark SQL is a good product. However, users need to have the capability of implementing the correct tools and efficiencies. I'd rate the solution seven out of ten.
We've just started using this solution. We were using it until recently on a research basis, just to measure the performance, the cost, and so on and so forth. Many things could be improved, but are okay up till now, I'm happy with. I would recommend the product. I would rate this solution eight out of ten.