If you were talking to someone whose organization is considering Databricks, what would you say?
How would you rate it and why? Any other tips or advice?
Our client is a bank and some of the information can be shared outside of the organization, whereas some of the data is confidential and private. Using a purely on-premises solution would have made it more difficult to share information with the outside, which is one of the reasons that they wanted a cloud-based deployment. My advice for anybody who is considering this solution is that it is very good for unstructured or semi-structured data. If, however, you have structured data then I would recommend a columnar database like Snowflake or Vertica. These solutions are easier to deploy. This is a good solution that is working well, but I don't think that it is really a SaaS. I would rate this solution a seven out of ten.
On a scale from one to ten where one is the worst and ten is the best, I would rate Databricks overall as around a 7 or 7.5. If we had more experience with it and could be sure we had a solid understanding of what it could do and the reliability, I might recommend it with a better score. I do not think I should give it more than a seven for now.
It's more data scientists using Databricks. I would call them power users trying to see how they can get a hand on it, though they are not data scientists. They try to understand it a little bit better for their future use. On a scale of one to ten, I would rate it an eight, easy.
We're partners with Databricks. We're using the latest version of the solution, but I can't recall what version number we are on. I'd advise others considering the solution to look at usage. They shouldn't adopt the solution blindly. How the implementation and usage will go will depend on the skill of the data engineer and what your requirements are. I'd rate the solution seven out of ten.
I work in the data science field and I found Databricks to be very useful. If I want to run any models then I can code them in PySpark. If you are coming from a Python background then you can write code in PySpark and it runs quickly. This is a good solution in terms of performance. I would rate this solution a nine out of ten.
I'm a software development engineer. I'm working with the latest version. As long as the developers have an understanding of spark, and understanding technical tricks, it's very fast in terms of using the database. I'd rate the solution eight out of ten.
Databricks has been good and I like it. However, it would be improved with the enhancement of the machine learning libraries, and with the inclusion of visualization libraries. I would rate this solution an eight out of ten.
My advice for developers who are interested in working with this solution is to first go through the Spark architecture. I would rate this solution a nine out of ten.
The product has improved and I'm sure this will continue in the next versions. We are completely satisfied with it, the ease of connecting to different sources of data or pocket files in the search. I think it could be very interesting for users looking for a framework to use Databricks. I would, however, recommend a more complicated architecture for using Databricks and achieving a great result for end-users. I would rate this product an eight out of 10.
By investing in people skilled in data querying, Python coding, and even basic Data Science, a Databricks setup will reward the business. Once the Databricks data flows are established, it is a matter of a few incremental steps to opening up streaming and running up-to-the-minute queries, allowing the business to build its data-driven processes. Databricks continues to advance the state-of-the-art and will be my go-to choice for mission-critical PySpark and ML workflows.