Databricks Review

Great for dealing with huge amounts of data and it is easy to connect to different sources of data


What is our primary use case?

Our primary use case is really DevOps, for integration and continuous development. We've combined our database with some components from Azure to deploy elements in Sandbox for our data scientists and for our data engineers. 

What is most valuable?

Valuable features would have to include the Notebook for piping some models and the future of executing the notebooks in parallel, in batches, which is also something that we use. And we use the Notebook on Spark with Python. 

What needs improvement?

Improvements could include the pricing, the product is a little expensive, although I think comparable to other similar options. The integration features could be more interesting, more involved. For example, we use the Database Notebook, which is not as great as Jupyter Notebook, for providing a great user experience. The look and feel are not the same and we've had complaints from some of our users. They say that it's easier and more productive for them to use Jupyter Notebook.

And then there is the integration feature for connecting to data sources, for example, Jupyter Notebook through publishes connect. The problem is that when you do that, you don't get all the Jupyter features which is a shame for us. 

For additional features, having some PyTorch or TensorFlow type features inside would definitely be great. For now, my users are developing for themselves by importing their libraries into their Notebook and then creating models based on the potential flow of PyTorch. That requires a lot of imports, particularly library imports, something that is now available in the new version of  Machine Learning services. These things are very important because the self appliance community has shifted from the traditional way of preparing models, to a deeper learning system. It's now more common to have those features. 

For how long have I used the solution?

I've been using the product inside Azure for about six months now. 

What do I think about the stability of the solution?

Given my experience, the product is very stable. 

What do I think about the scalability of the solution?

The product is quite easy to scale and increasing the number of users is quite simple. 

Which solution did I use previously and why did I switch?

We previously used the earlier version of Azure Machine Learning services and we decided to move over because over time it became more difficult to deploy. That was two years ago, but now with the new version, it's much easier to deploy Machine Learning.

How was the initial setup?

The setup is straightforward, I did it myself. 

What other advice do I have?

The product has improved and I'm sure this will continue in the next versions. We are completely satisfied with it, the ease of connecting to different sources of data or pocket files in the search. 

I think it could be very interesting for users looking for a framework to use Databricks. I would, however, recommend a more complicated architecture for using Databricks and achieving a great result for end-users. 

I would rate this product an eight out of 10. 

Which deployment model are you using for this solution?

Private Cloud
**Disclosure: I am a real user, and this review is based on my own experience and opinions.
Add a Comment
Guest