We just raised a $30M Series A: Read our story

Databricks Competitors and Alternatives

Get our free report covering Microsoft, Amazon, Microsoft, and other competitors of Databricks. Updated: November 2021.
555,358 professionals have used our research since 2012.

Read reviews of Databricks competitors and alternatives

JD
Full stack Data Analyst at a tech services company with 10,001+ employees
Real User
Plenty of features, powerful AutoML functionality, but better MLflow integration needed

Pros and Cons

  • "Azure Machine Learning Studio's most valuable features are the package from Azure AutoML. It is quite powerful compared to the building of ML in Databricks or other AutoMLs from other companies, such as Google and Amazon."
  • "I have found Databricks is a better solution because it has a lot of different cluster choices and better integration with MLflow, which is much easier to handle in a machine learning system."

What is our primary use case?

I use a combination of Microsoft Azure Machine Learning Studio and Azure Databricks. I mostly use Azure Databricks for building a machine learning system. There are several workflows for a machine learning tuning system that involves data pre-processing, quick modeling pipelines that execute within a couple of seconds, and complex model pipelines, such as hyperparameters. Additionally, there is a setting to set different AutoML parameters. 

For the training and evaluation phase of the whole machine learning system, I use MLflow, for a testing system and a model serving system, which is one core component of Databricks. I use it for Model Register and it allows me to do many things, such as registering model info, logs, and evaluation metrics.

What is most valuable?

The newer version of this solution has better integration with automated ML processes and different APIs. I feel like it is quite powerful in terms of general machine learning features, such as training data handily by having different sampling methods and has more useful modeling parameter settings. People who are not data scientists or data analysts, can quickly use the platform and build models to leverage the data to do some predictive models.

Azure Machine Learning Studio's most valuable features are the package from Azure AutoML. It is quite powerful compared to the building of ML in Databricks or other AutoMLs from other companies, such as Google and Amazon. It has the most sophisticated set of categories of parameters. The data encodings and options are good and it has the most detailed settings for specifics models.

What needs improvement?

I have found Databricks is a better solution because it has a lot of different cluster choices and better integration with MLflow, which is much easier to handle in a machine learning system.

The developers for this solution have not been as active in improving it as other solutions have had more improvements, such as Databricks.

Sometimes there might be some data drifting problems and this is what I am currently working on. For example, when our new data has a drift from the previous old data. I need to first work out a solution. Azure in Databricks or in Azure Machine Learning Studio both works fine. However, the normal data drifting solution is not working that well for the problem that I am facing. I am able to receive the distribution change and numerical metrics changes, but it will not inform me how to fix them.

For how long have I used the solution?

I have been using this solution for approximately three months.

Which solution did I use previously and why did I switch?

I use Databricks alongside this solution.

What other advice do I have?

I rate Microsoft Azure Machine Learning Studio a seven out of ten.

Disclosure: My company has a business relationship with this vendor other than being a customer: Partner
Flag as inappropriate
MO
Consultant at a tech services company with 501-1,000 employees
Consultant
Top 20
Great for automating pipelines and creation of API endpoints

Pros and Cons

  • "Allows you to create API endpoints."
  • "Lacking in some machine learning pipelines."

What is our primary use case?

Our primary use case for SageMaker is for developing end to end machine learning solutions and ready solutions for things such as computer vision or speech recognition or speech to text. It's basically providing off-the-shelf solutions. Our customers are generally medium to enterprise size companies. We're a partner of Amazon.

What is most valuable?

The most valuable feature of the solution is that it allows you to create API endpoints and that saves a lot of time for data scientists. 

What needs improvement?

The product has come a long way and they've added a lot of things, but in terms of improvement I would like to probably have features such as MLflow embedded into it.

Additional features I would like to see would include, as mentioned, MLflow and ML Pipelines which are more of a feature rich support of machine learning pipelines as well as scheduling machine learning pipelines, and visualization of machine learning pipelines.  

For how long have I used the solution?

I've been using this solution for about a year.

What do I think about the stability of the solution?

The solution is quite stable. 

What do I think about the scalability of the solution?

The solution is hosted on Amazon so it's quite scalable.

How are customer service and technical support?

The documentation is good so I haven't needed to use technical support. 

Which solution did I use previously and why did I switch?

SageMaker was the first cloud solution I've used but there are other products, such as Databricks or Google and Azure that have similar products. There are common features with all these products but I'd say that SageMaker has more features than Databricks. Azure has other features in addition to Databricks, but SageMaker has provided everything. 

How was the initial setup?

Initial setup is quite straightforward. 

What's my experience with pricing, setup cost, and licensing?

The pricing for the Notebook endpoints is a bit high, but generally reasonable. 

What other advice do I have?

I think for anyone using SageMaker it will help automate pipelines, and make it easier than doing the process manually. For anyone already on the AWS platform, they should definitely make use of it.

I would rate this product an eight out of 10. 

Disclosure: My company has a business relationship with this vendor other than being a customer: partner
Get our free report covering Microsoft, Amazon, Microsoft, and other competitors of Databricks. Updated: November 2021.
555,358 professionals have used our research since 2012.