Databricks Overview

Databricks is the #1 ranked solution in our list of Streaming Analytics tools. It is most often compared to Amazon SageMaker: Databricks vs Amazon SageMaker

What is Databricks?

Databricks creates a Unified Analytics Platform that accelerates innovation by unifying data science, engineering, and business. It utilizes Apache Spark to help clients with cloud-based big data processing. It puts Spark on “autopilot” to significantly reduce operational complexity and management cost. The Databricks I/O module (DBIO) improves the read and write performance of Apache Spark in the cloud. An increase in productivity is ensured through Databricks’ collaborative workplace.

Databricks is also known as Databricks Unified Analytics, Databricks Unified Analytics Platform, Redash.

Databricks Buyer's Guide

Download the Databricks Buyer's Guide including reviews and more. Updated: May 2021

Databricks Customers

Elsevier, MyFitnessPal, Sharethrough, Automatic Labs, Celtra, Radius Intelligence, Yesware

Databricks Video

Pricing Advice

What users are saying about Databricks pricing:
  • "Licensing on site I would counsel against, as on-site hardware issues tend to really delay and slow down delivery."
  • "I am based in South Africa, where it is expensive adapting to the cloud, and then there is the price for the tool itself."
  • "Whenever we want to find the actual costing, we have to send an email to Databricks, so having the information available on the internet would be helpful."
  • "The price is okay. It's competitive."

Filter Reviews

Filter by:
Filter Reviews
Industry
Loading...
Filter Unavailable
Company Size
Loading...
Filter Unavailable
Job Level
Loading...
Filter Unavailable
Rating
Loading...
Filter Unavailable
Considered
Loading...
Filter Unavailable
Order by:
Loading...
  • Date
  • Highest Rating
  • Lowest Rating
  • Review Length
Search:
Showingreviews based on the current filters. Reset all filters
VP
Data Scientist at a energy/utilities company with 10,001+ employees
Real User
Has a good feature set but it needs samples and templates to help invite users to see results

What is our primary use case?

I am a data scientist here and that is my official role. I own the company. Our team is quite small at this point. We have around five people on the team and we are working with about five different businesses. The projects we get from them are massive undertakings. Each of us on the team takes multiple roles in our company and we use multiple tools to help best serve our clients. We are trying to look at creative ways that different solutions can be integrated and we try to understand what products we can use to create solutions for client companies that will be effective in meeting their… more »

Pros and Cons

  • "Imageflow is a visual tool that helps make it easier for business people to understand complex workflows."
  • "The product needs samples and templates to help invite users to see results and understand what the product can do."

What other advice do I have?

On a scale from one to ten where one is the worst and ten is the best, I would rate Databricks overall as around a 7 or 7.5. If we had more experience with it and could be sure we had a solid understanding of what it could do and the reliability, I might recommend it with a better score. I do not think I should give it more than a seven for now.
MM
Lead Data Architect at a government with 1,001-5,000 employees
Real User
Good integration with majority of data sources through Databricks Notebooks using Python, Scala, SQL, R.

What is our primary use case?

We used Databricks in AWS on top of s3 buckets as data lake. The primary use case was providing consistent, ACID compliant data sets with full history and time series, that could be used for analytics.

Pros and Cons

  • "The initial setup is pretty easy."
  • "Overall it's a good product, however, it doesn't do well against any individual best-of-breed products."

What other advice do I have?

In the current capacity as and Architect and the end user of Databricks I would say I do have confidence that Databricks can provide a wealth of functionalities to start with. My advice to future adopters of Databricks would be to be careful about the overall architectural roadmap for this application, adopt a flexible, modular, microservices like architecture whose components could be replaced in the future should they deem inadequate to cater for evolving business needs.
Learn what your peers think about Databricks. Get advice and tips from experienced pros sharing their opinions. Updated: May 2021.
501,499 professionals have used our research since 2012.
Sr. BigData Architect at ITC Infotech
MSP
Top 5
Very elastic, easy to scale, and a straightforward setup

What is our primary use case?

We work with clients in the insurance space mostly. Insurance companies need to process claims. Their claim systems run under Databricks, where we do multiple transformations of the data.

Pros and Cons

  • "It's easy to increase performance as required."
  • "Instead of relying on a massive instance, the solution should offer micro partition levels. They're working on it, however, they need to implement it to help the solution run more effectively."

What other advice do I have?

There isn't really a version, per se. It's a popular service. I'd recommend the solution. The solution is cloud-agnostic right now, so it really can go into any cloud. It's the users who will be leveraging installed environments that can have these services, no matter if they are using Azure or Ubiquiti, or other systems. I don't think you can find any other tool or any other service that is faster them Databricks. I don't see that right now. It's your best option. Overall, I'd rate the solution eight out of ten. The reason I'm not giving it full marks is that it's expensive compared to open…
TB
Data Scientist at iOCO
Real User
Good built-in optimization, easy to use with a great user interface

What is our primary use case?

We are using this solution to run large analytics queries and prepare datasets for SparkML and ML using PySpark. We ran on multiple clusters set up for a minimum of three and a maximum of nine nodes having 16GB RAM each. For one ad hoc requirement, a 32-node cluster was required. Databricks clusters were set for autoscaling and to time out after forty minutes of inactivity. Multiple users attached their notebooks to a cluster. When some workloads required different libraries, a dedicated cluster was spun up for that user.

Pros and Cons

  • "The built-in optimization recommendations halved the speed of queries and allowed us to reach decision points and deliver insights very quickly."
  • "The product could be improved by offering an expansion of their visualization capabilities, which currently assists in development in their notebook environment."

What other advice do I have?

By investing in people skilled in data querying, Python coding, and even basic Data Science, a Databricks setup will reward the business. Once the Databricks data flows are established, it is a matter of a few incremental steps to opening up streaming and running up-to-the-minute queries, allowing the business to build its data-driven processes. Databricks continues to advance the state-of-the-art and will be my go-to choice for mission-critical PySpark and ML workflows.
SN
Head of Data & Analytics at a tech services company with 11-50 employees
Real User
Top 5Leaderboard
Helpful integration with Python and notebooks, but it should be more user-friendly and less complicated to use

What is our primary use case?

We are a consulting house and we employ solutions based on our customers' needs. We don't generally use products internally. I am a certified data engineer from Microsoft and have worked on the Azure platform, which is why I have experience with Databricks. Now that Microsoft has launched Synapse, I think that there will be more use cases.

Pros and Cons

  • "The integration with Python and the notebooks really helps."
  • "Databricks is not geared towards the end-user, but rather it is for data engineers or data scientists."

What other advice do I have?

From a purely technical perspective, I would rate Databricks and eight out of ten. However, there is a failure in terms of user adoption. After I look at other products, including Synapse, those are better. I still feel that Databricks is quite complicated for the average person. I would rate this solution a five out of ten.
Business Intelligence and Analytics Consultant at a tech services company with 201-500 employees
Consultant
Easy to switch loads between clusters and automation is easy using the API

What is our primary use case?

I am a developer and I do a lot of consulting using Databricks. We have been primarily using this solution for ETL purposes. We also do some migration of on-premises data to the cloud.

Pros and Cons

  • "Automation with Databricks is very easy when using the API."
  • "Some of the error messages that we receive are too vague, saying things like "unknown exception", and these should be improved to make it easier for developers to debug problems."

What other advice do I have?

My advice for developers who are interested in working with this solution is to first go through the Spark architecture. I would rate this solution a nine out of ten.
OB
Security Consulting, Manager at a computer software company with 1,001-5,000 employees
MSP
Top 10
A scalable solution to quickly process and analyze streams of information

What is our primary use case?

We are working with Databricks and SMLS in the financial sector for big data and analytics. There are a number of business cases for analysis related to debt there. Several clients are working with it, analyzing data collected over a period of time and planning the next steps in multiple business divisions. My organization is a professional consulting service. We provide services for the other organizations, which implement and use them in a production environment. We manage, implement, and upgrade those services, but we don't use them.

Pros and Cons

  • "Databricks helps crunch petabytes of data in a very short period of time."
  • "Costs can quickly add up if you don't plan for it."

What other advice do I have?

If you're thinking of implementing Databricks, I would recommend working with professionals. It'll help you save time. Also, plan the work and work the plan. Otherwise, it'll be a waste of time and money. On a scale from one to ten, I would give Databricks a nine.
Chief Data-strategist and Director at a consultancy with 11-50 employees
Real User
Top 5Leaderboard
Flexible, stable, and reasonably priced

What is our primary use case?

We primarily use the solution for retail and manufacturing companies. It allows us to build data lakes.

Pros and Cons

  • "The solution is very easy to use."
  • "The integration of data could be a bit better."

What other advice do I have?

We are customers and end-users. Databricks is on the could and therefore, we're always on the latest version of the solution. It's constantly updated for us so that we have access to the latest updates and upgrades. I'd rate the solution at a nine out of ten. The capability of the product is quite good and we are very satisfied with it overall. I'd recommend the solution to other companies and organizations.
See 13 more Databricks Reviews