Elizabeth Ho - PeerSpot reviewer
Manager, Customer Journey at a retailer with 10,001+ employees
Real User
You can connect multiple data sources and share work easily
Pros and Cons
  • "I like how easy it is to share your notebook with others. You can give people permission to read or edit. I think that's a great feature. You can also pull in code from GitHub pretty easily. I didn't use it that often, but I think that's a cool feature."
  • "I would like it if Databricks adopted an interface more like R Studio. When I create a data frame or a table, R Studio provides a preview of the data. In R Studio, I can see that it created a table with so many columns or rows. Then I can click on it and open a preview of that data."

What is our primary use case?

I use Databricks for customer marketing analytics.

What is most valuable?

Databricks lets you schedule jobs pretty easily, and you can use SQL, Spark SQL, Python, or R. It also allows you to save a table or view. 

I like that you can connect to multiple data sources. Most of our data is stored in the Azure data lake, but my previous company connected to SQL databases or even blob storage. 

They've improved on many features. I don't do data engineering, but I had an issue a couple of years ago at my two companies ago. It took a long time to read and save tables, but I think the new Delta feature helped. 

I like how easy it is to share your notebook with others. You can give people permission to read or edit. I think that's a great feature. You can also pull in code from GitHub pretty easily. I didn't use it that often, but I think that's a cool feature.

What needs improvement?

I would like it if Databricks adopted an interface more like R Studio. When I create a data frame or a table, R Studio provides a preview of the data. In R Studio, I can see that it created a table with so many columns or rows. Then I can click on it and open a preview of that data. 

Because I work in analytics and not data engineering, I think that's probably the biggest one. There are better graphical tools, so I don't think Databricks can compete. You can do a simple graph, and it's not that great. However, I don't think they can ever stack up to Tableau, so it's probably not worth it to improve upon that. 

For how long have I used the solution?

I've been using Databricks for two years.

Buyer's Guide
Databricks
June 2024
Learn what your peers think about Databricks. Get advice and tips from experienced pros sharing their opinions. Updated: June 2024.
772,679 professionals have used our research since 2012.

What do I think about the stability of the solution?

Databricks is stable.

What do I think about the scalability of the solution?

Databricks is scalable.

How are customer service and support?

Databricks tech support has been great every time I've dealt with them. Their team is highly knowledgeable. 

How was the initial setup?

Setting up Databricks is easy. I set it up at my previous company. That was on Azure as well, but they utilized a third-party team with expertise in Databricks to ensure everything was optimized. 

What other advice do I have?

I rate Databricks 10 out of 10. I recommend taking advantage of Databricks support or a third-party provider to ensure it's set up optimally. I don't know if it's an additional service you must pay for, but we always had access to Databricks support in my last company. 

I think that's worth the money because there are so many different scenarios with distributed computing. Even people who study analytics may not understand the ins and out of Spark. It's worth it to have a service contract for support.

Which deployment model are you using for this solution?

Public Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Microsoft Azure
Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Sanjay Bheemasenarao - PeerSpot reviewer
Director - Data Engineering expert at Sankir Technologies
Real User
Is user friendly and has great performance, but documentation needs improvement
Pros and Cons
  • "Databricks has a scalable Spark cluster creation process. The creators of Databricks are also the creators of Spark, and they are the industry leaders in terms of performance."
  • "If I want to create a Databricks account, I need to have a prior cloud account such as an AWS account or an Azure account. Only then can I create a Databricks account on the cloud. However, if they can make it so that I can still try Databricks even if I don't have a cloud account on AWS and Azure, it would be great. That is, it would be nice if it were possible to create a pseudo account and be provided with a free trial. It is very essential to creating a workforce on Databricks. For example, students or corporate staff can then explore and learn Databricks."

What is our primary use case?

I use Databricks to explore new features and provide the industry visibility and scalability of Databricks to the companies that I work with.

I create proof of concepts for companies. As a consultant, I also create training courses on Databricks. If a company wants to leverage a service provided by Databricks and needs to train people, they use our courses.

What is most valuable?

Databricks has a scalable Spark cluster creation process. The creators of Databricks are also the creators of Spark, and they are the industry leaders in terms of performance.

Databricks has made great strides in terms of performance. 

It is very user friendly. I like the ease of creating a Spark cluster, submitting a job, or creating a notebook.

The UI has also changed for the better compared to what it was two years ago.

What needs improvement?

If I want to create a Databricks account, I need to have a prior cloud account such as an AWS account or an Azure account. Only then can I create a Databricks account on the cloud. However, if they can make it so that I can still try Databricks even if I don't have a cloud account on AWS and Azure, it would be great. That is, it would be nice if it were possible to create a pseudo account and be provided with a free trial. It is very essential to creating a workforce on Databricks. For example, students or corporate staff can then explore and learn Databricks.

It's a big ask to have people jump through a lot of hoops to get approval to create a Databricks cluster just to explore it, but if they can try it on their own with a free trial without an underlying cloud account it would be more convenient.

Documentation can be improved as well. There are so many versions of documents. For example, when I tried to create a DBU vault and secrets file, I had to go through multiple versions of documents. This could be improved so that the documentation is easy to use.

For how long have I used the solution?

I've been using this solution for about two years.

What do I think about the stability of the solution?

Stability wise, it's quite okay. In my experience, it doesn't crash.

What do I think about the scalability of the solution?

I have not used autoscaling because it consumes a lot of money and because my experience has been alright. In some cases, though, it is tied to the quota of the underlying infrastructure. I have not tested the scalability to its fullest extent, but with the workloads I run, it has been fine.

How are customer service and support?

When I wanted to create an AWS account and contacted technical support via email, I never received a response. Recently, however, I think they have improved their support a little bit, and I did get a call in response to my question. Overall, I've not faced any issues with the person I had to contact directly.

How was the initial setup?

The initial setup is not very easy, but it's medium in complexity.

What's my experience with pricing, setup cost, and licensing?

Databricks is a very expensive solution. Pricing is an area that could definitely be improved. They could provide a lower end compute and probably reduce the price.

What other advice do I have?

I would rate Databricks at seven on a scale from one to ten. If you compare it to Snowflake, for example, Snowflake doesn't mandate an underlying cloud account. It creates one on its own. That's a subtle convenience that Snowflake has and one that Databricks could also build.

Snowflake's documentation is easy to use in comparison to that of Databricks. 

Which deployment model are you using for this solution?

Public Cloud
Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Buyer's Guide
Databricks
June 2024
Learn what your peers think about Databricks. Get advice and tips from experienced pros sharing their opinions. Updated: June 2024.
772,679 professionals have used our research since 2012.
MILTON FERREIRA - PeerSpot reviewer
Co-founder/Senior Data Scientist at Hence
Real User
Responsive support, integrates and scales well
Pros and Cons
  • "The most valuable feature of Databricks is the integration of the data warehouse and data lake, and the development of the lake house. Additionally, it integrates well with Spark for processing data in production."
  • "The solution could be improved by adding a feature that would make it more user-friendly for our team. The feature is simple, but it would be useful. Currently, our team is more familiar with the language R, but Databricks requires the use of Jupyter Notebooks which primarily supports Python. We have tried using RStudio, but it is not a fully integrated solution. To fully utilize Databricks, we have to use the Jupyter interface. One feature that would make it easier for our team to adopt the Jupyter interface would be the ability to select a specific variable or line of code and execute it within a cell. This feature is available in other Jupyter Notebooks outside of Databricks and in our own IDE, but it is not currently available within Databricks. If this feature were added, it would make the transition to using Databricks much smoother for our team."

What is our primary use case?

We are using Databricks for machine learning workloads specifically.

Databricks aligns well with our skillset and overall approach. We sought out their solution specifically for a big data application we are currently working on, as we needed a platform capable of handling large amounts of data and building models. Additionally, the fact that they use open-source software and can integrate data warehouse and data lake systems was particularly appealing, as we have encountered such issues in the past. We determined that Databricks would be an effective solution for our needs.

What is most valuable?

The most valuable feature of Databricks is the integration of the data warehouse and data lake, and the development of the lake house. Additionally, it integrates well with Spark for processing data in production. 

What needs improvement?

The solution could be improved by adding a feature that would make it more user-friendly for our team. The feature is simple, but it would be useful. Currently, our team is more familiar with the language R, but Databricks requires the use of Jupyter Notebooks which primarily supports Python. We have tried using RStudio, but it is not a fully integrated solution. To fully utilize Databricks, we have to use the Jupyter interface. One feature that would make it easier for our team to adopt the Jupyter interface would be the ability to select a specific variable or line of code and execute it within a cell. This feature is available in other Jupyter Notebooks outside of Databricks and in our own IDE, but it is not currently available within Databricks. If this feature were added, it would make the transition to using Databricks much smoother for our team.

The most important feature other than the Jupyter interface would be to have the RStudio interface inside Databricks. This would be perfect.

For how long have I used the solution?

We have been using Databricks for approximately one year.

What do I think about the stability of the solution?

The stability of Databricks is good.

I rate the stability of Databricks a nine out of ten.

What do I think about the scalability of the solution?

Databricks is scalable.

I rate the scalability of Databricks a nine out of ten.

How are customer service and support?

I have been receiving responsive answers from Databricks's support. I have been pleased with the support.

I rate the support from Databricks a ten out of ten.

How would you rate customer service and support?

Positive

How was the initial setup?

The initial setup of Databricks is simple. I did not experience any challenges. The time it takes for the deployment is approximately four hours.

I rate the initial setup of Databricks.

What about the implementation team?

We did the deployment of the solution in-house. There were three people involved in the deployment. A data engineer, data analyst, and machine learning engineer.

What's my experience with pricing, setup cost, and licensing?

We have only incurred the cost of our AWS cloud services. This is because during this period, Databricks provided us with an extended evaluation period, and we have not spent much money yet. We are just starting to incur costs this month, I will know more later on the full cost perspective.

We only pay standard fees for the solution. 

What other advice do I have?

We use a data engineer, data analyst, and machine learning engineer for the maintenance of the solution.

I rate Databricks a nine out of ten.

Which deployment model are you using for this solution?

Public Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Amazon Web Services (AWS)
Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
PankajKumar13 - PeerSpot reviewer
Computer Scientist at Adobe
Real User
Pumps up performance and the processing power; comes with helpful Lakehouse and SQL environments
Pros and Cons
  • "When we have a huge volume of data that we want to process with speed, velocity, and volume, we go through Databricks."
  • "I believe that this product could be improved by becoming more user-friendly."

What is our primary use case?

Our primary use case is for data analytics. Essentially, we use it for the financial reporting for Adobe.

How has it helped my organization?

The way Databricks has improved my organization is definitely through giving us improved performance and the processing power. We are usually never able to achieve it using traditional data warehouses. When we have a huge volume of data that we want to process with speed, velocity, and volume, we go through Databricks.

What is most valuable?

The features I found most helpful with Databricks are the Lakehouse and SQL environments.

What needs improvement?

I believe that this product could be improved by becoming more user-friendly. 

In the next release, I would like to see more flexibility in the dashboard. It has plenty of features but it can be enhanced so that it matches with other visualization tools, like Power BI and Tableau. Also, the integrations with other tools could be better.

For how long have I used the solution?

I have been using Databricks for the last three years.

What do I think about the stability of the solution?

I would rate the stability of Databricks an eight, on a scale from one to 10, with one being the worst and 10 being the best.

What do I think about the scalability of the solution?

I would rate the scalability of this solution a nine, on a scale from one to 10, with one being the worst and 10 being the best. I would say there are around 2,000 to 3,000 users of this solution in our organization.

How are customer service and support?

I've been in contact with the Databricks support team and received timely support from them. I would rate their support an eight, on a scale from one to 10, with one being the worst and 10 being the best.

How would you rate customer service and support?

Positive

Which solution did I use previously and why did I switch?

Prior to Databricks, we initially used Hadoop. Afterwards, we used HANA, SAP HANA, and the Microsoft SQL Server.

How was the initial setup?

The initial setup was relatively straightforward. I would rate it nine, on a scale from one to 10, with one being the easiest and 10 being the hardest.

There is no need to worry about the deployment as it can be done quickly. It is relatively automated. We used Terraform for auto-deployment, which happens in Azure. With Terraform, there are two options. As option one, you can deploy manually by creating services. For option two, you use Terraform and automate. Terraform is like infrastructure as a code where you can code the deployment part using it.

There were two or three persons involved in the deployment of this solution.

What other advice do I have?

The new version of the Databricks solution requires code maintenance. This is done by the platform team.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Trond Jensen - PeerSpot reviewer
Data Analyst at Eviny
Real User
Fast and does what it needs to but customer service should be improved upon
Pros and Cons
  • "It is fast, it's scalable, and it does the job it needs to do."
  • "I would like to see the integration between Databricks and MLflow improved. It is quite hard to train multiple models in parallel in the distributed fashions. You hit rate limits on the clients very fast."

What needs improvement?

I would like to see the integration between Databricks and MLflow improved. It is quite hard to train multiple models in parallel in the distributed fashions. You hit rate limits on the clients very fast.

For how long have I used the solution?

I have been using Databricks for three years.

What do I think about the stability of the solution?

I would rate the stability of this solution a nine out of 10, with one being not stable and 10 being very stable.

What do I think about the scalability of the solution?

I would rate the scalability of this solution an eight out of 10, with one being not scalable and 10 being very scalable.

There are three people using this solution in our organization.

How are customer service and support?

I would rate the available customer service a three. It's worth mentioning that this is Microsoft and not Databricks itself. I haven't spoken to Databricks people directly, but I know the people who have and they have been a lot more pleased.

How would you rate customer service and support?

Negative

What's my experience with pricing, setup cost, and licensing?

I would rate their pricing plan a six (on a scale of one to 10, with one being cheap and 10 being expensive). I think the prices could be lowered a little bit.

What other advice do I have?

Overall, I would rate this solution an eight out of 10, with one being quite poor and 10 being excellent. It is fast, it's scalable, and it does the job it needs to do.

Disclosure: My company has a business relationship with this vendor other than being a customer: Partner
PeerSpot user
Olubisi Akintunde - PeerSpot reviewer
Team Lead at a tech services company with 1,001-5,000 employees
MSP
Top 10
Gives us the ability to write analytics code in multiple languages
Pros and Cons
  • "Databricks provides a consistent interface for data engineers to work with data in a consistent language on a single integrated platform for ingesting, processing, and serving data to the end user."
  • "Databricks would benefit from enhanced metrics and tighter integration with Azure's diagnostics."

What is our primary use case?

We use Databricks for batch data processing and stream data processing.

How has it helped my organization?

Databricks provides a consistent interface for data engineers to work with data in a consistent language on a single integrated platform for ingesting, processing, and serving data to the end user.

What is most valuable?

The flexibility of Databricks is the most valuable feature. It gives us the ability to write analytics code in multiple languages.

There is a single workspace for different data roles like data engineers, machine learning engineers, and the end user, who can connect to the same system. 

Databricks computes separate from storage, so you are not coupled with the underlying data sets, allowing for multiple processes and multiple programs to be written on the same code.

What needs improvement?

I would like to see improvement with the UI. It is functional and useful, but it's a bit clunky at times. It should be more user-friendly.

In future releases, Databricks would benefit from enhanced metrics and tighter integration with Azure's diagnostics.

For how long have I used the solution?

I have been using Databricks for eight months.

What do I think about the stability of the solution?

Databricks is very stable.

What do I think about the scalability of the solution?

The scalability of this solution is good. In our organization, users include analysts, data engineers, and data scientists.

How are customer service and support?

I would give Databrick service and support a four and a half out of five overall.

How would you rate customer service and support?

Positive

Which solution did I use previously and why did I switch?

Prior to using Databricks, we used Azure Stream Analytics. We made the switch because of the scalability and integrated platform.

How was the initial setup?

The initial setup of Databricks is more complex. I would rate it a four out of five on the complexity of the setup. It took two days to deploy the solution.

What about the implementation team?

We used a third party for some of the implementations of Databricks. The number of staff required to deploy and maintain this solution depends on the number of processes you have. Due to the cloud nature of the technology, it is easy to deploy and maintain. 

What's my experience with pricing, setup cost, and licensing?

The licensing of Databricks is a tiered licensing regime, so it is flexible. I feel their pricing is a five out of five.

What other advice do I have?

Databricks is a one-stop shop for everything data related, and it can scale with you.

I would rate this solution a 9.5 out of 10 overall.

Which deployment model are you using for this solution?

Public Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Microsoft Azure
Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
IshwarSukheja - PeerSpot reviewer
Sr Manager Data Scientist at Bizmetric
Real User
A user-friendly and customizable solution that offers excellent integration
Pros and Cons
  • "The solution is built from Spark and has integration with MLflow, which is important for our use case."
  • "The ability to customize our own pipelines would enhance the product, similar to what's possible using ML files in Microsoft Azure DevOps."

What is our primary use case?

Our use case is confidential, but I can say we use it for a deep learning model for machine learning. 

What is most valuable?

The solution is built from Spark and has integration with MLflow, which is important for our use case. 

Databricks is also user-friendly, providing customizable codes and models that allow people to experiment quickly. 

Integration of Delta Lake is another useful feature.

What needs improvement?

Writing pandas-profiling reports could be easier. 

The ability to customize our own pipelines would enhance the product, similar to what's possible using ML files in Microsoft Azure DevOps. 

For how long have I used the solution?

I have been using this product for one and a half years. 

What do I think about the stability of the solution?

For now the solution seems stable. 

What do I think about the scalability of the solution?

The solution is easy to scale horizontally and it has a useful auto-scaling feature. For vertical scaling, you need to bring the system down and make some adjustments.

On my current project I have a team of 30 members under me, including data engineers and data science people. Our data science, engineering, and MLOps projects are expanding, so we are planning to do some vertical scaling to increase the team size to over 100 members. In our company, we are trying to certify more and more people in Databricks because it's cloud-agnostic. 

How are customer service and support?

We have never needed to contact customer support, online resources have been sufficient to solve our problems. 

How was the initial setup?

The initial setup of the solution is straightforward, once you understand the UI it is easy to implement. I would rate Databricks a four out of five for ease of setup.

One migration project took two to three months, including writing all the code and implementing end-to-end pipelines. 

We are planning to deploy the solution in stages over the next 15 months to completely implement MLOps for our organization.

What's my experience with pricing, setup cost, and licensing?

I'm not involved in the financing, but I can say that the solution seemed reasonably priced compared to the competitors. Similar products are usually in the same price range. With five being affordable and one being expensive, I would rate Databricks a four out of five. 

I find that deployed systems work out cheaper than having to operate manually, which appeals to our customers. 

What other advice do I have?

I would rate this solution an eight out of ten. 

There is an issue where clusters are automatically deleted after termination or after 100 days of non-usage. This could be more user-friendly, and they could include an enabler to pin the clusters you want to keep, instead of having to go and research why clusters got deleted after implementing the product. That documentation needs to be right in front of the user to avoid issues.

I definitely recommend this product to other users. 

Which deployment model are you using for this solution?

Hybrid Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Microsoft Azure
Disclosure: My company has a business relationship with this vendor other than being a customer: Partner
PeerSpot user
Lead Analytics at a manufacturing company with 10,001+ employees
Real User
Useful machine learning and easy to scale
Pros and Cons
  • "In the manufacturing industry, Databricks can be beneficial to use because of machine learning. It is useful for tasks, such as product analysis or predictive maintenance."
  • "The stability of the clusters or the instances of Databricks would be better if it was a much more stable environment. We've had issues with crashes."

What is our primary use case?

Our team is currently utilizing machine learning for various applications, and a few members are also exploring Databrick's use for ML operations.

What is most valuable?

In the manufacturing industry, Databricks can be beneficial to use because of machine learning. It is useful for tasks, such as product analysis or predictive maintenance.

For how long have I used the solution?

I have been using Databricks for approximately six months

What do I think about the stability of the solution?

The stability of the clusters or the instances of Databricks would be better if it was a much more stable environment. We've had issues with crashes.

What do I think about the scalability of the solution?

The scalability of Databricks is good as long as you have a data lake, and it's easy to scale.

We have approximately 50 users using this solution in my company.

How are customer service and support?

We have a different team who handles the support. I do not have contact with Databricks support.

Which solution did I use previously and why did I switch?

I have not used a similar solution to Databricks.

What was our ROI?

I have seen an ROI using Databricks.

What's my experience with pricing, setup cost, and licensing?

I rate the price of Databricks as eight out of ten.

What other advice do I have?

Having a good understanding of physical security in relation to cybersecurity in an OT (Operational Technology) environment would be beneficial, and utilizing an existing data lake prior to implementing a Databricks initiative would greatly aid in its success.

I rate Databricks an eight out of ten.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Buyer's Guide
Download our free Databricks Report and get advice and tips from experienced pros sharing their opinions.
Updated: June 2024
Buyer's Guide
Download our free Databricks Report and get advice and tips from experienced pros sharing their opinions.