Principal at a computer software company with 5,001-10,000 employees
Real User
Top 20
Has advanced modeling and machine-learning features; highly scalable, with no stability issues
Pros and Cons
  • "What I like about Databricks is that it's one of the most popular platforms that give access to folks who are trying not just to do exploratory work on the data but also go ahead and build advanced modeling and machine learning on top of that."
  • "I have had some issues with some of the Spark clusters running on Databricks, where the Spark runtime and clusters go up and down, which is an area for improvement."

What is our primary use case?

I've worked with Databricks primarily in the pharmaceuticals and life sciences space, which means a lot of work on patient-level data and the predictive analytics around that.

Another use case for Databricks is in the manufacturing industry. I'm a consultant, so the use cases for the product vary, but my primary use case for it is in the pharma space.

What is most valuable?

From a data science and applied analytics perspective, what I like about Databricks is that it's probably one of the most popular platforms that give access to folks who are trying not just to do exploratory work on the data but also go ahead and build advanced modeling and machine learning on top of that, and then go ahead and make that available for dissemination of insights. For example, you can save all data and build out endpoints, so business analysts and users can access that data through a dashboard.

During the process, I also like that Databricks allows you to do portion control to keep track of your operations on the data and maintain that lineage to create reproducible results. 

The most significant Databricks advantage is that you can do everything within the platform. You don't need to exit the platform because it's a one-stop shop that can help you do all processes.

The solution is top-notch from a data science, applied ML, or advanced analytics perspective.

What needs improvement?

I have had some issues with some of the Spark clusters running on Databricks, where the Spark runtime and clusters go up and down, which is an area for improvement. Still, I am generally unaware of any super-critical issues.

For how long have I used the solution?

My experience with Databricks is two and a half years.

Buyer's Guide
Databricks
May 2024
Learn what your peers think about Databricks. Get advice and tips from experienced pros sharing their opinions. Updated: May 2024.
771,212 professionals have used our research since 2012.

What do I think about the stability of the solution?

Databricks stability is an eight out of ten because I never had issues with its stability.

What do I think about the scalability of the solution?

Databricks has high scalability. Most of my work on the solution has been in the pharma space, which has massive data sets, so it's a nine out of ten, scalability-wise.

How are customer service and support?

I've never dealt with the Databricks technical support team.

How was the initial setup?

I don't have experience setting up Databricks because that's generally taken care of by the IT, data, or software engineering team before the data science team comes in and starts leveraging the platform. I have yet to experience setting up the Databricks environment personally. However, I have had experience setting up clusters, which was pretty straightforward. Still, in the overall environment of an enterprise-wide system, I have yet to gain experience setting Databricks up.

What's my experience with pricing, setup cost, and licensing?

The cost for Databricks depends on the use case. I work on it as a consultant, so I'm using the client's Databricks, so it depends on how big the client is. If it's a global organization, that cost varies versus a smaller organization that has just adopted the platform and is trying to onboard a small team of five people. It depends.

What other advice do I have?

I'm a data scientist, so I frequently use Databricks and Domino Data Science Platform.

I'm a consultant, so every client has a different version or a different runtime in Databricks, so the versions used would vary per client.

The deployment for the solution is on the cloud, predominantly on AWS or Azure.

My clients adopted Databricks as the platform of choice, and with different use cases and more teams coming on board, the usage of Databricks will increase. I don't see that going down. It can only go up.

My advice to anyone looking into implementing Databricks is that it should be one of your top choices, especially if you're looking to focus on data processing, standard ETL operations, advanced analytics, or the ML type of work.

I'd rate the solution as nine out of ten. It checks almost all the boxes that modern applications need to have.

My organization is an active partner and implementer of Databricks, but it doesn't resell the solution.

Which deployment model are you using for this solution?

Public Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Amazon Web Services (AWS)
Disclosure: My company has a business relationship with this vendor other than being a customer: Partner
PeerSpot user
PraveenS - PeerSpot reviewer
Design Engineer at Cyient Limited
Real User
Top 5
A scalable and cost-effective solution that has excellent translation features and can be used for data analytics
Pros and Cons
  • "It is a cost-effective solution."
  • "The product should provide more advanced features in future releases."

What is our primary use case?

We use the solution for data analytics of industrial data.

What is most valuable?

We extensively use the product’s notebooks, jobs, and triggers. We can create activities. Wherever translation is required, we use Databricks. The product fulfills our customer requirements. It is a cost-effective solution.

What needs improvement?

The product should provide more advanced features in future releases.

For how long have I used the solution?

I have been using the solution for six months.

What do I think about the stability of the solution?

Our data was not too huge. It worked well. It is easily adaptable.

What do I think about the scalability of the solution?

The tool is scalable. We can make it available for a larger audience.

How was the initial setup?

The initial setup is not that difficult. I rate the ease of setup a seven out of ten. The solution is cloud-based. We use native services like Data Factory for orchestration. Sometimes, the customers require us to use Amazon as the cloud provider instead of Azure.

What's my experience with pricing, setup cost, and licensing?

The pricing is average.

What other advice do I have?

There are many services which are coming up. They are still in the preview stage. Overall, I rate the product an eight out of ten.

Which deployment model are you using for this solution?

Public Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Microsoft Azure
Disclosure: My company has a business relationship with this vendor other than being a customer: Partner
Flag as inappropriate
PeerSpot user
Buyer's Guide
Databricks
May 2024
Learn what your peers think about Databricks. Get advice and tips from experienced pros sharing their opinions. Updated: May 2024.
771,212 professionals have used our research since 2012.
Jeremy Salt - PeerSpot reviewer
Sr. Data Quality Analyst at Seek
Real User
Top 10
Can use different technologies to do data analysis and can quickly get data
Pros and Cons
  • "Databricks makes it really easy to use a number of technologies to do data analysis. In terms of languages, we can use Scala, Python, and SQL. Databricks enables you to run very large queries, at a massive scale, within really good timeframes."
  • "Databricks has added some alerts and query functionality into their SQL persona, but the whole SQL persona, which is like a role, needs a lot of development. The alerts are not very flexible, and the query interface itself is not as polished as the notebook interface that is used through the data science and machine learning persona. It is clunky at present."

What is our primary use case?

We use it for data analysis and testing of high volume web user behavioral data.

What is most valuable?

Databricks makes it really easy to use a number of technologies to do data analysis. In terms of languages, we can use Scala, Python, and SQL. Databricks enables you to run very large queries, at a massive scale, within really good timeframes.

I'm starting to build a solution using Delta Live Tables and Delta Live pipelines, and it is proving to be exceptionally easy to use. I have also been able to quickly implement a pipeline.

What needs improvement?

Databricks has added some alerts and query functionality into their SQL persona, but the whole SQL persona, which is like a role, needs a lot of development. The alerts are not very flexible, and the query interface itself is not as polished as the notebook interface that is used through the data science and machine learning persona. It is clunky at present.

For how long have I used the solution?

I've been using Databricks for a year.

What do I think about the stability of the solution?

It is a stable and reliable solution. I'd rate stability at eight out of ten.

What do I think about the scalability of the solution?

Databricks is absolutely scalable, and I'd rate scalability at eight out of ten. We probably have between 60 and 100 users in our organization, and we hope to increase usage in the future.

How are customer service and support?

The technical support staff we have worked with have been amazing. They helped us initially with our Delta Live pipelines. I would give them a rating of ten out of ten.

How would you rate customer service and support?

Positive

Which solution did I use previously and why did I switch?

I have previously worked with Apache Hadoop, and Databricks is definitely a better product. It's much easier to get data quickly in Databricks. As a result, a lot of the drudgery is taken away. Whereas with Hadoop, it's a bit more tricky to get data together.

What's my experience with pricing, setup cost, and licensing?

We're charged on what the data throughput is and also what the compute time is.

What other advice do I have?

I'd strongly recommend giving Databricks a try. We have found it to be a fantastic tool that has accelerated some of our solutions. We're an AI-heavy shop, and there are a lot of data scientists using the MLflow capabilities. I hear a lot of good things from that side as well. From a data analysis point of view, Databricks has been fantastic, and I would rate it at eight on a scale from one to ten.

Which deployment model are you using for this solution?

Public Cloud
Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
STI Data Leader at grupo gtd
Real User
Top 5Leaderboard
Easy to use with a free community version and helpful documentation
Pros and Cons
  • "The solution offers a free community version."
  • "We'd like a more visual dashboard for analysis It needs better UI."

What is most valuable?

I like the simplicity and ease of use. 

You can deploy the solution to many clouds easily. 

The initial setup is straightforward.

The solution offers a free community version.

What needs improvement?

The auto models can be improved. 

We can create auto models like Microsoft Azure Machine Learning. In Azure Machine Learning, they have these features, for example, for auto models or code, or by code. They need this in Databricks. 

We need more connectors between on-premises and the cloud. 

We'd like a more visual dashboard for analysis It needs better UI. 

For how long have I used the solution?

I've used the solution for one and a half months. 

What do I think about the stability of the solution?

The solution is very stable. There are no bugs or glitches. It doesn't crash or freeze. 

What do I think about the scalability of the solution?

Scalability is no problem. At the beginning, we created a cluster, for example, and if we need more performance in the future, for example, or to accelerate the training, we can change the cluster. It's quite straightforward. 

We have five people using the solution. 

In one or two years, we'd like to promote the solution to clients and increase usage. Right now, the way it is used is limited. I know that some banks and aeronautics companies use it.

How are customer service and support?

In terms of technical support, for now, we use the community. 

Which solution did I use previously and why did I switch?

We are also aware of KNIME, Azure Machine Learning, and Anaconda. In Anaconda, we use many frameworks, for example.

We started with other platforms, like Azure Machine Learning due to the fact that, with AutoML, it's easy to use. However, now that we have more skills, we need other tools or platforms like Databricks. It's a good platform to deploy and develop machine learning in employees.

How was the initial setup?

The implementation is quite easy. It's not complex or difficult. The first time, I did it using a tutorial which was quite helpful. Later, I took a course. I know it quite well. 

The deployment only takes a few days. 

You only need to deploy or maintain the solution. 

What about the implementation team?

We did not need any outside assistance in terms of setting up the solution. 

What's my experience with pricing, setup cost, and licensing?

For us, this product is free. We use the community version.

I am interested in using the enterprise version, however. Whether we use it or not depends on the projects and customers we get.

What other advice do I have?

I work with a solution provider. We are a Databrick customer.

We are not partners of Databricks. Only we are partnered with Microsoft Azure and Amazon AWS.

We are using the latest version of the solution. However, I do not know the exact version number. 

I still need time with the solution before providing advice to others. I need to prepare the capacity internally. So far, it's been great.

I'd rate the solution eight out of ten. 

Which deployment model are you using for this solution?

Public Cloud
Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Elizabeth Ho - PeerSpot reviewer
Manager, Customer Journey at a retailer with 10,001+ employees
Real User
You can connect multiple data sources and share work easily
Pros and Cons
  • "I like how easy it is to share your notebook with others. You can give people permission to read or edit. I think that's a great feature. You can also pull in code from GitHub pretty easily. I didn't use it that often, but I think that's a cool feature."
  • "I would like it if Databricks adopted an interface more like R Studio. When I create a data frame or a table, R Studio provides a preview of the data. In R Studio, I can see that it created a table with so many columns or rows. Then I can click on it and open a preview of that data."

What is our primary use case?

I use Databricks for customer marketing analytics.

What is most valuable?

Databricks lets you schedule jobs pretty easily, and you can use SQL, Spark SQL, Python, or R. It also allows you to save a table or view. 

I like that you can connect to multiple data sources. Most of our data is stored in the Azure data lake, but my previous company connected to SQL databases or even blob storage. 

They've improved on many features. I don't do data engineering, but I had an issue a couple of years ago at my two companies ago. It took a long time to read and save tables, but I think the new Delta feature helped. 

I like how easy it is to share your notebook with others. You can give people permission to read or edit. I think that's a great feature. You can also pull in code from GitHub pretty easily. I didn't use it that often, but I think that's a cool feature.

What needs improvement?

I would like it if Databricks adopted an interface more like R Studio. When I create a data frame or a table, R Studio provides a preview of the data. In R Studio, I can see that it created a table with so many columns or rows. Then I can click on it and open a preview of that data. 

Because I work in analytics and not data engineering, I think that's probably the biggest one. There are better graphical tools, so I don't think Databricks can compete. You can do a simple graph, and it's not that great. However, I don't think they can ever stack up to Tableau, so it's probably not worth it to improve upon that. 

For how long have I used the solution?

I've been using Databricks for two years.

What do I think about the stability of the solution?

Databricks is stable.

What do I think about the scalability of the solution?

Databricks is scalable.

How are customer service and support?

Databricks tech support has been great every time I've dealt with them. Their team is highly knowledgeable. 

How was the initial setup?

Setting up Databricks is easy. I set it up at my previous company. That was on Azure as well, but they utilized a third-party team with expertise in Databricks to ensure everything was optimized. 

What other advice do I have?

I rate Databricks 10 out of 10. I recommend taking advantage of Databricks support or a third-party provider to ensure it's set up optimally. I don't know if it's an additional service you must pay for, but we always had access to Databricks support in my last company. 

I think that's worth the money because there are so many different scenarios with distributed computing. Even people who study analytics may not understand the ins and out of Spark. It's worth it to have a service contract for support.

Which deployment model are you using for this solution?

Public Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Microsoft Azure
Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
RichardXu - PeerSpot reviewer
Data Science Lead at a mining and metals company with 10,001+ employees
Real User
Top 10
Scalable and reliable, with helpful support
Pros and Cons
  • "It can send out large data amounts."
  • "It's not easy to use, and they need a better UI."

What is our primary use case?

We use this solution to build skill and text classification models.

What is most valuable?

The scalability brings value to this solution.

It can send out large data amounts.

What needs improvement?

The user experience can be improved. 

It's not easy to use, and they need a better UI.

For how long have I used the solution?

I have been dealing with Databricks for more than five years.

We used this solution last five months ago and used the most current version during that time.

What do I think about the stability of the solution?

This solution is quite stable. We have not had any issues with stability.

What do I think about the scalability of the solution?

It's a scalable solution. Very few people are using this solution in our organization. Most don't have the skill.

How are customer service and technical support?

We were using the free version which did not have a lot of support.

We didn't really need support at the time. I had one conversation with them and they were very nice. They were helpful.

Which solution did I use previously and why did I switch?

We are using Dataiku for one project and also SageMaker. We have some issues with scalability using SageMaker, which is why we may be going back to Databricks.

SageMaker is a very specific AI tool.

How was the initial setup?

The initial setup was okay.

What's my experience with pricing, setup cost, and licensing?

There are many different versions. 

We used the trial version, which was free.

What other advice do I have?

If you have a lot of data, Databricks is a good choice. 

With the migration of Microsoft and Databricks, they make it easy. It's the direction to go in.

It's a very good tool. I would rate Databricks a nine out of ten. 

Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
DevSmita Asthana - PeerSpot reviewer
Strategic Alliances& Ecosystems Manager at a outsourcing company with 501-1,000 employees
MSP
Top 10
Helps to have a good data presence but needs to incorporate learning aspects
Pros and Cons
  • "Databricks has helped us have a good presence in data."
  • "The product should incorporate more learning aspects. It needs to have a free trial version that the team can practice."

What is our primary use case?

The product has helped in data fabrication. 

How has it helped my organization?

Databricks has helped us have a good presence in data. 

What needs improvement?

The product should incorporate more learning aspects. It needs to have a free trial version that the team can practice. 

For how long have I used the solution?

I have been using the product for more than six months. 

What do I think about the stability of the solution?

I rate Databricks' an eight out of ten. 

What do I think about the scalability of the solution?

I rate the tool's scalability an eight out of ten. 

How was the initial setup?

The transition to Databricks was smooth. 

What's my experience with pricing, setup cost, and licensing?

Databricks' price is high. 

What other advice do I have?

I rate the solution a nine out of ten. 

Disclosure: My company has a business relationship with this vendor other than being a customer:
Flag as inappropriate
PeerSpot user
Head of Business Integration and Architecture at Jakala
Real User
Top 5
Highly scalable data platform that offers exceptional performance and value data types unique to this solution
Pros and Cons
  • "The Delta Lake data type has been the most useful part of this solution. Delta Lake is an opensource data type and it was implemented and invented by Databricks."
  • "The data visualization for this solution could be improved. They have started to roll out a data visualization tool inside Databricks but it is in the early stages. It's not comparable to a solution like Power BI, Luca, or Tableau."

What is our primary use case?

We use this solution for the Customer Data Platform(CDP). My company works in the MarTech space and usually we implement custom CDP.

What is most valuable?

The Delta Lake data type has been the most useful part of this solution. Delta Lake is an opensource data type and it was implemented and invented by Databricks. It is the most important element of the solution. Databricks also offers exceptional performance and scalability. 

What needs improvement?

The data visualization for this solution could be improved. They have started to roll out a data visualization tool inside Databricks but it is in the early stages. It's not comparable to a solution like Power BI, Luca, or Tableau.

In a future release, we would like to have a better ETL designer tool to assist in the way we move data from one place to another.

For how long have I used the solution?

We have been using this solution for four years. 

What do I think about the stability of the solution?

This is a stable solution. 

What do I think about the scalability of the solution?

This is a scalable solution. 

How was the initial setup?

The initial setup is very easy. It is a managed solution inside Azure so you just need to search for Databricks. There are a couple of pages to follow in the setup wizard and Databricks is up and running.

What's my experience with pricing, setup cost, and licensing?

We implement this solution on behalf of our customers who have their own Azure subscription and they pay for Databricks themselves. The pricing is more expensive if you have large volumes of data. 

Which other solutions did I evaluate?

When we first started using Databricks in 2018, there were not many comarable solutions to consider. Right now there are many solutions to consider including Snowflake, Azure Synapse, Redshift and BigQuery.

Databricks continues to be our solution of choice but Snowflake does have a better user interface and is easier to work with the data pipelines and with the overall UI.

What other advice do I have?

I would advise others to first define a strong data strategy and then choose which data platform suits your needs. 

I would rate this solution a nine out of ten. 

Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Buyer's Guide
Download our free Databricks Report and get advice and tips from experienced pros sharing their opinions.
Updated: May 2024
Buyer's Guide
Download our free Databricks Report and get advice and tips from experienced pros sharing their opinions.