Axel Richier - PeerSpot reviewer
Tech Lead Consultant | Manager Data Engineering at Ekimetrics
Real User
Top 5
Simple to set up, fast to deploy, and offers helpful technical support
Pros and Cons
  • "We can scale the product."
  • "I would love an integration in my desktop IDE. For now, I have to code on their webpage."

What is our primary use case?

We're using it to provide a unified development experience for all our data experts, including all data engineers, data scientists, and IT engineers. 

What is most valuable?

The shared experience of collaborative notebooks is probably the most useful aspect since, as an expert, it allows me to help my juniors debug their books and their code live. I can do some live coding with them or help them find the errors very efficiently.

It's simple to set up.

I love Databricks due to the fact that we can deploy it in 15 minutes and it's ready to use. That's very nice.

The solution is stable. 

We can scale the product.

What needs improvement?

I would love an integration in my desktop IDE. For now, I have to code on their webpage. They provide a web interface to do my code. However, I have my local software to do some coding for other projects, yet I cannot use it for Databricks, and I lose all my shortcuts. I lose all the benefits from my local IDE. If one day they would provide some integrations with VS code, for example, that would be game-changing. Having Databricks in my VS code would be the most amazing feature.

For how long have I used the solution?

I've been using the solution for more than three years now.

Buyer's Guide
Databricks
May 2024
Learn what your peers think about Databricks. Get advice and tips from experienced pros sharing their opinions. Updated: May 2024.
770,141 professionals have used our research since 2012.

What do I think about the stability of the solution?

The product is very stable. I've been using it for three years now, and I have projects that have been running for three years without any big issues.

What do I think about the scalability of the solution?

It's very scalable. I have a project that started as a proof of concept on connected cars. We had 100 cars to track at first - just for the proof of concept. Now we have millions of cars that are being tracked. It scales very well. We have terabytes of data every day and it doesn't even flinch.

How are customer service and support?

I've had very good experiences with technical support where they answer me in a couple of hours. Sometimes it takes a bit longer. It's usually a matter of days, so it's very good overall. 

Even if it took a bit of time, I got my answer. They never left me without an answer or a solution.

How would you rate customer service and support?

Positive

How was the initial setup?

The implementation is very simple to set up. That's why we choose it over many other tools.

Usually, we have two to five data engineers handling the maintenance and running of our solutions.

What's my experience with pricing, setup cost, and licensing?

The solution is a bit expensive. That said, it's worth it. I see it as an Apple product. For example, the iPhone is very expensive, yet you get what you pay for.

The cost depends on the size of your data. If you have lots of data, it's going to be more expensive since your paper compute units will be more. My smallest project is around a hundred euros, and my most expensive is just under a thousand euros a week. That is based on terabytes of data processed each month.

Which other solutions did I evaluate?

We looked into Azure Synapse as an alternative, as well as Azure ML and Vertex on GCP. Vertex AI would be the main alternative.

Some people consider Snowflake a competitor; however, we can't deploy Snowflake ourselves just like we deploy Databricks ourselves. We use that as an advantage when we sell Databricks to our clients. We say, "If you go with us, we are going to deploy Databricks in your environment in 15 minutes," and they really like it.

What other advice do I have?

We're a partner.

We use the solution on various clouds. Mostly it is Aure. However, we also have Google and AWS as well. 

One of the big advantages is that it works across domains. I'm responsible for a data engineering team. However, I work on the same platform with data scientists, and I'm very close to my IT team, who is in charge of the data access and data access control, and they can manage all the accesses from one point to all the data assets. It's very useful for me as a data engineer. I'm sure that my IT director would say it's very useful for him too. They managed to build a solution that can very easily cross responsibilities. It unifies all the challenges in one place and solves them all mostly.

I'd rate the solution nine out of ten.

Which deployment model are you using for this solution?

Public Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Microsoft Azure
Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Alex Tsui - PeerSpot reviewer
Sr. Director at Omnicell
Real User
A stable, scalable solution that simplifies the development process but needs more debuggers and components
Pros and Cons
  • "The simplicity of development is the most valuable feature."
  • "Databricks has a lack of debuggers, and it would be good to see more components."

What is our primary use case?

We use the solution for data engineering. 

How has it helped my organization?

The tool helps us manage large amounts of data. 

What is most valuable?

The simplicity of development is the most valuable feature. 

What needs improvement?

Databricks has a lack of debuggers, and it would be good to see more components. 

Another issue is that the D4 data format keeps changing on our cluster. This doesn't affect me much because I use functions to define it, but it is very frustrating for some more casual users. One day the output will be in a particular format, and then it becomes an object without us changing the cluster configuration. As a small team, we don't have the capacity to dig deeply into the issue, which has been frustrating.

For how long have I used the solution?

We have been using the solution for three years. 

What do I think about the stability of the solution?

The solution's stability is good. 

What do I think about the scalability of the solution?

The product is scalable. We're a small organization with 12 users, and we don't currently have any plans to increase our usage.

What was our ROI?

We see an ROI from Databricks. 

What other advice do I have?

I would rate the solution seven out of ten. 

It's a good solution and more for handling large amounts of data. Databricks is better as a batch processing system than as an interactive system. The performance is a little disappointing because the memory processing is supposed to be excellent, but it's not as competitive as some other solutions out there in this regard. Even classical databases can respond and process faster.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Buyer's Guide
Databricks
May 2024
Learn what your peers think about Databricks. Get advice and tips from experienced pros sharing their opinions. Updated: May 2024.
770,141 professionals have used our research since 2012.
RichardXu - PeerSpot reviewer
Data Science Lead at a mining and metals company with 10,001+ employees
Real User
Top 10
Scalable and reliable, with helpful support
Pros and Cons
  • "It can send out large data amounts."
  • "It's not easy to use, and they need a better UI."

What is our primary use case?

We use this solution to build skill and text classification models.

What is most valuable?

The scalability brings value to this solution.

It can send out large data amounts.

What needs improvement?

The user experience can be improved. 

It's not easy to use, and they need a better UI.

For how long have I used the solution?

I have been dealing with Databricks for more than five years.

We used this solution last five months ago and used the most current version during that time.

What do I think about the stability of the solution?

This solution is quite stable. We have not had any issues with stability.

What do I think about the scalability of the solution?

It's a scalable solution. Very few people are using this solution in our organization. Most don't have the skill.

How are customer service and technical support?

We were using the free version which did not have a lot of support.

We didn't really need support at the time. I had one conversation with them and they were very nice. They were helpful.

Which solution did I use previously and why did I switch?

We are using Dataiku for one project and also SageMaker. We have some issues with scalability using SageMaker, which is why we may be going back to Databricks.

SageMaker is a very specific AI tool.

How was the initial setup?

The initial setup was okay.

What's my experience with pricing, setup cost, and licensing?

There are many different versions. 

We used the trial version, which was free.

What other advice do I have?

If you have a lot of data, Databricks is a good choice. 

With the migration of Microsoft and Databricks, they make it easy. It's the direction to go in.

It's a very good tool. I would rate Databricks a nine out of ten. 

Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Sahil Taneja - PeerSpot reviewer
Principal Consultant/Manager at Tenzing
Real User
Top 5
Processes tremendous data easily
Pros and Cons
  • "The processing capacity is tremendous in the database."
  • "There is room for improvement in the documentation of processes and how it works."

What is our primary use case?

Our primary use case is in our project; we are dealing with Duo Special Data, where we need a lot of computing resources. Here, the traditional warehouse cannot handle the amount of data we are using, and this is where Databricks comes into the picture. 

What is most valuable?

The processing capacity is tremendous in the database. We are dealing with Azure as storage, so we have not faced any challenges. And also the connectors to different data sources. Moreover, it is not a language-dependent tool. Therefore, development also takes place faster. It is one of the best features of Databricks.

What needs improvement?

There is room for improvement in the documentation of processes and how it works. I was trying to get one of the certifications, so I saw an area of improvement there. 

For how long have I used the solution?

I have been using Databricks for eight to nine months.

What do I think about the stability of the solution?

It is a stable product for us. We didn't see any challenges. 

What do I think about the scalability of the solution?

There are around 30 to 35 users in our organization. 

How was the initial setup?

The initial setup was easy because the third-party team made the clusters for us. 

What about the implementation team?

A third-party team enabled the cluster to make the setup easy for us. 

What other advice do I have?

I would advise using it based on the use case because it easily handles big data. It is your go-to tool if you are dealing with massive data. 

Overall, I would rate the solution a nine out of ten. The tool performs well in various use cases, availability of documentation online, and compatibility with big data systems like GCP, Azure, or AWS.

Which deployment model are you using for this solution?

Public Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Microsoft Azure
Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Lead Data Scientist at a manufacturing company with 10,001+ employees
Real User
Top 5
A great solution that has allowed for collaboration within our organization
Pros and Cons
  • "We have the ability to scale, collaborate and do machine learning."
  • "The product cannot be integrated with a popular coding IDE."

What is our primary use case?

Our primary use case for this solution is research for data scientists. The solution is deployed on cloud.

How has it helped my organization?

It has allowed our data engineers, data scientists, and analysts to collaborate and work on the same platform. 

What is most valuable?

We have the ability to scale, collaborate and do machine learning.

What needs improvement?

The product cannot be integrated with a popular coding IDE.

For how long have I used the solution?

We have been using this solution for approximately three years.

What do I think about the stability of the solution?

The solution is stable.

What do I think about the scalability of the solution?

The solution is scalable. There are five people using it in our organization.

How are customer service and support?

I rate my experience with customer service and support an eight out of ten.

Which solution did I use previously and why did I switch?

We previously used H2O.

How was the initial setup?

The initial setup was straightforward.

What about the implementation team?

Implementation was done in-house.

What was our ROI?

We have seen a return on investments.

What's my experience with pricing, setup cost, and licensing?

Licensing costs are charged on a yearly basis and costs between 25,000 and 30,000.

Which other solutions did I evaluate?

We evaluated other options but this solution was the best fit for what we required.

What other advice do I have?

I rate this solution nine out of ten. The solution is good but can be improved by integrating with a popular coding IDE.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Anand Sharma - PeerSpot reviewer
Sr Data Engineer at PIMCO
Real User
Supports several coding languages, good performance, and facilitates team collaboration
Pros and Cons
  • "The load distribution capabilities are good, and you can perform data processing tasks very quickly."
  • "In the future, I would like to see Data Lake support. That is something that I'm looking forward to."

What is our primary use case?

Our primary use case is ETL.

How has it helped my organization?

Using Databricks enables us to use the Data Mesh methodology, where every team performs their own ETL.

What is most valuable?

The most valuable feature is the versatility of the ecosystem. You can write code in SQL, Python, or Java.

The load distribution capabilities are good, and you can perform data processing tasks very quickly.

You can save and share notebooks between different teams.

The interface is easy to use.

What needs improvement?

The cost of this solution is high, on the expensive side.

In the future, I would like to see Data Lake support. That is something that I'm looking forward to.

For how long have I used the solution?

I worked with Databricks for approximately two years in my previous company.

What do I think about the scalability of the solution?

This is a very scalable solution. We have twenty-five data engineers that use it, and we may grow our usage.

How are customer service and support?

The technical support is okay. I would rate them a seven out of ten.

How would you rate customer service and support?

Neutral

Which solution did I use previously and why did I switch?

We did not use another similar solution prior to Databricks.

How was the initial setup?

The cloud-based deployment is simple.

If you use an on-premises deployment then there is more to do.

What about the implementation team?

We deployed it with our in-house team.

There is no maintenance required.

What was our ROI?

We have seen a return on our investment with Databricks.

What's my experience with pricing, setup cost, and licensing?

Price-wise, I would rate Databricks a three out of five.

Which other solutions did I evaluate?

When we looked into Databricks, we evaluated Azure Data Factory and some of the others on the market. We found that Databricks was one of the easiest ones to use.

What other advice do I have?

My advice for anybody that is looking into Databricks is not to use the on-premises deployment. Instead, use the cloud-based setup.

In summary, this is a good product.

I would rate this solution an eight out of ten.

Which deployment model are you using for this solution?

Public Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Microsoft Azure
Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Rupal Sharma - PeerSpot reviewer
Data Architect at Three Ireland (Hutchison) - Infrastructure
Real User
Top 5
Processes large data for data science and data analytics purposes
Pros and Cons
  • "Specifically for data science and data analytics purposes, it can handle large amounts of data in less time. I can compare it with Teradata. If a job takes five hours with Teradata databases, Databricks can complete it in around three to three and a half hours."
  • "There is room for improvement in visualization."

What is our primary use case?

It's mainly used for data science, data analytics, visualization, and industrial analytics.

What is most valuable?

Specifically for data science and data analytics purposes, it can handle large amounts of data in less time. I can compare it with Teradata. If a job takes five hours with Teradata databases, Databricks can complete it in around three to three and a half hours.

So that's why it's quite convenient to use for data science, for training machine learning models. By using more computing power, you can make it even faster.

What needs improvement?

There is room for improvement in visualization.

For how long have I used the solution?

I used it for two years. I worked with the latest update. 

What do I think about the stability of the solution?

I would rate the stability a nine out of ten. I didn't face performance drops.

What do I think about the scalability of the solution?

I would rate the scalability an eight out of ten.

How are customer service and support?

Databrick's support is great. If we need any support, they are very quick with it. And they genuinely want you to use Databricks. So, whatever we ask them, they come up with multiple solutions to problem statements. That's really good.

Overall, the customer service and support are very good.

How would you rate customer service and support?

Positive

Which solution did I use previously and why did I switch?

I personally prefer using Databricks. However, we also considered using Snowflake, but the pricing was different. It's  price per query.

So, as per your storage, a data scientist or a data analytics team needs to query again and again, which does not suit a data-heavy organization.

What was our ROI?

It's a good return on investment for Databricks from a delivery perspective. Delivered multiple dashboards. So, it's quite a good return on investment. And being a small organization, everyone can use Databricks, and cost-wise, it's also good for small organizations.

Which other solutions did I evaluate?

If the company is a startup, Databricks might be suitable. If a big company needs a lot of storage, Teradata might be best for them. It depends on the situation.

What other advice do I have?

Overall, I would rate the solution a eight out of ten. I would definitely recommend this solution for small organizations. 

Which deployment model are you using for this solution?

Private Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Disclosure: I am a real user, and this review is based on my own experience and opinions.
Flag as inappropriate
PeerSpot user
Kevin McAllister - PeerSpot reviewer
Executive Manager at Hexagon AB
Real User
Top 5Leaderboard
Excellent data transformation but data-serving performance could be better
Pros and Cons
  • "Databricks' most valuable feature is the data transformation through PySpark."
  • "Databricks' performance when serving the data to an analytics tool isn't as good as Snowflake's."

What is our primary use case?

We mainly use Databricks to process ingest and do the ELT processes of data to get it ready for analytics and to serve the data to ThoughtSpot, which calls queries and Databricks to get the data.

How has it helped my organization?

We didn't have any good tooling for ELT processing prior to Databricks. We were using Microsoft HD Insight, but it was taking too long to process the data. When we changed our data-processing ELT processes over to Databricks, the amount of time to process the data was reduced to a fraction of what HD Insight used, so we were able to run jobs much faster.

What is most valuable?

Databricks' most valuable feature is the data transformation through PySpark.

What needs improvement?

Databricks' performance when serving the data to an analytics tool isn't as good as Snowflake's. In the next release, Databricks should include a better data-sharing platform to facilitate data sharing between companies.

For how long have I used the solution?

I've been using Databricks for three years.

What do I think about the stability of the solution?

Databricks' stability has been great, and I would rate it eight out of ten.

What do I think about the scalability of the solution?

Databricks is very scalable because it's very easy to spin up multiple clusters, but the cost of doing that is tremendous. I'd rate its scalability nine out of ten, but you'll pay for it.

How are customer service and support?

The technical support has been really bad, but that's because we don't have a direct agreement with Databricks.

How would you rate customer service and support?

Neutral

Which solution did I use previously and why did I switch?

I previously used HD Insight from Microsoft, but it took many, many hours to process data, so we switched to Databricks.

How was the initial setup?

The initial setup was pretty complex and required three people.

What about the implementation team?

We used an in-house team with some consulting help.

What was our ROI?

We've had a low ROI from Databricks.

What's my experience with pricing, setup cost, and licensing?

I would rate Databricks' pricing seven out of ten.

What other advice do I have?

I would advise anyone thinking of implementing Databricks to know their use case. For example, if you're looking for a big data repository to query data and do ELT processing, I recommend looking at other platforms, like Snowflake. However, if you're going to do AI and machine learning, then Databricks is probably stronger in that area. Overall, I would rate Databricks seven out of ten.

Which deployment model are you using for this solution?

Public Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Microsoft Azure
Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Buyer's Guide
Download our free Databricks Report and get advice and tips from experienced pros sharing their opinions.
Updated: May 2024
Buyer's Guide
Download our free Databricks Report and get advice and tips from experienced pros sharing their opinions.