Jithin James - PeerSpot reviewer
Financial Analyst 4 (Supply Chain & Financial Analytics) at Juniper Networks
MSP
Top 5
Easy to collaborate with other team members who are working on it
Pros and Cons
  • "Databricks is hosted on the cloud. It is very easy to collaborate with other team members who are working on it. It is production-ready code, and scheduling the jobs is easy."
  • "Databricks would have more collaborative features than it has. It should have some more customization for the jobs."

What is our primary use case?

We use the solution for reliability engineering, where we apply ML and Deep Learning models to identify the fear failure patterns across different geographies and products.

What is most valuable?

Databricks is hosted on the cloud. It is very easy to collaborate with other team members who are working on it. It is production-ready code, and scheduling the jobs is easy.

What needs improvement?

Databricks would have more collaborative features than it has. It should have some more customization for the jobs. Also, it has an average dashboarding tool. They can bring advanced features so we don't depend on other BI tools to build a dashboard. We are using Tableau to create a dashboard. If Databricks has more advanced features, we can entirely use Databricks.

For how long have I used the solution?

I have been using Databricks for one year.

Buyer's Guide
Databricks
May 2024
Learn what your peers think about Databricks. Get advice and tips from experienced pros sharing their opinions. Updated: May 2024.
771,212 professionals have used our research since 2012.

What do I think about the stability of the solution?

The product is stable. It has been giving consistent outputs without any major issues.

What do I think about the scalability of the solution?

The solution is hosted on the cloud. It supports high scalability features.

10-20 users are using this solution.

How are customer service and support?

There was a training session from Databricks where they explained how to use it. We never had to contact them because they had already given us proper training on the platform.

Which solution did I use previously and why did I switch?

I have used Alteryx before. We switched to Databricks because it can compute and turn your code into production-ready code in very few seconds. Also, the stability is relatively high.

How was the initial setup?

The initial setup is easy.

What about the implementation team?

We have a dedicated team for the deployment.

What other advice do I have?

Delta Lake is a free system. We practically work on the data that we get from Snowflake. Databricks are returned to the model outputs that are returned to Delta Lake. It is easy for us to collaborate using Delta Lake, and the computation speed is also quite high for Delta Lake.

The learning curve for Databricks is not very steep. It's pretty easy, and you will find a lot of materials online. So, if you are comfortable coding in Python, it's very straightforward. There is nothing to worry about when using Databricks.

Overall, I rate the solution a ten out of ten.

Which deployment model are you using for this solution?

Public Cloud
Disclosure: I am a real user, and this review is based on my own experience and opinions.
Flag as inappropriate
PeerSpot user
Head of Referential and Big Data at a financial services firm with 5,001-10,000 employees
Real User
A highly scalable unified data platform that provides data access to any type of user
Pros and Cons
  • "I like cloud scalability and data access for any type of user."
  • "It would be better if it were faster. It can be slow, and it can be super fast for big data. But for small data, sometimes there is a sub-second response, which can be considered slow. In the next release, I would like to have automatic creation of APIs because they don't have it at the moment, and I spend a lot of time building them."

What is our primary use case?

We use Databricks to define tool data and have many use cases to analyze and distribute the data.

How has it helped my organization?

Data is open to everyone; they can access it through many channels, including notebooks or SQL. That on its own democratizes the data.

What is most valuable?

I like cloud scalability and data access for any type of user.

What needs improvement?

It would be better if it were faster. It can be slow, and it can be super fast for big data. But for small data, sometimes there is a sub-second response, which can be considered slow.

In the next release, I would like to have automatic creation of APIs because they don't have it at the moment, and I spend a lot of time building them.

For how long have I used the solution?

I have been using Databricks for roughly one and a half years.

What do I think about the stability of the solution?

Stability is excellent.

What do I think about the scalability of the solution?

Databricks is scalable. You can use the power of the cloud to scale your cluster size, either CPU or memory. The data doesn't work like a standard database, so you don't have it based on files, and you don't copy the data. It's super scalable. It's only the computing that you have to scale with the data.

We probably have 40 users with roles like developers, business analysts, and data scientists. We have big plans to increase the usage and have more departments using it.

How are customer service and support?

Technical support has helped us.

On a scale from one to ten, I would give technical support a five.

How would you rate customer service and support?

Positive

Which solution did I use previously and why did I switch?

We used Cloudera before switching to Databricks.

How was the initial setup?

The initial setup was fairly okay. It takes about two minutes to deploy this solution. It's all code, so we click a button, and then it's done.

On a scale from one to five, I would give the initial setup a four.

What about the implementation team?

We set up and deployed this solution.

What was our ROI?

On a scale from one to five, I would give our ROI a three.

What's my experience with pricing, setup cost, and licensing?

We only pay for the Azure compute behind the solution. If you want to compute, you have to have a database layer and Azure below.

On a scale from one to five, I would give their pricing a two.

Which other solutions did I evaluate?

We looked at other options such as Snowflake and Cloudera on the cloud,

What other advice do I have?

I would tell potential users that they need proper cloud engineers and a 
cloud infrastructure team to use this solution.

On a scale from one to ten, I would give Databricks a nine.

Which deployment model are you using for this solution?

Public Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Microsoft Azure
Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Buyer's Guide
Databricks
May 2024
Learn what your peers think about Databricks. Get advice and tips from experienced pros sharing their opinions. Updated: May 2024.
771,212 professionals have used our research since 2012.
Head of Credit Risk and Data at Cegid Invoice and Financing
Vendor
It's a reasonably priced all-in-one platform that enables us to build a lakehouse framework
Pros and Cons
  • "Databricks gives us the ability to build a lakehouse framework and do everything implicit to this type of database structure. We also like the ability to stream events. Databricks covers a broad spectrum, from reporting and machine learning to streaming events. It's important for us to have all these features in one platform."
  • "I'm not the guy that I'm working with Databricks on a daily basis. I'm on the management team. However, my team tells me there are limitations with streaming events. The connectors work with a small set of platforms. For example, we can work with Kafka, but if we want to move to an event-driven solution from AWS, we cannot do it. We cannot connect to all the streaming analytics platforms, so we are limited in choosing the best one."

What is our primary use case?

We primarily use Databricks for reporting and machine learning.

What is most valuable?

Databricks gives us the ability to build a lakehouse framework and do everything implicit to this type of database structure. We also like the ability to stream events. Databricks covers a broad spectrum, from reporting and machine learning to streaming events. It's important for us to have all these features in one platform.

What needs improvement?

I'm not the guy that I'm working with Databricks on a daily basis. I'm on the management team. However, my team tells me there are limitations with streaming events. The connectors work with a small set of platforms. For example, we can work with Kafka, but if we want to move to an event-driven solution from AWS, we cannot do it. We cannot connect to all the streaming analytics platforms, so we are limited in choosing the best one.

Also, this is an all-in-one platform, but it might be preferable if there were an a la carte model where we could select the best tool in each class for reporting, machine learning, etc. I'm not yet sure if this strategy is the best one. 

For how long have I used the solution?

We've been using Databricks since the start of the year.

What do I think about the stability of the solution?

Databricks is quite stable. We haven't had any issues with stability. It's always working perfectly with no downtime.

What do I think about the scalability of the solution?

Databricks is based on Spark, which is based on Scala. These languages aren't easy to handle, and it's challenging to find people who know them well. At the same time, a couple of other vendors that work on top of Databricks are low-code platforms. We have to work around Databrick's lack of scalability by using low-code platforms that work on top of Databricks to give us scalability.

How are customer service and support?

I'll give Databricks support 10 out of 10. They are always prompt even though we didn't buy a support package. They have done an excellent job.

How would you rate customer service and support?

Positive

How was the initial setup?

Setting up Databricks is a bit complex, and the initial deployment took a few days—closer to a week. Of course, not everyone is working full-time on this. There are intervals when people are doing other stuff. 

What was our ROI?

It's too soon to tell what kind of return we're getting because we just started using it, and we're still migrating.

What's my experience with pricing, setup cost, and licensing?

The cost of Databricks is in the lower range compared to other solutions. That was one of the main reasons we chose Databricks over other vendors and platforms.  

We pay as we go, so there isn't a fixed price. It's charged by the unit. I don't have any details detail about how they measure this, but it should be a mix between processing and quantity of data handled. We run a simulation based on our use cases, which gives us an estimate. We've been monitoring this, and the costs have met our expectations. 

What other advice do I have?

I give Databricks nine out of 10. The solution has met all our expectations. I'd recommend it to a friend. It's a reasonably priced all-in-one solution that gives us data lake and lakehouse capabilities. Those were the primary reasons we chose Databricks.

Which deployment model are you using for this solution?

Public Cloud
Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Data Engineering Manager at a pharma/biotech company with 10,001+ employees
Real User
A great and easy-to-use platform for data engineers and data scientists who rely on a large dataset to do advanced analytics reporting
Pros and Cons
  • "The most valuable feature is the Spark cluster which is very fast for heavy loads, big data processing and Pi Spark."
  • "It would be great if Databricks could integrate all the cloud platforms."

What is our primary use case?

We use Databricks for data science work in projects that create data pipelines, pre-processing, data wrangling, big data cluster management and ML, machine learning and deep learning tasks.

How has it helped my organization?

Databricks collaborates very well with the Azure platform, Dataiku, and enterprise AI tool. Databricks is a new connection to pull the data or connect to the Spark cluster. It is helpful for us to instance it or distribute the load through the Spark cluster, and it is very user-friendly.

What is most valuable?

The most valuable feature is the Spark cluster which is very fast for heavy loads, big data processing and Pi Spark.

What needs improvement?

Databricks as a solution is integrated with Azure, but Google Cloud has some restrictions. I'm not sure about AWS Cloud, but it would be great if Databricks could integrate all the cloud platforms. Regarding additional features, we would like to see them mostly on the data engineering side, where we have a Spark cluster and some inbuilt ML. In addition, pre-processing steps will be useful.

For how long have I used the solution?

We have been using this solution for two years and are using the latest update.

What do I think about the stability of the solution?

It is a stable solution as long as the Microsoft Azure Platform is stable too.

What do I think about the scalability of the solution?

It is a scalable solution, both vertically and horizontally, which is good. My organization is big, and we have a lot of users. In my department, we have about 15 people using Databricks.

How are customer service and support?

We have not escalated any issues to technical support, but we initially struggled with configuration and the settings of Hive metastore, but we resolved it. I rate the technical support a nine out of ten.

How would you rate customer service and support?

Positive

Which solution did I use previously and why did I switch?

We were using the looped EMR elastic MapReduce from AWS before using Databricks. We switched to Databricks because the whole platform changed from AWS to Azure platform, and Databricks comes as a package.

How was the initial setup?

The initial setup was easy to complete and not complex. It may initially be challenging for a new user, but it improves over time. The CICD pipeline works well with the Microsoft Azure platform because the continuous integration, development and deployment come with the Git integration. It makes it easier for Databricks and the CICD. The deployment should be improved from the perspective of auto ML functionality, so it doesn't have intensive automation learning capability.

We don't use Databricks directly because we work on a data science project. It requires an auto ML and inbuilt machine learning capability. We found capabilities like the large language model using NLP and other deep learning models that are not that intensive. It is meant for data engineering purposes rather than data science purposes. It'll be great if Databricks could be intensive for data science.

We used a third-party, Dataiku platform for the deployment, where we connected to Databricks and completed the ML ops. We required about three people for deployment, and it is easy to maintain the solution.

What was our ROI?

We have seen an ROI but cannot differentiate because it also comes with the Azure platform.

What's my experience with pricing, setup cost, and licensing?

I do not have details about the pricing.

What other advice do I have?

I rate this solution a nine out of ten. Regarding advice, Databricks is a very good platform, popular and easy to use daily for data engineers and data scientists who rely on a large dataset to do advanced analytics reporting. It's a very good tool.

Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Machine Learning Engineer at a mining and metals company with 10,001+ employees
Real User
Highly scalable, stable and good technical support
Pros and Cons
  • "Databricks is a scalable solution. It is the largest advantage of the solution."
  • "The interface of Databricks could be easier to use when compared to other solutions. It is not easy for non-data scientists. The user interface is important before we had to write code manually and as solutions move to "No code AI" it is critical that the interface is very good."

What is our primary use case?

We were using Databricks to build an AI solution. We are only evaluating it, we have approximately three people that tried it out. Later we choose another solution, we did not fully deploy Databricks.

How has it helped my organization?

Before I used Databricks it took me a long time to do some functions and now with Databricks I can do them much quicker. It scales very well.

What needs improvement?

The interface of Databricks could be easier to use when compared to other solutions. It is not easy for non-data scientists. The user interface is important before we had to write code manually and as solutions move to "No code AI" it is critical that the interface is very good.

For how long have I used the solution?

I have used Databricks within the last 12 months.

What do I think about the stability of the solution?

The solution is stable.

What do I think about the scalability of the solution?

Databricks is a scalable solution. It is the largest advantage of the solution.

How are customer service and support?

We have been in contact with the technical support of Databricks, they were good.

Which solution did I use previously and why did I switch?

We have used a lot of different solutions, such as Watson and DataIQ.

How was the initial setup?

The initial setup is easy. However, I do not know much about the implementation because the company does it.

What about the implementation team?

We did the implementation of the solution.

What other advice do I have?

If companies want scalability, they should choose Databricks.

I rate Databricks a nine out of ten.

Which deployment model are you using for this solution?

Public Cloud
Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Tajinder_Singh - PeerSpot reviewer
Senior Software Engineer at a computer software company with 201-500 employees
Real User
Top 5Leaderboard
Valuable data analysis and engineering features with an easy setup
Pros and Cons
  • "The setup is quite easy."
  • "Can be improved by including drag-and-drop features."

What is our primary use case?

Our primary use case for the solution is data analysis by providing a Spark cluster environment with a driver to analyze a huge amount of data and gigabytes of data and can create Notebooks in Databricks. We can write SQL commands, Python code, Scala, or Spark with Python. With Databricks, we get a cluster hosted in the public cloud and we adjust it based on how much we use it.

What is most valuable?

The most valuable features are data engineering and data science because we can create Notebooks on them. We can use any Python library to build data science models, or we can use libraries like Seaborn or Matplotlib to create charts based on data for data analysis. It is a really valuable capability.

What needs improvement?

Microsoft Azure has its learning environment on the Microsoft website. We can complete certifications, but the Databricks certification is more expensive than Microsoft. It costs between $2,000 and $2,500, and the knowledge is linked. They're also charged based on whether a person doesn't want to analyze large amounts of data. Hence, we want to have the capacity for free student users so that people can learn and build their professional skills.

For how long have I used the solution?

We have been using the solution for approximately one year.

What do I think about the stability of the solution?

The solution is stable. Microsoft offers a public service, and we can get it from the Databricks website. Additionally, many companies use it to analyze their data or create a Spark cluster to run Python or SQL scripts based on their data. I rate the stability a nine out of ten.

How was the initial setup?

The setup is quite easy, and Databricks has also partnered with Microsoft, so we get this service on Microsoft Azure.

What was our ROI?

We have seen a return on investment.

What's my experience with pricing, setup cost, and licensing?

We have a pay-as-you-go subscription and pay for it based on our usage.

Which other solutions did I evaluate?

We chose this solution because my company uses Microsoft Azure for a project, and my role as a data engineer primarily focuses on data-related services. For storing data, we use Data Lake; similarly, for the data processing engine, we use Spark, which Databricks provides.

What other advice do I have?

I rate the solution an eight out of ten. The solution is good but can be improved by including drag-and-drop features because it can be helpful for users who are unfamiliar with coding. I advise new users to have prior experience with Python or SQL before utilizing this solution if they use it for data science or model building. 

Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
MahalaxmanraoChappedi - PeerSpot reviewer
Associate Principal - Data Engineering at a tech services company with 10,001+ employees
Real User
Top 20
It's a unified platform that lets you do streaming and batch processing in the same place
Pros and Cons
  • "I like that Databricks is a unified platform that lets you do streaming and batch processing in the same place. You can do analytics, too. They have added something called Databricks SQL Analytics, allowing users to connect to the data lake to perform analytics. Databricks also will enable you to share your data securely. It integrates with your reporting system as well."
  • "Databricks may not be as easy to use as other tools, but if you simplify a tool too much, it won't have the flexibility to go in-depth. Databricks is completely in the programmer's hands. I prefer flexibility rather than simplicity."

What is our primary use case?

We build data solutions for the banking industry. Previously, we worked with AWS, but now we are on Azure. My role is to assess the current legacy applications and provide cloud alternatives based on the customers' requirements and expectations.

Databricks is a unified platform that provides features like streaming and batch processing. All the data scientists, analysts, and engineers can collaborate on a single platform. It has all the features, you need, so you don't need to go for any other tool. 

What is most valuable?

I like that Databricks is a unified platform that lets you do streaming and batch processing in the same place. You can do analytics, too. They have added something called Databricks SQL Analytics, allowing users to connect to the data lake to perform analytics. Databricks also will enable you to share your data securely. It integrates with your reporting system as well.

The Unity Catalog provides you with the data links and material capabilities. These are some of the unique features that fulfill all the requirements of the banking domain.

What needs improvement?

Every tool has room for improvement. Normally what happens, a solution will claim it can do ETL and everything else, but you encounter some limitations when you actually start. Then you keep on interacting with the vendor, and they continue to upgrade it. For example, we haven't fully implemented Databricks Unity Catalog, a newly introduced feature. We need to check how it works and then accordingly, there can be improvements in that also.

Databricks may not be as easy to use as other tools, but if you simplify a tool too much, it won't have the flexibility to go in-depth. Databricks is completely in the programmer's hands. I prefer flexibility rather than simplicity.

For how long have I used the solution?

I have been using Databricks for a year.

What do I think about the scalability of the solution?

Databricks relies on scalability and performance. Every cloud vendor prioritizes scalability, high availability, performance, and security. These are the most important reasons to move to the cloud.

How was the initial setup?

Deploying Databricks on the cloud is straightforward. It's not like an on-premise solution, where you must create a cluster and all those other prerequisites for big data. 

I don't think it's challenging to maintain, but you need an expert programmer because Databricks isn't GUI-based. With GUI-based tools, building ETLs is drag-and-drop. Databricks entirely relies on coding, so you need skilled programmers to building your code, ETLs, etc. 

What's my experience with pricing, setup cost, and licensing?

The price of Databricks is based on the computing volume. You also need to pay storage costs for the cloud where you're hosting Databricks, whether it is AWS, Azure, or Google. 

What other advice do I have?

I rate Databricks nine out of 10. Databricks is one of the best tools on the market.

Which deployment model are you using for this solution?

Public Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Microsoft Azure
Disclosure: My company has a business relationship with this vendor other than being a customer: Implementer
PeerSpot user
Oscar Estorach - PeerSpot reviewer
Chief Data-strategist and Director at Theworkshop.es
Real User
Top 10
Flexible, stable, and reasonably priced
Pros and Cons
  • "The solution is very easy to use."
  • "The integration of data could be a bit better."

What is our primary use case?

We primarily use the solution for retail and manufacturing companies. It allows us to build data lakes.

What is most valuable?

The solution is very easy to use. 

The storage on offer is very good. 

The solution is perfect for dealing with big data.

The artificial intelligence on offer is very good.

The product is quite flexible.

We have found the solution to be stable. 

The cloud services on offer are very reasonably priced.

Technical support is very good. They also have very good documentation on offer to help you navigate the product and learn about its offerings. 

What needs improvement?

The solution works very well for us. I can't recall any missing features or anything the solution really lacks. It's very complete. 

It would help if there were different versions of the solution on offer.

The integration of data could be a bit better.

For how long have I used the solution?

I've worked for about 20 to 25 years in business intelligence analytics and have worked with Databricks for about four years at this point. 

What do I think about the stability of the solution?

The stability of the solution is very good. It doesn't crash or freeze. There are no bugs or glitches. Its performance is very good.

What do I think about the scalability of the solution?

The scalability is quite good. A company that needs to expand it can do so with ease.

We only have four people on the solution at this time. The front-end users never use the product directly. The companies aren't that big here. If the economy improves, we'll likely have more of a need for the product.

How are customer service and technical support?

I've dealt with technical support in the past and have found them to be very good. They are helpful and responsive. We are satisfied with their level of service.

Which solution did I use previously and why did I switch?

I work with  Databricks, Cloudera and Snowflake.

How was the initial setup?

The solution is on the cloud and therefore there isn't really an installation process that you need to go through. You only really need to configure the clusters. 

Within the clusters, you configure according to how many platforms you need, or if you want to, you can build a cluster for artificial intelligence. You just configure it as required. 

What's my experience with pricing, setup cost, and licensing?

The pricing of the product is very reasonable. The fact that it is on the cloud makes it a less expensive option. Other solutions that are on-premises are quite expensive.

What other advice do I have?

We are customers and end-users. 

Databricks is on the could and therefore, we're always on the latest version of the solution. It's constantly updated for us so that we have access to the latest updates and upgrades. 

I'd rate the solution at a nine out of ten. The capability of the product is quite good and we are very satisfied with it overall. 

I'd recommend the solution to other companies and organizations.

Which deployment model are you using for this solution?

Public Cloud

If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?

Disclosure: I am a real user, and this review is based on my own experience and opinions.
PeerSpot user
Buyer's Guide
Download our free Databricks Report and get advice and tips from experienced pros sharing their opinions.
Updated: May 2024
Buyer's Guide
Download our free Databricks Report and get advice and tips from experienced pros sharing their opinions.