Databricks Room for Improvement

JH
Solution Architect at a insurance company with 10,001+ employees

It would be nice to have more guidance on integrations with ETLs and other data quality tools. The solution is not really a product for ETL or data quality so we use other DBT tools. 

View full review »
SS
Business Architect at YASH Technologies

The solution has some scalability and integration limitations when consolidating legacy systems.

View full review »
AbhishekGupta - PeerSpot reviewer
Engineering Leader at Walmart

CI/CD needs additional leverage and support. Community forums are helpful for gaining knowledge but the solution should provide specific documentation.

Streaming services such as Flink should be amplified and better supported. 

There are not many connectors to MapReduce.

View full review »
Buyer's Guide
Databricks
March 2024
Learn what your peers think about Databricks. Get advice and tips from experienced pros sharing their opinions. Updated: March 2024.
765,386 professionals have used our research since 2012.
Sudhendra Umarji - PeerSpot reviewer
Technical Architect at Infosys

Support for Microsoft technology and the compatibility with the .NET framework is somewhat missing. There should be reliability between these two. Databricks is based on open sources. If it's more synchronous between the Microsoft technology and the programming languages, it'll be better. Python has better languages, but compatibility would be a great help.

I would like to have better support for Microsoft technology and better language components.

With Azure or Cosmo DB, I can store other data links or time series data tables. That would be a great help for analytics in real time.

View full review »
Nabil Fegaiere1 - PeerSpot reviewer
Chief Executive Officer at dotFIT, LLC

I would like more integration with SQL for using data in different workspaces. We use the user interface for some functionalities, while for others, we have to use SQL to create data sets and grant permissions. For example, when creating a cluster, we have to create it with some API or user interface. Creating a cluster with some properties using SQL grants the possibility of using SQL syntax. Integration with SQL will make Databricks easier to use by people who have experience with databases like Lakehouse, and they would be able to use the data lake and BI. More integration will help have one point of view for everyone using SQL syntax.

Integration with Kubernetes could also be good for minimizing the price because you can use Kubernetes instead of virtual machines. But that won't be easy.

View full review »
Karan  Sharma - PeerSpot reviewer
Data Analyst at Allianz

Scalability is an area with certain shortcomings. The solution's scalability needs improvement.

View full review »
Avadhut Sawant - PeerSpot reviewer
Consulting Architect at a computer software company with 10,001+ employees

There are some aspects of Databricks, like generative AI, where they are positioning things like DALL-E. They're a little bit late to the game, but I think there are some things that they are working on. Generative AI is catching up in areas like data governance and enterprise flavor. Hence, these are places where Databricks has to be faster, and even though they are fast, I'm not sure how they'll catch up and get adopted because there are strong players in the market.

Databricks is coming up with a few good things in terms of integration. But I have to put one point forward that covers multiple aspects, which is the ease of use for the end user while operating this particular tool. For example, a tool like ADS gives you a GUI-based development, which is good for the end user who does development or maintenance. Looking at the complexities of data integration, a GUI might not be easy, but Databricks should embrace something on the graphical user development front because it is currently notebook-driven. Also, in terms of accessing the data for the end user, Databricks has an SQL interface, similar to earlier tools like SQL Management Studio. Since people are mostly comfortable with SSMS already or not, Databricks can build integration to known tools for data access, and that also helps, apart from what they're doing. I would like to see improvements with respect to user enablement, which is a good part of enterprise strategy. I would like to see their integration with a broader ecosystem of products. If you have to do data governance in tools like Microsoft Purview, it's manual and difficult. Now, I'm unsure if that momentum must be from Databricks or Microsoft. But it would be good if Databricks had some open interfaces to share metadata, which could be viewed in tools enabling data governance like Collibra, Purview, or Informatica. The improvement has to do with user and metadata integration for tools.

View full review »
Axel Richier - PeerSpot reviewer
Tech Lead Consultant | Manager Data Engineering at Ekimetrics

I would love an integration in my desktop IDE. For now, I have to code on their webpage. They provide a web interface to do my code. However, I have my local software to do some coding for other projects, yet I cannot use it for Databricks, and I lose all my shortcuts. I lose all the benefits from my local IDE. If one day they would provide some integrations with VS code, for example, that would be game-changing. Having Databricks in my VS code would be the most amazing feature.

View full review »
Alex Tsui - PeerSpot reviewer
Sr. Director at Omnicell

Databricks has a lack of debuggers, and it would be good to see more components. 

Another issue is that the D4 data format keeps changing on our cluster. This doesn't affect me much because I use functions to define it, but it is very frustrating for some more casual users. One day the output will be in a particular format, and then it becomes an object without us changing the cluster configuration. As a small team, we don't have the capacity to dig deeply into the issue, which has been frustrating.

View full review »
RichardXu - PeerSpot reviewer
Data Science Lead at a mining and metals company with 10,001+ employees

The user experience can be improved. 

It's not easy to use, and they need a better UI.

View full review »
Sahil Taneja - PeerSpot reviewer
Principal Consultant/Manager at Tenzing

There is room for improvement in the documentation of processes and how it works. I was trying to get one of the certifications, so I saw an area of improvement there. 

View full review »
AO
Lead Data Scientist at a manufacturing company with 10,001+ employees

The product cannot be integrated with a popular coding IDE.

View full review »
Anand Sharma - PeerSpot reviewer
Sr Data Engineer at PIMCO

The cost of this solution is high, on the expensive side.

In the future, I would like to see Data Lake support. That is something that I'm looking forward to.

View full review »
Rupal Sharma - PeerSpot reviewer
Data Architect at Three Ireland (Hutchison) - Infrastructure

There is room for improvement in visualization.

View full review »
Kevin McAllister - PeerSpot reviewer
Executive Manager at Hexagon AB

Databricks' performance when serving the data to an analytics tool isn't as good as Snowflake's. In the next release, Databricks should include a better data-sharing platform to facilitate data sharing between companies.

View full review »
PankajKumar13 - PeerSpot reviewer
Computer Scientist at Adobe

I believe that this product could be improved by becoming more user-friendly. 

In the next release, I would like to see more flexibility in the dashboard. It has plenty of features but it can be enhanced so that it matches with other visualization tools, like Power BI and Tableau. Also, the integrations with other tools could be better.

View full review »
Shiva Prasad ELLUR - PeerSpot reviewer
Vice President - Data Engineering and Analytics at a financial services firm with 10,001+ employees

This solution only supports queries in SQL and Python, which is a bit limiting. 

This is a fairly expensive solution for any service outside of the basic package, and costs can add up quite quickly if there are large scaling requirements.

View full review »
RC
Sr. BigData Architect at ITC Infotech

Instead of relying on a massive instance, the solution should offer micro partition levels. They're working on it, however, they need to implement it to help the solution run more effectively.

They're currently coming out with a new feature, which is Date Lake. It will come with a new layer of data compliance.

View full review »
SA
Principal at a computer software company with 5,001-10,000 employees

I have had some issues with some of the Spark clusters running on Databricks, where the Spark runtime and clusters go up and down, which is an area for improvement. Still, I am generally unaware of any super-critical issues.

View full review »
PraveenS - PeerSpot reviewer
Design Engineer at Cyient Limited

The product should provide more advanced features in future releases.

View full review »
Jeremy Salt - PeerSpot reviewer
Sr. Data Quality Analyst at Seek

Databricks has added some alerts and query functionality into their SQL persona, but the whole SQL persona, which is like a role, needs a lot of development. The alerts are not very flexible, and the query interface itself is not as polished as the notebook interface that is used through the data science and machine learning persona. It is clunky at present.

View full review »
AB
STI Data Leader at grupo gtd

The auto models can be improved. 

We can create auto models like Microsoft Azure Machine Learning. In Azure Machine Learning, they have these features, for example, for auto models or code, or by code. They need this in Databricks. 

We need more connectors between on-premises and the cloud. 

We'd like a more visual dashboard for analysis It needs better UI. 

View full review »
Elizabeth Ho - PeerSpot reviewer
Manager, Customer Journey at a retailer with 10,001+ employees

I would like it if Databricks adopted an interface more like R Studio. When I create a data frame or a table, R Studio provides a preview of the data. In R Studio, I can see that it created a table with so many columns or rows. Then I can click on it and open a preview of that data. 

Because I work in analytics and not data engineering, I think that's probably the biggest one. There are better graphical tools, so I don't think Databricks can compete. You can do a simple graph, and it's not that great. However, I don't think they can ever stack up to Tableau, so it's probably not worth it to improve upon that. 

View full review »
Sanjay Bheemasenarao - PeerSpot reviewer
Director - Data Engineering expert at Sankir Technologies

If I want to create a Databricks account, I need to have a prior cloud account such as an AWS account or an Azure account. Only then can I create a Databricks account on the cloud. However, if they can make it so that I can still try Databricks even if I don't have a cloud account on AWS and Azure, it would be great. That is, it would be nice if it were possible to create a pseudo account and be provided with a free trial. It is very essential to creating a workforce on Databricks. For example, students or corporate staff can then explore and learn Databricks.

It's a big ask to have people jump through a lot of hoops to get approval to create a Databricks cluster just to explore it, but if they can try it on their own with a free trial without an underlying cloud account it would be more convenient.

Documentation can be improved as well. There are so many versions of documents. For example, when I tried to create a DBU vault and secrets file, I had to go through multiple versions of documents. This could be improved so that the documentation is easy to use.

View full review »
DevSmita Asthana - PeerSpot reviewer
Strategic Alliances& Ecosystems Manager at a outsourcing company with 501-1,000 employees

The product should incorporate more learning aspects. It needs to have a free trial version that the team can practice. 

View full review »
RM
Head of Business Integration and Architecture at Jakala

The data visualization for this solution could be improved. They have started to roll out a data visualization tool inside Databricks but it is in the early stages. It's not comparable to a solution like Power BI, Luca, or Tableau.

In a future release, we would like to have a better ETL designer tool to assist in the way we move data from one place to another.

View full review »
GR
Head of Referential and Big Data at a financial services firm with 5,001-10,000 employees

It would be better if it were faster. It can be slow, and it can be super fast for big data. But for small data, sometimes there is a sub-second response, which can be considered slow.

In the next release, I would like to have automatic creation of APIs because they don't have it at the moment, and I spend a lot of time building them.

View full review »
JH
Head of Credit Risk and Data at Cegid Invoice and Financing

I'm not the guy that I'm working with Databricks on a daily basis. I'm on the management team. However, my team tells me there are limitations with streaming events. The connectors work with a small set of platforms. For example, we can work with Kafka, but if we want to move to an event-driven solution from AWS, we cannot do it. We cannot connect to all the streaming analytics platforms, so we are limited in choosing the best one.

Also, this is an all-in-one platform, but it might be preferable if there were an a la carte model where we could select the best tool in each class for reporting, machine learning, etc. I'm not yet sure if this strategy is the best one. 

View full review »
RC
Data Engineering Manager at a pharma/biotech company with 10,001+ employees

Databricks as a solution is integrated with Azure, but Google Cloud has some restrictions. I'm not sure about AWS Cloud, but it would be great if Databricks could integrate all the cloud platforms. Regarding additional features, we would like to see them mostly on the data engineering side, where we have a Spark cluster and some inbuilt ML. In addition, pre-processing steps will be useful.

View full review »
RX
Machine Learning Engineer at a mining and metals company with 10,001+ employees

The interface of Databricks could be easier to use when compared to other solutions. It is not easy for non-data scientists. The user interface is important before we had to write code manually and as solutions move to "No code AI" it is critical that the interface is very good.

View full review »
Tajinder_Singh - PeerSpot reviewer
Senior Software Engineer at a computer software company with 201-500 employees

Microsoft Azure has its learning environment on the Microsoft website. We can complete certifications, but the Databricks certification is more expensive than Microsoft. It costs between $2,000 and $2,500, and the knowledge is linked. They're also charged based on whether a person doesn't want to analyze large amounts of data. Hence, we want to have the capacity for free student users so that people can learn and build their professional skills.

View full review »
MahalaxmanraoChappedi - PeerSpot reviewer
Associate Principal - Data Engineering at a tech services company with 10,001+ employees

Every tool has room for improvement. Normally what happens, a solution will claim it can do ETL and everything else, but you encounter some limitations when you actually start. Then you keep on interacting with the vendor, and they continue to upgrade it. For example, we haven't fully implemented Databricks Unity Catalog, a newly introduced feature. We need to check how it works and then accordingly, there can be improvements in that also.

Databricks may not be as easy to use as other tools, but if you simplify a tool too much, it won't have the flexibility to go in-depth. Databricks is completely in the programmer's hands. I prefer flexibility rather than simplicity.

View full review »
Oscar Estorach - PeerSpot reviewer
Chief Data-strategist and Director at Theworkshop.es

The solution works very well for us. I can't recall any missing features or anything the solution really lacks. It's very complete. 

It would help if there were different versions of the solution on offer.

The integration of data could be a bit better.

View full review »
MA
Senior Data Engineer at TCS

The query plan is not easy with Databrick's job level. If I want to tune any of the code, it is not easily available in the blogs as well.

View full review »
MILTON FERREIRA - PeerSpot reviewer
Co-founder/Senior Data Scientist at Hence

The solution could be improved by adding a feature that would make it more user-friendly for our team. The feature is simple, but it would be useful. Currently, our team is more familiar with the language R, but Databricks requires the use of Jupyter Notebooks which primarily supports Python. We have tried using RStudio, but it is not a fully integrated solution. To fully utilize Databricks, we have to use the Jupyter interface. One feature that would make it easier for our team to adopt the Jupyter interface would be the ability to select a specific variable or line of code and execute it within a cell. This feature is available in other Jupyter Notebooks outside of Databricks and in our own IDE, but it is not currently available within Databricks. If this feature were added, it would make the transition to using Databricks much smoother for our team.

The most important feature other than the Jupyter interface would be to have the RStudio interface inside Databricks. This would be perfect.

View full review »
Trond Jensen - PeerSpot reviewer
Data Analyst at Eviny

I would like to see the integration between Databricks and MLflow improved. It is quite hard to train multiple models in parallel in the distributed fashions. You hit rate limits on the clients very fast.

View full review »
Olubisi Akintunde - PeerSpot reviewer
Team Lead at a tech services company with 1,001-5,000 employees

I would like to see improvement with the UI. It is functional and useful, but it's a bit clunky at times. It should be more user-friendly.

In future releases, Databricks would benefit from enhanced metrics and tighter integration with Azure's diagnostics.

View full review »
IshwarSukheja - PeerSpot reviewer
Sr Manager Data Scientist at Bizmetric

Writing pandas-profiling reports could be easier. 

The ability to customize our own pipelines would enhance the product, similar to what's possible using ML files in Microsoft Azure DevOps. 

View full review »
JK
Lead Architect at Birlasoft IndiaLtd.

The connectivity with various BI tools could be improved, specifically the performance and real time integration. There is also some improvement required in the semantic layers to manage the data match as well as the data warehouse features.

In a future release, we would like to have features to better manage all ML development activities.

View full review »
Jorge Alvarado - PeerSpot reviewer
Owner at a marketing services firm with 1-10 employees

I would like it if Databricks made it easier to set up a project. The use case determines which services we are going to use. You have the application engine, and you generate a potential budget for your workloads, so you can understand what you are going to do, what you are going to use, and what you will invest in.

Because I'm deploying on the Google Cloud Platform, measuring the investment, value, and use case is extremely difficult. So I leave it and move on without the risk. It would be easier if I had one page where you can see three columns: one for the use cases of a specific architecture, a second one for the prices based on the volume of data or machine time, and the third column for the budget. That would make it easier to know if I am using the appropriate architecture for the right solution.

I have seen something like that in Microsoft Azure, but obviously Microsoft Azure costs a lot of money. Amazon has something like that, but it's very complicated to use.

View full review »
AK
Coordenador Financeiro at Icatu

Data governance should be addressed. We have some trouble connecting all the governance solutions with Databricks. This means the integrative capabilities are problematic. 

The initial setup is difficult. 

View full review »
Anirban Bhattacharya - PeerSpot reviewer
Practice Head, Data & Analytics at a tech vendor with 10,001+ employees

In my view, the fundamental approach of implementing Databricks is still very code heavy, more than you find in Azure Data Factory and other technologies like Informatica or SQL Server Integration Service. From my perspective, that could be improved. I'd also like to have the ability to facilitate predictive analytics within the solution. 

View full review »
Tristan Bergh - PeerSpot reviewer
Data Scientist at a computer software company with 501-1,000 employees

The product could be improved by offering an expansion of their visualization capabilities, which currently assists in development in their notebook environment. Perhaps a few connectors that auto-deploy to a reporting server?

More parallelized Machine Learning libraries would be excellent for predictive analytics algorithms.

View full review »
Joaquin Marques - PeerSpot reviewer
CEO - Founder / Principal Data Scientist / Principal AI Architect at Kanayma LLC

The area in which this product can be improved is optimization. In the next release, I would like to see more optimization features.

View full review »
Sarbani Maiti - PeerSpot reviewer
Vice President at a tech services company with 51-200 employees

I'm struggling a little because I wanted to do some POC solutions. I present a lot of projects in various forums and seminars and there aren't a lot of credits and trial options with Databricks. Even if we want to explore, we're not able to and that's a challenge. The solution is quite expensive.

View full review »
PD
Enterprise Data Architect at a financial services firm with 51-200 employees

The product could include some UI features to improve the ease of use, like drag and drop for a few aggregated functions. Additionally, the Databricks cluster can be improved.

View full review »
HA
Cloud Administrator at a retailer with 5,001-10,000 employees

The tool should improve its integration with other products.

View full review »
AM
Global Data Architecture and Data Science Director at FH

Databricks requires writing code in Python or SQL, so if you're a good programmer then you can use Databricks.

View full review »
KG
Associate Manager at a consultancy with 501-1,000 employees

There should be better integration with other platforms.

View full review »
MM
Lead Data Architect at a government with 1,001-5,000 employees

The product is quite ambitious. It's trying to become a centralized platform for all data ingestion, transformation, and analytics needs. It may encounter a stiff competition from best of breed solutions powered by open source software. 

Overall it's a good product, however, it might get challenged over time with with individual best-of-breed products. 

For example in the area of Data Science, RStudio seems to be the industry standard at the moment. RStudio IDE is richer, there are a more out of the box functionalities like a push-button publishing, etc. It's more difficult to run R within Databricks. Especially when it comes to synchronizing the R packages, it legs behind. It's not even supporting the latest version of R 1.3. I believe eventually all analytics will converge into data science. The analytics of the future will be data science, because predicting the future will be one of the most prevalent use cases. The stuff we used to do before, slicing and dicing, drilling through, trend analysis, etc. will become redundant operations after the analytics toolsets become powered by AI/ML and fully automated. Unless the organisations acquire these platforms that can cater for machine learning and artificial intelligence, including natural language processing they will have a hard time surviving.

With Databricks I would like to see more integration with and accommodation of  open-source products. This could be controversial, as it could question the whole configuration and the purpose of the product. I'm pretty sure Microsoft is trying to position it in a monopoly market as they did with Windows and MS Office so that they could charge the premium. We are beginning to see the similar product strategy behind Databricks. 

View full review »
YK
Pre-sale Leader, Big Data Enterprise Solutions at Ness Technologies

I have seen better user interfaces, so that is something that can be improved.

It was quite hard to deploy.

View full review »
OB
Cloud & Infra Security, Group Manager at a tech vendor with 10,001+ employees

Costs can quickly add up if you don't plan for it. 

View full review »
OB
IT Manager: User Support at a financial services firm with 10,001+ employees

I think we are using a lot of people to manage this solution. I'd like to see the people using this solution sharing their knowledge. 

View full review »
it_user1050483 - PeerSpot reviewer
CEO at Inosense

Improvements could include the pricing, the product is a little expensive, although I think comparable to other similar options. The integration features could be more interesting, more involved. For example, we use the Database Notebook, which is not as great as Jupyter Notebook, for providing a great user experience. The look and feel are not the same and we've had complaints from some of our users. They say that it's easier and more productive for them to use Jupyter Notebook.

And then there is the integration feature for connecting to data sources, for example, Jupyter Notebook through publishes connect. The problem is that when you do that, you don't get all the Jupyter features which is a shame for us. 

For additional features, having some PyTorch or TensorFlow type features inside would definitely be great. For now, my users are developing for themselves by importing their libraries into their Notebook and then creating models based on the potential flow of PyTorch. That requires a lot of imports, particularly library imports, something that is now available in the new version of  Machine Learning services. These things are very important because the self appliance community has shifted from the traditional way of preparing models, to a deeper learning system. It's now more common to have those features. 

View full review »
RD
Data Scientist at a retailer with 5,001-10,000 employees

Since the Databricks community is not that old, there is not a lot of information about some of the issues that we face. We have to go back to the Databricks stream to get some of the issue resolutions from there. 

As time passes, and more people start putting more information out there about this technology, wit will be helpful.

I think even with the features that we currently have, they're still optimizing some of the clusters and trying to parallelize to better read from other types of data. So, that's going really well in terms of one of the features that they recently came up with to include the data format for data, which was really good, and that speeds up a lot of the processes.

I would like to see more documentation in terms of how an end-user could use it, and users like me can easily try it and implement use cases.

View full review »
VP
Data Scientist at a energy/utilities company with 10,001+ employees

I think the automatic categorization of variables needs to be improved. The current functionality is not always efficiently identifying the features of the data that is collected. Probably that is the only thing I can think of. Apart from that, I have not explored the product enough yet to go into more depth because there is only one asset project that I have taken on right now. Because I own this company, I have been doing more to run it than to explore this product very deeply. But when you get any form of data inside there, if it could understand what type of variables there are and what features the data has, it would help massively in taking processing to the next step. If it does not exactly identify the variables you may have to modify them a little. Apart from working with Databricks to understand its capabilities, I am also trying to learn Apache Spark right now. Some members of my team want to work with Apache Spark as a solution and at this point, we are evaluating both and we are planning to use Spark or Databricks.  

As far as what might be added, some custom algorithm samples would be useful. All of the other products of this type — Azure, AWS, SageMaker — they all have customizable algorithms. You have the capability to implement a sort of workflow from that by modifying things in the sample and changing it to fit your purposes. Probably that is something that might help in doing some small NDP (Near-Data Processing) development. It might not help in the project directly, but it will help while we work on some NDP development of our own so that we can quickly evaluate how something is going to work. Templates or other samples could make working on things easier.  

That would also help massively in getting people to understand the potential of what the product can actually do. But I also think not many people would strongly agree with this. Many people go to the first solution they can think of that they know very well already in the IT field even if they could imagine that something could be better.  

To get the value out of this technology, people will need to come to accept it. Technical people will accept Databricks more if they understand that this is something that they can use and start working on without a lot of experience. Adopting it will take time for new users who have no experience. But to feel like they can have success with a product, they have to execute something in a very short time and see how it can work. When you talk about AI — or really when you talk about anything new — people do not initially want to invest the time in discovery. These processes do take time to learn, but with templates or samples, you get to see immediately what the possibilities are and what you might get out of it. Then when they try something of their own and are able to get it working in less than a week's time, they will be encouraged to look into the product and the technology some more.  

View full review »
ZH
Data engineer

The pricing of Databricks could be cheaper. The solution can also improve by providing more intelligence to the coder.

View full review »
Mullai Selvan - PeerSpot reviewer
Project Manager at MAQ Software

Databricks can improve by making the documentation better.

View full review »
Natalia  Raffo - PeerSpot reviewer
Co - Founder & Chief Data Officer -CDO at Data360

There could be more support for automated machine learning in the database. I would like to see more ways to do analysis so that the reporting is more understandable.

View full review »
RB
Business Intelligence Coordinator Latam at a construction company with 5,001-10,000 employees

Databricks does not always have clear updates. Often we find an update in the tool but we are not really sure what has changed. We would appreciate better communication from Databricks. It could be in the form of a friendly warning that talks about the updates. 

There would also be benefits if more options were available for workers, or the clusters of the two points.

View full review »
AP
Chief Research Officer at a consumer goods company with 1,001-5,000 employees

I'd like to see more licensing options for the solution, the availability of additional pricing tiers. I understand it's not easy to achieve because it's a kind of platform-as-a-service type of solution. If you wanted to be more specific about the parts, and what you might or might not need, then you could save some money, and go for a lower level. Of course, that would then mean you'd have to manage more configurations which, as a user, would make things more complex but it would be good to have that option. The pricing is not the cheapest but it's understandable because it's a very high-end solution and easy to use, there's a lot of complexity masked away.

I would like to see additional monitoring tools and, in general, anything that can improve visualization of data. I know it's not the main point of Databricks and there are other tools that can be used, but anything that facilitates the integration of Databricks with visualization tools could be really useful. Increasing data scalability would also be great. 

View full review »
PG
Data Science Developer at a tech services company with 501-1,000 employees

Databricks should have more libraries for predictive analysis and machine learning.

It should have more compatible and more advanced visualization and machine learning libraries. As it is now, I have to try a customer algorithm in order for things to be compatible.

I would like to see more deep learning analytics.

View full review »
LV
Advanced Analytics Lead at a pharma/biotech company with 1,001-5,000 employees

The solution could improve by providing better automation capabilities. For example, working together with more of a DevOps approach, such as continuous integration. There is a lot of code from places, such as GitHub, but it is not tailored for Databricks. It requires a lot of effort to bring the code to a level where it can be used with Databricks capabilities.

View full review »
AD
Business Intelligence and Analytics Consultant at a tech services company with 201-500 employees

Some of the error messages that we receive are too vague, saying things like "unknown exception", and these should be improved to make it easier for developers to debug problems. As it is now, we have to go into the driver logs to identify the error messages properly. 

There is not much information about Databricks available online, such as cost. Whenever we want to find the actual costing, we have to send an email to Databricks, so having the information available on the internet would be helpful.

I would like to see integration with Power BI or Tableau for the business users. They may use Databricks to check on things, but it will be a little bit complicated for them. The GUI interfaces for Tableau and Power BI are ones that they are used to, so the integration would help.

View full review »
BG
Data Architect at a tech services company with 201-500 employees

Sometimes we experience issues connecting our database to Databricks. There are no direct connectors — they are very limited. This should be addressed and corrected in the next release.  

Reading past data can also be tricky as there is no data spectrum like you would find with Snowflake and other solutions. 

View full review »
NH
Director of Data (Engineering & Science) at a tech services company with 11-50 employees

The solution can be improved by expanding its integration capabilities and providing the ability to query external vendors directly.

View full review »
SN
Head of Data & Analytics at a tech services company with 11-50 employees

There is definitely room for improvement.

This is the type of solution where you need to have people with technical expertise to use it.  Other products are self-service and can be employed by end-users. Databricks is not geared towards the end-user, but rather it is for data engineers or data scientists. I'm not sure whether Databricks is working towards it, or not.

It would be nice if it were more user-friendly, where you don't have to rely on Power BI or a visualization tool. I know that there is integration in the notebook where you can do it, but still, the relationships and semantics make it more difficult. It would be better to do it right in Databricks. You could put them within the portal and I don't have to log out and bring that into Power BI and then visualize.

View full review »
SV
Engineer at a tech services company with 10,001+ employees

The management of the solution needs to be modernized. Managing the radius data is hard.

The solution requires modern scoring. There's not a good way of knowing how the models are performing from a data science perspective. The solution needs more model scoring abilities. It doesn't necessarily need more model monitoring, but more model scoring and performance from a data science perspective. 

Databricks is an analytics platform. It should offer more data science. It should have more features for data scientists to work with.

View full review »
SH
Data Science Consultant at Syniti

It would be very helpful if Databricks could integrate with platforms in addition to Azure.

Having an open-source version or having the option to get a trial version of Databricks would be very helpful.

It would be very useful for beginners if there were tutorials and examples on how to write code for PySpark, R, or Scala. Having examples would give people something to refer to and play with.

View full review »
DW
Machine Learning Engineer at a tech vendor with 51-200 employees

The solution could be improved by integrating it with data packets. Right now, the load tables provide a function, like team collaboration. Still, it's unclear as to if there's a function to create different branches and/or more branches. Our team had used data packets before, however, I feel it's difficult to integrate the current with the previous data packets.

The support could be improved a bit around the database. When we stream it to Data Lake, some data cannot be loaded. It should be a priority to fix this.

View full review »
SC
Chief Data Scientist at a tech services company with 11-50 employees

Databricks doesn't offer the use of Python scripts by itself and is not connected to GitHub repositories or anything similar. This is something that is missing. if they could integrate with Git tools it would be an advantage.

Along with having connections to different databases for Git tools, adding libraries for easy access would be a benefit. As data scientists, we connect to different databases and different sources of data, having a library would be useful.

View full review »
AA
Technical Architect at a tech services company with 10,001+ employees

One area for improvement would be that anyone who doesn't know SQL may find the product difficult to work with. It would also be useful to have a remote support team inside Databricks, which would collect and analyze user feedback.

View full review »
PC
Vice President, Business Intelligence and Analytics at a tech services company with 10,001+ employees

Pricing is one of the things that could be improved.

Also, there could be improvement in the visual analytics space there and on the machine learning functions. I haven't explored so I don't know about the functions and features that are there. If it is not there, then I think that's something which they should consider including.

View full review »
HL
Business Development Specialist at a tech services company with 51-200 employees

Databricks could improve in some of its functionality.

View full review »
Buyer's Guide
Databricks
March 2024
Learn what your peers think about Databricks. Get advice and tips from experienced pros sharing their opinions. Updated: March 2024.
765,386 professionals have used our research since 2012.