RameshChSr. BigData Architect at ITC Infotech
EzzAbdelfattahAssociate Professor of Statistics at KAU
We asked business professionals to review the solutions they use. Here are some excerpts of what they said:
"We are completely satisfied with the ease of connecting to different sources of data or pocket files in the search"
"Automation with Databricks is very easy when using the API."
"Databricks is based on a Spark cluster and it is fast. Performance-wise, it is great."
"The most valuable aspect of the solution is its notebook. It's quite convenient to use, both terms of the research and the development and also the final deployment, I can just declare the spark jobs by the load tables. It's quite convenient."
"I work in the data science field and I found Databricks to be very useful."
"The time travel feature is the solution's most valuable aspect."
"I haven't heard about any major stability issues. At this time I feel like it's stable."
"Imageflow is a visual tool that helps make it easier for business people to understand complex workflows."
"Most of the product features are good but I particularly like the linear regression analysis."
"Some of the most valuable features that we are using with some business models are machine learning algorithms, statistical models given to us by the business, and getting data from the database or text files."
"The best part is that they have an algorithm handbook, so you can open it up and understand how it works, and if it is useful, this is very important."
"You can find a complete algorithm in the solution and use it. You don't need to write your own algorithms for predictive analytics. That's the most valuable feature and the main one we use."
"They have many existing algorithms that we can use and use effectively to analyze and understand how to put our data to work to improve what we do."
"It has the ability to easily change any variable in our research."
"The most valuable feature is the user interface because you don't need to write code."
"In terms of the features I've found most valuable, I'd say the duration, the correlation, and of course the nonparametric statistics. I use it for reliability and survival analysis, time series, regression models in different solutions, and different types of solutions."
"The integration features could be more interesting, more involved."
"Some of the error messages that we receive are too vague, saying things like "unknown exception", and these should be improved to make it easier for developers to debug problems."
"It should have more compatible and more advanced visualization and machine learning libraries."
"The solution could be improved by integrating it with data packets. Right now, the load tables provide a function, like team collaboration. Still, it's unclear as to if there's a function to create different branches and/or more branches. Our team had used data packets before, however, I feel it's difficult to integrate the current with the previous data packets."
"It would be very helpful if Databricks could integrate with platforms in addition to Azure."
"Databricks is an analytics platform. It should offer more data science. It should have more features for data scientists to work with."
"Pricing is one of the things that could be improved."
"The product needs samples and templates to help invite users to see results and understand what the product can do."
"I think the visualization and charting should be changed and made easier and more effective."
"Technical support needs some improvement, as they do not respond as quickly as we would like."
"The statistics should be more self-explanatory with detailed automated reports."
"Each algorithm could be more adaptable to some industry-specific areas, or, in some cases, adapted for maintenance."
"The product should provide more ways to import data and export results that are user-friendly for high-level executives."
"The design of the experience can be improved."
"This solution is not suitable for use with Big Data."
"Most of the package will give you the fixed value, or the p-value, without an explanation as to whether it it significant or not. Some beginners might need not just the results, but also some explanation for them."
"Whenever we want to find the actual costing, we have to send an email to Databricks, so having the information available on the internet would be helpful."
"I do not exactly know the costs, but one of our clients pays between $100 USD and $200 USD monthly."
"Licensing on site I would counsel against, as on-site hardware issues tend to really delay and slow down delivery."
"We find Databricks to be very expensive, although this improved when we found out how to shut it down at night."
"The pricing depends on the usage itself."
"I am based in South Africa, where it is expensive adapting to the cloud, and then there is the price for the tool itself."
"The price is okay. It's competitive."
"Databricks uses a price-per-use model, where you can use as much compute as you need."
"We think that IBM SPSS is expensive for this function."
"The price of this solution is a little bit high, which was a problem for my company."
"The pricing of the modeler is high and can reduce the utility of the product for those who can not afford to adopt it."
Databricks creates a Unified Analytics Platform that accelerates innovation by unifying data science, engineering, and business. It utilizes Apache Spark to help clients with cloud-based big data processing. It puts Spark on “autopilot” to significantly reduce operational complexity and management cost. The Databricks I/O module (DBIO) improves the read and write performance of Apache Spark in the cloud. An increase in productivity is ensured through Databricks’ collaborative workplace.
Databricks is ranked 2nd in Data Science Platforms with 23 reviews while IBM SPSS Statistics is ranked 5th in Data Science Platforms with 16 reviews. Databricks is rated 8.0, while IBM SPSS Statistics is rated 8.0. The top reviewer of Databricks writes "Has a good feature set but it needs samples and templates to help invite users to see results". On the other hand, the top reviewer of IBM SPSS Statistics writes "Offers good Bayesian and descriptive statistics". Databricks is most compared with Microsoft Azure Machine Learning Studio, Amazon SageMaker, Azure Stream Analytics, Alteryx and Dataiku Data Science Studio, whereas IBM SPSS Statistics is most compared with IBM SPSS Modeler, TIBCO Statistica, Weka, MathWorks Matlab and TIBCO Data Science. See our Databricks vs. IBM SPSS Statistics report.
See our list of best Data Science Platforms vendors.
We monitor all Data Science Platforms reviews to prevent fraudulent reviews and keep review quality high. We do not post reviews by company employees or direct competitors. We validate each review for authenticity via cross-reference with LinkedIn, and personal follow-up with the reviewer when necessary.