Anonymous UserLead Data Architect at a government
Hemant AddalSenior Vice President at a financial services firm
We asked business professionals to review the solutions they use. Here are some excerpts of what they said:
"We are completely satisfied with the ease of connecting to different sources of data or pocket files in the search"
"Automation with Databricks is very easy when using the API."
"Databricks is based on a Spark cluster and it is fast. Performance-wise, it is great."
"The most valuable aspect of the solution is its notebook. It's quite convenient to use, both terms of the research and the development and also the final deployment, I can just declare the spark jobs by the load tables. It's quite convenient."
"I work in the data science field and I found Databricks to be very useful."
"The time travel feature is the solution's most valuable aspect."
"I haven't heard about any major stability issues. At this time I feel like it's stable."
"Imageflow is a visual tool that helps make it easier for business people to understand complex workflows."
"This open-source product can compete with category leaders in ELT software."
"The visual workflow tools for custom and complex tasks always beat raw coding languages with the agility, speed to deliver, and ease of subsequent changes."
"This solution is easy to use and especially good at data preparation and wrapping."
"It's a coding-less opportunity to use AI. This is the major value for me."
"This solution is easy to use and it can be used to create any kind of model."
"All of the features related to the ETL are fantastic. That includes the connectors to other programs, databases, and the meta node function."
"What I like the most is that it works almost out of the box with Random Forest and other Forest nodes."
"It is very fast to develop solutions."
"The integration features could be more interesting, more involved."
"Some of the error messages that we receive are too vague, saying things like "unknown exception", and these should be improved to make it easier for developers to debug problems."
"It should have more compatible and more advanced visualization and machine learning libraries."
"The solution could be improved by integrating it with data packets. Right now, the load tables provide a function, like team collaboration. Still, it's unclear as to if there's a function to create different branches and/or more branches. Our team had used data packets before, however, I feel it's difficult to integrate the current with the previous data packets."
"It would be very helpful if Databricks could integrate with platforms in addition to Azure."
"Databricks is an analytics platform. It should offer more data science. It should have more features for data scientists to work with."
"Pricing is one of the things that could be improved."
"The product needs samples and templates to help invite users to see results and understand what the product can do."
"The ability to handle large amounts of data and performance in processing need to be improved."
"I would like to see better web scraping because every time I tried, it was not up to par, although you can use Python script."
"It needs more examples, use cases, and MOOC to learn, especially with respect to the algorithms and how to practically create a flow from end-to-end."
"There should be better documentation and the steps should be easier."
"KNIME needs to provide more documentation and training materials, including webinars or online seminars."
"The predefined workflows could use a bit of improvement."
"The documentation is lacking and it could be better."
"There are a lot of tools in the product and it would help if they were grouped into classes where you can select a function, rather than a specific tool."
"Whenever we want to find the actual costing, we have to send an email to Databricks, so having the information available on the internet would be helpful."
"I do not exactly know the costs, but one of our clients pays between $100 USD and $200 USD monthly."
"Licensing on site I would counsel against, as on-site hardware issues tend to really delay and slow down delivery."
"We find Databricks to be very expensive, although this improved when we found out how to shut it down at night."
"The pricing depends on the usage itself."
"I am based in South Africa, where it is expensive adapting to the cloud, and then there is the price for the tool itself."
"The price is okay. It's competitive."
"Databricks uses a price-per-use model, where you can use as much compute as you need."
"KNIME is free as a stand-alone desktop-based platform but if you want to get a KNIME server then you can find the cost on their website."
"The price of KNIME is quite reasonable and the designer tool can be used free of charge."
"It's an open-source solution."
"The price for Knime is okay."
"At this time, I am using the free version of Knime."
"This is an open-source solution that is free to use."
"There is a Community Edition and paid versions available."
Databricks creates a Unified Analytics Platform that accelerates innovation by unifying data science, engineering, and business. It utilizes Apache Spark to help clients with cloud-based big data processing. It puts Spark on “autopilot” to significantly reduce operational complexity and management cost. The Databricks I/O module (DBIO) improves the read and write performance of Apache Spark in the cloud. An increase in productivity is ensured through Databricks’ collaborative workplace.
Databricks is ranked 2nd in Data Science Platforms with 22 reviews while KNIME is ranked 3rd in Data Science Platforms with 13 reviews. Databricks is rated 8.0, while KNIME is rated 8.4. The top reviewer of Databricks writes "Has a good feature set but it needs samples and templates to help invite users to see results". On the other hand, the top reviewer of KNIME writes "Has good machine learning and big data connectivity but the scheduler needs improvement ". Databricks is most compared with Microsoft Azure Machine Learning Studio, Amazon SageMaker, Azure Stream Analytics, Alteryx and Dremio, whereas KNIME is most compared with Alteryx, RapidMiner, Dataiku Data Science Studio, Weka and Microsoft Azure Machine Learning Studio. See our Databricks vs. KNIME report.
See our list of best Data Science Platforms vendors.
We monitor all Data Science Platforms reviews to prevent fraudulent reviews and keep review quality high. We do not post reviews by company employees or direct competitors. We validate each review for authenticity via cross-reference with LinkedIn, and personal follow-up with the reviewer when necessary.