We performed a comparison between RapidMiner and Talend Data Quality based on real PeerSpot user reviews.
Find out what your peers are saying about Alteryx, SAP, RapidMiner and others in Predictive Analytics."The solution is stable."
"The most valuable feature of RapidMiner is that it can read a large number of file formats including CSV, Excel, and in particular, SPSS."
"The best part of RapidMiner is efficiency."
"Using the GUI, I can have models and algorithms drag and drop nodes."
"RapidMiner for Windows is an excellent graphical tool for data science."
"What I like about RapidMiner is its all-in-one nature, which allows me to prepare, extract, transform, and load data within the same tool."
"The most valuable feature is what the product sets out to do, which is extracting information and data."
"Scalability is not really a concern with RapidMiner. It scales very well and can be used in global implementations."
"The file fetch process is impeccable."
"I like idea of storing the results of Data Quality jobs in a DB and having the ability to run reports in the DB to show a dashboard of quality metrics."
"It is saving a lot of time. Today, we can mask around a hundred million records in 10 minutes. Masking is one of the key pieces that is used heavily by the business and IT folks. Normally in the software development life cycle, before you project anything into the production environment, you have to test it in the test environment to make sure that when the data goes into production, it works, but these are all production files. For example, we acquired a new company or a new state for which we're going to do the entire back office, which is related to claims processing, payments, and member enrollment every year. If you get the production data and process it again, it becomes a compliance issue. Therefore, for any migrations that are happening, we have developed a new capability called pattern masking. This feature looks at those files, masks that information, and processes it through the system. With this, there is no PHI and PII element, and there is data integrity across different systems. It has seamless integration with different databases. It has components using which you can easily integrate with different databases on the cloud or on-premise. It is a drag and drop kind of tool. Instead of writing a lot of Java code or SQL queries, you can just drag and drop things. It is all very pictorial. It easily tells you where the job is failing. So, you can just go quickly and figure out why it is happening and then fix it."
"With its frequency function, we were able to pick a line of business to be addressed first in one of our conversion projects."
"It lowers the amount of time in development from weeks to a day."
"It has definitely streamlined certain processes."
"It offers advanced features that allow you to create custom patterns and use regular expressions to identify data issues."
"Provides a flexible development environment to the coder."
"The server product has been getting updated and continues to be better each release. When I started using RapidMiner, it was solid but not easy to set up and upgrade."
"RapidMiner would be improved with the inclusion of more machine learning algorithms for generating time-series forecasting models."
"If they could include video tutorials, people would find that quite helpful."
"A great product but confusing in some way with regard to the user interface and integration with other tools."
"It would be helpful to have some tutorials on communicating with Python."
"I would like to see all users have access to all of the deep learning models, and that they can be used easily."
"I would appreciate improvements in automation and customization options to further streamline processes."
"One challenge I encountered while implementing RapidMiner was the lack of documentation. Since there aren't as many users, finding resources to learn the tool was initially difficult. To overcome this hurdle, I believe RapidMiner could improve by providing more tutorials tailored for new users."
"If we encounter issues, it’s most likely when using the Talend Open Studio. The studio can be slow, get stuck, or crash. But again, it can be caused by the resources of your machine or your connection with the repository. If we encounter issues with the Studio we restart the Studio. In emergencies, we create and use a new workspace."
"In redundancy analysis, the query is failing to bring non-matched records. This query is an internal script. There is no way (that I know of) to fix this syntax error for future runs."
"It would be more helpful if it offered dynamic dashboards that could be directly used by clients for better analysis."
"I would say that some of the support elements need improvement."
"The performance is one area that Talend Data Quality could improve in because large volumes take a lot of time."
"There are more functions in a non-streamlined manner, which could be refined to arrive at a better off-the-shelf functions."
"The ability to change the code when debugging the JavaScript could be improved."
"They don't have any AI capabilities. Talend DQ is specifically for data quality, which only has data profiling. With Talend DQ, I cannot generate any reports today, so I need an ETL tool. It provides general Excel files, or I have to create some views. If instead of buying a new tool, Talend provides a reporting capability or solution, it would be great. It will reduce the development effort for creating these kinds of reports. We also manage the infrastructure for Talend. From the licensing perspective, for cloud, they only have seat licenses where one person is tied to one license, but for on-premise, they have concurrent licenses. It would be really awesome if they can provide concurrent licenses for the cloud so that if one person is not there, somebody else can use that license. Currently, it is not possible unless a person deactivates his or her license and moves the same seat license to someone else. We are one of the biggest customers in the central zone of the US for Talend, and this is the feedback that we have provided them again and again, but they come back and say that they aren't able to provide concurrent licenses on the cloud. In version 7.3, there is a feature for tokenization and de-tokenization of data. This is the feature that we are looking for. It is useful if somebody wants to see what we have masked and how do we demask it. This feature is not there in version 7.1. There are also a few other capabilities on the cloud, but we don't yet have a big footprint in the cloud."
RapidMiner is ranked 3rd in Predictive Analytics with 20 reviews while Talend Data Quality is ranked 4th in Data Quality with 20 reviews. RapidMiner is rated 8.6, while Talend Data Quality is rated 8.0. The top reviewer of RapidMiner writes "A no-code tool that helps to build machine learning models ". On the other hand, the top reviewer of Talend Data Quality writes "Saves a lot of time, good ROI, seamless integration with different databases, and stable". RapidMiner is most compared with KNIME, Alteryx, Dataiku, Tableau and Microsoft Azure Machine Learning Studio, whereas Talend Data Quality is most compared with Ataccama DQ Analyzer, Informatica Data Quality, Alteryx, Precisely Trillium and Ataccama ONE Platform.
We monitor all Predictive Analytics reviews to prevent fraudulent reviews and keep review quality high. We do not post reviews by company employees or direct competitors. We validate each review for authenticity via cross-reference with LinkedIn, and personal follow-up with the reviewer when necessary.