We performed a comparison between Actian Pervasive Data Integrator [EOL], Informatica PowerCenter, and StreamSets based on real PeerSpot user reviews.
Find out what your peers are saying about Microsoft, Informatica, Oracle and others in Data Integration."There were no concerns with the stability. This product is very good from a stability perspective."
"It has helped us monetize."
"It's a complete package, which is why we use this solution."
"What I like the most is that we have to deal with less while writing the queries."
"It works with any multi-databases, so it works with Sybase, SQL Server. Also, the performance is really good and it is easy to use."
"The most valuable feature of Informatica PowerCenter is data transformation and user-friendliness."
"The most complex task, in this case, was to read and transform BLOB data, and Java transformation in Informatica Power Center was a great solution."
"I like the automated scheduling feature."
"I found the map links, work links, and workflows valuable. They are important features."
"I have used Data Collector, Transformer, and Control Hub products from StreamSets. What I really like about these products is that they're very user-friendly. People who are not from a technological or core development background find it easy to get started and build data pipelines and connect to the databases. They would be comfortable like any technical person within a couple of weeks."
"StreamSets’ data drift resilience has reduced the time it takes us to fix data drift breakages. For example, in our previous Hadoop scenario, when we were creating the Sqoop-based processes to move data from source to destinations, we were getting the job done. That took approximately an hour to an hour and a half when we did it with Hadoop. However, with the StreamSets, since it works on a data collector-based mechanism, it completes the same process in 15 minutes of time. Therefore, it has saved us around 45 minutes per data pipeline or table that we migrate. Thus, it reduced the data transfer, including the drift part, by 45 minutes."
"What I love the most is that StreamSets is very light. It's a containerized application. It's easy to use with Docker. If you are a large organization, it's very easy to use Kubernetes."
"The most valuable feature is the pipelines because they enable us to pull in and push out data from different sources and to manipulate and clean things up within them."
"The Ease of configuration for pipes is amazing. It has a lot of connectors. Mainly, we can do everything with the data in the pipe. I really like the graphical interface too"
"It's very easy to integrate. It integrates with Snowflake, AWS, Google Cloud, and Azure. It's very helpful for DevOps, DataOps, and data engineering because it provides a comprehensive solution, and it's not complicated."
"The most valuable features are the option of integration with a variety of protocols, languages, and origins."
"StreamSets data drift feature gives us an alert upfront so we know that the data can be ingested. Whatever the schema or data type changes, it lands automatically into the data lake without any intervention from us, but then that information is crucial to fix for downstream pipelines, which process the data into models, like Tableau and Power BI models. This is actually very useful for us. We are already seeing benefits. Our pipelines used to break when there were data drift changes, then we needed to spend about a week fixing it. Right now, we are saving one to two weeks. Though, it depends on the complexity of the pipeline, we are definitely seeing a lot of time being saved."
"I am not sure if there are various connectors available in the recent version of Pervasive DI to support the wide range of sources available (e.g., big data, cloud, EME)."
"This product is going to decommission in the next couple of years."
"Informatica PowerCenter could improve on the documentation for the implementation. The documents provided are not very good for a new user."
"We need another tool for monitoring. It would be easier if all the features were consolidated into one tool."
"It should be more cloud-centric than on-prem-centric."
"An issue which should be addressed is that the solution only allows us to do structured, as opposed to unstructured, data processing."
"The real-time database connectivity when getting the real-time data using the VPN is an area that needs improvement."
"In terms of performance improvement and tuning, there should be a bit more guidance and documentation."
"Some of the conversions are done inside the product. We use work tables that are created by the engine itself, but the names of the work tables are very long, and they don't have any meaning, which makes it a bit difficult to understand and follow exactly what is happening inside."
"StreamSets should provide a mechanism to be able to perform data quality assessment when the data is being moved from one source to the target."
"Visualization and monitoring need to be improved and refined."
"I would like to see further improvement in the UI. In addition, upgrades are not automatic and they should be automated. Currently, we have to manually upgrade versions."
"They need to improve their customer care services. Sometimes it has taken more than 48 hours to resolve an issue. That should be reduced. They are aware of small or generic issues, but not the more technical or deep issues. For those, they require some time, generally 48 to 72 hours to respond. That should be improved."
"The data collector in StreamSets has to be designed properly. For example, a simple database configuration with MySQL DB requires the MySQL Connector to be installed."
"The design experience is the bane of our existence because their documentation is not the best. Even when they update their software, they don't publish the best information on how to update and change your pipeline configuration to make it conform to current best practices. We don't pay for the added support. We use the "freeware version." The user community, as well as the documentation they provide for the standard user, are difficult, at best."
"There aren't enough hands-on labs, and debugging is also an issue because it takes a lot of time. Logs are not that clear when you are debugging, and you can only select a single source for a pipeline."
"We create pipelines or jobs in StreamSets Control Hub. It is a great feature, but if there is a way to have a folder structure or organize the pipelines and jobs in Control Hub, it would be great. I submitted a ticket for this some time back."
Earn 20 points