What do BI analysts and big data professionals need to consider when choosing an ETL tool?
With all the software options on the cloud computing market, which data warehouse tools are users most satisfied with?
At IT Central Station, many users have written cloud data warehouse reviews that address these questions, as well as other prevalent topics among IT users currently engaged in BI analytics, data mining, and cloud-based data at large.
Amazon Redshift
“Our single source of truth for the organization”
Nir Wasserman, BI Manager at a tech vendor with 201-500 employees, writes about the value of a “unified and organized database” -- which Amazon Redshift provides him with;
“Since we have lots of data sources and high volumes, we needed a unified and organized DB that can handle these amounts and will be our one single source of truth for the organization. Therefore, Redshift is the best solution.”
Petabyte-scale Data Warehouse
Aju Mathew, Director at a tech company with 1,001-5,000 employees, shares a customer success story that demonstrates the storage capabilities of AWS Redshift:
“One of our existing customers stores more than 500 terabytes of data in an AWS Redshift database and the warehouse performance was good.
We want to highlight that even if the warehouse size increases to petabytes, Redshift would still work fine, and there wouldn’t be any performance issues and would cost less also.”
Query compilation time
Gregor Ratajc, Full Stack Engineer at a tech services company with 11-50 employees, points out the need for Amazon Redshift to improve their query compilation capabilities:
“Query compilation time needs a lot of improvement for cases where you are generating queries dynamically. Also, it would help tremendously to have some more user-friendly, query optimization helper tools.”
JSON format support
Wasserman also discusses Amazon Redshift’s support for JSON format, explaining that “you can copy JSON to the column and have it analyzed using simple functions.” He also attributes the cloud computing tool’s value to “the parallel off/on where you can choose if you want it to unload to split files or into one file.”
How Redshift differs to other cloud computing tools
Padmanesh NC, Big Data Solution Architect - Spatial Data Specialist at Sciera, Inc., explains why he chose AWS Redshift over other tools:
“I evaluated Hadoop and Spark, along with Redshift. I have no negative comments about those other products. Redshift is flexible in terms of configuration, maintenance, and security, especially VPC configuration, which secures our data a lot…
I have experience working in Hadoop as well. When I compare the two (Redshift & Hadoop), Redshift is more user-friendly in terms of configuration and maintenance.”
Netezza
Analyzing Years of Data
Ranganath Praveen, Lead Consultant at a tech services company with 10,001+ employees writes about Netezza’s powerful data analysis capabilities:
“Analyzing years of data requires high processing power and storage. IBM PDA has exactly that. Years of processed data (tables) can be queried and retrieved based on management requirements. This can be done in minutes for analysis.
This is extremely important in identifying trends for decision making in higher management to serve customers better in today’s business environment.”
Choosing the right ETL and reporting tool
Valai Gunapalan, QlikView Consultant at a tech services company with 11-50 employees, lends advice to users considering Netezza:
“It is easy to use. Make sure you select the right ETL and reporting tool. Also, select the right tool for the organization to hold it in the long run.
It has a compression engine and FPGA on but you should still analyze your volume of data and decide on the right model and size.”
Regarding Netezza’s compression capabilities, Gunapalan adds that “I can't extend the storage, only up to 6x compress of data. You need to plan this when selecting the right product to buy.”
Part of an entire ecosystem
A Member of the Board of Directors at a tech services company with 51-200 employees describes the value that Netezza adds to his organization, emphasizing its integration with other data platforms in their ecosystem:
This user explains that Netezza “has been the primary driving technology behind the corporate-wide transition to Netezza as a standard data platform. A whole ecosystem is beginning to develop around the product.”
For valuable features, he lists several:
Ease of use
Lack of performance problems for analytics and massive data systems
Integration with Linux-based ETL and data streaming technologies
Integration with distributed computing platforms
Highly complicated architecture
Praveen also points out the challenge involved in using Netezza, which stems from it being “a highly complicated architecture and only IBM engineers/support, or someone who worked on the hardware side of the system can understand the system architecture completely.”
Read more of the most recent cloud data warehouse reviews on IT Central Station.
Read more about cloud computing:
New Cloud Data Integration Reviews -- Q2 2017