What is our primary use case?
We are a service provider and we currently have five clients with active IT implementations that use Amazon Redshift. We also use it ourselves.
My clients primarily use this product for data analytics. They are mostly working with big data and using the machine learning functionality.
What is most valuable?
I like the cost-benefit ratio, meaning that it is as easy to use as it is powerful and well-performing. There are only three parameters that you need to understand, which are the distribution key, the sort key, and the compression method or encoding method. Once you understand these, you can tune the performance.
What needs improvement?
I would like a better way to ingest data in realtime because there is a bit too much latency.
There are too many limitations with respect to concurrency. It is now possible to auto-scale it, although that is still slow.
It could offer smaller nodes with decoupling of storage and processing because for the moment, the only nodes available to work that way are huge, and for large companies.
For how long have I used the solution?
My first implementation of Redshift was three and a half years ago, in 2017.
What do I think about the stability of the solution?
We have not had many issues with stability.
What do I think about the scalability of the solution?
Scalability can be a problem if you don't write your database queries correctly. For example, if you write a cartesian product in Redshift then you may end up consuming all of the resources. However, it does have features like workload management to prevent this from happening.
Our clients are mid-sized to very large companies.
How are customer service and technical support?
I have been in touch with Amazon technical support and they are very good. They are efficient and resolve problems quickly. They know what they're doing and they're very professional.
Which solution did I use previously and why did I switch?
I have also used Snowflake and its methods for ingesting real-time data are faster. It also offers a bit more functionality and a bit more flexibility. It's a bit easier to maintain and faster to scale, but more expensive as well.
To me, the big drawback with Snowflake is that the data is not stored in your AWS or Azure subscription, or AWS account. They store the data in their own account that they manage for you, which might be a problem for some companies in terms of compliance and legal requirements.
Azure Synapse and Google BigQuery are also competing solutions.
How was the initial setup?
The deployment is very straightforward and it usually takes a couple of minutes. This is one of the reasons I like it.
As long as a person understands the AWS landscape, they can deploy it on their own. Otherwise, without realizing it, they might for example deploy a Redshift cluster that is not properly secure. Similarly, it could cost a lot of money if they don't know what they're doing. You don't need a very in-depth technical expertise, but you do need to understand how AWS works.
What about the implementation team?
I have a team that provides maintenance for our customers. It is spread between France and Belgium and I have 25 people who report to me, with another 20 who I work with indirectly.
What's my experience with pricing, setup cost, and licensing?
The cost of Redshift ranges from a few hundred dollars a month to thousands of dollars a month, according to the resources that you're going to use, the number of nodes, and the type of nodes.
My customers have implementations that cost about $500 a month for a very small one. I also have a customer with a monthly invoice of about $25,000 USD.
What other advice do I have?
With the most recent update, we should now be able to decouple storage from processes.
My advice for anybody who is implementing Redshift is to make sure that they are using it for what it is made to do. It's an analytical database, so it's not meant to process transactional data. It's the perfect tool if you use it for the right purpose.
Overall, it is a very stable and robust product. That said, there is still plenty of potential for improvement.
I would rate this solution a seven out of ten.
Which deployment model are you using for this solution?
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?
Amazon Web Services (AWS)