What is our primary use case?
We are very happy to have a good warehouse solution that we can run on-premises. Next year we will start testing it on the cloud so we can progress to a hybrid solution. We would like to have Vertica in the cloud integrating with our data lake. We could go with another cloud solution but Vertica has been a good solution for us so far and we hope the cloud product is as good and flexible.
On-premises, Vertica would still be used as our data warehouse solution. That would keep us where we know that the cost for the hardware is relatively reasonable where we have customers running large number of reports. At the moment we have customers running many thousands of reports a day and having Vertica gives us real-time insight into our data. The ROI with Vertica on premises has been sunstantial.
How has it helped my organization?
Allow near realtime reports into vast amounts of data has given is insights into most areas of our business
What is most valuable?
I think the most valuable thing about the product is the speed and resilience. I would say the strong, well-featured SQL engine with many built-in features — including machine-learning — is also a strong point.
When testing alternative columnar or in-memory databases the queries their speed may quicker for one or two users, but their concurrency is not as good as Vertica. I would have to say Vertica returns the queries pretty much all the time reliably while other products can sometimes run for 10 minutes and still fail.
What needs improvement?
Every product has room for improvement and Vertica is no different in that way.
I think the geospatial is quick, but could be quicker.
Continue adding machine learning code to run directly on the database.
I also think the ability to perhaps directly link to other databases rather than just data sources and files would be another one.
For new functionality, I think the possibility of adding triggers or programmatic pieces of code might be helpful, depending on the data coming in. It is difficult to say if it would work or cause more issues than it solves.
For how long have I used the solution?
Five-plus years in heavy use.
What do I think about the stability of the solution?
Vertica is extremely stable. In all our years of heavy use, we had one crash at this was more likely OS related.
If you have Vertica in the cloud in Eon mode, then you can have different clusters to isolate different types of data usage. So, if it is a situation where you have many small, quick, concurrent queries, you could have one cluster for that. I have one for doing Kafka and batch-loading. You can have one for machine-learning as well. But when you are on-premises, up until recently, you had only shared resources which had to to be managed using resource pools.
What do I think about the scalability of the solution?
It is scalable. You just add extra nodes. But like all data solutions and all databases, you have to always consider how the data is stored, and how the data is queried. We have Elasticsearch and several other database. They will perform differently when used for different tasks. The right database for the right task.
Vertica is really quick, but with poor query design a report can take far too long.
Vertice EON mode allows for true user / query type scalability by firing up additional clusters as requirements change
How are customer service and technical support?
I have contacted the technical support team fairly frequently. I find the support very good, but for more technical questions end up having to escalate past the first point of contact.
Which solution did I use previously and why did I switch?
I have varied experience in this category of products and I have used a few different ones. We have a legacy Oracle data warehouse, which is just not performing quick enough - hence the move to Vertica. Since moving to Vertica all the manual maintenance and speed issues of the Oracle data warehouse are left behind.
We did look at ParAccel, Vertica and Kognitio when deciding we needed a fast database that could handle large volumes of data. Vertica worked on-premises, in the cloud, and I did extremely in-depth, rigorous testing on it, concurrency, size, volume, and it outperformed all the other databases. The cost was reasonable as well. Some of our larger reports are running on 3.200 billion row tables and they are running in seconds.
How was the initial setup?
Actually, the setup of the product itself was straightforward and really quick.
Once it was up and running and ingesting data we created our first customer facing reports very quickly. Replacing scheduled reports took us eight or nine hours on a Sunday and did not allow any real drill-down could be executed in milliseconds pm Vertica with a lot more flexibility. What takes time is building our dashboards, communication with the customers, introductions to the new products, document the reports.
I found it really straightforward to set up and run. I would say within two months of setting it up, I had all of my ingestion routines in place and everything was formalised. It ended up that it was one of the easiest database products I have ever had to set up.
We deployed without any external help except maybe a few tech support calls.
What was our ROI?
With something like Vertica it has had a really excellent ROI. To allow realtime reporting opens up possibilities. Democratisation of data and having a front end reporting tool allows insights into data that would not have been possible.
Which other solutions did I evaluate?
What differentiates Vertica from other competitors would probably be the ease of setup and ease of management. It can also ingest data from many sources, like Kafka, Parquet, and ORC (Apache Optimized Row Columnar). Vertica is quite flexible in how it handles its data. So mostly the flexibility and the ease of management separate it from most other products.
What other advice do I have?
Advice that I would give others who are considering this solution is what I always say for all products: you should heavily test your use cases and data with the product. There are different versions of software and there are always new software solutions. You always should do an exhaustive performance testing from the point of view of any database product. Better to let the product speak for itself. Test several vendor's products if possible.
To be honest, it is hard to see too many issues with Vertica as a solution. I find a few weaknesses in it, but I think that is the same with all databases. I think when you are doing very large fact table to very large fact table joins pretty much every database will experience problems. Joining two tables that are many multi-million rows is just a large volume to process.
Honestly, I think if you get a proof-of-concept up, you do your typical load and then keep multiplying concurrency and data volume — see how it works for concurrency, speed.
Vertica so far has proven to be relatively hassle-free. But, like all databases, you do have to keep an eye on how people use it, particularly when it comes to reporting.
On a scale from one to ten where one is the worst and ten is the best, I would rate Vertica as a nine.
Which deployment model are you using for this solution?