What is our primary use case?
We have a large, enterprise-level VMware virtual infrastructure. We use vROps for private cloud monitoring. We are using vROps for capacity management and audit monitoring. If there is any issue within the infrastructure, within the thresholds, vROps will capture them and trigger alerts. The triggered alerts are sent to our ticketing tool, using the REST API, and the ticket is created according to the priority. The respective first-level teams will handle those incidents.
How has it helped my organization?
The incidents we deal with are mainly in things like capacity management. Over a period of time, the virtual infra keeps growing. We measure when we are going to hit the entire capacity and we will always set thresholds 30 days ahead of hitting capacity. vROps will alert on that, and we can procure more hardware proactively and we can keep increasing the capacity well in advance.
VMware has released a feature called Continuous Availability (CA). We have HA within the data center and the CA is across the data centers. We use both services. For most of the infra we are using HA, meaning within a given data center, we have a master and master replica and multiple data. Based on the growth of our virtual infra, or if there is any new deployment, we'll keep increasing our data nodes. It can do analysis and give you beautiful reports. Those reports are very useful for management. What is the status of our memory and CPU? What was the utilization of infra like in the last 30 days? How many workloads were deployed? What are the future requirements? With a simple click we can generate the reports.
It certainly helps us to decrease overall downtime. While we have cluster-level resiliency on the vSphere end, vROps provides an alerting solution. On top of that, we can use workload balancing. vROps will sense that there are multiple clusters running, some that are more utilized and some that are under-utilized, and it will report that to us. If you use it to balance, it will automate that back to the virtual infra, and it will do all the migrations automatically. Workload balancing is a great feature from vROps. Without vROps, we had 80 to 85 percent uptime. With vROps, we improved that at least 10 percent and we are close to 98 or 99 percent uptime.
It has also increased VM density on particular clusters. Based on the memory assigned to the workload, the density on the cluster varies. If we have 50 VMs on a particular cluster, but the resource allocation is greater there, that cluster is heavily used. If we have a second cluster with 100 VMs, but each VM is assigned less memory and CPU, we cannot say that the density of the first cluster is only 50 and the second cluster is 100 VMs. It will calculate based on the demand and allocation model of capacity and resources to the workloads.
With vROps we have saved on hardware costs by at least 5 percent.
In addition, in general, if I want to see the logs for a particular object, I need to log in to vRealize Log Insight and search by framing a query. But because it is integrated with vROps, when I go to the cluster tree, if I click that object and click on the logs, it will automatically provide the output. It is very simple and I don't need to log in and frame the query.
What is most valuable?
The "what-if" analysis capability is important to us. We can create a report for possible failures. What if we lose one host or two hosts? And if we add two hosts, how does that affect our resources? Or if there is a new project and we need a certain amount of workloads deployed, how many hosts do we need? With the existing capacity, if we add that many workloads what will our remaining capacity be? We can do capacity analysis with this tool.
Policy tuning and the SDDC Management Pack for health monitoring are also important.
It gives us visibility into the virtual infrastructure, and even the physical infrastructure, and into the workloads running. We have visibility even at the level of the appliance services. We can monitor everything. We can also create dependency reports, so if a service is down, it will not impact things. It gives us those dependencies brilliantly.
What needs improvement?
When it comes to policies, they need to fine tune things to make it easier. It is a bit difficult setting up policies.
For how long have I used the solution?
We have been using VMware vRealize Operations for six years. We started with version 6.x. We keep upgrading and now we are running on the latest version, 8.1.
What do I think about the stability of the solution?
With the HA feature it was a stable product, but with the new service, the Continuous Availability, we have seen some issues and we are not recommending that. We are re-deploying that infra to high-availability. CA is a great feature, but we see some issues with our infra, so we are using HA. As soon as we got that new CA feature we implemented it and we learned that it creates a lot of issues for our infrastructure, but it is working fine for other customers. VMware tried to help us and their solution was to move to the HA.
But stability-wise, it's good. It won't create any issues. If there is an issue, just a simple services restart will fix them. We've mostly seen that disk space consumption increases when we keep provisioning and expanding. But that works fine and the product's stability is very good.
What do I think about the scalability of the solution?
We can scale up the infra without any downtime. There have been no issues.
How are customer service and technical support?
If there is any issue, they will pitch in and help, based on the severity. They're very helpful and very knowledgeable. We get good support from them. No issues. Their support has been brilliant.
Which solution did I use previously and why did I switch?
We started applying vROps in parallel with the inception of our VMware infra.
How was the initial setup?
The solution is very user friendly. In one step it is ready to deploy. We don't need to configure anything on the OS level. You just deploy it and power-on. We only need to configure in, vCenter, which infra we are monitoring. When we start to onboard, it's very simple to manage. Anybody can deploy and configure it. It is easy to deploy. There are a lot of publicly available articles that we can refer to. There was a great article on end-to-end setup.
Based on the virtual infrastructure size, we decide which appliance size is needed. Do we need to go for tiny, medium, large, or extra-large. The decision is based on our environment's capacity, how many objects we have within the virtual infra. We first deploy the master, then the master replica, and then the data nodes. We can run with one master node, but if we deploy master and replica and data nodes, it gives us more resilience. So even if we have a failure on the master, the master replica makes it a high-availability solution.
Deployment takes just 15 minutes, and we can have vROps up and running in 30 minutes.
There are five members on our team and everyone has knowledge of vROps. Everyone is certified. There is no segregation of roles. Everyone takes care of the entire product life cycle, whether it's upgrading, troubleshooting, or streamlining. We use it day in and day out. Our key job is tracking of vROps' health and alerts-monitoring, to make sure it's running fine. It's part of our daily work.
What's my experience with pricing, setup cost, and licensing?
They forecast our pricing based on the objects we deploy, but I'm not involved much with that. The licensing part is a bit complicated.
Which other solutions did I evaluate?
We have not evaluated other solutions since this one is from VMware itself. We prefer to use the proprietary solution.
What other advice do I have?
It provides proactive monitoring, but it is not a real-time monitoring. It is polling every five minutes. If there is an issue in the first minute, but polling happens at the fifth minute, there is a gap of four minutes. It will capture that failure and alert in the fifth minute. It is more reactive monitoring, in that sense. But at least we know there is an issue.
Overall, vROps is maturing, year by year. New versions have a lot of scope. We are not fully utilizing it, but if you understand the product features correctly, it will save you a lot of cost and reduce manual efforts. I would recommend it. If someone is looking for virtual monitoring, vROps is the best solution.
Which deployment model are you using for this solution?
Which version of this solution are you currently using?