What is our primary use case?
We primarily use it as a cost reduction tool regarding our cloud spend in Azure, as far as performance optimization or awareness. We use Turbonomic to identify opportunities where we can optimize our environments from a cost perspective, leveraging the utilization metrics to validate resources are right-sized correctly to avoid overprovisioning of public cloud workloads. We also use Turbonomic to identify workloads that require additional resources to avoid performance constraints.
We use the tools to assist in the orchestration of Turbonomic generated decisions so we can incorporate those decisions through automation policies, which allow us to alleviate long man-hours of having someone be available after hours or on a weekend to actually perform an action. The decisions from those actions are scheduled in the majority of cases at a specific date and time. They are executed without having anyone standing by to click a button. Some of those automated orchestrations are performed automatically without us having to even review the decision, based on some constraints that we have configured. So, the tool identifies the resource that has a decision identified to either address a performance issue or take a cost saving optimization, then it will automatically implement that decision at the specific times that we may have defined within the business to minimize impact as much as possible.
There are some cases where we might have to take a quick look at them manually and see if it makes sense to implement that action at a specific date and time. We then place the recommendation into a schedule that orchestrates the automation so we are not tying up essential IT people to take those actions. We take these actions for our public cloud offering within Azure. We don't use it so much for on-prem workloads. We don't have any other public cloud offerings, like AWS or GCP.
We do have it monitor our on-prem workloads, but we do not really have much of an interest in the on-prem because we're in the process of a lift and shift migration for removing all workloads in the cloud. So, we are not really doing too much with the on-prem stuff. We do use it for some migration planning and cost optimization to see what the workload would look like once we migrated into the cloud.
From our on-prem perspective, we do use it for some of the migration planning and cost planning. However,& most of our implementations with this are for optimization and performance into the public cloud.
It provides application metrics and estimates the impact of taking a suggested action from two aspects:
- It shows you what that impact is from the financial aspect in a public cloud offering. So, it will show you if that action will end up costing you more money or saving you money. Then, it also will show you what that action will like from a performance and resource utilization perspective. It will tell you, "If you make the change, what that resource utilization consumption will look like from a percentage perspective, if you will be consuming more or less resources, and if you're going to have enough resource overhead for performance spikes."
- It will give you the ability to forecast, but the utilization consumption's going to be in the future term. So, you can kind of gauge whether the action that you're taking now, e.g., how it's going to look and work for you in the long-term.
How has it helped my organization?
In our organization, optimizing application performance is a continuous process that is beyond human scale. We see tremendous value in Turbonomic to help us close that gap as much as possible within our organization. Essentially Turbonomic will provide us with a recommendation on how to address a workload in real-time based on its actual utilization. Then, we have pre-defined time slots where those actions can be implemented with minimal impact to the business because some of the changes may require rebooting the server. So, we don't want to reboot the server at 2:00 in the afternoon when everyone is using it, but we might have a dedicated time slot that says, "After 5:00 today or 2:00 in the morning when no one is using it, this server can be rebooted to take the action."
We have leveraged Turbonomic to not only ingest the data from the utilization of workloads to come up with performance-based driven decisions. We also have used Turbonomic to help orchestrate and initiate those actions automatically for a very large portion of our organization without us having to even be involved at all. For some more sensitive workloads, we look at them and coordinate with the business whether we will take action at another date and time.
We primarily use it in the public cloud for servers. We also monitor storage and databases within Azure. This is another added benefit that we like about Turbonomic. When we look at a decision, we are looking at how that decision is being driven based from a storage perspective, the IOPS being driven to a specific storage solution within our public cloud offering, its decisions based on specific DTU utilization from a database perspective, or if it is even a percentage of memory or CPU consumption. It takes into account all those various aspects and never puts us in a position where we take a decision or action without accommodating these other pieces and having them negatively impact us.
That level of monitoring is what has given us the confidence to allow Turbonomic to implement actions automatically without having IT oversight micromanage decisions, because it provides that holistic view, takes into account all those aspects, and ensures that a decision that is implemented never puts you into a point of contention or concern. We have the confidence to allow the appliance of the software solution to take actions without little to no IT oversight.
Turbonomic has identified areas within our public cloud where we had storage that was not being used at all. So, it provided us with insight into what that unused storage was so we could delete the unused storage and save on the recurring consumption cost. That was very helpful.
We have identified numerous workloads which have been overprovisioned by an administrator. We were able to essentially right-size workloads to use less resources, which cost us less money in our public cloud offering, e.g., a configuration with less memory or less CPU than what it was originally configured for. That helps us reduce our cloud consumption significantly.
In addition to ensuring that workloads are right-sized correctly, we have been able to save even more with our public cloud consumption by identifying workloads where we could purchase reserved instances, essentially long-term contracts for specific workload sizes. This allows us, on average, to save an additional 33% or more on our server run rates.
Turbonomic provides a proactive approach to avoiding performance degradation. It has allowed us to detect issues before they have actually become issues. Traditionally, in IT, we would not be aware of an issue until someone from the business came to us with an issue, then we would investigate the issue. In some cases, we would spend a couple hours trying to figure out what the issue was, then determine if something needed more resources, like more memory. Since Turbonomic, we have been able to almost immediately identify that our system needs more resources and take the action right then and there. Or, Turbonomic has identified there is an issue and we take an action, then notify the business that an action was taken in order to preemptively avoid a business impact.
Previously, a business impact use case would potentially take us hours. With Turbonomic, whenever we run into a business impact use case now, before we even log into a system to initially troubleshoot it, the first thing we do is go to Turbonomic and see, "What is Turbonomic telling us? What is the workload like now? What has it looked like in the last 24 hours or week? Do we see any trends to help guide us towards identifying where we should go from a troubleshooting perspective?" From that aspect, Turbonomic has definitely helped guide our path to resolution.
What is most valuable?
The ability to look at a workload from an actual consumption perspective for the resources that it's consuming internally is particularly valuable. For instance, when we have a server in the public cloud, we might provision a certain amount of memory resources to it and CPU, e.g., two processors and 24GB of memory. The tool provides the ability to look at the consumption utilization over a period of time and determine if we need to change that resource allocation based on the actual workload consumption, as opposed to how IT has configured it. Therefore, we have come to realize that a lot of our workloads are overprovisioned, and we are spending more money in the public cloud than we need to.
This solution allows us to have the data to make business decisions without having a concern on whether we are going to be impacting the business negatively by taking the wrong action. We actually have the analytical data to back decisions. This helps us have discussions with the business on if it's the right decision to make or not.
Turbonomic has the ability to manage the full application stack. We have not plugged in all aspects of our application stacks, but it does provide that. That's one of the things that we love from Turbonomic is that we're not only ingesting the data into Turbonomic and reviewing the decisions that Turbonomic is providing, but Turbonomic is also essentially providing us a single pane of glass to implement those actions. So, if there is an action that we would like to take, whether it is someone manually clicking a button and taking the action or the action being initiated automatically by Turbonomic, that is all taken from within the appliance. We don't have to go and log in somewhere else or log into our public cloud offering and take that action. It can all be done from a single management pane. We can look at our supply chain for a specific application or workload and see if one specific part of the solution is causing a problem, as opposed to having a bunch of people on the phone with a bridge call and having people looking at different aspects of the solution that they are more intimate with. Turbonomic shows us the ability from a service chain perspective, how things pitch together, and helps us identify that single point or bottleneck causing the impact. We have used it from that perspective.
It provides the ability for us to create customized dashboards and custom reports to help showcase info to key stakeholders. We have leveraged the custom reporting for things, like SAP, that we have running in the public cloud to show how SAP is running, both from a performance aspect as well as from a cost perspective.
What needs improvement?
There is an opportunity for improvement with some of Turbonomic's permissions internally for role-based access control. We would like the ability to come up with some customized permissions or scope permissions a bit differently than the product provides. We are trying to get broader use of the product within our teams globally. The only thing that is kind of making it hard for a mass global adoption, "How do we provide access to Turbonomic and give people the ability to do what they need to do without impacting others that might be using Turbonomic?" because we have a shared appliance. I also feel that that scenario that I'm describing is, in a way, somewhat unique to our organization. It might be something that some others may run into. But, predominantly, most organizations that use or adopt Turbonomic probably don't run into the concerns or scenarios that we're trying to overcome in terms of delegating permission access to multiple teams in Turbonomic.
For how long have I used the solution?
It has been somewhere between two and a half to three years since we started our relationship with them.
What do I think about the stability of the solution?
The stability is very good. We have not had to open up any support tickets for the product to troubleshoot or recover the appliance. It has been running just fine. We haven't had to redeploy or recover anything with it, surprisingly, in the two and a half years that we have had it. The code updates are pretty easy to perform as well. Ongoing maintenance is really simple, and our account team helps us with the code updates. They get a meeting invite together, then it is less than a whole 10 minutes, but they are there every step of the way.
What do I think about the scalability of the solution?
It is pretty scalable, in terms of any concerns that we would have. Right now, we are using on-prem appliances. However, if we needed to, they have the ability of pouring into a SaaS-based offering, which would help us adopt it faster, in terms of some of our sister companies, because we are not isolated to network access within this particular data center. We could leverage the same licensing from a SaaS perspective, then they wouldn't have to use a VPN to connect to the appliance to use it.
There are situations from a scalability perspective where we have to take into account things like GDPR. For things where GDPR or data sovereignty come into play, the scalability becomes a bit of a concern because you can only keep the appliance within that specific region. You need separate instances of Turbonomic, but the team has the ability to allow us to tackle that from a licensing perspective. This is a pretty minimal concern. We tackle GDPR or data sovereignty from the perspective that we just apply an instance of Turbonomic within that specific country region.
How are customer service and technical support?
If we have any questions or concerns, the account team as well as the product support team are always there and very accommodating to help us. With any problems that we have, even if they are not built into the product, we have worked with them to give them feedback on the product and on how we would like it to work. They have worked with us to help import some of that functionality into the product so it is available, not just for us, but for other customers who use the product as well.
How was the initial setup?
The initial setup was relatively straightforward. It was a pretty easy setup. I wouldn't say it was any more difficult than any other tool that we set up or have used in our environment. It is pretty easy to deploy, then probably just as easy to configure once it was deployed.
What was our ROI?
It helps us gauge our return on investment for the purchase of Turbonomic, based on the overall actions that we've taken and how much money we have saved by taking those actions over a period of time.
In the last year, Turbonomic has reduced our cloud costs by $94,000. It has identified a lot more cost saving areas, but we haven't taken advantage of those.
The amount of tickets that we have had come in for performance issues has surmounted to almost nothing in the calendar year. I don't know what we had before, but now in a calendar year, it is less than 10 to 12 tickets a year for a performance issue.
It has definitely provided a huge benefit in the area of man-hours saved. Without the tool, we would be flying blind on that and would probably be spending a lot of man-hours trying to formulate in-house strategies on how to reduce costs. Our company is a very lean company, in terms of headcount for IT resources as well as cloud skillset awareness. Having a tool like Turbonomic has allowed us to adopt and implement strategies like this, like cost saving measures with the public cloud, probably making us exponentially faster than we could have been without them.
When we had hit on how it ingests the workload performance data to help provide performance-driven analysis or recommendations to provide a recommendation for whether a workload should be scaled up or down, one of the things that has been kind of like a side effect to the ingestion of this data and the business decisions coming out of Turbonomic is it has been helping us identify workloads which are really not being used at all. From identifying those workloads that are not being used, we are able to go through our lifecycle management faster and more efficiently than we would have in the past. We have been able to decommission servers, essentially deleting them from our public cloud and completely reducing the operational cost of that workload altogether. So, it is not just ensuring that the VM is right-sized or locking in a commitment, but identifying that the workload is so low to utilize.
We are able to go back to the business and having a discussion with them based on the utilization of that VM over the course of a period of time for the data that we have, then have the justification and communication with the business to say, "Yeah, it doesn't make sense to have this workload in the environment anymore. Let's delete it." or, "Yeah, it's something that isn't used it all. Let's go ahead and delete it." It is allowing us to identify areas to save cost in those areas, but it's also helping us say, "This workload is costing us this much money. Is it really worth spending this much money every month or so for this solution that is running in the public cloud? Is it generating enough revenue for the business to warrant the run rate? Is the solution providing a service to the business that justifies the operational consumption on a monthly basis?" We are able to have these internal discussions within the business based on the data that Turbonomic is providing. This is a side effect of the product because the product is not providing these decisions and implementing them, but the product is providing us the data to have these discussions and net these decisions as an outcome. Then, this ends up saving money in our public cloud offering.
Which other solutions did I evaluate?
We did try some other solutions as PoCs before we worked with Turbonomic. Unfortunately, I am not aware of who those companies were because that was before I came onboard with the team. The big thing that it always came down to was whether we were going to adopt the entire implementation setup and configuration aspect. For example:
- How much work was it going to take to deploy the appliance?
- How many man-hours would it take to configure it?
- What the continuous configuration and management was going to be?
- Was it really saving us time and money in the long run?
Other solutions always fell flat because of how much involvement it would require from IT to deploy and work it, but also because of the ongoing configuration and maintenance of the appliance.
What other advice do I have?
It doesn't pick up the entire supply chain automatically. It requires minimal effort in configuration. We have to show a relationship in a sense that this workload is associated with another workload. However, once that relationship is established, the solution helps us manage our business-critical applications by understanding the underlying supply chain of resources.
Our capital expenses are relatively flat. We are not purchasing any new equipment. We are actually in a consolidation process. Everything is getting moved to the public cloud. From an operational perspective, with our workloads being in the public cloud, it has provided us:
- The ability to identify what we have running in the public cloud and how much it will actually cost us.
- How we can reduce public cloud operational costs, e.g., what actions can we do to help reduce operational expenses in the public cloud?
It identifies areas where we can delete storage that is not being used. We can address right-sizing workloads that are overprovisioned in the public cloud as well as logging in long-term commitments with workloads in the public cloud and saving on incidents, on average for us, over 33% or higher for our workloads, as opposed to just paying the pay as you go hourly rate with the provider.
Try to look at things, not just from a cost savings perspective, but also from performance avoidance. We looked at: How do we quantify our spend in the public cloud and how do we avoid our spend in the public cloud? But we always forgot that there were workloads out there that do have performance impacts. So, we counted this as a cost savings and cost optimization tool, but it became so much more than that.
We developed a crawl, walk, run approach. We took some workloads in our public cloud and looked at the business decisions. We took the decisions, then we tested to see what the outcomes were with them. As we went through those actions manually, gained the confidence on how those actions were being made, and what the post impact of that was, that allowed the business to become more confident in the tool. We no longer needed to have meetings to discuss why we were doing what we were doing.
It then became a point of communication. An action would be taken because Turbonomic said it was the right thing to do. Nowadays, it's not even questioned. When I talked to people about trying out Turbonomic and looking at how to adopt it in their workload, I say to look at areas which are current pain points in your environment and see where Turbonomic would fit into that instead of trying to come up with the workloads or use cases to plug into Turbonomic. Instead of trying to figure out what you have or seeing where you could put Turbonomic in your environment, see where your environment fits into Turbonomic. That was the way that we were able to drive adoption much faster and use it, not just as a reporting tool, but also as an orchestration tool as well.
They have some room to grow. I wouldn't give them a perfect 10. I would probably give them an eight and a half or nine (as a whole number).
Which deployment model are you using for this solution?
Which version of this solution are you currently using?