What is our primary use case?
My company is a data center service provider. We host and manage IT for all types of different companies, using TrueSight to manage and monitor the health performance availability of all our customers' environments: networks, servers, databases, websites, and all their back-end IT.
Right now, the focus is pushing DevOps and AIOps in our more traditional data center management. We are not using it in the cloud space today. Therefore, the focus is the traditional data center space, but for us, that is a very large space.
How has it helped my organization?
One case that we like to use a lot: We have a customer who uses F5 load balancers, and they were managing them with CA products. Those load balancers were generating around 11,000 tickets a month. Just moving them from CA to TrueSight, and replicating the same rules, they went from 11,000 tickets a month to 400 tickets a month. TrueSight did a much better job of doing the same thing. Then from there, we were able to tune it. We got it down to about 40 tickets a month. While this is an extreme example (I don't usually see this type of improvement), it shows the power that is there.
We are able to more quickly identify problems and get an engineer on it to restart services, etc. It is not fixing the customer's bugs. They've got buggy apps, and it goes down all the time. It is just that we can get them back online faster.
What is most valuable?
- It is breadth. It covers so many different technologies which can roll up into a single console.
- The noise reduction for ticketing works much better than we have seen in a lot of other companies.
- We're starting to get into the machine learning pieces to further enhance the intelligence of events.
What needs improvement?
Continue to improve the maturity of the product overall.
I definitely would like to see more improvement in the self-diagnostics. I need to know when anything is not working or collecting, long before our customer finds it.
I would like to see continued improved integration with some of their partners. We use a lot of Intuity software. While the connections are good, they could be better. We use App Visibility, as part of the TrueSight suite. Previously, we were a big BMC TMRT customer previously. They gave up a lot of features of TMRT to get App Visibility in. Features that our customers used. They still complain about this weekly: When are we going to get this report or view back.
When we took this issue back to BMC, they said, "It wasn't an upgrade from TMRT. It's a brand new product. It just happens to be serving the same market." From my user standpoint, we went from BMC TMRT to BMC App Visibility, giving up all these features. For us, it was an upgrade that we lost features on. I need that stuff back, at the end of the day, as a service provider. The customers need to feel comfortable that the data is there. They need to have accurate SLA type reports. The SLA reports that we get on TrueSight today are unfortunately worthless. They go to the whole integer. So, they all show 100 percent, when we've got contracts which are 99.996 percent and are now rounding to 100. Well, if we were at .9995, that's an SLA miss. Things like this are a problem. We have to do all this manually on the side. We can't roll this back, as the versions that we used to use are long out of support.
The biggest issue is probably the gaps in the reporting that I need for my end customers. That is a very public and embarrassing, I can't give you the report that you need. Also, the reliability of the ISNs needs improving. Having a customer find a machine that stopped collecting before we do, that is not what you want when you're a service provider.
For how long have I used the solution?
We have been a BMC client since 2001. We've been through many generations of the product.
What do I think about the stability of the solution?
The stability has a bit more maturing to do. There is still room for improvement. Overall, it's pretty good, depending on which layer you're looking at. At the highest level, which is the presentation server, we find that we have to restart that every two months or so, just because it stops responding. I would like it to be a bit better. We don't have any real understanding of what's causing that. The next layer down is the infrastructure manager level. That's probably about the same, every couple of months it stops responding. As you then go farther down to the data collection layer: the ISN level. Those aren't as stable as they need to be. They will go for six months fine, then fail three times in a row in two weeks. It doesn't give us a good alarm, and unfortunately, we've missed an event. Then, the customers notice something, and that didn't pass its events. So, a little more maturity is needed here.
What do I think about the scalability of the solution?
It's scaling fairly nice, but not as large as we would like. We are not seeing the type of scalability that BMC claims. For example, they say that you can run 900 agents against an ISN. We find the ISN stability goes down when you hit 500 or 600. So, you're only at two-thirds of the capacity. I forget how many millions of things that the TSIM was supposed to be able to handle. We are no where near that capacity. We're spinning up more TSIMs because it's just not scaling as advertised.
How are customer service and technical support?
Technical support is a mixed bag. Some tickets go in and are handled very quickly and well. However, we have had tickets which go in and have been out there for months, and some of them were fairly complex. They will go up to Tier 2 or Tier 3, then park. I'm assuming that we're running into a software bug, or something, but those tickets that stall out are frustrating.
How was the initial setup?
It was complex. I wish we had put Professional Services into the deal. Being a service provider, we are attached to companies all over the world with very strict auditing and security requirements. Therefore, designing the architecture to work in that environment was fairly complex. I was just talking to a product owner about the problems that we still have.
Once we get the architecture, the deployment went fairly smoothly. The policy creation and management were much more complex than in their previous products. It is probably more powerful, but not as easy to administer.
They have rolled things, which were multiple products separately in the past, into a single product. They've had to do some consolidation, or adjustments, to be able to merge them quickly to get their product to ship. This left some things missing. Some features that used to be there are gone. Features that we used to use. So, there are pain points, as we figure out how to work around the new gaps.
What about the implementation team?
We did it ourselves.
Globally, I've got six engineers and 12 operators who worked on the deployment. This is a sizable group. However, I'm currently supporting global operations of a couple hundred clients, and they're major clients.
What was our ROI?
TrueSight has helped reduce IT operations costs. From a software standpoint, I have been able to eliminate a lot of other tools, saving approximately half a million dollars a year in other maintenance costs. That is easy savings. The more important one is the labor savings: more reliable, simplified tickets.
The time savings are recognized by the operations teams, not my team. Therefore, it's hard to know the time savings, but if an operations person takes at least 15 minutes to analyze a ticket and their ticket volume is reduced by 10,000 a month, then TrueSight does save time.
We've been reducing ticket noise five to ten percent annually every year, and it has been cumulative. This means less tickets, noise, and operator intervention.
What's my experience with pricing, setup cost, and licensing?
It is a large, complex product. So, there is a commitment of manpower to deploy it, as it is not a cheap product.
We license per named endpoint for most of the products: servers, network devices, databases, etc. You pay for the initial license and maintenance. The way that my company looks at it is we figure out our monthly costs over five years, and right now, we are between five to six dollars. We need to get that down to about four dollars. That's included in the maintenance.
There is a big upfront cost when you buy the license, then there is annual maintenance. We look at, if I bought a license and paid for maintenance for five years, then average it out, what would be my monthly cost. We have had some of the competing tools come in around four dollars. This is coming in as a premium, which is why I don't have it deployed as I would like it. Therefore, we're in negotiations right now. If I can get it down to the four dollar range, I will triple my deployment in a year and a half. If they could could me to the right price point, there are 10,000 to 15,000 servers that I would install it on.
Which other solutions did I evaluate?
As we've acquired other companies, we've picked up pretty much every other tool set out there: CA, IBM, SolarWinds, etc. We have played with pretty much everything. The BMC TrueSight platform wins probably 80 percent of the time if you look feature by feature. It's a good, strong platform. It's ability to run on all the OSs that I've got is a huge thing. We do a lot with IBM iSeries, and a lot of vendors don't cover that. So, this is a big positive on the platform.
Being able to roll everything up to a single database and single feed out for reporting are all very big positives. The same type of consolidation rules under CA, if you write them in BMC, they just work when they didn't work in CA. Things like that make BMC great.
What other advice do I have?
You really want to plan out your policy and architecture in great detail before you start any deployments. It is a complex product. You don't want to have to go redo it. Pick a small environment, test out your plan, test it out a second time, beat it up, and once you're happy with it, then go nuts by deploying it everywhere. It's great once it's there, you just have to get past that design hurdle, because there are things that aren't necessarily intuitive.
I have a mixed bag impression of the usability. The end user experience is mostly good, as it's a very clean interface. There are some quibbles with it. You have to drill into a lot of layers to get into the data that you want. However, when you hit "Back", it takes you all the way back out of the tree. Then, you have to redrill into all those layers. That is a bit of an annoyance for end users. From an administration side, it is still sort of heavy, and policies are very complex. Therefore, it takes a fairly senior level engineer to build it and get it to work well. But, once it's working well, I can monitor tens of thousands of things.
Definitely get multiple references from each of the clients, since all salesmen lie. They all promise the possible best scenario, and I have found depending on the client that you get very different experiences. So, the claims that the BMC sales guys have made are all achievable in a perfect environment. No one has a perfect environment.
Claims from CA, I have found to be outright fabrications, such as, "We can do this." Then, we buy the product. "Oh well, you actually need Professional Services, and you're going to need like three years of custom coding." Millions of dollars down the drain with them.
Other vendors have different levels. They all come in very rosy, and sometimes too much. So, talk to people who have really done it. Take their advice. Don't assume that they didn't know what they were doing. There are a lot of good engineers out there. If the company is struggling, assume you will also struggle.