Stonebranch Universal Automation Center Review

We throw a lot at it from a resiliency perspective and it stays up, reducing maintenance costs significantly


What is our primary use case?

We started off with replacing mainframe batch scheduling for some of our distributed applications, and then it grew into not just batch but workflows and file transfers.

The volumes that we throw at it are in excess of 15 million tasks per month.

How has it helped my organization?

Our biggest relief was the file-transfer piece, the way they do it securely and the way they do that handshake and the way they farm out that dependency to give to users - versus admins - the ability to control that little subsection within the environment. We probably would have needed a team of 20 people to centrally admin, manage, create schedules, do file transfers, and support all that stuff. Instead, we have a team of two.

My biggest pain point was the agent. When it comes to the controller, it's one point of failure. You monitor it, it's just another application. You take a look at it. You know what items you need to keep an eye on: memory, log files, entries. You can be proactive because it's one important single point. When you look at all your endpoints, it becomes a management nightmare if you have to monitor every single one of them. Past experience has been that people want to run their batches. They don't want to care about the scheduler. They want to just set it and forget it. They tend to run the machines very hot. When the endpoints are resource-starved, because people's scripts are taking over all the capacity on the box, the agents fail and their workflow gets impacted and you take an SLA hit. We have yet to see that with any of the Stonebranch agents.

Regarding digital transformation, we were already down that road before we even looked at Stonebranch. I wouldn't say that it was the reason why we did it. It does help in the journey, where you're looking at a mainframe scheduler and you think, "Oh my God, I don't think we're going to be able to use this." Going digital, everything is software-defined where you say, "Alright, APIs, plugin, off we go."

This solution replaced an existing legacy system and the benefit was in the area of the support staff supporting those aging systems. We're no longer a bottleneck or a risk. We have a lot of those folks retiring right now, and it is tough to get that expertise on the market nowadays.

Finally, it helped us save money, and that was one of the drivers for getting it in. What I can share with you is it did 90 percent more than whatever solutions we had. We ended up saving a considerable amount every year. We got more for less. It has also saved us a lot of man-hours in support and maintenance. We were able to go down by three FTEs by implementing the solution. We went from less to more with half the staff that we had.

What is most valuable?

It's very feature-rich, but our focus has been mostly around resolving the file transfer problem: We did not have a standard way of transferring files internally. That was a plus. I don't think anybody in the market does it like they do. When it came down to our standards and compliance and hardening down systems, it was the most secure solution.

We also lean a lot on the multi-tenancy that they offer within the product, the ability to get other people to self-manage their estate, versus having a central team do all the scheduling. That's what we lean on the most.

Regarding the Universal Controller, to give you a bit of history without getting into the details of it, we've tried multiple solutions across the years. The one thing that we wanted to get rid of was the lack of resiliency of all the solutions that we had. What I liked about the agent at the time, before we got into the scheduler, was how robust it was. It just does not go down easily. When we looked at the resiliency of the scheduler, it was on par. It wasn't something that was developed in a basement somewhere. It was top of the class. We throw a lot at it from a resiliency perspective. It stays up. That is a major focus for us. It has reduced the amount of time we have to throw into keeping it up and running, which is translating into a lot of dollars. We host it on-prem.

When it comes to agent technology and compatibility with other vendors, from a platform perspective it was the one vendor that fit all the platforms that we have, from your old platforms - mainframe, NSK, IBM i - to the new ones, going into cloud and containers, etc. It is able to work across the entire suite of technologies, and it works very well with our core, which is the Windows and Unix platforms. It fit what we needed it to do. Other, bigger companies tend to forget one or more of those platforms, because they're in competition with each other, so they do not support some platforms. Stonebranch is very platform-agnostic, so if a customer uses it, they will support it.

What needs improvement?

There is a component called the OMS, which is the message broker. We rely on infrastructure, resiliency, and availability for that piece. If that could change to be highly available just as a software component, so that we don't have to provide the high-available storage, etc. for it, that would be a plus. It would just be cheaper to run.

For how long have I used the solution?

More than five years.

What do I think about the stability of the solution?

The stability has been pretty good. It's been the best out of all the solutions that I've had to deal with.

What do I think about the scalability of the solution?

What I like about it is the configuration that they allow you to get to, how granular it can get. Something that we used to struggle with - because we farm out the work to the applications and say, "You run this, this is just distributed cron for you," - was that people would run their scripts and sometimes do something silly like send their debug to standard out, and standard output is two gigs. Usually, our old tools would go capture that and send it back to the controller. That two-gig amount of data is huge. It's going to break either the agent or the transfer or take the controller down when it gets there. Stonebranch lets you tweak that stuff to say things along the lines of, "How much of the standard output do you want? Do you want 100k, 100 lines, 2k?" You decide. Scalability depends on that. If you want to run 100 million tasks a day, you have to figure out how much data you want to retain, and that's the power of this tool. Other tools don't let you do that.

How are customer service and technical support?

Stonebranch is one of the best support vendors. They leverage their expertise on the mainframe and IBM i. I could not find that anywhere else in the market. That is something that we really needed. Their Unix knowledge is impeccable. They've always helped us. They're always able to do deep dives easily; same thing with Windows. They're quick to getting to the solution. They're quick in helping us to recover outages if there are any. They're always quick to escalate up the chain on their side of the house if they need to. If the level-one person is looking at a problem and says, "You know what? It's been 30 or 40 minutes. I don't see it," they will get someone from level-two or a developer to take a look.

If you previously used a different solution, which one did you use and why did you switch?

We have many other solutions. I don't think I can mention those solutions as we do have NDAs with all our vendors on that side of the house.

How was the initial setup?

When we ran our proof of concept, there were two larger companies, three-letter names, that came in and their installs took us a few days just to set them up. When Stonebranch came in, it was 40 minutes. In fact, I had to triple check it when Colin came in and I had to ask, "Are you sure? Did you half-ass it? I need to take a look. Is it running? Can we go through the components?" It took us longer to verify that it was up and running than it took to install.

After getting it installed, implementation is nice and slow because we're a pretty big organization and converting the things that application teams tend to have takes a while. We plan two years in advance. This is technically an infrastructure initiative, where we have to go and get people's time. To get it started, it took us 12 months - just to get started and scheduled. From a migration perspective, it's very cookie-cutter with their Professional Services. They'll come in, look at what you have, and say, "Here's the format we need to convert things to," and they'll do it really quickly.

In terms of an implementation strategy, at that time, we were scheduling application based on their availability. We had 110 apps and we had an excess of 100,000 definitions. We broke it down by application and scheduled them in waves when our resources and our side of the house were available to do the conversion, to throw it in there, and get them to test. We had a whole workflow planned out between the work that we had to do on the infrastructure side, on the application side, and we organized it in a dependent, wave-by-wave approach. The vendor was here. They converted. Then: 

  • we threw it into the dev, app tested, made changes
  • promoted it to QA, app tested
  • promoted it to production, and then we shut down the old stuff in the old schedulers.

On average, it took an application three to four weeks to get to production. That was not that long based on our size. I've seen it take longer with a lot of other tools. The step-by-step approach on the resourcing that we had bottlenecked us so that we could probably only have four of those running in parallel.

What about the implementation team?

We used Stonebranch Professional Services to come in and help us. We did the majority of the design because that's what we do. We depend on the way our business runs, and we schedule with the business. Then we brought Professional Services in and said to them, "Here's how we're going to be able to do this. You guys tell us what the technical capabilities are and help us through it."

What was our ROI?

The way we run the shop is that Infrastructure has a specific budget. I don't think we did a business case to see how this would improve the business at all. We just looked at what we spend a year and decided, let's spend money on this. It's less work for us, so we went ahead and did it.

What's my experience with pricing, setup cost, and licensing?

Outside of licensing fees, there aren't any other costs.

Which other solutions did I evaluate?

We did evaluate many other options.

What other advice do I have?

Look at also having the database solution be HA as well, because the product is highly available and you can stretch it to also be your BCP where you just fail over from one data center to the other. We suffer because our database solution is not. I would urge everybody to go down that path and set it and forget it. If you lose a part of your data center, this thing will stay up.

The universal task is something that we started dabbling with. We haven't used it fully yet.

We don't rely on the Stonebranch Marketplace a lot. It was something that we discussed with Stonebranch over a period of time. It's something that we, as a culture, need to look into internally as a company. We tend to trust the things that we write, versus looking into things like a marketplace where we can extract thoughts or automation or universal tasks that other people have put out there. If it breaks, we need to be able to call somebody when it does.

At last count we had around 650 defined users, and around 50 logged in at once.

Right now, to do the scheduling and maintain the environment, it's two bodies, and we have one to help support the file-transfer piece. Those three bodies are responsible for administrating the environment. If somebody needs to be onboarded, that's all automated. You come in, AD groups are created, the security stuff is in, it's all automated via ServiceNow. All that those three guys do, from an admin perspective, is troubleshoot production issues. If something breaks, the app goes, they sit down with the application and explain why it broke. The other roles that we have are operators, schedulers, and the read-only users. The applications are broken into dev and production teams. Dev teams usually have access to schedule and promote to production. Operators only have access to production, and they do the operations role. The scheduler basically has read, write, delete, update access to everything. The operator only has that access on the tasks so operators are able to rerun, stop, that type of role. Those are the four roles that we have defined.

I would estimate that ten percent of the business uses this product. Are we going to expand it? Anybody is welcome to use it. It's slowly growing by itself. As soon as you mention the file transfer solution to people, they say, "Okay, I'm on board. Let's go." Are we going to make it a strategic tool that everybody has to use? It's just one of the many tools that we have in the toolbox. I think within our organization, we probably have in excess of 500 tools.

I would rate Stonebranch at nine out of ten. I would never give anybody a perfect ten. I always want people to work harder. I'd give them a nine because, if you deal with all the other vendors, you're used to a sales guy coming in with an agenda - that he needs to maximize the sale. I didn't get that from this vendor. It was very weird dealing with them because all the other vendors act a certain way, except them. They show up, very transparent, very honest, and they're always willing to negotiate.

Disclosure: IT Central Station contacted the reviewer to collect the review and to validate authenticity. The reviewer was referred by the vendor, but the review is not subject to editing or approval by the vendor.
Add a Comment
Guest
Sign Up with Email