What is our primary use case?
The primary use case for Rubrik is backup and restore and as an archival system. We use it for backup and restore for virtual machines, managed volumes that are mounted and which we use for snapshots from Rubrik to endpoint devices, NAS shares from our Isilon system, as well as SQL databases, Oracle Databases, and Windows and Linux. We may have some failover cluster Windows shares going to it. It's for backups and restores of pretty much everything. It works really well in concert with Pure Storage technology. We have a really large Pure Storage environment and they play really well together.
The solution is on-prem and it's protecting environments that are on-premises exclusively at this point, although we have plans to push towards the cloud. Most of it is virtual but I'd be surprised if we didn't have at least half-a-dozen physical machines connected to it.
How has it helped my organization?
I don't think that we have had an instance where we needed to recover en masse, like from a ransomware attack, but we have disaster recovery in production, and as part of our strategy we back up things that are also in test. Because it's test, sometimes things get configured wrong, and that's the whole point of the test. You figure out what works for you and the company and what solves the problem. But you break things in the process sometimes. It's really great to be able to do all of your testing and all of your work in testing without a great fear of really losing data or losing progress very much. We've had phone calls where they say, "Hey, I need XYZ restored, or I need this entire drive restored, or I need this entire VM restored." At the click of a button, five seconds later, it's back. It took longer for them to tell me what they needed back than it did to get it returned.
We try to run a disaster recovery test at least once a year. We want to make sure things are working, especially since here in Louisiana we have hurricanes. There could be a storm that comes in and we have to migrate our data and everything. The great thing is, we've got our data in both our production data center and our DR center. We're frequently doing those tests, we're frequently replicating between our different data centers. We do get a report about the replication status. With the exception of that, I don't think we really do much restore testing. But when we have restored, it has always worked. I never restored things before we had Rubrik so it's hard to know whether it reduces the time spent on recovery testing.
I know from my previous scenario in higher education, if the user on the phone said, "I know exactly the file and directory I need restored and I know exactly the day that I need it restored from," then I could probably do it in under 20 minutes. With Rubrik, I can do that in about 35 seconds, if I am already logged in. And in my previous occupation, that 20 minutes was only if the file was actually backing up. Frequently, we encountered issues where the file wasn't even backing up. Sometimes there were issues that we didn't get an email about and, as a result, we weren't backing things up. We only found out things weren't getting backed up because somebody needed some data. Overall, it's a huge reduction in time, if we're going from 35 minutes down to a minute.
And if Rubrik doesn't back up, we know it. If there's some issue where a system goes down and it can't take a snapshot, we know it. And that's good. It's not that I want to get those emails, but those are the emails that make you confident in your system. It has detected a problem and it immediately lets me know about it. And it tells me, "This is exactly the problem." I know exactly what I'm looking for.
It's great whenever I get a call that says, "Hey, I need this restored," because it's like I get to be a superhero. The person on the other end thinks their stuff is gone. They know, "Oh yeah, we have backups. And they might work, unless something happened and we don't have them." Whenever you give them their file back, and it's the last version that they edited, and their work is safe, it's really awesome. That's our validation. I have a lot of confidence in the system.
Regarding my team's overall productivity, here's the thing that's really great about Rubrik. It's really great that I could have someone who doesn't do this for a living. Provided that permissions were set up right, I could have a normal user, who is in charge of just his own data, go in and participate in restore operations. Rubrik is that much of a seamless, easy-to-use system. That's not just productivity for my team, a team full of people who do this every day. Users know they don't even have to ask. They can log in, they can get to what they're looking for because it's very easy to find, and they can restore it. Even though I may be one of the primary people to configure and deal with the nuts and bolts of it, that doesn't mean I'm the only one who can actually restore and get files back.
There's also the aspect that, whenever they commit a change or do something, as long as we're within our SLA snapshot time, they know that their changes are secure and that their changes will be there. So if they need to walk back or change something, they know they'll be able to. Again, confidence and trust in the system is fantastic.
What is most valuable?
The restore and backup agent is really great. It takes the load off of vSphere or vCenter or any of our ESXi hosts. It makes things just a dream to manage.
The fact that the API is so available to us with the playground — there's an internal and public playground — is also valuable. We can write API calls — and although I'm sure there's a way we could hurt the data — we write those calls with a lot of certainty that we won't be destroying anything. We write these API calls using really easy mechanisms and generate automation a lot faster. We can integrate into other systems that might not be as easy with other solutions. We can integrate Rubrik into the systems very easily because they give us the tools to do so.
Also, the web interface is really great. The design, from a user-experience standpoint, is really straightforward and easy to use. Sometimes you go to websites and you can immediately tell, "This is going to be a pain to use." The buttons are in weird placements or when you click on something it doesn't load very quickly. I don't know if Rubrik got it right on the first try or if they went through a lot of user testing, or maybe they hired some people that did user experience in the past. But they nailed it. Usually, from the very first panel, the dashboard that you land on after login, you've got most of your functionality right around where you need it to be. You've got your new items on the left, you've got your support on the top right. Nothing really seems out of place or just stuck in someplace.
Generally, within three or four clicks, I can get anywhere I need to be, whether that's restoring a snapshot or creating a new host. It's really fast. And from a technical standpoint, you can get to the interface from any of the nodes within the Rubrik cluster. You don't just explicitly have to go to the cluster host's name at the top level. You can go to any of the nodes that make up the cluster. So let's just say networking is hard, systems sometimes are hard and things can break. That's just a thing that happens with computers, they're not perfect — I wouldn't have a job if things were perfect. Let's just say something happens where you don't have access to the cluster. You can go to any of the cluster resources, any of the nodes in the cluster, and you can access virtually the same interface.
That's awesome, because usually, in the past, if something was down and it affected the cluster endpoint, the primary website, you would have to SSH in, you would have to go into command line, and reboot the server. There's no need to do that here. You have to lose your entire environment for it to go down.
In terms of SLA-based policy automation, I don't know what they were doing before Rubrik. I have to imagine there was a similar SLA system. For me, personally, I had a very static, flat rate of four weeks and that was it. If I wanted to have a separate set of SLAs, a separate 15-day SLA or a separate 20-day SLA, I had to stand up a completely separate version of that system and point things to that. Instead of having multiple SLAs in the same system, I had multiple systems that were exactly one SLA, which is a big management headache. There's a lot of overhead to that. You have to have another machine to run it, you have another cluster to run it. I don't know if this is a normal thing in the industry or if it's just a thing that all of a sudden I've seen, but of course you would do it this way. Everyone should do it this way.
For me, it was a really big eye-opener, being able to say for each resource, "You're going to be a 15-day at this time. That's every snapshot that you're going to have." It's continuous protection. It's really awesome that I get to work with a product that does that and does it well. I saw videos when I was learning about Rubrik. Other places have these features too, but they might not work as well. Frequently they don't. That's really one of the big selling points of the system.
Rubrik's archival functionality is a no-brainer. It doesn't require a ton of thought. I don't have to over-engineer different policies to validate what I think it's doing. If it says it's doing it, it's doing it, and it's really easy to click a button and say, "Now it's done." It's a very convenient piece of tech and I absolutely love it.
Regarding API support for integration with other solutions, we have not used it directly with any of the other hardware except Pure Storage. Pure Storage and Rubrik really go together well. We use a batch management control, which is like a job-controller. It's a modern solution, but it doesn't feel like a modern solution. The developers of it went in a different way, so it accepts command line and PowerShell, but with Rubrik's PowerShell modules and their API at a raw level, we're able to integrate it into pretty much anything. We're able to control when and where snapshots fire off and how to lock the different volumes to write- and read-only, depending on what we want to do. We're able to control that with our seemingly legacy — it's not actually legacy — system, even though there's not a direct integration.
It's the same thing with Isilon. Via a script mentality, and in concert with Adam Fox over at Rubrik, we're able to work with him and push all of our Isilon endpoints, all of our network shares from Isilon, into Rubrik, without having to go through the GUI. In our case, we had quite a lot of Isilon hosted storage. We were able to push that to Rubrik relatively seamlessly and simply because they had an API out there for us to use.
We have a lot of DBAs who are interested in Rubrik because, whenever you're a database administrator, I can't imagine that you'd have a lot of fun. You're always worried about mitigating loss. You have your database, and your replication of your database, and your backups for your database, and additional backups for your database, and then you need validation on those backups. The great thing is that Rubrik does most of that. It's not replication for databases, but it backs up the database and it's very seamless. It's very fast.
There are different settings that you can have on those backups to get a varying range of SLAs, where it's up-to-the-minute, or day or hour. You can get that continuous data protection, which is really great.
What needs improvement?
I joke around, every time we meet our SE, and say they could use a dark theme for the user experience. Everything else has a dark theme now, so it'd be cool if it had a dark theme.
But on the serious side, I have a personal want which might not necessarily make sense with Rubrik as a company or Rubrik as a software, but it would be really nice if they could also handle things like item-level backups and restores of Active Directory objects and DNS and DCP objects.
In Active Directory there's a recycling bin where something goes if you delete it. I don't know if it's there for a static amount of time, like 90 days, or if it's until we hold 1,000 objects, so if you delete more things, the oldest ones go from the recycling bin. It would be really nice to have an additional layer of convenience, where if it's been in Active Directory for at least a day, and we're within our snapshot time, in addition to the machine itself, we have the actual objects in the Active Directory database so we can back that up. And similarly for DNS: all the records, all the zones, DHCP.
It would also be really great for DSS if they could somehow integrate it with Microsoft's technologies at a modular level. In general, I would like to see more integration with Microsoft at an item level. It already backs up the machine itself. We have the virtual machine which contains the database with DNS or the DHCP or Active Directory, but the restore operations, from a bare-metal restore like that, is technically very cumbersome. I don't know if it would just be a lot of built-in PowerShell scripting where it exports the data, saves that export in Rubrik somehow, and then imports it back in using a reverse method, but I think it would be really helpful if it could.
At one point I thought it would be really great to use it almost like SEPM where you could have modules or files where, instead of restoring back to its original location, you could distribute it to all of your restore points. I've walked back on that somewhat. I think that's a little too outside of the focus for Rubrik.
For how long have I used the solution?
I personally have been using Rubrik for almost five months. It was deployed before I was working at my place of occupation. I used to work in higher education and I did the backups and the disaster recovery at that organization, amongst other things. When I came onboard at my present occupation, they said, "Here's the backup you're going to use, here's the system that we bought into, it's this thing called Rubrik." I said, "Cool. You've seen one system, you've seen them all. They all work." I believe the company has been using it for about a year.
What do I think about the stability of the solution?
Rubrik is incredibly stable. I'm getting out of that mode of thinking with Rubrik, "Well, maybe it won't work this time. Maybe it'll be down." It's never been down, it's never been inaccessible. If I can't connect to it, I'm typing the URL wrong. That's it.
We had other systems that are homegrown systems or even that were purchased. I don't know if there were technical aspects that were outside of our control, or that we just aren't mitigating or managing very well, with them. But as far as Rubrik is concerned, I've never had an issue accessing that on-prem system — and that's true even for our DR system which is technically on-prem but "over there," very far away. That includes nodes, the cluster. It's just been very good.
What do I think about the scalability of the solution?
We have a lot of Rubrik, a lot of "bricks". If we needed more, we'd just buy more. The horizontal scaling is really great. I don't think we need anything immediately. But I could definitely imagine a moment in the past — not that I know that this happened — where we had ten nodes instead of the 50-something nodes we now have at each site, and we needed more and we put in more. I could totally see it all just working. It would just all of a sudden get better.
If we were ever pressed and at a point where we need something better, we needed more, I would imagine Rubrik would have a solution for us and it would work 100 percent. Whether that would be to PoC some new hardware and verify that it would actually improve our situation, or tweak a setting, or do a site survey to figure out what we're using and how to help, they would either get what we are using right now to work better, or they would figure out what we need to make it better moving forward.
That's scalability in a lot of ways. That's technical scalability in being consistent and stable and being able to improve and evolve. And that's stability and scalability and not having to plan your business processes around what should be a no-brainer issue. It's something that shouldn't drive your business. It should allow your business to be driven in whatever direction it needs to go. It should be something that just works, and so far I've seen it just works.
We have over 2,000 employees, and every one of those employees has some form of a computer and some have multiple: a laptop, or a laptop and a virtual machine, or just a virtual machine, or a laptop and two virtual machines. It's a big environment. We have hundreds of Windows Servers and about 100 Linux servers, if not more. We have pretty extensive Microsoft SQL environments which are either always-on clusters or a combination of always-on clusters and available clusters, and then we have some Oracle Databases as well.
I don't remember the exact number of what we're currently supporting in Rubrik, but I know it is a lot. We've integrated it in such a way — and this is a fairly normal process, but it's great — that whenever we put a machine online, part of the workflow is to get it to back up into Rubrik. Whenever we decommission things, it's to remove those backups 90 days after we remove the physical or virtual server. We keep backups X number of months after we remove the machine, just in case, depending on what our data retention policy is.
It's ingrained. We're invested. We made the jump.
How are customer service and technical support?
While this might not count as a "tool," the support methodology with Rubrik is really interesting. When we need to do anything that is "invasive," if I have a question about how many upgrade-blocking things are in place, I open a support window, a ticket, and usually within ten minutes I'm contacted by someone, a real person, not just an automated system, at Rubrik.
It's really good. In my previous job to this one, I never really had an experience where the first response that I got back wasn't just an automated, robo-caller saying, "We've received your ticket, we will call you in a moment," and then two days later they would call. With Rubrik, you do get an email saying, "We've received your ticket and someone's going to call you." But within ten minutes, usually, and very rarely within any longer than 30 minutes, there is a real person on the phone calling me, who knows my name and is very aware of the situation. They're not asking me for a ton of information that I've already given in the ticket. They're really top-notch. And the support is integrated really well into the product.
That's not to say that we need support because things are broken. The support is there as an aid, as a tool for us.
We upgraded a month ago to the version we're on. We're planning on upgrading to the latest version, which I think is 5.03. The great thing is that we're really close with support. They work well with us. We don't upgrade to beta or anything like that, but whenever something big is coming down, they'll usually let us know. We'll talk to them about it and they'll tell us "Hey, this is a cool thing that maybe you guys can utilize."
Which solution did I use previously and why did I switch?
We replaced Avamar with Rubrik and it assumed the exact same role that Avamar had. I never got to use Avamar. It was decommissioned before I got to my current company.
When I worked in higher education, because we didn't have a lot of money to buy solutions, a lot of it was open-source. So I was the support and I was the deployment and I was the debugger and I was the guy that had to code all the integration. It was hard for me to have a vision of, and architect, how we were going to use things. Back then, we needed to use something and I needed to make it happen.
So in a lot of ways, Rubrik is my first big, differentiating factor in backup and restore software. It's not like we weren't able to do it at my previous organization, but this is a completely different realm. It's a totally different level with Rubrik. I'm not saying that Avamar wouldn't have been a similar feeling. But I hear what other people on the team who were using Avamar before are saying, and I get the feeling that Rubrik is leaps and bounds better in terms of validating that the backups actually happened and that they're there.
How was the initial setup?
In terms of deployment of the solution, it was vendor-aided. Rubrik helped through our SEs. If I had to guess, it would probably be less than half-a-dozen people who were a big part of the deployment, data center access and data center deployment notwithstanding. Some people had to go and plug and rack things.
We aren't interested in lagging behind as far as updates go. We're pretty good about updating to the latest version. The only reason we haven't done so right now is because it's in use. We continue to use it and the organization I work for is big. There are a lot of teams using it. So it's hard finding the time in the day where we can disconnect everything, upgrade the system, and then reconnect everything. That's on our side where we're trying to juggle all the teams that are making use of the product.
What was our ROI?
I believe our company has seen return on investment by going with Rubrik, although I can't talk about it in detail. I'm not a finance guy. But from the way I hear people talk about previous products we were using, and from my personal experience of wasted time in managing and deploying and supporting free or open-source software, I believe there is ROI. We've definitely done whatever was necessary to make the cost worthwhile.
What's my experience with pricing, setup cost, and licensing?
I remember hearing that we purchased a multi-year, contractual agreement. I don't know if we purchased the hardware outright or if it's a lease-to-own scenario.
What other advice do I have?
My advice would be in the form of a question. If you have the money to purchase Rubrik, the real question is do you want success? Do you want it to work? Because if you want it to work and you want it to be easy — I don't know if Rubrik has won awards for support and service, although I feel like they should have — if you want that support for the few times that you need it, then you go with Rubrik.
It's a really good, seamless system. It's a no-brainer, sometimes. It just works.
I met up with them at VMworld and I actually got to talk to one of the people who was writing the PowerShell modules that I was using for an automation piece that I was writing. I got to ask that person, developer-to-developer, why did you make this decision? I asked a couple of very in-depth questions, and I don't get to do that with a lot of other companies, the companies that are just a logo or just a payment box and a data center. I don't feel Rubrik is a payment box and a data center. It's more than that, it's bigger than that, and that's really good. There are communities out there for Rubrik and I can speak with other developers and other teams that have implemented Rubrik, and that's awesome. It's not a support portal and it's not a place where you go to air your grievances. You go there to have fun, you go there to learn.
I don't know that I've ever used a product that's been quite like it. There are a couple of products that are similar. You definitely get a lot out of Pure Storage, which is very much the same thing, but that's storage, not backup and restore. The advice I would give is: It's not charity software, it's not "for-free" software. It does cost, but what you're buying is a solution that will actually work. It will carry whatever weight you want to give it. And you're also getting the team that helped make it great.
We have not needed to use Rubrik's ransomware recovery yet. Thankfully we've been spared from having to utilize that component. But when I was at VMworld 2019 recently and I was watching a class on ransomware recovery, it was one of those things where thought, "Wow, I didn't even really know we had this." But we totally have this. We have Rubrik, and this is neat. I ended up talking to one of our SEs about it after the fact, and he said, "Yeah, well, you haven't needed it and hopefully you never will."
I believe some of our application developer teams are using Rubrik. They might not realize they're using it though, because a lot of the integration we put in is to back up the machines that they do work on, but they don't realize that we're backing them up. That's kind of sneaky. We're devious like that. We try to protect our users even from themselves sometimes.
For day-to-day maintenance there are only two or three people. I'm one of them, and I have another member on my team who is involved. We also have one of the database administrators who plays a big role in it. My passion, and where I fit perfectly in the team, is doing a lot of scripting. I'm a general-purpose solutions engineer with a focus in PowerShell, Active Directory, and Microsoft integration.
I don't like typically giving tens, because that says there's no room for improvement. But functionally, it's a 9.99999, which rounds up to a ten.