What is our primary use case?
For the most part, we're using it to move data off-prem. We have the ability to do mirrors from on-prem to Cloud Volumes ONTAP and we also have both single-node instances and HA instances. We are running it in both AWS and Azure.
We're using all of the management tools that go along with it. We're using both OnCommand Cloud Manager and OnCommand Unified Manager, which means we can launch System Manager as well.
Unified Manager is what monitors the environment. OnCommand Cloud Manager allows you to deploy and it does have some monitoring capabilities, but it's not like Unified Manager. And from OnCommand Cloud Manager you can launch System Manager, which gives you the lower-level details of the environment.
Cloud Manager will allow you to create volumes, do CIFS shares, NFS mounts, and create aggregates. But the rest of the networking components and other work for the SVMs and doing other configurations are normally done at that lower level. System Manager is where you would do that, whereas Unified Manager allows you to monitor the entire environment.
Say I have 30 instances running out there. Unified Manager allows me to monitor all 30 instances for things like volume-full alerts, near volume-full alerts, I-nodes, full network components being offline, paths, back-end storage paths, aggregate fulls. All those items that you would want to monitor for a healthy environment are handled through Unified Manager.
How has it helped my organization?
We're sitting at multiple petabytes of storage on our NetApp infrastructure. We're talking hundreds of thousands of shares across thousands of volumes. Even with that size of infrastructure, it's being supported by three people. And it's not like we're working 24/7. It gives us the ability to do a lot, to do more with less. Those three people manage our entire NAS environment. I've got two intermediate and one senior storage engineer in our environment who handle things. They're handling those multiple petabytes of on-prem and I'm just starting to get them involved in the cloud version, Cloud Volumes ONTAP. So, for the most part, it's just me on the Cloud Volume side.
In terms of the storage efficiency reducing our storage footprint, the answer I'd like to say is "yes." The problem I have is that nobody ever wants to delete anything. We have terabytes of data on-prem in multiple locations, in both primary and DR backed-up. And now, we're migrating it to the cloud. But eventually, the answer will be yes.
What is most valuable?
I'm very familiar with working from the command line, but Unified Manager, System Manager, and Cloud Manager are all GUI-based. It's easy for somebody who has not been exposed to this for years to pick it up and work with it. Personally, for the most part, I like to get in with my secure CRT and do everything from the command line.
We do a lot of DR testing of our environment, so we're using a couple of components. We use Unified Manager to link with WFA, Workflow Automation, and we do scripted cut-overs to build out. We use the mirroring to mirror our volumes to our DR location. We also create snapshots for backups. Snapshots will create a specified snapshot to be able to do a DR test without disrupting our standard mirrors. That means we can create a point-in-time snapshot, then use the ability of FlexClones to make a writeable volume to test with, and then blow it away after the DR test.
We could also do that in an actual disaster. All we would do is quiesce and break our mirrors, our volumes would become writeable, and then we would deploy our CIFS shares and our NFS mounts. We would have a full working environment in a different geographic location. Whether you're doing it on-prem or in the cloud, those capabilities are there. But that's all done at a lower level.
The data protection provided by the Snapshot feature is a crucial part of being able to maintain our environment. We stopped doing tape-based backups to our NAS systems. We do 35 days of snapshots. We keep four "hourlies," two dailies, and 35 nightly snapshots. This gives us the ability to recover any data that's been accidentally deleted or corrupted, from an application perspective, and to pull it out as a snapshot. And then there are the point-in-time snapshots, being able to create one at a given point in time. If I want to use a FlexCone to get at data, which are just pointers to the back-end data, right now, and use that as a writeable volume without interrupting my backup and DR capabilities, those point-in-time snapshots are crucial.
The user can go and recover the file himself so we don't have to have a huge number of people working on recovering things. The user has the ability to get to that snapshot location to recover the file and go however many days back. Being that it's a read-only a file to the user community, users can get at that data, as long as they have proper rights to that file. Somebody else could not get to a file for which they don't have rights. There's no security breach or vulnerability. It just provides the ability for a user who owns that data to get to a backup copy of that data, to recover it, in case they've deleted or had a file corruption.
We also use their File Services Solutions in the cloud, CIFS and NFS. It works just as well as on-prem. The way we configure an environment, we have the ability to talk back to our domain controllers, and then it uses the standard AD credentials and DNS from our on-prem environments.
Cloud Volumes ONTAP in the cloud, versus Data ONTAP on-prem, are the exact same products. If you have systems on-prem that you're migrating to the cloud, you won't have to retrain your workforce because they'll be used to everything that they'll be doing in the cloud as a result of what they've been doing on-prem. In that sense, Cloud Volumes ONTAP is the exact same product, unless you're using a really old version of Data ONTAP on-prem. Then there's the standard change between Data ONTAP versions.
What needs improvement?
Some of the licensing is a little kludgy. We just created an HA environment in Azure and their licensing for SVMs per node is a little kludgy. They're working on it right now. We're working with them on straightening it out.
We're moving a grid environment to Azure and the way it was set up is that we have eight SVMs, which are virtual environments. Each of those has its own CIFS servers, all their CIFS and NFS mounts. The reason they're independent of one another is that different groups of business got pulled together, so they had specific CIFS share names and you can't have the same name in the same server more than once on the network. You can't have CIFS share called "Data" in the same SVM. We have eight SVMs because of the way the data was labeled in the paths. God forbid you change a path because that breaks everything in every application all down the line. It gives you the ability to port existing applications from on-prem into cloud and/or from on-prem into fibre infrastructure.
But that ability wasn't there in Cloud Volumes ONTAP because they assume that it was going to be a new market and they licensed it for a single SVM per instance built out in the cloud. They were figuring: New market and new people coming to this, not people porting these massive old-volume infrastructures. In our DR infrastructure we have 60 SVMs. That's not how they build out the new environments.
We're working with them to improve that and they're making strides. The licensing is the only thing that I can see they can improve on and they're working on it, so I wouldn't even knock them on that.
For how long have I used the solution?
I've been using it since its inception. Prior to it being called Cloud Volumes ONTAP, it was named a couple of different things as it went along. I've been working with the on-prem Data ONTAP for about 16 years now. When they first offered the Cloud Volumes ONTAP, I started testing that out in a Beta program. It's been a few years now with Cloud Volumes ONTAP. I'm our lead storage engineer, but I'm also on a couple of our cloud teams and I'm a cloud administrator for our organization. We started looking at it when AWS ( /products/amazon-aws-reviews ) first started coming on the scene, at what we could do in the cloud. And as a company direction, we're implementing cloud-first, where available.
What do I think about the stability of the solution?
What do I think about the scalability of the solution?
In an HA environment, it will scale up to 358 terabytes. That's not bad per-system. We've had no difficulties.
We will be moving more stuff off-prem into the cloud. Right now it's at about 15 percent of our entire environment, and we plan on at least 10 percent, or more, per quarter, over the next few years.
We'll be doing the tiering and using the Cloud Sync as well. We're a financial and insurance company, so some things have to remain on-prem, and some things, from a PCI perspective, have a lot of different requirements around them. And because we're across multiple countries worldwide, there are all sorts of HIPAA and other types of legal and financial ramifications from a security perspective. In the UK and in Europe there are the privacy components. There are different things in Hong Kong and Singapore, in Spain, etc. Each country unit requires different types of policies to be adhered to. Everything we have is encrypted at rest, as well as encrypted in-flight.
Cloud Volumes ONTAP will also support doing data encryption at a volume level, a software encryption. But from a PCI perspective, we use the NSE drives, which give us hardware encryption. So they're double encrypted. They are hardware encrypted. We're having to use a management appliance to keep and maintain the encryption keys, and we do quarterly encryption-key replacement. But there are also the volumes that are encrypted as well. We also use TLS for transporting the data, doing encryption in-flight. There are all sorts of things that it supports which allow you to be compliant.
Another feature it has is disk sanitize, a destruction component which allows you to do a DoD wipe of the data. Once you've decommissioned an environment, it is completely wiped so nobody can get access to the data that was there previously. That's all built into Data ONTAP, including Cloud Volumes.
NSE drives are a little different because you are not getting physical drives in the cloud environment, so you couldn't do that. But you can do the volume encryption, from Cloud Volumes. In terms of a DoD wipe, you wouldn't be doing that on Azure's or AWS's environments because it's a virtual disk.
How are customer service and technical support?
I've rarely used tech support. I've got so much experience deploying these environments that it's like breathing. It's second nature. And when they first came out with OnCommand Cloud Manager, I was doing beta testing and debugging with the group out of Israel to build the product.
How was the initial setup?
The initial setup was very straightforward. If you use an OnCommand Cloud Manager to deploy it into AWS or Azure, it's point-and-click stupid-simple. It takes less than 15 minutes, depending upon your connectivity and bandwidth. That 15 minutes is to build out a brand-new filer and create CIFS shares on it. It automatically deploys it for you: the back-end storage, the EC2 instances, if you're in an AWS. In Azure, it creates the Blob space. It creates the VMs.
It's all done for you with just a couple of screens. You tell it what you want to call it, you tell it what account or subscription you're using, depending upon whether it's AWS or Azure. You tell it how big you want the device to be, how much storage you want it to have, and what volumes you want it to create; CIFS shares, etc. You click next, next, next. As long as you have the ability to provision what you've gone into, whether it's AWS or Azure, and turned on programmatic deployment, it gives you the access. The only thing you have to do outside Cloud Volumes ONTAP under OnCommand Cloud Manager is turn it on to allow it to run. It picks up everything else. It'll pick up what VPC you have, what subnet you have. You just tell it what security group you want it to use. It's fairly simple.
If somebody hasn't utilized or isn't familiar with how to deploy anything in either AWS or Azure, it might be a tad more complicated because they'd need to get that information to begin with. You have to have at least moderate experience with your infrastructure to know which VPC and subnet and security group to specify.
What was our ROI?
In my opinion, we're getting a good return on investment.
Which other solutions did I evaluate?
I always try new products. I've used the SoftNAS product, and a couple of other generic NAS products. They don't even compare. They're not on the same page. They're not even in the same universe. I might be a little biased but they're not even close.
I have looked at Azure NetApp Files, which is another product that NetApp is putting out. Instead of Cloud Volumes it's cloud files. You don't have to deploy an entire NetApp infrastructure. It gives you the ability to do CIFS at file level without having to manage any of the overhead. That's pre-managed for you.
What other advice do I have?
For somebody who's never used it before, the biggest thing is ease of use. In terms of advice, as long as you design your implementation correctly, it should be fine. I would do the due diligence on the front-end to determine how you want to utilize it before you deploy.
We have over 3,000 users of the solution who have access to snapshots, etc. but only to their own data. We have multiple SVMs per business unit and a locked-down security on that. Only individuals who own data have access to it. We are officially like a utility. We give them storage space. We give them the ability to use it and then they maintain their data. From an IT perspective, we can't really discern what is business-critical and what isn't to a specific business unit. We're global, we're not just U.S., we're all over the world.
We've gone into doing HA. It's the same as what's on-prem, and HA on-prem is something we've always done. When we would buy a filer for on-premise, we'd always buy a two-node HA filer with a switch back-end to be able to maintain the environment. The other nice thing, from an on-prem perspective with a switched environment, is that we can inject and eject nodes. We can do a zero-downtime lifecycle. We can inject new nodes and mirror the data to the new nodes. Once everything's on those new nodes, eject the old nodes and we will have effectively lifecycled the environment, without having to take any downtime. Data ONTAP works really well for that. The only thing to be aware of is that to inject new nodes into an existing cluster, they have to be at the same version of Data ONTAP.
In terms of provisioning, we keep that locked down because we don't want them running us out of space. We have a ticketing system where users request storage allocation and the NAS team, which supports the NetApp infrastructure, will allocate the space with the shares, to start out. After that, our second-level support teams, our DSC (distributed service center) will maintain the volumes from a size perspective. If something starts to get near-full, they will automatically allocate additional space. The reason we have that in place is that if it tries to grow rapidly, like if there's an application that's out of control and just keeps spinning up and eating more and more of the utilization, it gives us the ability to stop that and get with the user before they go from using a couple a hundred gigs to multiple terabytes, which would cost them X amount. There is the ability to auto-grow. We just don't use it in our environment.
In terms of the data protection provided by the solution's disaster recovery technology, we use that a lot. Prior to clustered ONTAP - this is going back to 7-Mode - there was the ability to auto-DR with a single command. That gave us the ability to do a cut-over to another environment and automatically fail. We're currently using WFA to do that because, when they first came out with cluster mode, they didn't have the ability to auto-DR. I have not looked into whether they've made auto-DR a feature in these later versions of Data ONTAP.
OnCommand Cloud Manager doesn't allow you to do DR-type stuff. There are other things within the suite of the cloud environment that you can do: There's Cloud Sync which allows you to create a data broker and sync between CIFS shares or NFS mounts into an S3 bucket back-end. There's a lot of stuff that you can do there, but that's getting into the other product lines.
As for using it to deploy Kubernetes, we are working through that right now. That process is going well. We've really just started getting through it and it hasn't been overly complicated. Cloud Volumes ONTAP's capabilities for deploying Kubernetes means it's been fairly easy.
In terms of the cloud, one thing that has made things a little easier is that previously, within the AWS environment, we used to have to create a virtual filer in each of our subscriptions or accounts because we really wanted the filer to be close to the database instances or the servers within that same account, without traversing VPCs. Now, since they have given us the ability to do VPC peering, we can create an overarching primary account and then have it talk to all the instances within that storage account, or subscription in Azure, without having to have one spun up in every single subscription or account. We have a lot of accounts so it has allowed us to reel that back by creating larger HA components in a single account and then give access through VPCs to the other accounts. All that traffic stays within Azure or AWS. That saves money because we don't have to pay them for multiple subscriptions of Cloud Volumes ONTAP and/or additional virtual filers.
For my use, Cloud Volumes ONTAP is a ten out of ten.