Dell EMC PowerMax NVMe Review

CloudIQ ensures that all our arrays are properly communicating so we can see performance and storage capacities

What is our primary use case?

We are a very large customer of Dell EMC. We have several different deployments or installations. The biggest use case is probably a multi-tenant or shared environment where we provide many petabytes of storage for multiple customers who utilize that same infrastructure. We are a managed services provider in the cloud sector so we have to deliver high performance storage for thousands of customers who have to be up all the time.

There are a lot of different use cases, in general: Having large quantities of storage available that is always available, because of this uptime is important as is performance. As a service provider, we deliver storage on demand for our customers. This is important because we can adjust storage needs on a per customer basis. Whether it be increases or decreases in storage, this platform allows us to do that very easily.

We are using the latest release.

How has it helped my organization?

As a service provider, we have to deliver the best possible service that is backed by SLAs. The NVMe performance is fantastic for our customers and the features of the PowerMax are fantastic. We have seen improvements in performance, which means less customer support tickets. The ease of management frees up resources for our storage teams so they can focus on other problems with other platforms, etc. This is such a self-sufficient beast of a platform that it has really freed up a lot of time so they can focus on other stuff besides storage.

There is no management overhead involved in optimizing performance. It does it so well on its own. We don't have to manage much at all. It really is like a set it and forget it solution. My storage engineers love the system. It is a lot less work than our previous systems, which weren't bad by any means. There is not nearly as much management as before. So, we are saving dozens of hours per month for our storage team, and that is a real cost in our business.

There are different ways to look at security and availability. We take advantage of array level encryption, but that is a behind-the-scenes thing. We tend to focus on the availability part, because high uptime and performance are important to us. In regards to data security and availability, the data is secure if it is encrypted. The availability means that it is always up.  We have very good opinions of the security features in both single-tenant and multi-tenant deployed to the security. 

There is also the security concept regarding access to data. What we are seeing is that the PowerMax is so consistently dependable that it gives us a very solid comfort level in terms of level of trust. There is data security and protection, keeping your data from the bad guys. On the other hand, there is security knowing that your data is always available. PowerMax provides both of those.

What is most valuable?

We use the solution's CloudIQ features for what we call fleet management. We manage hundreds of devices. We use this to make sure that all our arrays are properly communicating so we can see performance, storage capacities, etc. We can also generate reports on usage and performance. Our customers with dedicated solutions rely on CloudIQ for reports, but we also have a lot of homegrown internal tools which give us the same features so we don't use it as much as our customers, but we use it occasionally.

CloudIQ is definitely helpful for our customers who use it, but our teams are using internal tools that we've trusted for years. CloudIQ is very helpful for helping to manage storage for customers who need the tools but don't have their own.

In regards to efficiency and performance, we don't have escalations to the vendor at all because it works so well. These devices are a beast. Historically, before the PowerMax came out, we would sometimes experience storage performance bottlenecks because there were a lot of customers in the shared or multi-tenant environment. So, we have a lot of customers requesting a lot of data. We do things at an enterprise-level at scale. Therefore, we would see performance bottlenecks. The efficiency of the system has now just proven that it works phenomenally. It can allocate resources to different storage tiers, like a Gold, Silver, or Bronze tier. If Gold is busy, it can go and request resources from the Silver or Bronze layer as we have defined them. We no longer see performance issues because the system just runs really well and handles a lot of scaling in both directions. 

There is an underlying QoS-type functionality behind-the-scenes where we are providing storage with an SLA based on tiers (Gold, Silver, or Bronze tiers). For example, if the Gold tier does not hit its minimum required performance, the system will kick into a lesser quality of service. It will reach out to the other storage tiers and consume more bandwidth, if needed. However, in our experience, the system works so well that we don't actually have to use that feature. On the very rare occasions that we need to, we just go click a button in the background. The system works so well that we don't actually have to use the QoS capabilities.

It works great. We don't ever have to escalate to the vendor. PowerMax is really a game changer for us. Historically, we would have bottlenecks on older, spinning disk gear, but this NVMe technology is really solid. Now, it works phenomenally. Therefore, storage is not a problem for us. The performance that we are experiencing changes the customer's conversation from talking about I/O to response times or latency. We used to have to worry about disk and how quickly could your data go in and out. Now, things are so dang fast that we just want to know how quickly we can connect to it, so the latency is pretty cool. We don't have any issues with performance efficiency at all.

What needs improvement?

The improvements made to the product line over the generations has made PowerMax a gem. Nothing being perfect, the improvements that come to mind would not be specific to the physical product, but instead on the support and management side.

Support of the product can be slow and an administrative challenge: planning, scheduling, and overseeing data center access for a Dell EMC rep. One improvement could be to enable a self-maintenance option. The requirements that we go through to get Dell EMC onsite to replace failed drives, power supplies, and other small redundant parts can be unnecessarily complex. If simplified, they could send us the parts, then we could replace them much faster, more easily, and truly within the SLA parameters.

We have had performance/availability issues in the past with the management server/application, Unisphere. Upgrades to the platform could also be difficult and even fail. However, the most recent version released last month had been the first in a long time that was successful. Therefore, we are hopeful those past software issues have been addressed.

For how long have I used the solution?

We have been using the solution since it rolled out, along with the previous hardware iterations prior to NVMe.

What do I think about the stability of the solution?

PowerMax is an absolute must have - 100%. At Rackspace, we have had PowerMax since its initial launch. Prior to PowerMax, we had the VMAX3. We also had VMAX2s. We even started with the original VMAX (VMAX1). All told, we have been working with the entire Dell EMC product line for 10 to 11 years now. In that time, we have literally had just six minutes of downtime over 11 years. 

There was one single outage across that entire 10- to 11-year window. While no one likes outages, the nice thing about this one was that when it was down, there was zero data loss and zero data corruption. This single six minute outage was caused because of a legitimate bug in the system. The system kind of invoked a safety mechanism to protect data, but itself glitched. It immediately recovered, restored, booted back up, and picked up right where it left off. This happened in the middle of the day. Very few customers even noticed. This has been it for more than 10 years of service across hundreds of devices supporting double-digit quantities of petabytes of storage, which is pretty impressive. Based on our experience, Dell EMC could very easily offer a 100% uptime guarantee on an annual basis. It is that good of a system.

Based on the feedback from our engineers, the system could not be more stable than it is. It is incredibly stable and very dependable. This is Dell EMC’s flagship product line. It has been a very stable product for many years and easily achieves the five nines of uptime that they guarantee. Outside of the normal hardware failure here and there, we have only encountered a couple bugs that had effects on attached hosts which were very rapidly resolved by Dell EMC’s engineering teams with software or firmware patches. The only significant (downtime) event we have ever encountered was on a previous generation unit, where Dell EMC’s engineering team responded and resolved the issue very swiftly by identifying the bug and immediately writing a patch to prevent future occurrences.

What do I think about the scalability of the solution?

The system scales as far as you want to take it.

In a large shared infrastructure environment where we are regularly adding storage or taking storage down as our customers need change, this is hundreds of hours of time every quarter. Now, with this new technology, it is faster and more efficient. It gets the work done quickly, which is less time that my storage engineers have to worry about working. This applies for adding new storage as well as expanding an existing storage for our customers. Now, the customer says, "I need 1,000 GB." We say, "PowerMax, give me a 1,000 GB." Then, it is done. If the customer says, "Wait, I need 2,000." We can scale that up without any of the busy work on the back-end that we had to do with previous systems. The PowerMax system is getting our storage team out of the business of having to manage these micro-interactions while letting the team focus on storage maintenance and management. 

We have dozens of storage engineers on our team and thousands of customers who use the solution as part of our service. Because we are a service company, we deliver the best technology home for applications and data. Our customers are eCommerce (banks, medical, and retailers). We service businesses of all sizes and every vertical who are using the storage service that we deliver for them. We have a very competent, modest-sized team managing tens of petabytes for thousands of customers very easily.

We hope to increase usage in the future. When we get more customers, they buy more storage.

How are customer service and technical support?

Our support teams work with the actual Dell EMC support team. We are not engaging Dell EMC tech support a whole lot, unless we are escalating a serious bug issue.

We regularly meet with the Dell EMC product teams. They are getting our feedback constantly. They are asking us questions or being proactive on things that we have noticed, whether it's feature requests or bugs that we find. We have a clear communication path with Dell EMC.

Our storage team is very familiar with the trend analysis tool system, monitoring management tools, etc. In fact, our storage team regularly meets with the CloudIQ developer team every quarter or two to go over feature sets and give them feedback on our use cases. The CloudIQ team actually relies on Rackspace to provide them some input on the product, and as far as fleet management goes, to see what we have done. We have done some beta testing for them and had some sneak peaks on new features. We have a really tight relationship with Dell EMC, which we have had for a couple of decades now. So, we are definitely influencing the CloudIQ feature set and helping the team out the best we can.

Which solution did I use previously and why did I switch?

Here is a nice use case in regards to storage provisioning. In other words, how do we deploy storage for customers? At Rackspace, we are providing a large shared infrastructure environment where we are adding storage or taking it down constantly for customers. We are seeing savings of hundreds of hours of time per a given fiscal quarter (three months). Before NVMe and these versions came out, we had to do a lot of storage work manually to make changes for our customers. We would deal with a storage volume and the subcomponents below that storage volume. So, we create slivers of a volume, then we package those together to make a single volume and present that to the customer's hosts.

By provisioning within the PowerMax systems, we no longer have to go and create individual pieces, and say, "I need all the things needed for 1,000 GB LUN." Now, they can just go there, and say, "I need  1,000 GB. Give it to me." There is no provisioning subwork or extra work needed. It is just there. If I say I'm done with it, I can turn it off. If I want to go from 1,000 to 500. It just happens. A lot of the former busy work that was required for everyday storage support in that location goes away. It literally saves us hundreds of hours per quarter.

How was the initial setup?

Our team knows Dell EMC really well. I don't think they had any issues with the initial setup.

Follow the manufacturer's instructions once you get it deployed. In many ways, it is a set it and forget it technology.

What about the implementation team?

We work hand in hand with Dell EMC. The implementation strategy is just providing the best possible quality of storage equipment with the features that our customers need. The features that they need constantly change so we need the ability to adapt. Our implementation strategy is to work with a platform that is dependable and flexible, and we have been successful with Dell EMC.

What was our ROI?

You can save provisioning time and focus on mission-critical issues as well as problem solving. It is really helpful for businesses of all sizes.

The labor savings and support have been significant. If we're talking 100 hours of labor every three months, that is 100 hours of a database engineer costs. There are performance latency numbers as well as costs associated with recovering data that gets lost, and this system doesn't lose data. You can look at numbers that go around the cost of downtime, if data is not available. This system doesn't go down. Everyone's ROI is going to be unique, but the dependability and performance of the system combined with its ease of operation will definitely save businesses of all sizes money.

Which other solutions did I evaluate?

We have been with Dell EMC since the beginning of business. We adopted them from a server perspective, then we adopted their storage lines. 

What other advice do I have?

The solution keeps getting better. When you go with trusted vendors and time tested technology, things are going to go well for you.

I would rate this solution as 10 out of 10.

**Disclosure: IT Central Station contacted the reviewer to collect the review and to validate authenticity. The reviewer was referred by the vendor, but the review is not subject to editing or approval by the vendor. The reviewer's company has a business relationship with this vendor other than being a customer: Partner
More Dell EMC PowerMax NVMe reviews from users
...who work at a Financial Services Firm
...who compared it with NetApp NVMe AFF A800
Learn what your peers think about Dell EMC PowerMax NVMe. Get advice and tips from experienced pros sharing their opinions. Updated: September 2021.
534,299 professionals have used our research since 2012.
Add a Comment
ITCS user