Nutanix Valuable Features

Samuel Rothenbuehler
CTO Enterprise Cloud at Amanox Solutions (S&T Group)
Some years ago when we started working with Nutanix the solution was essentially a stable, user friendly hyper converged solution offering a less future rich version of what is now called the distributed storage fabric. This is what competing solutions typically offer today and for many customers it isn't easy to understand the added value (I would argue they should in fact be a requirement) Nutanix offers today in comparison to other approaches. Over the years Nutanix has added lots of enterprise functionality like deduplication, compression, erasure coding, snapshots, (a)-sync replication and so on. While they are very useful, scale extremely well on Nutanix and offer VM granular configuration (if you don't care about granularity do it cluster wide by default). It is other, maybe less obvious features or I should say design principles which should interest most customers a lot: Upgradeable with a single click This was introduced a while ago, I believe around version 4 of the product. At first is was mainly used to upgrade the Nutanix software (Acropolis OS or AOS) but today we use it for pretty much anything from the hypervisor to the system BIOS, the disk firmware and also to upgrade sub components of the Acropolis OS. There is for example a standardized system check (around 150 checks) called NCC (Nutanix Cluster Check) which can be upgrade throughout the cluster with a single click independent of AOS. The One-Click process also allows you to use a granular hypervisor upgrade such as an ESXi offline bundle (could be a ptach release). The Nutanix cluster will then take care of the rolling reboot, vMotion etc. to happen in a fully hyper-converged fashion (e.g. don't reboot multiple nodes at the same time). If you think how this compares to a traditional three tier architecture (including converged generation 1) you do have a much simpler and well tested workflow which is what you use by default. And yes it does automatic prechecks and also ensures what you are updating is on the Nutanix compatibility matrix. It is also worth mentioning that upgrading AOS (the complete Nutanix software layer) doesn't require a host reboot since it isn't part of the hypervisor but installed as a VSA (regular VM). It also doesn't require any VMs to migrate away from the node/host during and after upgrade (I love that fact since bigger cluster tend to have some hickups when using vMotion and other similar techniques especially if you have 100 VMs on a host) not to mentioned the network impact. Linearly scalable Nutanix has several unique capabilities to ensure linear scalability. The key ingredients are data locality, a fully distributed meta data layer as well as granular data management. The first is important especially when you grow your cluster. It is true that 10G networks offer very low latency but the overhead will count towards every single read IO so you should consider the sum of them (and there is a lot of read IOs you get out of every single Nutanix node!). If you look at what development is currently ongoing in the field of persistent flash storage you will see that the network overhead will only become more important going forward. The second key point is the fully distributed meta data database. Every node holds a part of the database (the meta data belonging to it's currently local data for the most part and replica information from other nodes). All meta data is stored on at least three nodes for redundancy (each node writes to it's neighbor nodes in a ring structure, there are no meta data master nodes). No matter how many nodes your cluster holds (or will hold) there is always a defined number of nodes (three or five) involved when a meta data update is performed (a lookup/read is typically local). I like to describe this architecture using Big O notation where in this case you can think of it as O(n) and since there are no master node there aren't any bottlenecks at scale. The last key point is the fact that Nutanix acts as an object storage (you work with so called Vdisks) but the objects are split in small pieces (called extends) and distributed throughout the cluster with one copy residing on the local node and each replica residing on other cluster nodes. If your VM writes three blocks to its virtual disk they will all end up on the local SSD and the replicas (for redundancy) will be spread out in the cluster for fast replication (they can go to three different nodes in the cluster avoiding hot spots). If you move your VM to another node, data locality (for read access) will automatically be built again (of course only for the extends your VM currently uses). You might now think that you don't want to migrate that extends from the previous to the now local node but if you think about the fact that the extend will have to be fetched anyhow then why not saving it locally and serve it directly from the local SSD going forward instead of discarding it and reading it over the network every single time. This is possible because the data structure is very granular. If you would have to migrate the whole Vdisk (e.g. VMDK) because this is the way your storage layer saves its underlying data then you simply wouldn't do it (imagine vSphere DRS migrates your VMs around and your cluster would need to constantly migrate the whole VMDK(s)). If you wonder how this all matters when a rebuild (disk failure, node failure) is required then there is good news too! Nutanix immediately starts self healing (rebuild lost replica extends) whenever a disk or node is lost. During a rebuild all nodes are potentially used as source and target to rebuild the data. Since extends are used (not big objects) data is evenly spread out within the cluster. A bigger cluster will increase the probability of a disk failure but the speed of a rebuild is higher since a bigger cluster has more participating nodes. Furthermore a rebuild of cold data (on SATA) will happen directly on all remaining SATA drives (doesn't use your SSD tier) within the cluster since Nutanix can directly address all disks (and disk tiers) within the cluster. Predictable Thanks to data locality a large portion of your IOs (all reads, can be 70% or more) are served from local disks and therefore only impact the local node. While writes will be replicated for data redundancy they will have second priority over local writes of the destination node(s). This gives you a high degree of predictability and you can plan with a certain amount of VMs per node and you can be confident that this will be reproducible when adding new nodes to the cluster. As I mentioned above the architecture doesn't read all data constantly over the network and uses meta data master nodes to track where everything is stored. Looking at other hyper converged architectures you won't get that kind of assurance especially when you scale your infrastructure and the network won't keep up with all read IOs and meta data updates going over the network. With Nutanix a VM can't take over the whole clusters performance. It will have an influence on other VMs on the local node since they share the local hot tier (SSD) but that's much better compared to today's noisy neighbor and IO blender issues with external storage arrays. If you should have too little local hot storage (SSD) your VMs are allowed to consume remote SSD with secondary priority over the other node's local VMs. This means no more data locality but is better than accessing local SATA instead. Once you move away some VMs or the load on the VM gets smaller you automatically get your data locality back. As described further down Nutanix can tell you exactly what virtual disk uses how much local (and possibliy remote) data, you get full transparency there as well. Extremely fast I think it is known that hyper converged systems offer very high storage performance. Not much to add here but to say that it is indeed extremely fast compared to traditional storage arrays. And yes a full flash Nutanix cluster is as fast (if not faster) than an external full flash storage array with the added benefit that you read from you local SSD and don't have to traverse the network/SAN to get it (that and of course all other hyper convergence benefits). Performance was the area where Nutanix had the most focus when releasing 4.6 earlier this year. The great flexibility of working with small blocks (extends) rather than the whole object on the storage layer comes at the price of much greater meta data complexity since you need to track all these small entities through out the cluster. To my understanding Nutanix invested a great deal of engineering to make their meta data layer extremely efficient to be able to even beat the performance of an object based implementation. As a partner we regularly conduct IO tests in our lab and at our customers and it was very impressive to see how all existing customers could benefit from 30-50% better performance by simply applying the latest software (using one-click upgrade of course). Intelligent Since Nutanix has full visibility into every single virtual disks of every single VM it also has lots of ways to optimize how it deals with our data. This is not only the simple random vs sequential way of processing data but it allows to not have one application take over all system performance and let others starve (to name one example). During a support case we can see all sorts of crazy information (I have a storage background so I can get pretty excited about this) like where exactly your applications consumes it's resources (local, remote disks). What block size is used random/sequential, working set size (hot data) and lots more. All with single virtual disk granularity. At some point they were even thinking at making a tool which would look inside your VM and tell you what files (actually sub file level) are currently hot because the data is there and just needs to be visualized. Extensible If you take a look at the up... View full review »
reviewer1376286
Consulting Solutions Architect at a tech services company with 5,001-10,000 employees
Nutanix has several feature sets that we like. For example, everything's core centralized on the UI. You don't have multiple interfaces that you have to jump between like in some other solutions. It's more integrated for the overall management of the infrastructure. The other part too which is very attractive, is the fact they provide an option if you're not leveraging your OEM hypervisor like VMware or HyperV. That was a significant cost saver for us as well as enabling us to look at alternatives to the VMware tax. View full review »
Mohammed Alakhouch
Direction Générale des Impôts at a sports company with 201-500 employees
I think that the most interesting features are the replication and redundancy. It is a good solution and it is easy to work on a Nutanix platform. View full review »
Learn what your peers think about Nutanix. Get advice and tips from experienced pros sharing their opinions. Updated: April 2020.
426,265 professionals have used our research since 2012.
GILLES THEAUDIERE
Consultant at a tech services company with 10,001+ employees
The fact that there is only one interface to deploy a complete solution for maximum storage is fantastic. View full review »
Emmanuel Onen
Enterprise Technical Consultant at Datacentrix (Pty) Ltd
The most valuable feature would be the ease of deployment. That is the most significant feature for me because I've worked with multiple vendors and it's always been very complicated to install the software and get everything running. It has better software to get everything up and running. It has a simple interface. View full review »
Samuel Rothenbuehler
CTO Enterprise Cloud at Amanox Solutions (S&T Group)
* Very easy management (e.g. daily tasks and also major upgrades) * Simple and fast to implement as a partner * Very mature and stable with outstanding Nutanix support if needed (we are a L1 and L2 support partner as well) * Potential to replace 80-90% of all customer use cases we see in Switzerland View full review »
Danny Tseng
IT Director at Elite Semiconductor Memory
The solution offers impressive performance. We don't have to pay for extras. The interface of the solution is good. I like the UI. The stability of the solution is quite decent. View full review »
Manzeel Uprety
Co-Founder at Mero Reading Room
* The level of statistical performance data that it can confirm in real-time is extremely useful. I can see what my VM’s hosts and guests are doing from a single pane of glass and identify issues before they would otherwise become apparent. * The informatics support for this graphical software is up-to-date with the latest infographic web trends. View full review »
Mohamed Shatla
Technical Head at eSky IT
Any feature related to the performance, like data locality, compression, and deduplication are all very important features. View full review »
SystemsEng
System Engineer at a tech services company with 501-1,000 employees
Their Google operating system is more mature, so their results are much better. View full review »
Dominique Pasquier
Directeur Technique at INFONECS
Karbon is a must-have as it drastically simplifies the deployment of Kubernetes. View full review »
Luca De Vincenzi
IT Project Manager with 5,001-10,000 employees
* One-click upgrade * Data locality for performances * Prism central administration console * Metro availbility. View full review »
Shannon Jones
Storage Analyst at Fidelity National Information Services, Inc.
SRM capabilities for replication have been proven reliable and very useful for our organization. View full review »
Peter Cheng
Owner at SIS International HK Limited
The most valuable feature is its data locality, as well as the solution's performance. View full review »
Learn what your peers think about Nutanix. Get advice and tips from experienced pros sharing their opinions. Updated: April 2020.
426,265 professionals have used our research since 2012.