Hyper-Converged (HCI) Forum

Miriam Tover
Content Specialist
IT Central Station
Aug 30 2019
A lot of community members are trying to decide between hyper-convergence vs. traditional server capabilities. How do you decide between traditional IT infrastructure, Converged Infrastructure (CI), and Hyperconverged Infrastructure (HCI)? What advice do you have for your peers? It's really hard to cut through all of the vendor hype about these solutions. Thanks for sharing and helping others make the right decision.
Shawn SaundersThis is truly a TCO decision. But not as some do TCO, a comprehensive TCO which includes the cost of the new CI/HCI, plus the installation, training, and staffing, and the difference in operational costs over the life of the solution. This should consider the other points from Scott and Werner about consistency in support and compatibility of the components. The best time I see this opportunity is in an environment with substantial IT debt, or when you can align refreshes of the various components together. This helps the TCO conversation dramatically. Keep away from the "shiny object" argument. Just because "everyone" is doing it, is not the right reason. Nor does it make sense just because all your vendors are pushing it, or your technical team is pushing it. Again steer clear of "shiny object". I recommend, getting demos of multiple 3-5 HCI vendors, and capture the capabilities they provide. Then spend some time with your business and technical teams to understand what their requirements in terms of capabilities really are. Separate the must-haves, from the should-haves, to the nice-to-haves. Then build a cost model of what you are currently spending to support your environments (understand your current TCO). Then build your requirements document accordingly. Release this as an RFP, to find a solution that can meet your TCO requirements. Your goal should be a better TCO than the status quo unless there are specific business benefits, that out way a simple TCO. Then you may need to talk with the business to fund that difference, so they can have that specific business value proposition.
John BarnhartHCI lowers operating and capital costs since it is dependent on the integration of commoditized hardware and certified/validated software such as Dell Intel-based x86 servers and "SANSymphony" from DataCore, or say VSAN from VMware, etc. The idea is used to reduce costs and complexity by converging networking, compute, storage and software in one system. Thereby avoiding technology silos and providing a "cloud-like" cost model and deployment/operation/maintenance/support experience for the admin, developer, tenant/end-users. HCI platforms can also help with scalability, flexibility, and reduction of single points of failure as well as HA, BURA, security, compliance, and more. HCI provides a unified resource pool for running applications more efficiently and with better performance due to being more rack dense and because different technologies are converged into one solution. Placing technology inside the same platform is very beneficial if for no other reason than the physical benefits of how data and electronic signals have less area to travel through. For instance, using internal flash memory to support a memory/disk read-write intensive database is a good idea so a separate external array is no longer necessary. It is no different than how far water from a glacier has to travel down the river after it melts in order to reach the ocean. The more rapid and closer it is between the original source and the final destination, the better. The above is a simple answer without knowing the current needs/use case of the business and is only intended to provide a very simple perspective on WHY HCI. You must consider your IT organization, goals and business needs very carefully before choosing any solution. I always advice considering the following; 1. Is "your world" changing? 2. Why? 3. What benefits do you expect to achieve from making incremental changes? 4. What happens if you do nothing?
Bob WhitcombeShould I or Shouldn’t I – that is the HCI question. Not to wax poetic over a simple engineering decision – but HCI is about understanding the size and scale of your applications space. Currently, most HCI implementations are limited to 32 nodes. This can make for a very powerful platform assuming you make the right choices at the outset. What does HCI do that is fundamentally different from the traditional client-server architecture? HCI manages the entire site as a single cluster – whether it has 4 nodes, 14, 24 or 32. It does this by trading ultimate granularity for well-defined Lego blocks of Storage, Network and Compute. When seeking to modify a traditional architecture you need to coordinate between 3 separate teams, Storage, Server and Network to add, move, update or change anything. With HCI, if you need more capacity, you add another Block of storage, compute and network. You trade the ultimate “flexibility” of managing every detail of disk size, CPU type and how you network in a traditional architecture for a standard module in an HCI environment that will be replicated as you scale. At later dates, you can scale by adding a standard block or one that is compute or storage-centric. With that constraint, you don’t have to worry about the complexity of managing, scaling or increasing systems performance. But you do need to pick the right sized module – which means that for multiple needs, you may end up with different HCI clusters – all based on different starting blocks. The decision to use HCI or traditional servers comes down to scale. For most needs today – general virtualization, DevOps and many legacy apps that have been virtualized, HCI is a more cost-effective. To handle EPIC for a large hospital chain or a Global SAP implementation for a major multi-national operation – you probably need a traditional architecture. If you need more than 32 servers to run an application today – that application will need to be cut up to fit in most HCI platforms. The trend is to use containers and virtualization to parse legacy applications into discrete modules that fit on smaller, cheaper platforms, but any time you even think of starting a conversation with "We just need the software dogs to ..." you're barking up the wrong tree. Let’s look at three examples for HCI clusters – Replacing a Legacy applications platform where maintenance is killing you, Building out a new Virtualization cluster for general application use and a DevOps environment for remote and local developer teams. A legacy application – say running logistics for distribution or manufacturing operation – typically is pretty static. It does what it does – but is constrained by the size of the physical plant, so will probably not grow too much beyond current sizing. In that case, I would scope the size of the requirements and if it is under 100TB probably opt for a Hybrid HCI solution where each node has 2 or 4 SSD’s that act as a data cache for a block of hard drives – typically 2-10 2.4TB 2.5” 10Krpm units – in a 2U chassis. You need to decide how many cores are needed for the application, and then have to add the number of cores the HCI software requires for its management overhead - which can range from 16-32. You start with dual-socket 8-core CPU’s and move up as needed, 12, 16, 18, 22, 24, etc. Most systems use Intel CPU’s which are well characterized for performance. So selection of CPU is no different from today - other than the need to accommodate the incremental cores for the HCI Software overhead. Most IT groups have standardized CPUs either because core selection is constrained by Software License costs tied to Core counts or go full meal deal on the core counts and frequencies to get the most from their VMware ELA’s. For HCI networking, since the network is how the disks and inter-process communications are handled, you go with commoditized 10Gb switching. Most server nodes in the HCI platforms will have 4-port 10Gb cards to provide up to 40Gb of Bandwidth for each node - tied together by a pair of 10Gb commodity switches. If your legacy application is a moderate-sized transaction processing engine – then simply move from a hybrid system using a flash cache and spinning rust – to an All-Flash environment. You trade ~120 IOPs HDD’s for 4000 IOP SSD’s and then the network becomes your limiting factor. If I were to build out a new virtualization platform for general applications I would focus on the types of applications – look at their IOPs requirements but in general, would propose All-Flash – just as we are doing today in traditional disk arrays. As noted earlier, the base performance of an HCI cluster is tied to the disk IOPs of the core building blocks, then network latency, and bandwidth. The current trend is for flash densities and performance to grow as HDDs have plateaued. While Spinning disk costs are fairly stable, the prices on flash are falling. If I build a cluster today for my virtualization environment that needs 5000 IOPs now and 10,000 IOPs next year as it doubles in size, I will get better performance from the system today and my future SSD price will fall allowing me to increase performance by adding nodes while the price per node drops. Don't forget that as the utility of a compute service increases - so do the number of users and maintaining user response times as more get added is about lower latency. Read SSD. For a DevOp environment, I want those teams on All-Flash, High Core count CPU design limited to 6-8 nodes depending on how many developers and size of the applications they operate on. I would insist on a separate Dev/Test environment that mimics production for them to deploy and test against to verify performance and applications response times before anything is deployed to the tender mercies of the user community. Obviously, I have made extensive generalizations in these recommendations but like the Pirate Code, they are just guidelines. Traditional IT architectures are okay but HCI is better for appropriately sized applications. Push for All-Flash HCI, there are fewer issues in the future from users. When the budget dogs come busting in to disturb the serenity of the IT tower go Hybrid. Measure twice, scale over and over and document carefully. Have the user and budget dogs sign off because they will be first to complain when the system is suddenly asked to scale rapidly and is unable to deliver incremental performance as user counts grow. User support is tied to IOPs and when your core building block is 120 IOPs per drive you have 30 times less potential than when you use SSD. Once the Software dogs catch up and the applications all live in modular containers that can be combined over high speed networks there will be no "If HCI", Just Do.
Ariel Lindenfeld
Sr. Director of Community
IT Central Station
Jul 19 2019
There are a lot of vendors offering HCI solutions. What is the #1 most important criteria to look for when evaluating solutions? Help your peers cut through the vendor hype and make the best decision.
SamuelMcKoyIn my opinion, the most important criteria when assessing HCI solutions other than the obvious performance. How does that HCI solution scale? Or in other words, how does one add storage and compute resources to the solution. Without understanding how the solution scales one can easily request resources without understanding how and why the overall costs have ballooned. The costs can balloon not only because you're adding additional nodes to your HCI cluster for the additional storage and compute resources that were needed but also with additional compute nodes added to the cluster this requires additional licensing for whichever hypervisor the HCI solution depends upon. This is usually on a per-compute-node basis. For example, some HCI architecture allows admins to add only storage to the HCI cluster when additional storage is needed. Not requiring the purchase of any additional licensing from the hypervisor's perspective. On the other hand, some HCI architecture requires you to add a compute node with the additional storage you need. Even if you don't need the compute resources required to add that storage. That compute node will then need to be properly licensed as well. This type of architecture can and usually does force its consumers to spend more money than the circumstances initially dictated. So for me how the HCI solution scales is most important because it can ultimately determine how cost-effective the HCI solution really is.
Bharat BediWhile there is a long list features/functions that we can look at for HCI -> In my experience of creating HCI solutions and selling it to multiple customers, here are some of the key things I have experienced most customers boil it down to: 1) Shrink the data center: This is one of the key "Customer Pitch" that all the big giants have for you, "We will help you reduce the carbon footprint with Hyperconverged Infrastructure". It will be good to understand how much reduction they are helping you with. Can 10 racks come down to two, less or more? With many reduction technologies included and Compute + Storage residing in those nodes, what I mentioned above is possible, especially if you are sitting on a legacy infrastructure. 2) Ease of running it: The other point of running and buying HCI is "Set it and forget it". Not only should you look at how easy it is for you to set up and install the system, but how long does it take to provision new VMs/Storage, etc. It is great to probe your vendors around to find out what they do about QOS, centralized policy management, etc. Remember that most HCI companies portfolios differ at the software layer and some of the features I mentioned above are bundled in their code and work differently with different vendors. 3) Performance: This could be an architecture level difference. In the race of shrinking the hardware footprint down, you could face performance glitch. Here is an example: When you switch on de-duplication and compression, how much effect does it have on the overall performance on CPU, and thereby affecting the VMs. Ask your vendors how they deal with it. I know some of them out there offload such operations to a separate accelerator card 4) Scaling up + Scaling out: How easy it is to add nodes, both for compute and storage? How long does it take while adding nodes and is there a disruption in service? What technologies do the vendors use to create a multi-site cluster? Keep in mind if the cluster is created with remote sites too? Can you add "Storage only" or "Compute only" nodes if needed? All of the above have cost implications in a longer run 5) No finger pointing: Remember point number two? Most of these HCI are based on "Other Vendors' hardware" wrapping it with their own HCI Software and making it behave in a specific way. If something goes wrong, is your vendor okay to take full accountability and not ask you to speak with a hardware vendor? It will be a good idea to look for a vendor with a bigger customer base (not just for HCI but compute and storage in general) - making them a single point of contact and more resources to help you with, in case anything goes wrong.
Bart HeungensFor me an HCI solution should provide me: - ease of management, 1 console does all, no experts needed, cloud Experience but with on-premise guarantees - invisible IT, don't care about the underlying hardware, 1 stack - built-in intelligence based on AI for monitoring and configuration - guaranteed performance for any workloads, also when failures occur - data efficiency with always-on dedupe and compression - data protection including backup and restore - scalability, ease of adding resources independent of each other (scale up & out) - a single line of support
Nurit Sherman
Content Specialist
IT Central Station
Jul 08 2019
We all know that it's important to conduct a trial and/or proof-of-concept as part of the buying process.  Do you have any advice for the community about the best way to conduct a trial or POC? How do you conduct a trial effectively?  Are there any mistakes to avoid?
Manish BhatiaI would say, gather and understand the requirements, share and check with vendors, invite them for a solution with a POC on your environment, ask for use cases and for any legacy application/hardware, ask for the compatibility matrix, and then you will have the idea about the capabilities of that solution and vendor.
anush santhanamHi, When evaluating HCI, it is absolutely essential to run a trial/POC to evaluate the system against candidate workloads it will be expected to run in production. However, there are quite a few things to watch out for. Here is a short list: 1. Remember that most HCI depend on a distributed architecture which means it is NOT the same as a standard storage array. What that means is that, if you want to do any performance benchmarking with tools such as IOMeter, you need to be extremely careful in the way you create your test VMs and how you provision disks. Guys such as Nutanix have their own tool X-Ray. However I would still stick to a more traditional approach. 2. Look at the list of apps you will be looking to run. If you are going to go for a KVM type of a hypervisor solution, you need to see if the apps are certified. More importantly, keep an eye out on OS certification. While HCI vendors will claim they will and can run anything and everything, you need the certification to come from the app/OS OEM. 3. Use industry standard benchmarking tools. Remember unless you are using a less “standard” type of a hypervisor such as KVM or Xen, you really don’t need to be wasting your time with the hypervisor part as VMWare is the same anywhere. 4. Your primary interest should be the storage layer without question and the distributed architecture. Remember with HCI, the computer does not change and hypervisor (assuming VMWare) does not change. What changes is the storage. Next there are the ancillary elements such as management and monitoring and other integration pieces. Look at these closely. 5. Use workload specific testing tools. Examples include LoginVSI, jMeter, Paessler/Bad boy for web server benchmarking etc. 6. Finally, remember to look at the best practices on a per-app basis. The reason I suggest this is because of the following. You may have been running an app like Oracle in your environment for ages in a monolithic way. However when you try the same app out in HCI it may not give you the performance you want. This has to do with the way the app has been configured/deployed. So looking at app best practices is something to note. 7. If you are looking at DR/backup etc, then evaluate your approaches. Are you using any native backup or replication capability or are you using any external tool. Evaluate these accordingly. Remember your RTO/RPO. Not all HCI will support sync replication. 8. Finally if you are looking at looking at native HCI capabilities around data efficiency etc (inline de-dupe and compression), you will need to design testing for these carefully. 9. Lastly, if you are looking at multiple HCI products, ensure you use a common approach across products. Otherwise your comparison will be like looking at oranges and apples. Hope this helps.
MohamedMostafa1There are several ways to evaluate HCI Solutions before buying, Customers need to contact HCI Vendors or one of the local resellers who propose the same technology. Both of HCI Vendors and Resellers will be able to demonstrate the technology in Three Different scenarios like : 1 – Conduct Cloud-Based Demo, in which the presenter will illustrate product features and characteristics based on a ready-made environment and the presenter will be able to demonstrate also daily administration activities and reports as well. 2 – Conduct a Hosted POC, in which the presenter will work with the customer in building a dedicated environment for him and simulate his current infrastructure components. 3 – Conduct Live POC, in which the presenter has to ship appliances to customer’s data center and deploy the solution and migrate/create VMs for testing purpose and evaluate performance, manageability & Reporting. If the vendor or a qualified reseller is doing the POC, there should be no mistakes because it’s a straightforward procedure.
Sign Up with Email