Cisco UCS B-Series Review

It changed our mindset to abstract the server, making it a stateless object for workloads.


What is most valuable?

Why pick a UCS blade over a Dell, HPE or Lenovo system? The answer depends on what application I need to run. If I want a small-scale, 3-4 server application space in a localized area, I want a rack mount, for a price advantage. If I need a larger-scale virtualized environment, I prefer blades, and for the lowest OpEx as I scale out, I find Cisco's UCS lets me manage a larger footprint with fewer people.

How has it helped my organization?

Previously, we focused on CPUs and servers, relying on the Intel cadence for change. With Cisco UCS, we became network-centric and changed our mindset to abstract the server, making it a stateless object for workloads. Managing blade servers logically lets us take full advantage of Moore's law – which started with 640 cores per fabric and now provides 5760 cores for B200-M4 blades in our standard 20 chassis pods; more workloads per pod, and fewer people to manage them. This has significantly improved our OpEx costs.

What needs improvement?

Cisco is behind as far as SSD qualifications and options allowed, relative to other vendors, but that is in keeping with their philosophy of a stateless working environment. If I add a unique storage attribute to my blades, I encumber it with a state that requires manual intervention to move around.

SSD evolution is coming hard and fast with higher density, lower cost options popping up each quarter. New form factors like M.2, U.2, Multi-TB, NVMe and now signs of Optane are emerging across a range of price points turning the once stolid server domain into the wild west. Dell and HPE have field qualification processes with vendors such that very soon after new products are shipping, they are available for use in their servers.

The process is slower for UCS as Cisco must perform extensive validation to assure compatibility with UCS-Manager. Does the device respond in time to blade controller logic, are there issues with time-outs for UCS-Manager that might have either type 1 or type 2 fault errors. Hence the array of new SSD products are more robust with HPE and Dell than for Cisco.

This goes to the core difference in architectural philosophy between the Legacy server vendors and Cisco that calls for a stateless environment leveraging networked storage so that any workload can be readily moved to a new server as a more powerful system is deployed, or a fault occurs on the old server. If an HPE blade has a local boot option with a new 1TB SSD – then you cannot move that workload remotely to a new 2-socket 36-core blade. You have to have a technician go on site to physically pull the boot SSD from the older blade and insert it into a new blade, then confirm it got the right one. This adds labor cost and slows down the upgrade process – increasing OpEx costs to manage the legacy infrastructure.

For how long have I used the solution?

We have used this since inception in 2009.

What was my experience with deployment of the solution?

The change in mindset from building stateful servers to stateless devices managed across an intelligent fabric with logical abstraction took about a month for operations to come up to speed on; no looking back since.

What do I think about the stability of the solution?

We went through the original teething pains of any new system. In particular, once we had our operational epiphany on what the potential was, we were limited by how fast features could be added to UCS Manger. With XML extensions, UCS Central (Manager of Managers) and UCS Director (Automation), we have enough on our plate.

What do I think about the scalability of the solution?

Early on, we encountered scalability issues – UCS was to support 40 chassis – but it only did 10, then increased to 20. 20 chassis (160 servers) is more than enough as Moore's law, increased CPU core count and higher network bandwidth all made for the ability to place more workloads in a pod than we were comfortable with. So, it rapidly caught up.

How is customer service and technical support?

Customer Service:

Customer service is excellent.

Technical Support:

Technical support is excellent. Cisco understands what is needed and it plays to their networking strengths. Ironically, most of my previous rack system problems came down to network constraints as we ran into switch domain boundaries, VLAN mapping issues and so forth; the basic blocking and tackling for Cisco.

Which solutions did we use previously?

We previously used HPE. They had a good blade system and good racks, but their iLO is expensive and gets very complex at scale.

How was the initial setup?

Initial setup was straightforward. More time was spent educating us on UCS Manager, the logical tool, service profiles and the other tools of automated provisioning than physical connectivity, which is child's play.

What about the implementation team?

We bought through a vendor, who showed us how to set up and some tricks of the trade to short circuit the learning process. Then, after a few months, we were cruising at scale.

What was our ROI?

ROI is not something we share, but I will note that we now use 2 persons to manage 1600 servers in two remote data centers. This is across 25 domains that can all be seen at once and, as alerts come in, drilled down and addressed from a web console.

What's my experience with pricing, setup cost, and licensing?

SSD evolution is coming hard and fast with higher density, lower cost options popping up each quarter. New form factors like M.2, U.2, Multi-TB, NVMe and now signs of Optane are emerging across a range of price points turning the once stolid server domain into the wild west. Dell and HPE have field qualification processes with vendors such that very soon after new products are shipping, they are available for use in their servers.

The process is slower for UCS as Cisco must perform extensive validation to assure compatibility with UCS-Manager. Does the device respond in time to blade controller logic, are there issues with time-outs for UCS-Manager that might have either type 1 or type 2 fault errors. Hence the array of new SSD products are more robust with HPE and Dell than for Cisco.

This goes to the core difference in architectural philosophy between the Legacy server vendors and Cisco that calls for a stateless environment leveraging networked storage so that any workload can be readily moved to a new server as a more powerful system is deployed, or a fault occurs on the old server. If an HPE blade has a local boot option with a new 1TB SSD – then you cannot move that workload remotely to a new 2-socket 36-core blade. You have to have a technician go on site to physically pull the boot SSD from the older blade and insert it into a new blade, then confirm it got the right one. This adds labor cost and slows down the upgrade process – increasing OpEx costs to manage the legacy infrastructure.

Which other solutions did I evaluate?

Before choosing we also evaluated HPE, Dell, and IBM. We all found that, aside from the physical differences, they had the same architecture and OpEx; external management; local switch infrastructure in each chassis; complex routing rules when scaling domains; and challenges in provisioning new units. Once we learned the "UCS Way," we were more efficient.

Disclosure: My company has a business relationship with this vendor other than being a customer: My company and Cisco are partners.
1 visitor found this review helpful
1 Comment
Juan DominguezConsultantTOP 5LEADERBOARD

Cisco UCs is definitely a system that overcome the competition from many angles. It's single pane management and policy driven format are atop of the field. I have created and deployed HP and Dell, by far Cisco UCS is the most flexible and scalable in my opinion. Excellent content in your write up.

21 February 17
Guest

Sign Up with Email