What is our primary use case?
We have a Genomics lab at Clemson and we've been working with Cisco to use CCP to deploy Kubernetes clusters to run our genomics workflows on. We'll create a cluster and then run our workflows on that cluster.
There are two options with the allocation I'm using. There's public — AWS/GCP/etc, and you can deploy on that. But all of my experience has been on-prem using Cisco's vSphere allocation.
How has it helped my organization?
As a system administrator, it definitely makes it easier for me to manage clusters without having to switch platforms. I don't have to log in to different things and remember different commands or different ways to do things between the different providers, because they all have slightly different platforms. CCP simplifies all that and allows me to become really familiar with just one platform.
The beauty of Kubernetes clusters is that they're pretty much the same wherever they are. So whatever platform you use to deploy them, Kubernetes clusters are pretty much going to be the same as far as the user-experience goes. It doesn't affect the actual usage of Kubernetes but it definitely makes the deployment of clusters pretty straightforward.
From my experience, CCP definitely gives you a few more options in terms of networking configurability. Those options may be present with GKE; I may not have explored them well enough. But CCP allows you to enable some pretty advanced networking stuff and other management services pretty easily.
What is most valuable?
The most valuable feature is definitely the fact that you can use a single platform to deploy to different resource providers. Right now, the version I'm using has vSphere and AWS, but I know in the future they're planning on adding more. The ability to deploy clusters on-prem or to any number of public cloud providers is really valuable because you don't need to relearn or switch platforms to switch resource providers. That is one really cool feature of a CCP. It gives us the ability to have a single platform to deploy clusters to multiple resource providers and then manage all of those clusters from that same platform.
There are some other smaller things I like. For example, creating a cluster itself is pretty straight forward. It's very familiar. If you've ever filled out a web form for anything, you can do this.
What needs improvement?
One thing I have not really had the chance to explore too much is the Cisco Container Platform command-line interface. I've been told that exists and it's functional, but I'm not sure if it's really made for end-users. It might just be for admins or developers.
One thing that is a little bit annoying about Cisco Container Platform is that for each cluster you create you have to go through the same web form each time. If you're creating two identical clusters, you still have to go through that web form twice. What's really nice about most platforms is that they have command-line interfaces where you can just copy a single command which has all the flags with all the configurations you want and put that in a text file. Then, when you want to create another cluster you can just paste that in and edit one or two flags if you want to. You don't have to go through a web form every time and that is a feature that I would like to see in the future with CCP. It would be nice, at the end, once you create a cluster using the web form, if it would give you a single command that you could copy and put somewhere and then paste it, in the future, to create an identical cluster or an almost identical cluster. I would like the ability to save cluster configurations to CCP.
I've provided that feedback to the development team. There might even be a version that is out which already has that functionality integrated into it. I think it's safe to say that at some point in the future that feature will be provided.
For how long have I used the solution?
We've been using it for over a year.
What do I think about the stability of the solution?
The stability of CCP has been fine. We've had no outages, no issues. We're using an on-prem resource pool. If there was a data center failure wherever those resources might be located, I'm sure there would be issues. But there have not been any issues for us.
What do I think about the scalability of the solution?
Our lab, as a whole, works with tons of data; petabytes and petabytes of data. But the size of the clusters we're creating using CCP is pretty robust. The largest experiment I've run on CCP had about 1.6 terabytes of intermediate, workflow data. Our lab works with a lot more. Given more resources, we could definitely scale up to much larger data sets.
Since we're using on-prem, there are obviously limits to the amount of hardware that's on that on-prem resource store. We haven't needed to exceed that. We have other platforms that we can run larger datasets on, but as far as scalability goes, we haven't tried to scale up too far.
In the near future our plans to increase usage are not in the size of a single cluster. We're doing a demo with Cisco at Internet2's TechEx conference in New Orleans this year. We're going to train 15 to 20 network engineers how to use CCP and how to deploy clusters. We're going to be using a single CCP environment to deploy 15 to 20 clusters at the same time. As far as compute goes, our current resource pool will be able to handle that. The main thing that is going to be a stress test for CCP is having 15 clusters all trying to deploy at once and getting all the networking stuff configured. That will be a really cool test for CCP.
I think it should handle it. All the Cisco points of contact have told me it's good to go. To my knowledge, it should be fine, as long as there are enough allocatable IP addresses. That's one of the main limits: making sure you have enough IP addresses for all those clusters.
Which solution did I use previously and why did I switch?
We've been given credits by Google, so we've been using Google Kubernetes Engine to deploy cube clusters that we do research on. The other major thing that we use is the Pacific Research Platform. It's a pretty robust Kubernetes cluster that was originally based out of California. A bunch of schools in California decided to link up nodes and create this big cluster. Some people are calling it the National Research Platform now because it has spread across the country. There is no cost, so it's really nice for us to use.
How was the initial setup?
I do not have experience actually installing CCP on the resources themselves. I got on a call with some guys from Cisco and they showed me the software stack that supports CCP. They use VMware's a vSphere and, under that, is UCS but I can't comment on how easy it is to deploy CCP.
I'm sure it's possible for just one person to deploy it.
In terms of maintenance, within our lab I'm the cloud architect. I'm the main point of contact with Cisco as far as engineering development with CCP and the partnership with Clemson University go. I'm the only one who is creating and managing these clusters. Everyone else is focusing on the genomics, on the science, and on that genetics research. There are about six people who are using these clusters. They're the people who are actually running the workflows, pulling the data, and trying to analyze it to figure out gene interactions that contribute to causing cancer.
What was our ROI?
We haven't really made an investment because it is a partnership between Clemson University and Cisco. We've invested time into learning how to use CCP and we've had a very beneficial return on that investment. But Clemson, and our lab, have not directly invested any money in this platform. In terms of time savings and the ability to use this resource pool that Cisco provided to us, it has definitely been a very beneficial return on investment.
Because we're an academic institution, we have access to a lot of free, nationally available cyber infrastructure that we can do research on at no cost. So CCP hasn't directly saved us any money.
Which other solutions did I evaluate?
It depends on where you put your value. If you put your value in time savings as far as management admin go, you will get a lot of value from Cisco Container Platform because it allows you to manage multiple clusters from multiple resource providers from one location. That will definitely save you a lot of time, in the long run, if you're a pretty robust admin who is responsible for a lot of different clusters on a lot of different resources.
But certain public cloud platforms do have features that will save you money in terms of compute-billing that I haven't seen yet with CCP. An example of that would be preemptable virtual machines. I know AWS and Google and perhaps Microsoft, all offer deployment of cube clusters with virtual machines that can be preempted. So if another process needs it, it can take your node from you. That's fine because Kubernetes is built to handle node preemption. It's a very robust system so it's fine. It actually saves you a ton of money and compute. I don't think CCP offers preemptable VMs with public cloud yet. But I'm not totally sure about the newest version or editions.
Another thing that will save you a lot of money on other public cloud platforms is auto-scaling. If you're using the Google Kubernetes Engine or Amazon's EKS, it will automatically either scale up the nodes if your platform is getting used more or scale down the nodes if they aren't getting used. That, combined with preemptive VMs, will save you a lot of money in your compute billing.
What other advice do I have?
To use CCP you have to be familiar with Kubernetes and how it works. I've obviously learned a lot about how to use CCP, which I think will be very valuable in the future, but I haven't really learned anything directly about Kubernetes. I did learn the value of certain features that public providers have, like the autoscaling and preemptable VMs. I saw why those are so awesome.
My advice would be to be familiar with what you want to do before you try to implement it. I would suggest utilizing CCP if you intend to deploy and manage a lot of clusters from different resource providers.
I've never built one completely from source. I have deployed cube clusters using AWS and GKE. I see them as pretty much the same. I don't think one is completely easier than the other between AWS, GKE, and CCP.
We don't have to deploy clusters that often. I either change or deploy the cluster on CCP about once a month. It's not like it's saving me tons of time every day. But compared to deploying a cluster on, let's say, Google, it's probably about the same if you're using the actual graphical interface on their respective websites. It doesn't really save me too much time.
Right now, we only use one resource provider and it is on-prem. But let's say I had a cluster on-prem and a cluster on AWS that I created with CCP. That's where the time savings would be in terms of managing the clusters from the same platform. That would definitely save me a lot of time.
We do have a couple of genomic workflows that are GPU compatible, but we have not had a chance to use any with CCP. That's because of the resources we have don't include GPUs.
I'm still getting into it. We don't really need anything too advanced. For running our genomics workflows, we just need to have a Kubernetes cluster that can access the internet. There are no really complex management services or networking services that we use.
CCP has a lot of room to grow. I don't think the version I'm using is by any means its final version. I'm not sure how many official releases they've done or what version they intend to put into full production, as a product. But I would rate the version I have, as a development version, as an eight out of ten. If it becomes what I want it to become, what I think it's going to become in the coming months and years, I would rate it a ten.