Kubernetes: the story of innovation

kubernetes logo

What is Kubernetes?

Kubernetes, also known as K8s, is an open-source system for automating deployment, scaling, and management of containerized applications. Kubernetes progressively rolls out changes to your application or its configuration, while monitoring application health to ensure it doesn’t kill all your instances at the same time. If something goes wrong, Kubernetes will rollback the change for you. Take advantage of a growing ecosystem of deployment solutions.

Use cases

babylon

The story of Babylon unfurls as one of the greatest tales of innovation in the Medical AI sector supported by the Kubernetes framework. A large number of Babylon’s product use machine learning models, however the in-house computing power was fairly inadequate to sustain large scale experimentation.

This called for automation in the terms of application of Kubeflow, a tool of Kubernetes used for deploying machine learning models. This caused landmark changes in terms of service. Client validation came in faster and it was no longer required to wait for hours to be able to deploy computation.

Based on that experience, Vallée’s team was tasked with building a self-service platform to help Babylon’s AI teams become more efficient, and by extension help get products to market faster. The main requirements: (1) the ability to give researchers and engineers access to the compute they needed, regardless of the size of the experiments they may need to run; (2) a way to provide teams with the best tools that they needed to do their work, on demand and in a centralized way; and (3) the training platform had to be close to the data that was being managed, because of the company’s expansion into different countries.

Once the team decided to build the Babylon AI Research platform on top of Kubernetes, they referred to the Cloud Native Landscape to build out the stack: Prometheus and Grafana for monitoring; an Istio service mesh to control the network on the training platform and control what access all of the workflows would have; Helm to deploy the stack; and Flux to manage the GitOps part of the pipeline.

interface

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store