Observing Enterprise Kubernetes Clusters At Scale

Observing Kubernetes clusters at scale is difficult. While most companies operate a small number of Kubernetes clusters, Giant Swarm is responsible for many more, in multiple regions. This scale makes maintaining a responsible level of observability harder.

Our infrastructure benefits from our learnings with this level of operations, such as building tooling for automatically managing Prometheus for on-demand Kubernetes clusters, or new Prometheus exporters to address hard-to-monitor problems.

This talk presents our learnings of handling observability at scale, with in-depth examples from our infrastructure.


Audience requirements:

Some level of Kubernetes, observability and/or operations experience

Objective of the talk:

To present our learnings of handling observability of enterprise Kubernetes clusters at scale.

You can view Joe’s slides below:

CLL19 Joe Salisbury


Please follow us:

Track 3
Location: Abbey Date: May 15, 2019 Time: 2:30 pm - 3:15 pm Joe Salisbury, Giant Swarm Joe Salisbury, Giant Swarm