Kubernetes didn’t just appear overnight.
It embodies 15 years of Google’s hard-earned practical know-how, accumulated while running billions of containers.

π― What this article covers
- The background and core philosophy behind Google’s internal cluster management system, Borg
- The experimental attempts of Omega, designed to overcome Borg’s limitations
- How Kubernetes, released to the world as open source, became the cloud standard
- A comparison of the key differences between the three systems
π Introduction / Background β Why did Google create this?
Have you ever considered that the services we use todayβGmail, YouTube, Google Searchβhandle hundreds of millions of requests simultaneously?
In the early 2000s, Google faced an enormous problem. As services exploded, manually managing thousands of servers became physically impossible. A single mistake in server configuration could cause outages, resources were wasted, and engineers suffered through sleepless nights responding to failures.
From this pain emerged Google’s trilogy of cluster management systems: Borg β Omega β Kubernetes.
π‘ What is a Cluster? A collection of multiple servers (nodes) grouped into a single pool of computing resources. The concept is to treat the entire cluster as one giant computer, rather than managing individual servers.

π 1st Generation: Borg β “Google’s Secret Weapon” (2003~)
What is Borg?
Borg is a large-scale cluster management system that Google developed and operated internally since 2003. It automatically deployed and managed hundreds of thousands of jobs across thousands of machines.
Its name comes from the alien species ‘Borg’ in Star Trek. Just as the Borg assimilate individual beings into a Collective, the philosophy was to operate thousands of individual servers as one giant collective.
Borg’s Core Concepts
Borg classifies jobs into two types:
- Prod (Production): Latency-sensitive services. Real-time services like Gmail and Search. Always top priority.
- Non-Prod (Batch): Batch jobs like large-scale data processing. Lower priority.
This classification was key. Borg minimized resource waste by bin packing Prod and Non-Prod jobs together on the same physical server. Batch jobs would secretly use CPU not utilized by Prod jobs, and immediately vacate resources when Prod jobs needed them.
What Borg solved
| Problem | Borg’s Solution |
| Manual management of thousands of servers | Central scheduler for automatic placement |
| Manual recovery during service failures | Automatic restart, rescheduling |
| Resource waste | Maximized utilization with mixed Prod/Batch placement |
| Lack of deployment automation | Declarative deployment with Job definition files |
### Borg’s Architecture
Borg consists of a BorgMaster (central control plane) and Borglet (agent on each node).
[BorgMaster] βββ μ€μΌμ€λ§ κ²°μ
β
βββ [Borglet] β μλ² λ
Έλ 1
βββ [Borglet] β μλ² λ
Έλ 2
βββ [Borglet] β μλ² λ
Έλ N
When BorgMaster instructs “run this job on that server,” Borglet actually executes it and reports its status.
Borg’s Limitations
However, Borg had structural problems:
- Monolithic BorgMaster: All decisions are made by a single BorgMaster. As the system grew, bottlenecks became severe.
- Job-centric design: Borg’s basic unit is a ‘Job’ containing ‘Tasks’. Expressing relationships between jobs was awkward.
- IP address sharing: Tasks on the same server shared IP addresses, leading to frequent port conflicts.
- Legacy accumulation: Having been used internally for a long time, structural improvements were difficult due to backward compatibility.
π 2nd Generation: Omega β “Redesigning from Scratch” (2008~)
Purpose of Omega’s Birth
Feeling Borg’s limitations, Google engineers began an experimental project called Omega around 2008. It started with the question, “What if we designed it correctly from the beginning, instead of fixing Borg?”
Omega was indeed operated internally at Google, but it couldn’t completely replace Borg. Borg was too deeply rooted in Google’s infrastructure.
Omega’s Innovation: Shared State Architecture
Borg’s biggest problem was its centralized scheduler. A single BorgMaster making all decisions limited scalability.
Omega solved this with a Shared State approach:
κΈ°μ‘΄ Borg:
[BorgMaster] β λͺ¨λ κ²°μ β [Borgletλ€]
(λ³λͺ© λ°μ)
Omega:
[Scheduler A] βββ
[Scheduler B] βββΌββ [곡μ μν μ μ₯μ] β [λ
Έλλ€]
[Scheduler C] βββ
(λκ΄μ λμμ± μ μ΄)
Multiple schedulers simultaneously observe the cluster state and make scheduling decisions independently. Conflicts are resolved using Optimistic Concurrency Control.
π‘ Optimistic Concurrency Control: A method that “assumes conflicts are rare and proceeds with operations, retrying if a conflict is later detected.” It offers higher throughput than pessimistic methods (which acquire locks first).
###
Omega’s Legacy
While Omega itself didn’t replace Borg, its core ideas significantly influenced Kubernetes:
- Treating resources as first-class objects: Explicitly specifying and tracking CPU, memory
- Scheduler decoupling: Breaking away from single-scheduler dependency
- Enhanced Declarative API: A system that adjusts to the desired state declared by the user
π 3rd Generation: Kubernetes β “The Open Source that Changed the World” (2014~)
Why did Google open source it?
With the emergence of Docker in 2013, container technology began to gain widespread adoption. Google realized, “If we open source the know-how we’ve accumulated over a decade, we can establish the standard for the cloud ecosystem.”
In June 2014, Google released Kubernetes as open source. It means ‘helmsman’ (the person who steers a ship) in Greek. This is why its logo is a ship’s wheel (Helm).
Applying Lessons from Borg to Kubernetes
In their 2016 paper “Borg, Omega, Kubernetes,” Google engineers detailed how Borg’s operational experience influenced Kubernetes’ design:
π΄ Borg’s Mistakes β Kubernetes’ Solutions
| Borg’s Problem | Kubernetes’ Solution |
| Job/Task-centric structure, difficulty expressing relationships | Introduction of the Pod concept. Redefining the unit as a container group |
| Tasks sharing IPs, port conflicts | Assigning a unique IP to each Pod |
| Monolithic BorgMaster | Decoupled components (API Server, Scheduler, Controller Manager) |
| Internal-only design | Open API centric, extensible plugin structure |
###
Kubernetes’ Core Architecture
[Control Plane]
βββ API Server β λͺ¨λ μμ²μ κ΄λ¬Έ
βββ Scheduler β Podλ₯Ό μ΄λ λ
Έλμ λ°°μΉν μ§ κ²°μ
βββ Controller Mgr β νμ¬ μνλ₯Ό μνλ μνλ‘ μ μ§
βββ etcd β ν΄λ¬μ€ν° μ 체 μν μ μ₯ (λΆμ° KV μ μ₯μ)
[Worker Nodes]
βββ kubelet β Borgletκ³Ό λμΌν μν , 컨ν
μ΄λ μ€νΒ·κ΄λ¦¬
βββ kube-proxy β λ€νΈμν¬ κ·μΉ κ΄λ¦¬
βββ Container Runtime (Docker, containerd λ±)
###
Declarative API β Kubernetes’ Philosophy
Kubernetes’ most powerful philosophy is Declarative Management.
# Declare the desired state
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-webserver
spec:
replicas: 3 # Always run 3 instances
selector:
matchLabels:
app: webserver
template:
metadata:
labels:
app: webserver
spec:
containers:
- name: nginx
image: nginx:1.25
resources:
requests:
cpu: "250m" # 0.25 CPU request
memory: "128Mi" # 128MB memory request
limits:
cpu: "500m" # Max 0.5 CPU
memory: "256Mi" # Max 256MB
When this YAML is applied, Kubernetes automatically runs 3 nginx Pods and automatically recovers if one dies. Engineers only need to declare “what” they want, not “how.”
The Ecosystem Created by Kubernetes
What makes Kubernetes truly great is not just the technology itself. It created an ecosystem.
- 2015: CNCF (Cloud Native Computing Foundation) established. Google donated Kubernetes.
- 2016: Helm (package manager), managed services from various cloud vendors (GKE, AKS, EKS) emerged.
- 2017: Competitors like Docker Swarm and Mesos virtually recognized Kubernetes as the de facto standard.
- 2018 onwards: Hundreds of tools like Service Mesh (Istio), GitOps (ArgoCD), Serverless (Knative) operate on top of Kubernetes.
β οΈ Cautions / Common Mistakes
1. Kubernetes is not an open-source version of Borg π« Kubernetes is a system redesigned from scratch based on lessons learned from Borg. It’s not Borg’s code open-sourced as-is.
2. Omega also failed to replace Borg π« Borg, Omega, and Kubernetes are operated simultaneously within Google. Borg still underpins Google’s core infrastructure.
3. Kubernetes = More than just container orchestration β Many people understand it only as a “tool for managing multiple Docker containers,” but Kubernetes is a platform for declaratively managing the entire cloud-native infrastructure. It encompasses networking, storage, security policies, and service discovery.
β Summary / Conclusion
| System | Period | Public Status | Key Innovation | Limitation |
| Borg | 2003~ | Internal Only | Large-scale automation, mixed Prod/Batch | Monolithic, scalability limits |
| Omega | 2008~ | Internal Only | Shared state, multiple schedulers | Failed to replace Borg |
| Kubernetes | 2014~ | Open Source | Declarative API, plugin ecosystem | High learning curve |
Google’s 15-year journey is not just a simple technological evolution. It is the release of painful lessons learned from operating billions of containers to the world in the form of Kubernetes.
When you execute a single line of `kubectl apply -f deployment.yaml`, the wisdom accumulated by Google engineers over 20 years is at work behind it. π
For the next steps, studying Kubernetes’ scheduling mechanisms, etcd’s Raft consensus algorithm, and Service Mesh (Istio) will help you better understand the depth of this ecosystem.
Leave a Reply