The ISOLATION : Masterclass
1. The Core Philosophy: What is a Container?
Section titled “1. The Core Philosophy: What is a Container?”A container is not a virtual machine. There is no hypervisor, and there is no guest operating system. A container is simply a standard Linux process that the kernel has placed inside an invisible box. The kernel lies to the process about what it can see and what it can use.
This “box” is built using two distinct Linux kernel features:
- Namespaces: Control what a process can see (Isolation).
- Cgroups (Control Groups): Control what a process can use (Resource limits like CPU and RAM).
2. Linux Namespaces & Docker Equivalents (Deep Dive)
Section titled “2. Linux Namespaces & Docker Equivalents (Deep Dive)”Docker is essentially a high-level wrapper around these Linux namespaces. By default, Docker creates a new, isolated namespace for every container. However, you can use flags to intentionally break this isolation and share namespaces.
A. PID Namespace (Process ID)
Section titled “A. PID Namespace (Process ID)”Isolates the process tree. In an isolated PID namespace, a process thinks it is PID 1 (the init process).
- Isolated (Default):
docker run -d nginx(Nginx is PID 1 inside the container). - Host Sharing (
-pid=host): The container sees every process running on the host machine.- Command:
docker run -it --pid=host alpine top - Use Case: Running a monitoring agent (like Datadog or Prometheus Node Exporter) as a container so it can watch the host’s processes.
- Command:
- Container Sharing (
-pid=container:<name>): Container B sees all processes inside Container A.- Command:
docker run -it --pid=container:web-app alpine ps aux - Use Case: Attaching a debugging container to a broken production container to run
straceor kill hanging processes without installing debugging tools in the production image.
- Command:
B. NET Namespace (Network)
Section titled “B. NET Namespace (Network)”Isolates network interfaces, routing tables, and IP addresses.
- Isolated (Default): Container gets its own private IP on the Docker bridge network (
172.17.x.x). - Host Sharing (
--net=host): Bypasses the Docker bridge entirely. The container uses the host’s network stack.- Command:
docker run --d --net=host nginx - Result: Nginx binds directly to port 80 of the EC2 instance. No port mapping (
p) is needed. Maximum network performance, but causes port conflicts if two containers want port 80.
- Command:
- Container Sharing (
--net=container:<name>): Container B uses Container A’s IP address and network stack.- Command:
docker run -d --net=container:my-db my-app - Result: Both containers share the exact same IP.
my-appcan talk tomy-dbsimply by callinglocalhost.
- Command:
C. IPC Namespace (Inter-Process Communication)
Section titled “C. IPC Namespace (Inter-Process Communication)”Isolates shared memory (SHM) segments and POSIX message queues.
- Isolated (Default): Container has its own private
/dev/shm. - Host Sharing (
-ipc=host): Container shares memory directly with the host operating system. - Container Sharing (
-ipc=container:<name>): Two containers share the same memory space.- Command:
docker run -d --ipc=container:postgres cache-layer - Use Case: Extremely high-throughput data processing where routing data through the network stack (even localhost) is too slow. They read/write to the exact same physical RAM addresses.
- Command:
D. MNT Namespace (Mount / Filesystem)
Section titled “D. MNT Namespace (Mount / Filesystem)”Isolates the filesystem mount points. This is why a container cannot see your host’s /etc or /home directories.
- Sharing mechanism: Volumes and Bind Mounts.
- Command:
docker run -v /host/path:/container/path nginx
- Command:
E. UTS Namespace (UNIX Timesharing System)
Section titled “E. UTS Namespace (UNIX Timesharing System)”Isolates the hostname and domain name.
- Sharing mechanism: Docker allows you to set this explicitly.
- Command:
docker run --hostname=my-custom-server alpine
- Command:
F. User Namespace (UID/GID)
Section titled “F. User Namespace (UID/GID)”Maps a user inside the container to a different user outside the container.
- Logic: You can run a process as
root(UID 0) inside the container, but the kernel maps it to an unprivileged user (UID 10000) outside the container on the host. If the container breaks out, it has no real host privileges.- Command: Configured at the Docker daemon level via
-userns-remap.
- Command: Configured at the Docker daemon level via
3. Kubernetes Isolation Logic
Section titled “3. Kubernetes Isolation Logic”Kubernetes (K8s) takes Docker’s concepts and scales them up. However, the Linux kernel has no idea what a “Pod” is. A Pod is a K8s abstraction.
A. K8s Namespaces vs. Linux Namespaces
Section titled “A. K8s Namespaces vs. Linux Namespaces”Do not confuse these.
- Linux Namespaces: Kernel-level physical isolation of processes (PID, NET, IPC).
- K8s Namespaces: Cluster-level logical isolation (e.g.,
dev-namespace,prod-namespace). They are just labels in the K8s database (etcd) to group resources and apply permissions (RBAC). They do not provide physical security isolation by default.
B. How a Pod is Implemented (The “Pause” Container)
Section titled “B. How a Pod is Implemented (The “Pause” Container)”To create a “Pod”, Kubernetes spins up a hidden, extremely lightweight container called the Pause Container (or Infrastructure Container).
- K8s creates the Pause container. The kernel assigns it a Network namespace and an IPC namespace.
- K8s then spins up your actual application containers (Backend, Frontend).
- K8s injects your containers into the Pause container’s namespaces using the equivalent of
-net=container:pauseand-ipc=container:pause.
C. Isolation: Container vs. Container (Inside the same Pod)
Section titled “C. Isolation: Container vs. Container (Inside the same Pod)”Because they share the Pause container’s namespaces, containers in the same Pod:
- Network: Share the exact same IP address and MAC address. They talk to each other via
localhost. (Port conflicts will happen if both try to use port 80). - IPC: Share the same memory segments for high-speed communication.
- Filesystem (MNT): Isolated. Container A cannot see Container B’s files unless you explicitly mount an
emptyDirK8s Volume into both containers. - Process (PID): Isolated (by default). Container A cannot see Container B’s processes. (Note: K8s allows you to change this by setting
shareProcessNamespace: truein the Pod spec).
D. Isolation: Pod vs. Pod in same Node
Section titled “D. Isolation: Pod vs. Pod in same Node”- Strictly Isolated.
- Pod A and Pod B have completely different Pause containers, meaning completely different Network, IPC, and PID namespaces.
- They must communicate over the overlay network (e.g., Flannel, Calico) using their respective IP addresses, governed by Kubernetes NetworkPolicies.
The Matrix
Section titled “The Matrix”| Scenario | Network (IP/Ports) | IPC (Shared RAM) | Filesystem | Processes (PID) |
|---|---|---|---|---|
| Standard Docker run | Isolated | Isolated | Isolated | Isolated |
Docker --net=host | Shared with Host | Isolated | Isolated | Isolated |
Docker --pid=host | Isolated | Isolated | Isolated | Shared with Host |
| K8s: Same Pod | Shared (localhost) | Shared | Isolated (needs Volumes) | Isolated (configurable) |
| K8s: Different Pods | Isolated | Isolated | Isolated | Isolated |