Every Kubernetes user has faced the difficulty of managing Pods with multiple containers. Modern Pods often have a main application container, a sidecar container for extra tasks (like monitoring or managing secrets), and an init container for a one-time initial process. However, all these containers were tied to the same restart rule for the whole Pod, which was often frustrating.
Previously, if a single container in a Pod failed, the entire Pod had to be restarted, which was inefficient. Kubernetes 1.34 introduces per-container restart policies, allowing for smarter control and faster recovery. This article will show how it works and why it's important.
With the release of Kubernetes 1.34, the way containers lifecycles are managed changes completely. The new feature called Container Restart Policy and Rules, which is still in the early (alpha) stage, allows for restart control at the container level, no longer just at the Pod level. This is a big change that is more aligned with modern architecture.
This feature has two main parts:
💡To enable this feature: a cluster administrator must turn on the ContainerRestartRules feature gate. |
The following table highlights the differences:
Category |
Pod restartPolicy (Before K8s 1.34) |
Container restartPolicy (Starting K8s 1.34, Alpha) |
Scope |
Whole Pod. One rule for all containers. |
Each container can have its own rule. |
Available Rules |
Always, OnFailure, Never. |
Always, OnFailure, Never at the container level, plus exit code-based rules. |
Problem Handling |
No control based on exit codes. Only based on a failed status (OnFailure) or every time it stops (Always). |
restartPolicyRules allow for restarts based on specific exit codes. |
Flexibility |
Low. Requires complex solutions for Pods with many containers. |
High. Allows for a cleaner Pod design. |
To use this feature, you can add restartPolicy and restartPolicyRules to the containers and initContainers sections in the Pod's configuration file.
Technically, restartPolicyRules is a list of rules. Each rule has two parts:
- action: The action to take when the container stops. The current options are Restart to restart the container and DoNotRestart to prevent restart or stop the Pod.
- exitCodes: The condition that triggers this action. It can be an operator (In or NotIn) and values (a list of exit codes).
This system works best when your application is designed to output clear exit codes for different types of failures.
💡Pro Tip: Design your application to produce specific and well-documented exit codes. This will maximize the usefulness of the restartPolicyRules feature. |
This feature opens the door to more resilient and intelligent Pod designs. Here are a few examples:
- Old Problem: If a database migration initContainer failed, the entire Pod would repeatedly restart, which could lead to a corrupted database.
- New Solution: You can set restartPolicy: Never for the initContainer.
- Result: If the migration fails, the Pod stops, preventing the main application from running on an incomplete or corrupted database. The main service container can still have a restartPolicy: Always for normal operation.
apiVersion: v1
kind: Pod
metadata:
name: app-with-migration
labels:
app: user-service
version: v1.0.0
spec:
# Pod-level restartPolicy is still required by API, but individual container policies override it.
restartPolicy: Always
initContainers:
- name: db-migrator
image: my-company/db-migration:1.0.0
restartPolicy: Never # A single, one-time run. If it fails, the whole Pod fails.
env:
- name: DB_HOST
valueFrom:
secretKeyRef:
name: db-credentials
key: host
command: ["/migrate", "--target-version=v1.0.0"]
containers:
- name: user-service
image: my-company/web-app:1.0.0
restartPolicy: Always # This service should always be running.
ports:
- containerPort: 8080
livenessProbe:
httpGet:
path: /health
port: 8080
initialDelaySeconds: 30
- Old Problem: For long-running AI/ML jobs, restarting the entire Pod is very expensive in terms of time and resources.
- New Solution: You can configure a container to restart only for specific, fixable errors, such as a memory issue or network problem.
- Result: The Pod can recover from temporary, retriable failures without the need to reschedule the entire job. For fatal errors like data corruption, you can use DoNotRestart rules to stop the Pod entirely.
apiVersion: v1
kind: Pod
metadata:
name: ml-training-worker
labels:
job: nlp-classification
role: worker
spec:
restartPolicy: Never # Pod as a whole should not restart on failure
containers:
- name: training-worker
image: ml-platform.io/trainer:v1.0.0
restartPolicy: Never # Container also does not restart by default
restartPolicyRules:
- action: Restart
exitCodes:
operator: In
values: [42] # CUDA out of memory - reduce batch size and retry
- action: Restart
exitCodes:
operator: In
values: [43] # Network timeout during gradient sync
- action: DoNotRestart
exitCodes:
operator: In
values: [1, 2] # Data corruption or invalid hyperparameters
...
- Old Problem: Production microservices often bundle the core application with sidecar containers for logging and monitoring. If one of these sidecars had an issue, it could affect the entire Pod's restart behavior.
- New Solution: Each container in the Pod can have its own restartPolicy. For example, the main application can be set to Always restart, while the log forwarder can be set to OnFailure.
- Result: You can ensure the core application remains highly available while giving monitoring sidecars a different restart behavior, and even use exit codes to manage specific sidecar issues, such as Elasticsearch connection problems.
apiVersion: v1
kind: Pod
metadata:
name: payment-service-stack
labels:
app: payment-api
spec:
restartPolicy: Never
containers:
- name: payment-api
image: payments.io/api-server:v1.2.3
ports:
- containerPort: 9000
restartPolicy: Always
...
- name: prometheus-exporter
image: prom/node-exporter:v1.3.1
ports:
- containerPort: 9100
restartPolicy: OnFailure
...
- name: log-forwarder
image: fluent/fluent-bit:1.9.3
restartPolicy: OnFailure
restartPolicyRules:
- action: Restart
exitCodes:
operator: In
values: [1, 2] # Elasticsearch connection issues
- action: DoNotRestart
exitCodes:
operator: In
values: [125] # Configuration syntax error
...
This table provides an example of suggested exit code patterns and corresponding restart actions:
Error Type |
Example Exit Codes |
Description & Context |
Recommended restartPolicyRules |
Configuration |
1, 2, 3 |
Incorrect configuration, missing variables. Fatal, cannot be fixed by restarting. |
DoNotRestart, because the Pod will never succeed. |
Resources |
10, 11, 12 |
Out of memory, disk full, CPU too busy. Can be retried. |
Restart, because a restart can resolve this temporary issue. |
Network |
20, 21, 22 |
Connection timed out, DNS failed. Often temporary and can be retried. |
Restart, so the container can recover its connection without needing to reschedule the Pod. |
Application |
30, 31, 32 |
Logic error, data validation failed. Depends on the application; can be fatal or not. |
Depends on the application's design. |
System |
40, 41, 42 |
Container forced to stop, node is low on resources. |
Restart, because this is a problem outside the application that can be retried. |
For informed restart decisions, applications should exit with codes that accurately reflect the cause of failure.
This feature is not just a small fix, but a big step forward in operational efficiency.
More Efficient
The biggest benefit is better efficiency. When a container in a Pod fails, it can restart in-place, without needing to reschedule the entire Pod. Rescheduling a Pod takes time and resources to pull the image and mount new volumes.
With in-place restarts, recovery time can be much faster. Restart times can be reduced from the typical 30-60 seconds to just 5-15 seconds.
In-place restarts ensure that volumes and network configuration remain intact. This allows for a quicker recovery without losing important data. The feature also simplifies the architecture by removing the need to split functionality into different Pods.
Although the Container Restart Policy and Rules feature is very useful, you should use it carefully because it is still in the alpha stage.
⚠️This feature is still in the alpha stage in Kubernetes 1.34. Use it carefully and avoid deploying it on mission-critical production applications until it becomes stable. |
Follow these four steps when adopting the new feature:
With more control, debugging becomes more complex. You need to check container logs and container-level restart metrics to understand why a container is restarting or not.
# Check container-specific restart counts
kubectl get pod my-pod -o jsonpath='{.status.containerStatuses[*].restartCount}'
# Examine restart reasons per container
kubectl describe pod my-pod | grep -A 5 "Container Statuses"
Restart policies also change how we think about resources. Containers with Always restart policies need steady resources available, while those that Never restart can usually run with tighter limits.
When using container-level restart policies, monitoring becomes more important.
# Example monitoring labels for Prometheus
- name: main-app
restartPolicy: Always
env:
- name: CONTAINER_NAME
value: "main-app"
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
When using container-level restart policies, monitoring becomes more important.
Keep an eye on how often each container restarts, not just the Pod. Set up alerts for cases where a container fails to restart as planned, or keeps restarting when it shouldn’t.
Kubernetes Container Restart Policy and Rules advance container management by offering granular control over Pods. This feature addresses long-standing design issues, enabling more resilient and efficient applications. Managing individual container lifecycles and smart recovery logic based on exit codes reduces latency, saves resources, and simplifies Pod architecture.
However, it’s important to note that this only controls restart behavior; the readiness rule of the Pod does not change. A Pod is considered Ready only if all containers are running. For example, if a sidecar crashes while the main application container is still running, direct communication to the main container (e.g., via Pod IP) remains safe, but access through a Service will be affected because the Pod is marked as not ready and removed from the load-balancing pool.
1. Kubernetes v1.34: Finer-Grained Control Over Container Restarts | Kubernetes