Production Kubernetes deployments face specific security challenges that static analysis and build-time controls cannot address. Kubernetes runtime security adds active defense mechanisms to your clusters by monitoring and controlling container behavior during execution. It operates at multiple layers, from kernel-level system calls to container runtime interactions and Kubernetes API server events.
For instance, when a container attempts to modify read-only system directories or establish connections to known malicious IP addresses, runtime security can immediately block these actions and alert security teams. This real-time protection fills gaps left by image scanning and admission controls, which cannot detect threats that emerge after containers start running.
This article covers how to build a practical runtime security monitoring setup for your Kubernetes environments. It will help you implement security controls that detect threats while maintaining the performance and manageability of your infrastructure.
Summary of key Kubernetes runtime security concepts
Concept |
Description |
Runtime security architecture fundamentals |
Security controls operate at multiple levels—from kernel system calls to container behaviors—protecting running workloads by monitoring system calls, resource usage, and process behaviors, blocking malicious activities before they cause damage. |
Security tooling and integration |
Security monitoring systems collect and analyze data from multiple sources across containers and clusters, creating a comprehensive view of potential threats and suspicious activities. |
Proactive security controls |
Techniques such as system call filtering and resource isolation establish security boundaries before threats emerge, restricting container capabilities to reduce potential attack surfaces. |
Future-ready security architecture |
Distributed telemetry processing at the edge handles security data from large-scale environments efficiently, using modern technologies like eBPF to enable real-time threat detection without centralized bottlenecks. |
Runtime security architecture fundamentals
Kubernetes runtime security architecture consists of three core layers that protect running containers: kernel-level security, container runtime security, and security telemetry. Each layer monitors different aspects of container execution, including low-level system calls, container lifecycle events, and cluster-wide security telemetry. These layers detect and block threats such as privilege escalations, file system modifications, and resource abuse that may arise.
Kernel-level security mechanisms
Kernel modules provide the first layer of runtime monitoring through system call controls. These features include Linux security modules (LSMs) and system call filtering mechanisms, which enforce strict access controls and restrict how containers interact with the kernel. For instance, LSMs like AppArmor or SELinux enforce policies on what resources containers can access, while system call filters like Seccomp limit the types of interactions containers can have with the kernel, protecting against malicious or misconfigured behavior..
Let's examine three key mechanisms that Kubernetes integrates with to enforce security at the kernel level.
Next-generation telemetry data pipelines for logs, metrics, and traces built with leading-edge technologies
- 1
Optimize, enrich & route your real-time telemetry data at the ingestion point
- 2
Drag-and-drop data pipeline creation & maintenance with the need for Regex
- 3
Rely on a future-proof, simple, efficient & flexible architecture built for hyperscale
Seccomp (secure computing mode) is not an LSM but complements it by filtering system calls. When containers interact with the kernel (e.g., opening files, creating network connections), Seccomp allows administrators to create profiles that specify which system calls are allowed and which are blocked. By restricting unnecessary system calls, Seccomp reduces the potential attack surface for containers, preventing malicious actions such as privilege escalation or unauthorized network access.
For example, a Seccomp profile for a web application might configure:
Which system calls are allowed/denied
What action to take when a blocked call occurs
{
"defaultAction": "SCMP_ACT_ERRNO",
"architectures": ["SCMP_ARCH_X86_64"],
"syscalls": [
{
"names": ["accept", "bind", "listen", "socket", "read", "write"],
"action": "SCMP_ACT_ALLOW"
}
]
}
AppArmor is another Linux kernel security module that Kubernetes integrates with. While seccomp filters system calls, AppArmor creates security profiles that specify which resources (e.g., files, network ports, capabilities) containers can access. AppArmor works like a gatekeeper, monitoring every attempt by a container to access sensitive resources.
For example, this profile restricts a MySQL database container to only access its designated data paths:
#include <tunables/global>
profile mysql-server flags=(attach_disconnected) {
#include <abstractions/base>
#include <abstractions/mysql>
/var/lib/mysql/ r,
/var/lib/mysql/** rwk,
/var/log/mysql/ r,
/var/log/mysql/* rw,
deny /etc/shadow rwklx,
}
SELinux (Security-Enhanced Linux) takes a different approach to security in Kubernetes by assigning security labels to every process, file, and network port. These labels form a security context that controls which processes can interact with each other, creating strict boundaries between containers and the host system.
apiVersion: v1
kind: Pod
metadata:
name: secured-pod
spec:
securityContext:
seLinuxOptions:
level: "s0:c123,c456" # MCS level for isolation
user: "system_u" # SELinux user
role: "object_r" # SELinux role
type: "container_t" # SELinux type
containers:
- name: app
image: app:1.0
This SELinux configuration applies several security controls to the pod. The level: "s0:c123,c456"
sets Multi-category security (MCS) labels that isolate the containers with different security requirements.. The user
, role
, and type
parameters establish the security context for the container, defining what resources it can access.
Container runtime security
A container runtime is a core component of each Kubernetes node responsible for running containers. Popular options include containerd and CRI-O, which manage container lifecycle operations, including image pulling, starting, and stopping containers. While runtimes offer basic namespace
and cgroup
isolation, they require explicit security configuration to enforce tighter controls.
Let's look at the key security configurations for containerd
:
Enabling image signature verification:
containerd:
defaults:
runtime:
options:
verify_image_signatures: true
Setting strict resource isolation:
containerd:
runtimes:
runc:
options:
SystemdCgroup: true
NoNewPrivileges: true
RestrictSUIDSGID: true
Setting runtime monitoring rules:
containerd:
plugins:
cri:
containerd:
runtimes:
runc:
monitor:
interval: "10s"
threshold: 5
Security telemetry flow
Security telemetry in Kubernetes refers to the collection and analysis of data related to security events in the cluster. This data flows from various sources, including system calls, container lifecycle events, and Kubernetes API activities. Security telemetry is crucial because it allows security teams to continuously monitor the cluster for threats, helping to quickly detect and mitigate issues like privilege escalation, unauthorized access, or abnormal behavior across containers. The data flows from several sources as follows:
The kernel generates audit logs about system calls and resource access.
Container runtimes log container lifecycle events (starts, stops, crashes).
The Kubernetes API server records who does what in your cluster.
The combination of tools and processes that collect and process this data creates a comprehensive security monitoring system. Tools like Falco or Tracee collect and send this data through a processing pipeline. The pipeline might use tools like Elasticsearch for storage and analysis. This pipeline standardizes the data format and adds context, for example, linking a suspicious network connection to the pod that made it.
Security tooling and integration
While Kubernetes provides core security features, specialized security tools are required to gain a deeper level of protection. These tools monitor container behavior, system events, and network activities to detect threats that Kubernetes’ built-in security mechanisms might miss.
Open source security tools
There are three powerful open-source tools that you can implement in your clusters to detect and prevent security threats:
Falco monitors container behavior at the node level, detecting common attack patterns like privilege escalation and suspicious network connections. After deployment, you can start with Falco's default ruleset, which covers common attack patterns. These rules detect privilege escalation, sensitive file access, or suspicious network connections. As you better understand your application's behavior, you can create custom rules specific to your security requirements.
Tracee leverages eBPF to monitor system events with minimal overhead, providing real-time visibility into container activities. You can filter these events based on your security needs, for example, focusing on file system modifications in critical paths or watching for container escape attempts.
KubeArmor extends Kubernetes network policies with application-level controls, enforcing rules for network connections between containers. For instance, you might allow only your application's primary method to make external API calls while blocking other processes.
Tool integration patterns
These three security tools output event data in standardized formats (e.g., JSON) and protocols with consistent fields like timestamp, source, event type, and context. This standardization enables security teams to correlate events across different tools.
For instance, Falco monitors runtime behavior at the node level by watching system calls like process executions and file access patterns. At the same time, Tracee uses eBPF to capture detailed system events and container operations. KubeArmor adds network security through LSM hooks and process-level filtering.
All these tools send their events in JSON format to a node-level collector that enriches the data with Kubernetes context (like pod and namespace information). This collected data flows into a centralized security pipeline where Elasticsearch or OpenSearch stores and indexes it, enabling the SIEM system to correlate events and detect threats.
Finally, the alert management system routes security alerts based on severity, creates incidents, and notifies relevant teams, creating a streamlined flow from event detection to response.
Proactive security controls
Proactive security controls in Kubernetes prevent security incidents by establishing boundaries that limit container capabilities and monitoring their behavior for deviations. For instance, PodSecurityPolicies can restrict which users can run containers with elevated privileges, while SecurityContexts can ensure that containers don't run as root.
Resource isolation strategies
PodSecurityContext is key to ensuring that containers are properly isolated from each other and from the host system. An essential PodSecurityContext prevents containers from running as root, reduces privilege escalation, and drops unnecessary Linux capabilities.
For example, a web application pod might use this security context:
apiVersion: v1
kind: Pod
metadata:
name: web-app
spec:
securityContext:
runAsNonRoot: true
runAsUser: 1000
runAsGroup: 3000
fsGroup: 2000
allowPrivilegeEscalation: false
capabilities:
drop: ["ALL"]
add: ["NET_BIND_SERVICE"]
This configuration ensures that the container runs as user ID 1000 and group 3000, prevents privilege escalation, removes all capabilities except the one needed to bind to port 80/443, and sets file system permissions through fsGroup.
Network policies complement security contexts by controlling pod communication. For example, a policy can restrict a database pod to accept connections only from specific application pods in the same namespace:
kind: NetworkPolicy
metadata:
name: database-policy
namespace: db
spec:
podSelector:
matchLabels:
app: database
ingress:
- from:
- podSelector:
matchLabels:
role: backend
System call filtering
Seccomp profiles restrict which system calls containers can make, reducing the attack surface. The process starts by running containers in audit mode to identify necessary system calls. These profiles are stored on each node and referenced in pod configurations, helping to prevent malicious system calls during runtime.
A typical web application might include network operations, file access, and fundamental process management while blocking dangerous operations like module loading or system time changes.
Example Seccomp profile for a web application:
{
"defaultAction": "SCMP_ACT_ERRNO",
"architectures": ["SCMP_ARCH_X86_64"],
"syscalls": [
{
"names": [
"read",
"write",
"open",
"close",
"socket",
"bind",
"listen",
"accept",
"connect"
],
"action": "SCMP_ACT_ALLOW"
}
]
}
This profile implements a deny-by-default approach as follows:
defaultAction
: Blocks all system calls unless explicitly allowedarchitectures
: Profile only applies tox86_64
systemssyscalls
: Provides a list of allowed system calls needed for a web application to handle files (read, write, open, close) and manage network connections (socket, bind, listen, accept, connect)
This profile must be stored on each node (for example, at /var/lib/kubelet/seccomp/profiles/container-hardening.json
), then referenced in the pod's security context:
apiVersion: v1
kind: Pod
metadata:
name: secure-pod
spec:
securityContext:
seccompProfile:
type: Localhost
localhostProfile: profiles/container-hardening.json
containers:
- name: app-container
image: app:1.0
The localhostProfile
path is relative to the node's seccomp profile root (/var/lib/kubelet/seccomp/
by default). When the container starts, the containerd runtime applies these kernel-level restrictions.
Runtime protection
Runtime protection establishes standard behavior patterns for containers, based on historical data, predefined security rules, or the expected operational patterns of containers.. It starts by observing container behavior during regular operation—what files they access, which network connections they make, and what processes they run. This creates a baseline for detecting suspicious activity.
Here’s an example runtime security policy using Tracee:
apiVersion: tracee.aquasec.com/v1beta1
kind: Policy
metadata:
name: runtime-protection
spec:
scope:
- workload:
name: web-app
rules:
- event: anti_debugging
filters:
- args.syscall=ptrace
- event: crypto_mining
- event: dropped_executable
filters:
- args.file.path=/usr/bin/*
- event: illegitimate_shell
filters:
- args.process.name=bash|sh|ash
This policy monitors for:
Anti-debugging attempts using ptrace
Crypto mining signatures
Dropped executable files in system directories
Unexpected shell executions
These settings balance security and functionality, allowing legitimate application behavior while catching potential security incidents.
Future-ready security architecture
In large production clusters, deployments generate massive amounts of security data: system calls, API server events, container logs, and runtime activities. Processing this security telemetry at scale requires an architecture that can handle high-volume data streams while catching critical security events in real time.
Real-time data processing
Security events in Kubernetes, such as container privilege escalation, unauthorized API access, and suspicious network connections, require instant detection to mitigate potential damage. Onum’s platform processes security telemetry in real-time as the data is transmitted, enabling immediate identification of threats before they can affect the infrastructure. This proactive approach allows security teams to detect and respond to threats faster than traditional post-storage analysis.
Consider a container attempting to modify system binaries. Instead of waiting for logs to be stored and analyzed, Onum detects this behavior in the data stream itself, enabling immediate response before the attack can progress.
Telemetry optimization
Kubernetes security tools generate huge volumes of data. Debug logs, routine syscalls, and API server events can flood security storage and analytics. Most of this data is irrelevant for threat detection. Onum optimizes this data by filtering security events at the source; it strips out routine operations, keeps critical security events, and adds context where needed. This targeted filtering cuts storage costs and lets security teams focus on real threats.
Onum's platform routes different types of security data where they belong—critical alerts to incident response, routine events to compliance storage. You get actionable alerts without wading through irrelevant data.
Data protection and compliance
Security logs often contain sensitive information like credentials, personal data, or internal system details. Regulations like HIPAA and PCI-DSS require careful handling of this data. Onum preserves original security events for compliance while creating sanitized versions for analysis.
This dual approach helps security teams maintain compliance while having the data they need for incident investigation. The platform automatically tracks data handling, simplifying audit requirements and reducing manual documentation.
Scaling architecture
The Onum platform distributes security processing across your infrastructure. Critical security alerts flow to real-time analysis engines, while compliance-related events route to audit systems. This distributed architecture and targeted routing reduce infrastructure costs and improve detection speed.
For organizations running multiple Kubernetes clusters, this means efficient security monitoring without the overhead of centralized processing. Security teams get the necessary visibility without managing complex data pipelines or numerous security tools.
Conclusion
Runtime security in Kubernetes requires multiple security layers working together. Each layer builds upon the previous, creating comprehensive security coverage. Deploy kernel-level controls first; seccomp profiles block dangerous system calls, while AppArmor rules limit resource access. Then, you can add Tracee for real-time threat detection through eBPF monitoring.
For practical implementation, begin in your development environment. Configure one security control at a time and observe its impact. Watch for application disruptions and adjust policies accordingly. This methodical approach helps you understand how security controls affect your specific workloads before moving to production.
Onum’s platform helps streamline this process by allowing you to observe the impact of each security control and adjust as needed. By filtering out irrelevant data at the source, Onum enables your teams to focus on relevant security events and reduce unnecessary overhead.
The key is balance—enough security to stop attacks without drowning in alerts or breaking applications. With appropriate planning and tools, you can implement and scale Kubernetes security monitoring efficiently, without overwhelming your teams or infrastructure.
Want the latest from Onum?
- 1
Subscribe to our LinkedIn newsletter to stay up to date on technical best practices for building resilient and scalable observability splutions and telemetry pipelines.