Sysdig | Nigel Douglas

Kubernetes 1.31 – What’s new?

Nigel Douglas — Fri, 26 Jul 2024 14:00:00 +0000

Kubernetes 1.31 is nearly here, and it’s full of exciting major changes to the project! So, what’s new in this upcoming release?

Kubernetes 1.31 brings a plethora of enhancements, including 37 line items tracked as ‘Graduating’ in this release. From these, 11 enhancements are graduating to stable, including the highly anticipated AppArmor support for Kubernetes, which includes the ability to specify an AppArmor profile for a container or pod in the API, and have that profile applied by the container runtime.

34 new alpha features are also making their debut, with a lot of eyes on the initial design to support pod-level resource limits. Security teams will be particularly interested in tracking the progress on this one.

Watch out for major changes such as the improved connectivity reliability for KubeProxy Ingress, which now offers a better capability of connection draining on terminating Nodes, and for load balancers which support that.

Further enhancing security, we see Pod-level resource limits moving along from Net New to Alpha, offering a capability similar to Resource Constraints in Kubernetes that harmoniously balances operational efficiency with robust security.

There are also numerous quality-of-life improvements that continue the trend of making Kubernetes more user-friendly and efficient, such as a randomized algorithm for Pod selection when downscaling ReplicaSets.

We are buzzing with excitement for this release! There’s plenty to unpack here, so let’s dive deeper into what Kubernetes 1.31 has to offer.

Editor’s pick:

These are some of the changes that look most exciting to us in this release:

#2395 Removing In-Tree Cloud Provider Code

Probably the most exciting advancement in v.1.31 is the removal of all in-tree integrations with cloud providers. Since v.1.26 there has been a large push to help Kubernetes truly become a vendor-neutral platform. This Externalization process will successfully remove all cloud provider specific code from the k8s.io/kubernetes repository with minimal disruption to end users and developers.

Nigel Douglas – Sr. Open Source Security Researcher

#2644 Always Honor PersistentVolume Reclaim Policy

I like this enhancement a lot as it finally allows users to honor the PersistentVolume Reclaim Policy through a deletion protection finalizer. HonorPVReclaimPolicy is now enabled by default. Finalizers can be added on a PersistentVolume to ensure that PersistentVolumes having Delete reclaim policy are deleted only after the backing storage is deleted.

The newly introduced finalizers kubernetes.io/pv-controller and external-provisioner.volume.kubernetes.io/finalizer are only added to dynamically provisioned volumes within your environment.

Pietro Piutti – Sr. Technical Marketing Manager

#4292 Custom profile in kubectl debug

I’m delighted to see that they have finally introduced a new custom profile option for the Kubectl Debug command. This feature addresses the challenge teams would have regularly faced when debugging applications built in shell-less base images. By allowing the mounting of data volumes and other resources within the debug container, this enhancement provides a significant security benefit for most organizations, encouraging the adoption of more secure, shell-less base images without sacrificing debugging capabilities.

Thomas Labarussias – Sr. Developer Advocate & CNCF Ambassador

Apps in Kubernetes 1.31

#3017 PodHealthyPolicy for PodDisruptionBudget

Stage: Graduating to Stable
Feature group: sig-apps

Kubernetes 1.31 introduces the PodHealthyPolicy for PodDisruptionBudget (PDB). PDBs currently serve two purposes: ensuring a minimum number of pods remain available during disruptions and preventing data loss by blocking pod evictions until data is replicated.

The current implementation has issues. Pods that are Running but not Healthy (Ready) may not be evicted even if their number exceeds the PDB threshold, hindering tools like cluster-autoscaler. Additionally, using PDBs to prevent data loss is considered unsafe and not their intended use.

Despite these issues, many users rely on PDBs for both purposes. Therefore, changing the PDB behavior without supporting both use-cases is not viable, especially since Kubernetes lacks alternative solutions for preventing data loss.

#3335 Allow StatefulSet to control start replica ordinal numbering

Stage: Graduating to Stable
Feature group: sig-apps

The goal of this feature is to enable the migration of a StatefulSet across namespaces, clusters, or in segments without disrupting the application. Traditional methods like backup and restore cause downtime, while pod-level migration requires manual rescheduling. Migrating a StatefulSet in slices allows for a gradual and less disruptive migration process by moving only a subset of replicas at a time.

#3998 Job Success/completion policy

Stage: Graduating to Beta
Feature group: sig-apps

We are excited about the improvement to the Job API, which now allows setting conditions under which an Indexed Job can be declared successful. This is particularly useful for batch workloads like MPI and PyTorch that need to consider only leader indexes for job success. Previously, an indexed job was marked as completed only if all indexes succeeded. Some third-party frameworks, like Kubeflow Training Operator and Flux Operator, have implemented similar success policies. This improvement will enable users to mark jobs as successful based on a declared policy, terminating lingering pods once the job meets the success criteria.

CLI in Kubernetes 1.31

#4006 Transition from SPDY to WebSockets

Stage: Graduating to Beta
Feature group: sig-cli

This enhancement proposes adding a WebSocketExecutor to the kubectl CLI tool, using a new subprotocol version (v5.channel.k8s.io), and creating a FallbackExecutor to handle client/server version discrepancies. The FallbackExecutor first attempts to connect using the WebSocketExecutor, then falls back to the legacy SPDYExecutor if unsuccessful, potentially requiring two request/response trips. Despite the extra roundtrip, this approach is justified because modifying the low-level SPDY and WebSocket libraries for a single handshake would be overly complex, and the additional IO load is minimal in the context of streaming operations. Additionally, as releases progress, the likelihood of a WebSocket-enabled kubectl interacting with an older, non-WebSocket API Server decreases.

#4706 Deprecate and remove kustomize from kubectl

Stage: Net New to Alpha
Feature group: sig-cli

The update was deferred from the Kubernetes 1.31 release. Kustomize was initially integrated into kubectl to enhance declarative support for Kubernetes objects. However, with the development of various customization and templating tools over the years, kubectl maintainers now believe that promoting one tool over others is not appropriate. Decoupling Kustomize from kubectl will allow each project to evolve at its own pace, avoiding issues with mismatched release cycles that can lead to kubectl users working with outdated versions of Kustomize. Additionally, removing Kustomize will reduce the dependency graph and the size of the kubectl binary, addressing some dependency issues that have affected the core Kubernetes project.

#3104 Separate kubectl user preferences from cluster configs

Stage: Net New to Alpha
Feature group: sig-cli

Kubectl, one of the earliest components of the Kubernetes project, upholds a strong commitment to backward compatibility. We aim to let users opt into new features (like delete confirmation), which might otherwise disrupt existing CI jobs and scripts. Although kubeconfig has an underutilized field for preferences, it isn’t ideal for this purpose. New clusters usually generate a new kubeconfig file with credentials and host details, and while these files can be merged or specified by path, we believe server configuration and user preferences should be distinctly separated.

To address these needs, the Kubernetes maintainers proposed introducing a kuberc file for client preferences. This file will be versioned and structured to easily incorporate new behaviors and settings for users. It will also allow users to define kubectl command aliases and default flags. With this change, we plan to deprecate the kubeconfig Preferences field. This separation ensures users can manage their preferences consistently, regardless of the –kubeconfig flag or $KUBECONFIG environment variable.

Kubernetes 1.31 instrumentation

#2305 Metric cardinality enforcement

Stage: Graduating to Stable
Feature group: sig-instrumentation

Metrics turning into memory leaks pose significant issues, especially when they require re-releasing the entire Kubernetes binary to fix. Historically, we’ve tackled these issues inconsistently. For instance, coding mistakes sometimes cause unintended IDs to be used as metric label values.

In other cases, we’ve had to delete metrics entirely due to their incorrect use. More recently, we’ve either removed metric labels or retroactively defined acceptable values for them. Fixing these issues is a manual, labor-intensive, and time-consuming process without a standardized solution.

This stable update should address these problems by enabling metric dimensions to be bound to known sets of values independently of Kubernetes code releases.

Network in Kubernetes 1.31

#3836 Ingress Connectivity Reliability Improvement for Kube-Proxy

Stage: Graduating to Stable
Feature group: sig-network

This enhancement finally introduces a more reliable mechanism for handling ingress connectivity for endpoints on terminating nodes and nodes with unhealthy Kube-proxies, focusing on eTP:Cluster services. Currently, Kube-proxy’s response is based on its healthz state for eTP:Cluster services and the presence of a Ready endpoint for eTP:Local services. This KEP addresses the former.

The proposed changes are:

Connection Draining for Terminating Nodes:
Kube-proxy will use the ToBeDeletedByClusterAutoscaler taint to identify terminating nodes and fail its healthz check to signal load balancers for connection draining. Other signals like .spec.unschedulable were considered but deemed less direct.

Addition of /livez Path:
Kube-proxy will add a /livez endpoint to its health check server to reflect the old healthz semantics, indicating whether data-plane programming is stale.

Cloud Provider Health Checks:
While not aligning cloud provider health checks for eTP:Cluster services, the KEP suggests creating a document on Kubernetes’ official site to guide and share knowledge with cloud providers for better health checking practices.

#4444 Traffic Distribution to Services

Stage: Graduating to Beta
Feature group: sig-network

To enhance traffic routing in Kubernetes, this KEP proposes adding a new field, trafficDistribution, to the Service specification. This field allows users to specify routing preferences, offering more control and flexibility than the earlier topologyKeys mechanism. trafficDistribution will provide a hint for the underlying implementation to consider in routing decisions without offering strict guarantees.

The new field will support values like PreferClose, indicating a preference for routing traffic to topologically proximate endpoints. The absence of a value indicates no specific routing preference, leaving the decision to the implementation. This change aims to provide enhanced user control, standard routing preferences, flexibility, and extensibility for innovative routing strategies.

#1880 Multiple Service CIDRs

Stage: Graduating to Beta
Feature group: sig-network

This proposal introduces a new allocator logic using two new API objects: ServiceCIDR and IPAddress, allowing users to dynamically increase available Service IPs by creating new ServiceCIDRs. The allocator will automatically consume IPs from any available ServiceCIDR, similar to adding more disks to a storage system to increase capacity.

To maintain simplicity, backward compatibility, and avoid conflicts with other APIs like Gateway APIs, several constraints are added:

ServiceCIDR is immutable after creation.
ServiceCIDR can only be deleted if no Service IPs are associated with it.
Overlapping ServiceCIDRs are allowed.
The API server ensures a default ServiceCIDR exists to cover service CIDR flags and the “kubernetes.default” Service.
All IPAddresses must belong to a defined ServiceCIDR.
Every Service with a ClusterIP must have an associated IPAddress object.
A ServiceCIDR being deleted cannot allocate new IPs.

This creates a one-to-one relationship between Service and IPAddress, and a one-to-many relationship between ServiceCIDR and IPAddress. Overlapping ServiceCIDRs are merged in memory, with IPAddresses coming from any ServiceCIDR that includes that IP. The new allocator logic can also be used by other APIs, such as the Gateway API, enabling future administrative and cluster-wide operations on Service ranges.

Kubernetes 1.31 nodes

#2400 Node Memory Swap Support

Stage: Graduating to Stable
Feature group: sig-node

The enhancement should now integrate swap memory support into Kubernetes, addressing two key user groups: node administrators for performance tuning and app developers requiring swap for their apps.

The focus was to facilitate controlled swap use on a node level, with the kubelet enabling Kubernetes workloads to utilize swap space under specific configurations. The ultimate goal is to enhance Linux node operation with swap, allowing administrators to determine swap usage for workloads, initially not permitting individual workloads to set their own swap limits.

#4569 Move cgroup v1 support into maintenance mode

Stage: Net New to Stable
Feature group: sig-node

The proposal aims to transition Kubernetes’ cgroup v1 support into maintenance mode while encouraging users to adopt cgroup v2. Although cgroup v1 support won’t be removed immediately, its deprecation and eventual removal will be addressed in a future KEP. The Linux kernel community and major distributions are focusing on cgroup v2 due to its enhanced functionality, consistent interface, and improved scalability. Consequently, Kubernetes must align with this shift to stay compatible and benefit from cgroup v2’s advancements.

To support this transition, the proposal includes several goals. First, cgroup v1 will receive no new features, marking its functionality as complete and stable. End-to-end testing will be maintained to ensure the continued validation of existing features. The Kubernetes community may provide security fixes for critical CVEs related to cgroup v1 as long as the release is supported. Major bugs will be evaluated and fixed if feasible, although some issues may remain unresolved due to dependency constraints.

Migration support will be offered to help users transition from cgroup v1 to v2. Additionally, efforts will be made to enhance cgroup v2 support by addressing all known bugs, ensuring it is reliable and functional enough to encourage users to switch. This proposal reflects the broader ecosystem’s movement towards cgroup v2, highlighting the necessity for Kubernetes to adapt accordingly.

#24 AppArmor Support

Stage: Graduating to Stable
Feature group: sig-node

Adding AppArmor support to Kubernetes marks a significant enhancement in the security posture of containerized workloads. AppArmor is a Linux kernel module that allows system admins to restrict certain capabilities of a program using profiles attached to specific applications or containers. By integrating AppArmor into Kubernetes, developers can now define security policies directly within an app config.

The initial implementation of this feature would allow for specifying an AppArmor profile within the Kubernetes API for individual containers or entire pods. This profile, once defined, would be enforced by the container runtime, ensuring that the container’s actions are restricted according to the rules defined in the profile. This capability is crucial for running secure and confined applications in a multi-tenant environment, where a compromised container could potentially affect other workloads or the underlying host.

Scheduling in Kubernetes

#3633 Introduce MatchLabelKeys and MismatchLabelKeys to PodAffinity and PodAntiAffinity

Stage: Graduating to Beta
Feature group: sig-scheduling

This was Tracked for Code Freeze as of July 23rd. This enhancement finally introduces the MatchLabelKeys for PodAffinityTerm to refine PodAffinity and PodAntiAffinity, enabling more precise control over Pod placements during scenarios like rolling upgrades.

By allowing users to specify the scope for evaluating Pod co-existence, it addresses scheduling challenges that arise when new and old Pod versions are present simultaneously, particularly in saturated or idle clusters. This enhancement aims to improve scheduling effectiveness and cluster resource utilization.

Kubernetes storage

#3762 PersistentVolume last phase transition time

Stage: Graduating to Stable
Feature group: sig-storage

The Kubernetes maintainers plan to update the API server to support a new timestamp field for PersistentVolumes, which will record when a volume transitions to a different phase. This field will be set to the current time for all newly created volumes and those changing phases. While this timestamp is intended solely as a convenience for cluster administrators, it will enable them to list and sort PersistentVolumes based on the transition times, aiding in manual cleanup and management.

This change addresses issues experienced by users with the Delete retain policy, which led to data loss, prompting many to revert to the safer Retain policy. With the Retain policy, unclaimed volumes are marked as Released, and over time, these volumes accumulate. The timestamp field will help admins identify when volumes last transitioned to the Released phase, facilitating easier cleanup.

Moreover, the generic recording of timestamps for all phase transitions will provide valuable metrics and insights, such as measuring the time between Pending and Bound phases. The goals are to introduce this timestamp field and update it with every phase transition, without implementing any volume health monitoring or additional actions based on the timestamps.

#3751 Kubernetes VolumeAttributesClass ModifyVolume

Stage: Graduating to Beta
Feature group: sig-storage

The proposal introduces a new Kubernetes API resource, VolumeAttributesClass, along with an admission controller and a volume attributes protection controller. This resource will allow users to manage volume attributes, such as IOPS and throughput, independently from capacity. The current immutability of StorageClass.parameters necessitates this new resource, as it permits updates to volume attributes without directly using cloud provider APIs, simplifying storage resource management.

VolumeAttributesClass will enable specifying and modifying volume attributes both at creation and for existing volumes, ensuring changes are non-disruptive to workloads. Conflicts between StorageClass.parameters and VolumeAttributesClass.parameters will result in errors from the driver.

The primary goals include providing a cloud-provider-independent specification for volume attributes, enforcing these attributes through the storage, and allowing workload developers to modify them non-disruptively. The proposal does not address OS-level IO attributes, inter-pod volume attributes, or scheduling based on node-specific volume attributes limits, though these may be considered for future extensions.

#3314 CSI Differential Snapshot for Block Volumes

Stage: Net New to Alpha
Feature group: sig-storage

This enhancement was removed from the Kubernetes 1.31 milestone. It aims at enhancing the CSI specification by introducing a new optional CSI SnapshotMetadata gRPC service. This service allows Kubernetes to retrieve metadata on allocated blocks of a single snapshot or the changed blocks between snapshots of the same block volume. Implemented by the community-provided external-snapshot-metadata sidecar, this service must be deployed by a CSI driver. Kubernetes backup applications can access snapshot metadata through a secure TLS gRPC connection, which minimizes load on the Kubernetes API server.

The external-snapshot-metadata sidecar communicates with the CSI driver’s SnapshotMetadata service over a private UNIX domain socket. The sidecar handles tasks such as validating the Kubernetes authentication token, authorizing the backup application, validating RPC parameters, and fetching necessary provisioner secrets. The CSI driver advertises the existence of the SnapshotMetadata service to backup applications via a SnapshotMetadataService CR, containing the service’s TCP endpoint, CA certificate, and audience string for token authentication.

Backup applications must obtain an authentication token using the Kubernetes TokenRequest API with the service’s audience string before accessing the SnapshotMetadata service. They should establish trust with the specified CA and use the token in gRPC calls to the service’s TCP endpoint. This setup ensures secure, efficient metadata retrieval without overloading the Kubernetes API server.

The goals of this enhancement are to provide a secure CSI API for identifying allocated and changed blocks in volume snapshots, and to efficiently relay large amounts of snapshot metadata from the storage provider. This API is an optional component of the CSI framework.

Other enhancements in Kubernetes 1.31

#4193 Bound service account token improvements

Stage: Graduating to Beta
Feature group: sig-auth

The proposal aims to enhance Kubernetes security by embedding the bound Node information in tokens and extending token functionalities. The kube-apiserver will be updated to automatically include the name and UID of the Node associated with a Pod in the generated tokens during a TokenRequest. This requires adding a Getter for Node objects to fetch the Node’s UID, similar to existing processes for Pod and Secret objects.

Additionally, the TokenRequest API will be extended to allow tokens to be bound directly to Node objects, ensuring that when a Node is deleted, the associated token is invalidated. The SA authenticator will be modified to verify tokens bound to Node objects by checking the existence of the Node and validating the UID in the token. This maintains the current behavior for Pod-bound tokens while enforcing new validation checks for Node-bound tokens from the start.

Furthermore, each issued JWT will include a UUID (JTI) to trace the requests made to the apiserver using that token, recorded in audit logs. This involves generating the UUID during token issuance and extending audit log entries to capture this identifier, enhancing traceability and security auditing.

#3962 Mutating Admission Policies

Stage: Net New to Alpha
Feature group: sig-api-machinery

Continuing the work started in KEP-3488, the project maintainers have proposed adding mutating admission policies using CEL expressions as an alternative to mutating admission webhooks. This builds on the API for validating admission policies established in KEP-3488. The approach leverages CEL’s object instantiation and Server Side Apply’s merge algorithms to perform mutations.

The motivation for this enhancement stems from the simplicity needed for common mutating operations, such as setting labels or adding sidecar containers, which can be efficiently expressed in CEL. This reduces the complexity and operational overhead of managing webhooks. Additionally, CEL-based mutations offer advantages such as allowing the kube-apiserver to introspect mutations and optimize the order of policy applications, minimizing reinvocation needs. In-process mutation is also faster compared to webhooks, making it feasible to re-run mutations to ensure consistency after all operations are applied.

The goals include providing a viable alternative to mutating webhooks for most use cases, enabling policy frameworks without webhooks, offering an out-of-tree implementation for compatibility with older Kubernetes versions, and providing core functionality as a library for use in GitOps, CI/CD pipelines, and auditing scenarios.

#3715 Elastic Indexed Jobs

Stage: Graduating to Stable
Feature group: sig-apps

Also graduating to Stable, this feature will allow for mutating spec.completions on Indexed Jobs when it matches and is updated with spec.parallelism. The success and failure semantics remain unchanged for jobs that do not alter spec.completions. For jobs that do, failures always count against the job’s backoffLimit, even if spec.completions is scaled down and the failed pods fall outside the new range. The status.Failed count will not decrease, but status.Succeeded will update to reflect successful indexes within the new range. If a previously successful index is out of range due to scaling down and then brought back into range by scaling up, the index will restart.

If you liked this, you might want to check out our previous ‘What’s new in Kubernetes’ editions:

Get involved with the Kubernetes project:

Visit the project homepage
Check out the Kubernetes project on GitHub
Get involved with the Kubernetes community
Meet the maintainers on the Kubernetes Slack
Follow @KubernetesIO on Twitter

The post Kubernetes 1.31 – What’s new? appeared first on Sysdig.

Happy 10th Birthday Kubernetes!

Nigel Douglas — Thu, 06 Jun 2024 15:30:00 +0000

As Kubernetes celebrates its 10th anniversary, it’s an opportune moment to reflect on the profound impact Kubernetes has had on the cloud technology landscape. Since its inception, Kubernetes has revolutionized the way we deploy, manage, and scale containerized applications, becoming the de facto orchestration platform for today’s cloud-native ecosystem. This milestone not only highlights Kubernetes’ success as an open-source project but also the vibrant community that has grown around it, driving continuous innovation and collaboration.

In parallel, Sysdig’s journey has been deeply intertwined with Kubernetes’ evolution. From its early days as a pioneering observability solution for containerized workloads, Sysdig has continually evolved to address the growing security needs of the cloud-native world. As we celebrate Kubernetes’ achievements, we also recognize Sysdig’s contributions, particularly through open-source projects like Falco, which enhance the security and resilience of Kubernetes environments. Join me as we walk down memory lane and take a look at Sysdig’s evolution and its significant contributions to the ecosystem over the past decade.

Early Beginnings: Open Source Visibility

Sysdig was founded in 2013 by Loris Degioanni, leveraging his experience as a co-creator of Wireshark, a widely-used network protocol analyzer. Initially, Sysdig focused on providing deep visibility into containerized environments. The creation of open source Sysdig Inspect was a significant milestone in providing visibility into containers and Kubernetes environments. Sysdig Inspect utilized the same deep packet inspection principles from Wireshark, extending them to modern cloud-native applications.

Addressing Security Needs: Introduction of Falco

As Kubernetes rapidly gained traction following its launch in 2014, the need for comprehensive security solutions for containerized workloads became increasingly evident. Recognizing this demand, Sysdig introduced Falco in 2016, an open source project focused on runtime security and threat detection for Kubernetes, containers, and cloud environments. This was a crucial time when Kubernetes was solidifying its position as the outright standard for container orchestration, and tools like Falco played a significant role in enhancing its security posture.

Falco quickly became an essential component of the security toolkit, capable of detecting unexpected behaviors and potential threats in real-time by monitoring system calls. Its significance was further underscored in 2018 when Falco was donated to the Cloud Native Computing Foundation (CNCF), the same organization that had been nurturing Kubernetes since its early days. This move not only highlighted Falco’s importance but also ensured its continued development within this blooming cloud-native ecosystem.

In 2020, the CNCF and The Linux Foundation introduced the Certified Kubernetes Security Specialist (CKS) certification, aimed at professionals who had already obtained the Certified Kubernetes Administrator (CKA) certification and wanted to showcase their expertise in Kubernetes security. Falco was a core component of the CKS certification spec, further highlighting Falco’s integral role in securing Kubernetes environments.

By 2024, as Kubernetes celebrated a decade of revolutionizing application deployment and management, Falco graduated from the CNCF. This milestone marked it as a mature and stable project, ready for widespread adoption in production environments, paralleling Kubernetes’ own journey to maturity and broad acceptance in the industry.

Expanding the Open Source Ecosystem

Following Falco’s success, the community continued to innovate by developing complementary tools to enhance the overall security posture of cloud-native environments:

falcosidekick: A companion project that extends Falco’s alerting capabilities by providing a flexible mechanism to forward Falco alerts to various outputs such as Slack, email, or SIEM systems, improving incident response and introspection.
falcoctl: A tool designed to simplify the deployment, management, and operation of Falco. It helps streamline security workflows and integrates seamlessly with existing CI/CD pipelines.
Promcat: On our journey to provide a scalable Prometheus experience, we found that companies need a reliable toolbox of observability integrations to succeed. In addition to scale and security controls, they need a quick answer to the following question: “How can I monitor X, Y and Z in my cluster?”
Falco Talon: Introduced as a dedicated threat mitigation engine for Kubernetes, Falco Talon enables automated responses to detected threats. It uses Kubernetes primitives to take actions like labeling workloads, terminating suspicious pods, and enforcing network policies, thus mitigating threats in real-time.

The Market Evolution: From Disparate Toolsets to CNAPP

The cloud-native security ecosystem has evolved significantly over the past decade. Initially, organizations relied on various separate tools to secure their cloud environments, each addressing specific needs such as protecting workloads, managing permissions, and ensuring compliance. However, the complexity and fragmented nature of these tools led to a growing demand for a more integrated approach to security.

This shift has given rise to the concept of the Cloud-Native Application Protection Platform (CNAPP), which aims to provide comprehensive security by combining the capabilities of these individual tools into a unified platform. Sysdig has been at the forefront of this evolution, continually enhancing its open-source offerings to deliver end-to-end security solutions that cover the entire lifecycle of cloud-native applications. By integrating workload protection, permission management, and posture management into a single platform, Sysdig simplifies security operations, improves visibility, and enhances the overall security posture of Kubernetes.

Conclusion

Sysdig’s journey over the past 10 years mirrors the rapid evolution of the cloud-native ecosystem. Sysdig has consistently driven innovation to meet the growing demands of modern, containerized environments. As we celebrate Kubernetes’ 10th anniversary, it’s clear that Sysdig’s contributions have been instrumental in shaping the future of cloud-native security, ensuring that organizations can confidently adopt and secure their cloud-native applications.

If you want to see how far Kubernetes has come over the past 10 years, James Spurin from DiveInto shared an interactive, hands-on, in-browser version of the very first version of Kubernetes (v1.0.0). The lab is accessible through his Github repository and you can run it absolutely for FREE! Kubernetes has come a long way since the first ever official release of the project, and this is a cool way to celebrate the evolution of the project.

The post Happy 10th Birthday Kubernetes! appeared first on Sysdig.

Optimizing Wireshark in Kubernetes

Nigel Douglas — Tue, 21 May 2024 17:00:00 +0000

In Kubernetes, managing and analyzing network traffic poses unique challenges due to the ephemeral nature of containers and the layered abstraction of Kubernetes structures like pods, deployments, and services. Traditional tools like Wireshark, although powerful, struggle to adapt to these complexities, often capturing excessive, irrelevant data – what we call “noise.”

The Challenge with Traditional Packet Capturing

The ephemerality of containers is one of the most obvious issues. By the time a security incident is detected and analyzed, the container involved may no longer exist. When a pod dies in Kubernetes, it’s designed to instantly recreate itself again. When this happens, it has new context, such as a new IP address and pod name. As a starting point, we need to look past the static context of legacy systems and try to do forensics based on Kubernetes abstractions such as network namespaces and service names.

It’s worth highlighting that there are some clear contextual limitations of Wireshark in cloud native. Tools like Wireshark are not inherently aware of Kubernetes abstractions. This disconnect makes it hard to relate network traffic directly back to specific pods or services without significant manual configuration and contextual stitching. Thankfully, we know Falco has the context of Kubernetes in the Falco rule detection. Wireshark with Falco bridges the gap between raw network data and the intelligence provided by the Kubernetes audit logs. We now have some associated metadata from the Falco alert for the network capture.

Finally, there’s the challenge of data overload associated with PCAP files. Traditional packet capture strategies, such as those employed by AWS VPC Traffic Mirroring or GCP Traffic Mirroring, often result in vast amounts of data, most of which is irrelevant to the actual security concern, making it harder to isolate important information quickly and efficiently. Comparatively, options like AWS VPC Flow Logs or Azure’s attempt at Virtual network tap, although less complex, still incur significant costs in data transfer/storage.

When’s the appropriate time to start a capture? How do you know when to end it? Should it be pre-filtered to reduce the file size, or should we capture everything and then filter out noise in the Wireshark GUI? We might have a solution to these concerns that bypasses the complexities and costs of cloud services.

The /555 Guide For Security Practitioners

Meet The Only Benchmark For Cloud Security!

Read The Guide

Introducing a New Approach with Falco Talon

Organizations have long dealt with security blindspots related to Kubernetes alerts. Falco and Falco Talon address these shortcomings through a novel approach that integrates Falco, a cloud-native detection engine, with tshark, the terminal version of Wireshark, for more effective and targeted network traffic analysis in Kubernetes environments.

Falco Talon’s event-driven, API approach to threat response is the best way to deal with initiating captures in real time. It’s also the most stable approach we can see with the existing state-of-the-art in cloud-native security – notably, Falco.

Step-by-Step Workflow:

Detection: Falco, designed specifically for cloud-native environments like Kubernetes, monitors the environment for suspicious activity and potential threats. It is finely tuned to understand Kubernetes context, making it adept at spotting Indicators of Compromise (IoCs). Let’s say, for example, it triggers a detection for specific anomalous network traffic to a Command and Control (C2) server or botnet endpoints.

Automating Tshark: Upon detection of an IoC, Falco sends a webhook to the Falco Talon backend. Talon has many no-code response actions, but one of these actions allows users to trigger arbitrary scripts. This trigger can be context-aware from the metadata associated with the Falco alert, allowing for a tshark command to be automatically initiated with metadata context specific to the incident.

Contextual Packet Capturing: Finally, a PCAP file is generated for a few seconds with more tailored context. In the event of a suspicious TCP traffic alert from Falco, we can filter a tshark command for just TCP activity. In the case of a suspicious botnet endpoint, let’s see all traffic to that botnet endpoint. Falco Talon, in each of these scenarios, initiates a tshark capture tailored to the exact network context of the alert. This means capturing traffic only from the relevant pod, service, or deployment implicated in the security alert.

Improved Analysis: Finally, the captured data is immediately available for deeper analysis, providing security teams with the precise information needed to respond effectively to the incident. This is valuable for Digital Forensics & Incident Response (DFIR) efforts, but also in maintaining regulatory compliance by logging context specific to security incidents in production.

This targeted approach not only reduces the volume of captured data, making analysis faster and more efficient, but also ensures that captures are immediately relevant to the security incidents detected, enhancing response times and effectiveness.

Collaboration and Contribution

We believe this integrated approach marks a significant advancement in Kubernetes security management. If you are interested in contributing to this innovative project or have insights to share, feel free to contribute to the Github project today.

This method aligns with the needs of modern Kubernetes environments, leveraging the strengths of both Falco and Wireshark to provide a nuanced, powerful tool for network security. By adapting packet capture strategies to the specific demands of cloud-native architectures, we can significantly improve our ability to secure and manage dynamic containerized applications.

Open source software (OSS) is the only approach with the agility and broad reach to set up the conditions to meet modern security concerns, well-demonstrated by Wireshark over its 25 years of development. Sysdig believes that collaboration brings together expertise and scrutiny, and a broader range of use cases, which ultimately drives more secure software.

This proof-of-concept involves three OSS technologies (Falco, Falco Talon, and Wireshark). While the scenario was specific to Kubernetes, there is no reason why it cannot be adapted to standalone Linux systems, Information of Things (IoT) devices, and Edge computing in the future.

The post Optimizing Wireshark in Kubernetes appeared first on Sysdig.

The Race for Artificial Intelligence Governance

Nigel Douglas — Mon, 13 May 2024 14:00:00 +0000

As AI adoptions become increasingly integral to all aspects of society worldwide, there is a heightened global race to establish artificial intelligence governance frameworks that ensure their safe, private, and ethical use. Nations and regions are actively developing policies and guidelines to manage AI’s expansive influence and mitigate associated risks. This global effort reflects a recognition of the profound impact that AI has on everything from consumer rights to national security.

Here are seven AI security regulations from around the world that are either in progress or have already been implemented, illustrating the diverse approaches taken across different geopolitical landscapes. For example, China and the U.S. prioritized safety and governance, while the EU prioritized regulation and fines as a way to ensure organization readiness.

In March 2024, the European Parliament adopted the Artificial Intelligence Act, the world’s first extensive horizontal legal regulation dedicated to AI.

Read what that means for you

1. China: New Generation Artificial Intelligence Development Plan

Status: Established

Overview: Launched in 2017, China’s Artificial Intelligence Development Plan (AIDP) outlines objectives for China to lead global AI development by 2030. It includes guidelines for AI security management, use of AI in public services, and promotion of ethical norms and standards. China has since also introduced various standards and guidelines focused on data security and the ethical use of AI.

The AIDP aims to harness AI technology for enhancing administrative, judicial, and urban management, environmental protection, and addressing complex social governance issues, thereby advancing the modernization of social governance.

However, the plan lacks enforceable regulations, as there are no provisions for fines or penalties regarding the deployment of high-risk AI workloads. Instead, it places significant emphasis on research aimed at fortifying the existing AI standards framework. In November 2023, China entered a bilateral AI partnership with the United States. However, Matt Sheehan, a specialist in Chinese AI at Carnegie Endowment for International Peace, remarked to Axios that there’s a prevailing lack of comprehension on both sides — neither country fully grasps the AI standards, testing, and certification systems being developed by the other.

The Chinese initiative advocates for upholding principles of security, availability, interoperability, and traceability. Its objective is to progressively establish and enhance the foundational aspects of AI, encompassing interoperability, industry applications, network security, privacy protection, and other technical standards. To foster an effective artificial intelligence governance dialogue in China, officials must delve into specific priority issues and address them comprehensively.

2. Singapore: Model Artificial Intelligence Governance Framework

Status: Established

Overview: Singapore’s framework stands out as one of the first in Asia to offer comprehensive and actionable guidance on ethical AI governance practices. On Jan. 23, 2019, Singapore’s Personal Data Protection Commission (PDPC) unveiled the first edition of the Model AI Governance Framework (Model Framework) to solicit broader consultation, adoption, and feedback. Following its initial release and feedback received, the PDPC published the second edition of the Model Framework on Jan. 21, 2020, further refining its guidance and support for organizations navigating the complexities of AI deployment.

The Model Framework delivers specific, actionable guidance to private sector organizations on addressing key ethical and governance challenges associated with deploying AI solutions. It includes resources such as the AI Governance Testing Framework and Toolkit, which help organizations ensure that their use of AI is aligned with established ethical standards and governance norms.

The Model Framework seeks to foster public trust and understanding of AI technologies by clarifying how AI systems function, establishing robust data accountability practices, and encouraging transparent communication.

3. Canada: Directive on Automated Decision-Making

Status: Established

Overview: Implemented to govern the use of automated decision-making systems within the Canadian government, part of this directive took effect as early as April 1, 2019, with the compliance portion of the directive kicking in a year later.

This directive includes an Algorithmic Impact Assessment tool (AIA), which Canadian federal institutions must use to assess and mitigate risks associated with deploying automated technologies. The AIA is a compulsory risk assessment tool, structured as a questionnaire, designed to complement the Treasury Board’s Directive on Automated Decision-Making. The assessment evaluates the impact level of automated decision systems based on 51 risk assessment questions and 34 mitigation questions.

Non-compliance with this directive could lead to measures (the nature of discipline is corrective, rather than punitive, and its purpose is to motivate employees to accept those rules and standards of conduct which are desirable or necessary to achieve the goals and objectives of the organization), which are deemed appropriate by the Treasury Board under the Financial Administration Act, depending on the specific circumstances. For detailed information on the potential consequences of non-compliance to this artificial intelligence governance directive, you can consult the Framework for the Management of Compliance.

4. United States: National AI Initiative Act of 2020

Status: Established

Overview: The National Artificial Intelligence Initiative Act (NAIIA) was signed to promote and coordinate a national AI strategy. It includes efforts to ensure the United States is a global leader in AI, enhance AI research and development, and protect national security interests at a domestic level. While it’s less focused on individual AI applications, it lays the groundwork for the development of future AI regulations and standards.

The NAIIA states its goal is to “modernize governance and technical standards for AI-powered technologies, protecting privacy, civil rights, civil liberties, and other democratic values.” With the NAIIA, the U.S. government intends to build public trust and confidence in AI workloads through the creation of AI technical standards and risk management frameworks.

5. European Union: AI Act

Status: In progress

Overview: The European Union’s AI Act is one of the world’s most comprehensive attempts to establish artificial intelligence governance. It aims to manage risks associated with specific uses of AI and classifies AI systems according to their risk levels, from minimal to unacceptable. High-risk categories include critical infrastructure, employment, essential private and public services, law enforcement, migration, and justice enforcement.

The EU AI Act, still under negotiation, reached a provisional agreement on Dec. 9, 2023. The legislation categorizes AI systems with significant potential harm to health, safety, fundamental rights, and democracy as high risk. This includes AI that could influence elections and voter behavior. The Act also lists banned applications to protect citizens’ rights, prohibiting AI systems that categorize biometric data based on sensitive characteristics, perform untargeted scraping of facial images, recognize emotions in workplaces and schools, implement social scoring, manipulate behavior, or exploit vulnerable populations.

Comparatively, the United States NAIIA office was established as part of the NAIIA Act to predominantly focus efforts on standards and guidelines, whereas the EU’s AI Act actually enforces binding regulations, violations of which would incur significant fines and other penalties without further legislative action.

6. United Kingdom: AI Regulation Proposal

Status: In progress

Overview: Following its exit from the EU, the UK has begun to outline its own regulatory framework for AI, separate from the EU AI Act. The UK’s approach aims to be innovation-friendly, while ensuring high standards of public safety and ethical considerations. The UK’s Centre for Data Ethics and Innovation (CDEI) is playing a key role in shaping these frameworks.

In March 2023, the CDEI published their AI regulation white paper, setting out initial proposals to develop a “pro-innovation regulatory framework” for AI. The proposed framework outlined five cross-sectoral principles for the UK’s existing regulators to interpret and apply within their remits – they are listed as;

Safety, security and robustness.
Appropriate transparency and explainability.
Fairness.
Accountability and governance.
Contestability and redress.

This proposal also appears to lack clear repercussions for organizations who are abusing trust or compromising civil liberties with their AI workloads.

While this in-progress proposal is still weak on taking action against general-purpose AI abuse, it does provide clear intentions to work closely with AI developers, academics and civil society members who can provide independent expert perspectives. The UK’s proposal also mentions an intention to collaborate with international partners leading up to the second annual global AI Safety Summit in South Korea in May 2024.

7. India: AI for All Strategy

Status: In progress

Overview: India’s national AI initiative, known as AI for All, is dedicated to promoting the inclusive growth and ethical usage of AI in India. This program primarily functions as a self-paced online course designed to enhance public understanding of Artificial Intelligence across the country.

The program is intended to demystify AI for a diverse audience, including students, stay-at-home parents, professionals from any sector, and senior citizens — essentially anyone keen to learn about AI tools, use cases, and security concerns. Notably, the program is concise, consisting of two main parts: “AI Aware” and “AI Appreciate,” each designed to be completed within about four hours. The course focuses on making use of AI solutions that are both secure and ethically aligned with societal needs.

It’s important to clarify that the AI for All approach is neither a regulatory framework nor an industry-recognized certification program. Rather, its existence is to help unfamiliar citizens take the initial steps towards embracing an AI-inclusive world. While it does not aim to make participants AI experts, it provides a foundational understanding of AI, empowering them to discuss and engage with this transformative technology effectively.

Conclusion

Each of these initiatives reflects a broader global trend towards creating frameworks that ensure AI technologies are developed and deployed in a secure, ethical, and controlled manner, addressing both the opportunities and challenges posed by AI. Additionally, these frameworks continue to emphasize a real need for robust governance — be it through enforceable laws or comprehensive training programs — to safeguard citizens from the potential dangers of high-risk AI applications. Such measures are crucial to prevent misuse and ensure that AI advancements contribute positively to society without compromising individual rights or safety.

The post The Race for Artificial Intelligence Governance appeared first on Sysdig.

How Businesses Can Comply with the EU’s Artificial Intelligence Act

Nigel Douglas — Tue, 30 Apr 2024 13:48:00 +0000

On March 13, 2024, the European Parliament marked a significant milestone by adopting the Artificial Intelligence Act (AI Act), setting a precedent with the world’s first extensive horizontal legal regulation dedicated to AI.

Encompassing EU-wide regulations on data quality, transparency, human oversight, and accountability, the AI Act introduces stringent requirements that carry significant extraterritorial impacts and potential fines of up to €35 million or 7% of global annual revenue, whichever is greater. This landmark legislation is poised to influence a vast array of companies engaged in the EU market. The official document of the AI Act adopted by the European Parliament can be found here.

Originating from a proposal by the European Commission in April 2021, the AI Act underwent extensive negotiations, culminating in a political agreement in December 2023, detailed here. The AI Act is on the cusp of becoming enforceable, pending the European Parliament’s approval, initiating a crucial preparatory phase for organizations to align with its provisions.

AI adoption has quickly gone from a nice-to-have to global disruption. Now there is global race to ensure it happens ethically and safely.

Here are seven AI security regulations from around the world.

Risk-Based Reporting

The AI Act emphasizes a risk-based regulatory approach and targets a broad range of entities, including AI system providers, importers, distributors, and deployers. It distinguishes between AI applications by the level of risk they pose, from unacceptable and high-risk categories that demand stringent compliance, to limited and minimal-risk applications with fewer restrictions.

The EU’s AI Act website features an interactive tool, the EU AI Act Compliance Checker, designed to help users determine whether their AI systems will be subject to new regulatory requirements. However, as the EU AI Act is still being negotiated, the tool currently serves only as a preliminary guide to estimate potential legal obligations under the forthcoming legislation.

Meanwhile, businesses are increasingly deploying AI workloads with potential vulnerabilities into their cloud-native environments, exposing them to attacks from adversaries. Here, an “AI workload” refers to a containerized application that includes any of the well-known AI software packages, but not limited to:

“transformers”

“tensorflow”

“NLTK”

“spaCy”

“OpenAI”

“keras”

“langchain”

“anthropic”

Understanding Risk Categorization

Key to the AI Act’s approach is the differentiation of AI systems based on risk categories, introducing specific prohibitions for AI practices deemed unacceptable based on their threat to fundamental human or privacy rights. In particular, high-risk AI systems are subject to comprehensive requirements aimed at ensuring safety, accuracy, and cybersecurity. The Act also addresses the emergent field of generative AI, introducing categories for general-purpose AI models based on their risk and impact.

General-purpose AI systems are versatile, designed to perform a broad array of tasks across multiple fields, often requiring minimal adjustments or fine-tuning. Their commercial utility is on the rise, fueled by an increase in available computational resources and innovative applications developed by users. Despite their growing prevalence, there is scant regulation to prevent these systems from accessing sensitive business information, potentially violating established data protection laws like the GDPR.

Thankfully, this pioneering legislation does not stand in isolation but operates in conjunction with existing EU laws on data protection and privacy, including the GDPR and the ePrivacy Directive. The AI Act’s enactment will represent a critical step toward establishing a balanced legislation that encourages AI innovation and technological advancements while fostering trust and protecting the fundamental rights of European citizens.

GenAI Adoption has created Cyber Security Opportunities

For organizations, particularly cybersecurity teams, adhering to the AI Act involves more than mere compliance; it’s about embracing a culture of transparency, responsibility, and continuous risk assessment. To effectively navigate this new legal landscape, organizations should consider conducting thorough audits of their AI systems, investing in AI literacy and ethical AI practices, and establishing robust governance frameworks to manage AI risks proactively.

According to Gartner, “AI assistants like Microsoft Security Copilot, Sysdig Sage, and CrowdStrike Charlotte AI exemplify how these technologies can improve the efficiency of security operations. Security TSPs can leverage embedded AI capabilities to offer differentiated outcomes and services. Additionally, the need for GenAI-focused security consulting and professional services will arise as end users and TSPs drive AI innovation.”¹

Conclusion

Engaging with regulators, joining industry consortiums, and adhering to best practices in AI security and ethics are crucial steps for organizations to not only comply with the AI Act, but also foster a reliable AI ecosystem. Sysdig is committed to assisting organizations on their journey to secure AI workloads and mitigate active AI risks. We invite you to join us at the RSA Conference on May 6 – 9, 2024, where we will unveil our strategy for real-time AI Workload Security, with a special focus on our AI Audit capabilities that are essential for adherence to forthcoming compliance frameworks like the EU AI Act.

Gartner; Quick Answer: How GenAI Adoption Creates Cybersecurity Opportunities; Mark Wah, Lawrence Pingree, Matt Milone; ↩︎

The post How Businesses Can Comply with the EU’s Artificial Intelligence Act appeared first on Sysdig.

What’s New in Kubernetes 1.30?

Nigel Douglas — Mon, 15 Apr 2024 15:00:00 +0000

Kubernetes 1.30 is on the horizon, and it’s packed with fresh and exciting features! So, what’s new in this upcoming release?

Kubernetes 1.30 brings a plethora of enhancements, including a blend of 58 new and improved features. From these, several are graduating to stable, including the highly anticipated Container Resource Based Pod Autoscaling, which refines the capabilities of the Horizontal Pod Autoscaler by focusing on individual container metrics. New alpha features are also making their debut, promising to revolutionize how resources are managed and allocated within clusters.

Watch out for major changes such as the introduction of Structured Parameters for Dynamic Resource Allocation, enhancing the previously introduced dynamic resource allocation with a more structured and understandable approach. This ensures that Kubernetes components can make more informed decisions, reducing dependency on third-party drivers.

Further enhancing security, the support for User Namespaces in Pods moves to beta, offering refined isolation and protection against vulnerabilities by supporting user namespaces, allowing for customized UID/GID ranges that bolster pod security.

There are also numerous quality-of-life improvements that continue the trend of making Kubernetes more user-friendly and efficient, such as updates in pod resource management and network policies.

We are buzzing with excitement for this release! There’s plenty to unpack here, so let’s dive deeper into what Kubernetes 1.30 has to offer.

Kubernetes 1.30 – Editor’s pick

These are the features that look most exciting to us in this release:

#2400 Memory Swap Support

This enhancement sees the most significant overhaul, improving system stability by modifying swap memory behavior on Linux nodes to better manage memory usage and system performance. By optimizing how swap memory is handled, Kubernetes can ensure smoother operation of applications under various load conditions, thereby reducing system crashes and enhancing overall reliability.

Nigel Douglas – Sr. Open Source Security Advocate (Falco Security)

#3221 Structured Authorization Configuration

This enhancement also hits beta, streamlining the creation of authorization chains with enhanced capabilities like multiple webhooks and fine-grained control over request validation, all configured through a structured file. By allowing for complex configurations and precise authorization mechanisms, this feature significantly enhances security and administrative efficiency, making it easier for administrators to enforce policy compliance across the cluster.

Mike Coleman – Staff Developer Advocate – Open Source Ecosystem

#3488 CEL for Admission Control

The integration of Common Expression Language (CEL) for admission control introduces a dynamic method to enforce complex, fine-grained policies directly through the Kubernetes API, enhancing both security and governance capabilities. This improvement enables administrators to craft policies that are not only more nuanced but also responsive to the evolving needs of their deployments, thereby ensuring that security measures keep pace with changes without requiring extensive manual updates.

Thomas Labarussias – Sr. Developer Advocate & CNCF Ambassador

In cloud security, time is the most valuable currency. An attack could tarnish reputations in as little as 10 minutes.

That’s why we have curated a comprehensive checklist to guide your security strategy as you escalate your utilisation of containers and Kubernetes.

Read the Checklist

Apps in Kubernetes 1.30

#4443 More granular failure reason for Job PodFailurePolicy

Stage: Net New to Alpha
Feature group: sig-apps

The current approach of assigning a general “PodFailurePolicy” reason to a Job’s failure condition could be enhanced for specificity. One way to achieve this is by adding a customizable Reason field to the PodFailurePolicyRule, allowing for distinct, machine-readable reasons for each rule trigger, subject to character limitations. This method, preferred for its clarity, would enable higher-level APIs utilizing Jobs to respond more precisely to failures, particularly by associating them with specific container exit code.

#3017 PodHealthyPolicy for PodDisruptionBudget

Stage: Graduating to Stable
Feature group: sig-apps

Pod Disruption Budgets (PDBs) are utilized for two main reasons: to maintain availability by limiting voluntary disruptions and to prevent data loss by avoiding eviction until critical data replication is complete. However, the current PDB system has limitations. It sometimes prevents eviction of unhealthy pods, which can impede node draining and auto-scaling.

Additionally, the use of PDBs for data safety is not entirely reliable and could be considered a misuse of the API. Despite these issues, the dependency on PDBs for data protection is significant enough that any changes to PDBs must continue to support this requirement, as Kubernetes does not offer alternative solutions for this use case. The goals are to refine PDBs to avoid blocking evictions due to unhealthy pods and to preserve their role in ensuring data safety.

#3998 Job Success/completion policy

Stage: Net New to Alpha
Feature group: sig-apps

This Kubernetes 1.30 enhancement offers an extension to the Job API, specifically for Indexed Jobs, allowing them to be declared as successful based on predefined conditions. This change addresses the need in certain batch workloads, like those using MPI or PyTorch, where success is determined by the completion of specific “leader” indexes rather than all indexes.

Currently, a job is only marked as complete if every index succeeds, which is limiting for some applications. By introducing a success policy, which is already implemented in third-party frameworks like the Kubeflow Training Operator, Flux Operator, and JobSet, Kubernetes aims to provide more flexibility. This enhancement would enable the system to terminate any remaining pods once the job meets the criteria specified by the success policy.

CLI in Kubernetes 1.30

#4292 Custom profile in kubectl debug

Stage: Net New to Alpha
Feature group: sig-cli

This merged enhancement adds the –custom flag in kubectl debug to let the user customize its debug resources. The enhancement of the ‘kubectl debug’ feature is set to significantly improve security posture for operations teams.

Historically, the absence of a shell in base images posed a challenge for real-time debugging, which discouraged some teams from using these secure, minimalistic containers. Now, with the ability to attach data volumes within a debug container, end-users are enabled to perform in-depth analysis and troubleshooting without compromising on security.

This capability promises to make the use of shell-less base images more appealing by simplifying the debugging process.

#2590 Add subresource support to kubectl

Stage: Graduating to Stable
Feature group: sig-cli

The proposal introduces a new –subresource=[subresource-name] flag for the kubectl commands get, patch, edit, and replace.

This enhancement will enable users to access and modify status and scale subresources for all compatible API resources, including both built-in resources and Custom Resource Definitions (CRDs). The output for status subresources will be displayed in a formatted table similar to the main resource.

This feature follows the same API conventions as full resources, allowing expected reconciliation behaviors by controllers. However, if the flag is used on a resource without the specified subresource, a ‘NotFound’ error message will be returned.

#3895 Interactive flag added to kubectl delete command

Stage: Graduating to Stable
Feature group: sig-cli

This proposal suggests introducing an interactive mode for the kubectl delete command to enhance safety measures for cluster administrators against accidental deletions of critical resources.

The kubectl delete command is powerful and permanent, presenting risks of unintended consequences from errors such as mistyping or hasty decisions. To address the potential for such mishaps without altering the default behavior due to backward compatibility concerns, the proposal recommends a new interactive (-i) flag.

This flag would prompt users for confirmation before executing the deletion, providing an additional layer of protection and decision-making opportunity to prevent accidental removal of essential resources.

Instrumentation

#647 API Server tracing

Stage: Graduating to Stable
Feature group: sig-instrumentation

This Kubernetes 1.30 enhancement aims to improve debugging through enhanced tracing in the API Server, utilizing OpenTelemetry libraries for structured, detailed trace data. It seeks to facilitate easier analysis by enabling distributed tracing, which allows for comprehensive insight into requests and context propagation.

The proposal outlines goals to generate and export trace data for requests, alongside propagating context between incoming and outgoing requests, thus enhancing debugging capabilities and enabling plugins like admission webhooks to contribute to trace data for a fuller understanding of request paths.

#2305 Metric cardinality enforcement

Stage: Graduating to Stable
Feature group: sig-instrumentation

This enhancement addresses the issue of unbounded metric dimensions causing memory problems in instrumented components by introducing a dynamic, runtime-configurable allowlist for metric label values.

Historically, the Kubernetes community has dealt with problematic metrics through various inconsistent approaches, including deleting offending labels or metrics entirely, or defining a retrospective set of acceptable values. These fixes are manual, labor-intensive, and time-consuming, lacking a standardized solution.

This enhancement aims to remedy this by allowing metric dimensions to be bound to a predefined set of values independently of code releases, streamlining the process and preventing memory leaks without necessitating immediate binary releases.

#3077 Contextual Logging

Stage: Graduating to Beta
Feature group: sig-instrumentation

This contextual logging proposal introduces a shift from using a global logger to passing a logr.Logger instance through functions, either via a context.Context or directly, leveraging the benefits of structured logging. This method allows callers to enrich log messages with key/value pairs, specify names indicating the logging component or operation, and adjust verbosity to control the volume of logs generated by the callee.

The key advantage is that this is achieved without needing to feed extra information to the callee, as the necessary details are encapsulated within the logger instance itself.

Furthermore, it liberates third-party components utilizing Kubernetes packages like client-go from being tethered to the klog logging framework, enabling them to adopt any logr.Logger implementation and configure it to their preferences. For unit testing, this model facilitates isolating log output per test case, enhancing traceability and analysis.

The primary goal is to eliminate klog’s direct API calls and its mandatory adoption across packages, empowering function callers with logging control, and minimally impacting public APIs while providing guidance and tools for integrating logging into unit tests.

Network in Kubernetes 1.30

#3458 Remove transient node predicates from KCCM’s service controller

Stage: Graduating to Stable
Feature group: sig-network

To mitigate hasty disconnection of services and to minimize the load on cloud providers’ APIs, a new proposal suggests a change in how the Kubernetes cloud controller manager (KCCM) interacts with load balancer node sets.

This enhancement aims to discontinue the practice of immediate node removal when nodes temporarily lose readiness or are being terminated. Instead, by introducing the StableLoadBalancerNodeSet feature gate, it would promote a smoother transition by enabling connection draining, allowing applications to benefit from graceful shutdowns and reducing unnecessary load balancer re-syncs. This change is aimed at enhancing application reliability without overburdening cloud provider systems.

#3836 Ingress Connectivity Reliability Improvement for Kube-Proxy

Stage: Graduating to Beta
Feature group: sig-network

This Kubernetes 1.30 enhancement introduces modifications to the Kubernetes cloud controller manager’s service controller, specifically targeting the health checks (HC) used by load balancers. These changes aim to improve how these checks interact with kube-proxy, the service proxy managed by Kubernetes. There are three main improvements:

1) Enabling kube-proxy to support connection draining on terminating nodes by failing its health checks when nodes are marked for deletion, particularly useful during cluster downsizing scenarios;

2) Introducing a new /livez health check path in kube-proxy that maintains traditional health check semantics, allowing uninterrupted service during node terminations;

3) Advocating for standardized health check procedures across cloud providers through a comprehensive guide on Kubernetes’ official website.

These updates seek to ensure graceful shutdowns of services and improve overall cloud provider integration with Kubernetes clusters, particularly for services routed through nodes marked for termination.

#1860 Make Kubernetes aware of the LoadBalancer behavior

Stage: Graduating to Beta
Feature group: sig-network

This enhancement is a modification to the kube-proxy configurations for handling External IPs of LoadBalancer Services. Currently, kube-proxy implementations, including ipvs and iptables, automatically bind External IPs to each node for optimal traffic routing directly to services, bypassing the load balancer. This process, while beneficial in some scenarios, poses problems for certain cloud providers like Scaleway and Tencent Cloud, where such binding disrupts inbound traffic from the load balancer, particularly health checks.

Additionally, features like TLS termination and the PROXY protocol implemented at the load balancer level are bypassed, leading to protocol errors. The enhancement suggests making this binding behavior configurable at the cloud controller level, allowing cloud providers to disable or adjust this default setting to better suit their infrastructure and service features, addressing these issues and potentially offering a more robust solution than current workarounds.

Kubernetes 1.30 Nodes

#3960 Introducing Sleep Action for PreStop Hook

Stage: Graduating to Beta
Feature group: sig-node

This Kubernetes 1.30 enhancement introduces a ‘sleep’ action for the PreStop lifecycle hook, offering a simpler, native option for managing container shutdowns.

Instead of relying on scripts or custom solutions for delaying termination, containers could use this built-in sleep to gracefully wrap up operations, easing transitions in load balancing, and allowing external systems to adjust, thereby boosting Kubernetes applications’ reliability and uptime.

#2400 Node Memory Swap Support

Stage: Major Change to Beta
Feature group: sig-node

The enhancement integrates swap memory support into Kubernetes, addressing two key user groups: node administrators for performance tuning and application developers requiring swap for their apps.

The focus is to facilitate controlled swap use on a node level, with the kubelet enabling Kubernetes workloads to utilize swap space under specific configurations. The ultimate goal is to enhance Linux node operation with swap, allowing administrators to determine swap usage for workloads, initially not permitting individual workloads to set their own swap limits.

#24 AppArmor Support

Stage: Graduating to Stable
Feature group: sig-node

Scheduling

#3633 Introduce MatchLabelKeys to Pod Affinity and Pod Anti Affinity

Stage: Graduating to Beta
Feature group: sig-scheduling

This Kubernetes 1.30 enhancement introduces MatchLabelKeys for PodAffinityTerm to refine PodAffinity and PodAntiAffinity, enabling more precise control over Pod placements during scenarios like rolling upgrades.

#3902 Decouple TaintManager from NodeLifecycleController

Stage: Graduating to Stable
Feature group: sig-scheduling

This enhancement separated the NodeLifecycleController duties into two distinct controllers. Currently, the NodeLifecycleController is responsible for both marking unhealthy nodes with NoExecute taints and evicting pods from these tainted nodes.

The proposal introduces a dedicated TaintEvictionController specifically for managing the eviction of pods based on NoExecute taints, while the NodeLifecycleController will continue to focus on applying taints to unhealthy nodes. This separation aims to streamline the codebase, allowing for more straightforward enhancements and the potential development of custom eviction strategies.

The motivation behind this change is to untangle the intertwined functionalities, thus improving the system’s maintainability and flexibility in handling node health and pod eviction processes.

#3838 Mutable Pod scheduling directives when gated

Stage: Graduating to Stable
Feature group: sig-scheduling

The enhancement introduced in #3521, PodSchedulingReadiness, aimed at empowering external resource controllers – like extended schedulers or dynamic quota managers – to determine the optimal timing for a pod’s eligibility for scheduling by the kube-scheduler.

Building on this foundation, the current enhancement seeks to extend the flexibility by allowing mutability in a pod’s scheduling directives, specifically node selector and node affinity, under the condition that such updates further restrict the pod’s scheduling options. This capability enables external resource controllers to not just decide the timing of schedulin,g but also to influence the specific placement of the pod within the cluster.

This approach fosters a new pattern in Kubernetes scheduling, encouraging the development of lightweight, feature-specific schedulers that complement the core functionality of the kube-scheduler without the need for maintaining custom scheduler binaries. This pattern is particularly advantageous for features that can be implemented without the need for custom scheduler plugins, offering a streamlined way to enhance scheduling capabilities within Kubernetes ecosystems.

Kubernetes 1.30 storage

#3141 Prevent unauthorized volume mode conversion during volume restore

Stage: Graduating to Stable
Feature group: sig-storage

This enhancement addresses a potential security gap in Kubernetes’ VolumeSnapshot feature by introducing safeguards against unauthorized changes in volume mode during the creation of a PersistentVolumeClaim (PVC) from a VolumeSnapshot.

It outlines a mechanism to ensure that the original volume mode of the PVC is preserved, preventing exploitation through kernel vulnerabilities, while accommodating legitimate backup and restore processes that may require volume mode conversion for efficiency. This approach aims to enhance security without impeding valid backup and restore workflows.

#1710 Speed up recursive SELinux label change

Stage: Net New to Beta
Feature group: sig-storage

This enhancement details improvements to SELinux integration with Kubernetes, focusing on enhancing security measures for containers running on Linux systems with SELinux in enforcing mode. The proposal outlined how SELinux prevents escaped container users from accessing host OS resources or other containers by assigning unique SELinux contexts to each container and labeling volume contents accordingly.

The proposal also seeks to refine how Kubernetes handles SELinux contexts, offering the option to either set these manually via PodSpec or allow the container runtime to automatically assign them. Key advancements include the ability to mount volumes with specific SELinux contexts using the -o context= option during the first mount to ensure the correct security labeling, as well as recognizing which volume plugins support SELinux.

The motivation behind these changes includes enhancing performance by avoiding extensive file relabeling, preventing space issues on nearly full volumes, and increasing security, especially for read-only and shared volumes. This approach aims to streamline SELinux policy enforcement across Kubernetes deployments, particularly in securing containerized environments against potential security breaches like CVE-2021-25741.

#3756 Robust VolumeManager reconstruction after kubelet restart

Stage: Graduating to Stable
Feature group: sig-storage

This enhancement addresses the issues with kubelet’s handling of mounted volumes after a restart, where it currently loses track of volumes for running Pods and attempts to reconstruct this state from the API server and the host OS – a process known to be flawed.

It proposes a reworking of this process, essentially a substantial bugfix that impacts significant portions of kubelet’s functionality. Due to the scope of these changes, they will be implemented behind a feature gate, allowing users to revert to the old system if necessary. This initiative builds on the foundations laid in KEP 1790, which previously went alpha in v1.26.

The modifications aim to enhance how kubelet, during startup, can better understand how volumes were previously mounted and assess whether any changes are needed. Additionally, it seeks to address issues like those documented in bug #105536, where volumes fail to be properly cleaned up after a kubelet restart, thus improving the overall robustness of volume management and cleanup.

Other enhancements

#1610 Container Resource based Pod Autoscaling

Stage: Graduating to Stable
Feature group: sig-autoscaling

This enhancement outlines enhancements to the Horizontal Pod Autoscaler’s (HPA) functionality, specifically allowing it to scale resources based on the usage metrics of individual containers within a pod. Currently, HPA aggregates resource consumption across all containers, which may not be ideal for complex workloads with containers whose resource usage does not uniformly scale.

With the proposed changes, HPA would have the capability to scale more precisely by assessing the resource demands of each container separately.

#2799 Reduction of Secret-based Service Account Tokens

Stage: Graduating to Stable
Feature group: sig-auth

This improvement outlines measures to minimize the reliance on less secure, secret-based service account tokens following the general availability of BoundServiceAccountTokenVolume in Kubernetes 1.22. With this feature, service account tokens are acquired via the TokenRequest API and stored in a projected volume, making the automatic generation of secret-based tokens unnecessary.

This aims to cease the auto-generation of these tokens and remove any that are unused, while still preserving tokens explicitly requested by users. The suggested approach includes modifying the service account control loop to prevent automatic token creation, promoting the use of the TokenRequest API or manually created tokens, and implementing a purge process for unused auto-generated tokens.

#4008 CRD Validation Ratcheting

Stage: Graduating to Beta
Feature group: sig-api-machinery

This proposal focuses on improving the usability of Kubernetes by advocating for the “shift left” of validation logic, moving it from controllers to the frontend when possible. Currently, the process of modifying validation for unchanged fields in a Custom Resource Definition (CRD) is cumbersome, often requiring version increments even for minor validation changes. This complexity hinders the adoption of advanced validation features by both CRD authors and Kubernetes developers, as the risk of disrupting user workflows is high. Such restrictions not only degrade user experience but also impede the progression of Kubernetes itself.

For instance, KEP-3937 suggests introducing declarative validation with new format types, which could disrupt existing workflows. The goals of this enhancement are to eliminate the barriers that prevent CRD authors and Kubernetes from both broadening and tightening value validations without causing significant disruptions. The proposal aimed to automate these enhancements for all CRDs in clusters where the feature is enabled, maintaining performance with minimal overhead and ensuring correctness by preventing invalid values according to the known schema.

If you liked this, you might want to check out our previous ‘What’s new in Kubernetes’ editions:

Get involved with the Kubernetes project:

Visit the project homepage.
Check out the Kubernetes project on GitHub.
Get involved with the Kubernetes community.
Meet the maintainers on the Kubernetes Slack.
Follow @KubernetesIO on Twitter.

And if you enjoy keeping up to date with the Kubernetes ecosystem, subscribe to our container newsletter, a monthly email with the coolest stuff happening in the cloud-native ecosystem.

The post What’s New in Kubernetes 1.30? appeared first on Sysdig.

The Hidden Economy of Open Source Software

Nigel Douglas — Fri, 12 Apr 2024 14:00:00 +0000

The recent discovery of a backdoor in XZ Utils (CVE-2024-3094), a data compression utility used by a wide array of various open-source, Linux-based computer applications, underscores the importance of open-source software security. While it is often not consumer-facing, open-source software is a critical component of computing and internet functions, such as secure communications between machines.

Open source software (abbreviated as OSS) has become a cornerstone of the tech industry, influencing everything from small startups to global corporations. Despite its ubiquitous presence and foundational role in driving innovation, the true economic value of OSS has remained largely uncharted territory—until now. A groundbreaking study entitled “The Value of Open Source Software” by researchers Manuel Hoffmann, Frank Nagle, and Yanuo Zhou at Harvard Business School delves into this unexplored domain, revealing the astonishing economic impact of OSS throughout industry.

A Priceless Foundation with a Trillion-Dollar Impact

The study begins by addressing a fundamental paradox: How do you measure the value of something that is freely available? Traditionally, economic value is calculated by multiplying the price of a product by the quantity sold. However, this formula hits a snag when it comes to OSS—there’s no price tag on something that’s free, and tracking its usage is a Herculean task due to the decentralised nature of OSS distribution.

Leveraging unique global data sources and a novel approach, the research estimates the “supply-side” value (the cost to recreate the most widely used OSS) at $4.15 billion. But the true eye-opener is the “demand-side” value, pegged at a staggering $8.8 trillion. This figure represents the hypothetical cost that companies would face if they had to develop equivalent software internally, highlighting the immense savings and efficiency gains OSS provides to the global economy.

For instance, Falco, an open-source, cloud-native security tool, boasts contributions from 190 individuals dedicated to enhancing the software and ensuring it meets the evolving threats in cloud computing. If an organisation attempted to develop a custom threat detection engine in Go from scratch, it would be financially impractical to employ 190 staff members to continuously develop and maintain the tool. Although most of the 190 contributors likely engage with Falco as a side project rather than their primary employment, acknowledging the number of people actively committing to the project offers valuable insight into its collective human investment.

Cloud attacks are happening faster than ever before. The OWASP Top 10 for Kubernetes is a set of security risks specific to Kubernetes environments to address in order to ensure the security of cloud-native applications.

Read Our Guide

The Unsung Heroes of OSS

One of the most intriguing findings of the study is the concentration of value creation within the OSS community. A mere 5% of OSS developers are responsible for 96% of its demand-side value. This elite group of contributors has a disproportionate impact on the software landscape, emphasising the need for support and recognition from both the tech industry and policymakers.

Sticking to the topic of the recent XZ Utils backdoor, to prevent incidents like that from recurring, policymakers and software vendors must take proactive steps to enhance the security and integrity of existing OSS projects. Many OSS maintainers work on these projects voluntarily, without compensation, and often in addition to their regular employment. This can lead to overwork and burnout, creating vulnerabilities that adversaries can exploit to compromise software.

Without adequate safeguards and support systems, these maintainers operate in an environment that undervalues their crucial contributions and exposes them to significant risks. To address these challenges, there is a pressing need for policy interventions that recognise and financially support OSS development, along with industry-wide adoption of rigorous security practices. By implementing measures such as funding OSS projects, offering security training for maintainers, and developing comprehensive review processes, policymakers and vendors can protect maintainers from undue pressures and enhance the security of OSS.

The Programming Languages That Power the Economy

Digging deeper, the study finds that the lion’s share of OSS value is actually generated by a few key programming languages, with Go, JavaScript, and Java leading the pack. These languages are not just popular among developers; they are instrumental in creating billions of dollars in value, further emphasizing the strategic importance of investing in and nurturing the OSS ecosystem.

The notion of organisations opting to create proprietary programming languages rather than leveraging existing open-source options like JavaScript or Python libraries does not hold practical merit, considering the extensive resources and expertise required for such an endeavor.

Constructing a new programming language from scratch involves not just the immense initial development effort but also the continuous maintenance, development of libraries, tools, and community support to make it viable for production use. Moreover, the existing ecosystems around popular languages such as JavaScript and Python are the result of years of collective effort and contributions from a global community, encompassing vast libraries and frameworks that facilitate rapid development and deployment of applications.

These widely-used languages, however, are not without their vulnerabilities, including known Common Vulnerabilities and Exposures (CVEs) that pose significant security risks if left unpatched. Addressing these vulnerabilities often falls beyond the capacity of individual organisations, especially considering the breadth of open-source dependencies modern applications rely on. This scenario underscores the crucial role of large software vendors in enhancing the security infrastructure of the open-source ecosystem.

By contributing to the security of these languages and libraries, either through direct code contributions, funding, or the provision of advanced security tools and services, these vendors can significantly reduce the potential attack surface for organisations worldwide. Such collaborative efforts between individual maintainers, organisations, and large vendors are essential in bolstering the overall security posture of the open-source software that underpins much of today’s digital infrastructure.

How is the Falco project staying secure?

The Falco project emphasizes its commitment to maintaining vendor independence and the collective effort to bolster its security posture. A foundational pillar of Falco’s philosophy is its vendor-neutral stance, ensuring that the project benefits from a wide array of contributions without being tethered to any single company’s interests. This approach has fostered a diverse and robust community, with significant engineering resources dedicated by several leading companies.

To prove the project’s maturity and reliability, Falco successfully graduated from the Cloud Native Computing Foundation (CNCF) incubating status. This achievement was marked by a fairly rigorous Due Diligence process conducted by the CNCF Technical Oversight Committee (TOC), including a comprehensive third-party security audit. This graduation not only proved Falco’s growth and sustainability, but also solidified Falco’s position as a leader in the open-source runtime security ecosystem.

Reflecting on Falco’s commitment to an inclusive development environment, Falco boasts contributions from 17 organizations actively committing to the project. Notably, approximately 38% of contributions originated from diverse committers affiliated with renowned organizations such as Amazon, Cisco, Chainguard, Clastix, IBM, Microsoft, RedHat, SecureWorks, among others, alongside many individual contributors. This collective effort also demonstrates how Falco’s mission to foster a broad-based and resilient security tool is being enforced.

Governance practices further cement Falco’s dedication to vendor neutrality, with specific measures to prevent any single entity from dominating the project’s direction. A key governance rule caps any organization’s eligible votes at 40%, ensuring balanced representation and decision-making within the project community.

Towards a Sustainable Future for OSS

Harvard’s study revelations are a clear call to action to organisations to reflect on the value of OSS in their business, while also highlighting how many of those projects are taking appropriate steps to audit their projects. The paper further highlights the vital role of OSS in driving technological innovation and economic efficiency.

However, this digital commons, much like its physical counterparts, is vulnerable to overuse and underinvestment – as seen with the XZ Utils backdoor. The findings advocate for a concerted effort to support OSS development, ensuring its sustainability and continued contribution to the global economy.

“The Value of Open Source Software” study shines a spotlight on the hidden economic powerhouse that is OSS. By quantifying its value, the research not only celebrates the contributions of the OSS community but also highlights the critical need for strategic investment and support to secure its future. As we move forward in the digital era, the true value of OSS cannot be overstated—it is an indispensable resource that fuels innovation, drives efficiency, and shapes the technology landscape.

The post The Hidden Economy of Open Source Software appeared first on Sysdig.

Container Drift Detection with Falco

Nigel Douglas — Tue, 27 Feb 2024 15:30:00 +0000

DIE is the notion that an immutable workload should not change during runtime; therefore, any observed change is potentially evident of malicious activity, also commonly referred to as Drift. Container Drift Detection provides an easy way to prevent attacks at runtime by simply following security best practices of immutability and ensuring containers aren’t modified after deployment in production.

Getting ahead of drift in container security

According to the Sysdig 2024 Cloud-Native Security & Usage Report, approximately 25% of Kubernetes users receive alerts on drift behavior. On the other hand, about 4% of teams are fully leveraging drift control policies by automatically blocking unexpected executions. In order to prevent drift, you need to be able to detect drift in real-time. And that’s where Falco’s rich system call collection and analysis is required. We will highlight how Falco rules can detect drift in real time, and provide some practical drift control advice.

Container drift detection when files are open and written

This Falco rule is rather rudimentary, but it still achieves its intended purpose. It looks for the following event types listed – open, openat, openat2, creat. This works, but it relies on fairly ambiguous kernel signals, and therefore only works with Falco Engine version 6 or higher. The rule is enabled by default in the stable rules feed of the Falco Rules Maturity Framework.

- rule: Container Drift Detected (open+create)
  desc: Detects new executables created within a container as a result of open+create.
  condition: >
    evt.type in (open,openat,openat2,creat) 
    and evt.rawres>=0
    and evt.is_open_exec=true 
    and container 
    and not runc_writing_exec_fifo 
    and not runc_writing_var_lib_docker 
    and not user_known_container_drift_activities 
  enabled: false
  output: Drift detected (open+create), new executable created in a container (filename=%evt.arg.filename name=%evt.arg.name mode=%evt.arg.mode evt_type=%evt.type user=%user.name user_uid=%user.uid user_loginuid=%user.loginuid process=%proc.name proc_exepath=%proc.exepath parent=%proc.pname command=%proc.cmdline terminal=%proc.tty %container.info)
  priority: ERROR
  tags: [maturity_sandbox, container, process, filesystem, mitre_execution, T1059]

To see which Falco rules are in what status of the Falco Maturity Framework, check out this link.
maturity_stable indicates that the rule has undergone thorough evaluation by experts with hands-on production experience. These practitioners have determined that the rules embody best practices and exhibit optimal robustness, making it more difficult for attackers to bypass Falco detection.

Container Drift Detection through chmod

In Unix and similar operating systems, the chmod command and system call are utilized to modify the access rights and specific mode flags (such as setuid, setgid, and sticky flags) for file system entities, including both files and directories.

- rule: Container Drift Detected (chmod)
  desc: Detects when new executables are created in a container as a result of chmod.
  condition: >
    chmod 
    and container 
    and evt.rawres>=0 
    and ((evt.arg.mode contains "S_IXUSR") or
         (evt.arg.mode contains "S_IXGRP") or
         (evt.arg.mode contains "S_IXOTH"))
    and not runc_writing_exec_fifo 
    and not runc_writing_var_lib_docker 
    and not user_known_container_drift_activities 
  enabled: false
  output: Drift detected (chmod), new executable created in a container (filename=%evt.arg.filename name=%evt.arg.name mode=%evt.arg.mode evt_type=%evt.type user=%user.name user_uid=%user.uid user_loginuid=%user.loginuid process=%proc.name proc_exepath=%proc.exepath parent=%proc.pname command=%proc.cmdline terminal=%proc.tty %container.info)
  priority: ERROR
  tags: [maturity_sandbox, container, process, filesystem, mitre_execution, T1059]

While this Falco rule can generate significant noise, chmod usage is frequently linked to dropping and executing malicious implants. The rule is therefore disabled by default and placed within the “Sandbox” rules feed of the maturity matrix, however, it can be fine-tuned to better work for your environment.

The newer rule “Drop and execute new binary in container” provides more precise detection of this TTP using unambiguous kernel signals. It is recommended to use the new rule. However, this rule might be more relevant for auditing if applicable in your environment, such as when chmod is used on files within the /tmp folder.

Detect drift when a new binary is dropped and executed

It’s ideal to detect if an executable not belonging to the base image of a container is being executed. The drop and execute pattern can be observed very often after an attacker gained an initial foothold. The process is_exe_upper_layer filter field only applies for container runtimes that use overlayFS as a union mount filesystem.

- rule: Drop and execute new binary in container
  desc: Detects if an executable not belonging to a container base image is executed.
  condition: >
    spawned_process
    and container
    and proc.is_exe_upper_layer=true 
    and not container.image.repository in (known_drop_and_execute_containers)
  output: Executing binary not part of base image (proc_exe=%proc.exe proc_sname=%proc.sname gparent=%proc.aname[2] proc_exe_ino_ctime=%proc.exe_ino.ctime proc_exe_ino_mtime=%proc.exe_ino.mtime proc_exe_ino_ctime_duration_proc_start=%proc.exe_ino.ctime_duration_proc_start proc_cwd=%proc.cwd container_start_ts=%container.start_ts evt_type=%evt.type user=%user.name user_uid=%user.uid user_loginuid=%user.loginuid process=%proc.name proc_exepath=%proc.exepath parent=%proc.pname command=%proc.cmdline terminal=%proc.tty exe_flags=%evt.arg.flags %container.info)
  priority: CRITICAL
  tags: [maturity_stable, container, process, mitre_persistence, TA0003, PCI_DSS_11.5.1]

Adopters can utilize the provided template list known_drop_and_execute_containers containing allowed container images known to execute binaries not included in their base image. Alternatively, you could exclude non-production namespaces in Kubernetes settings by adjusting the rule further. This helps reduce noise by applying application and environment-specific knowledge to this rule. Common anti-patterns include administrators or SREs performing ad-hoc debugging.

Enforcing Container Drift Prevention at Runtime

Detecting container drift in real time is critical in reducing the risk of data theft or credential access in running workloads. Activating preventive drift control measures in production should reduce the amount of potentially malicious events requiring incident response intervention by approximately 9%, according to Sysdig’s report. That’s where Falco Talon comes to the rescue.

Falco Talon is a response engine for managing threats in your Kubernetes environment. It enhances the solutions proposed by the Falco community with a no-code, tailor-made solution for Falco rules. With easy to configure response actions, you can prevent the indicators of compromise in milliseconds.

- action: Terminate Pod
  actionner: kubernetes:terminate
  parameters:
    ignoreDaemonsets: false
    ignoreStatefulsets: true

- rule: Drift Prevention in a Kubernetes Pod
  match:
    rules:
      - Drop and execute new binary in container
  actions:
    - action: Terminate Pod
      parameters:
        gracePeriods: 2

As you can see in the above Talon Rule Drift Prevention in a Kubernetes Pod, we have configured a response actionner for the Falco rule Drop and execute new binary in container. So, when a user attempts to alter a running container, we can instantly and gracefully terminate the pod, upholding the cloud-native principle of immutability. It’s crucial to remember the DIE concept here. Regular modifications during runtime, if not aligned with DIE, could lead to alert overload or significant system disruptions due to frequent workload interruptions when drift prevention is enabled.

If you do not intend to shutdown the workload in response to container drift detections, you could alternatively choose to run a shell script within the container to remove the recently dropped binary. Or, you could enforce a Kubernetes network policy to isolate the network requests from a suspected C2 server.

Conclusion

Drift control in running containers is not an optional feature, but rather a necessity when we talk about runtime security. When we look back at the DIE philosophy, we need a real-time approach, as seen in Falco, to protect immutable cloud-native workloads in Kubernetes. By leveraging Falco rules to monitor for unauthorized changes, such as file modifications or unexpected binary executions, organizations can detect and automatically mitigate potential security breaches through Falco Talon. This proactive approach to container security, emphasizing immutability and continuous surveillance, not only fortifies defenses against malicious activities but also aligns with best practices for maintaining the integrity and security of modern cloud-native applications.

Moreover, the adaptability of Falco’s rules to specific operational environments, through customization and the application of context-aware filters, enhances their effectiveness while minimizing false positives. This tailored approach ensures that security measures are both stringent and relevant, avoiding unnecessary alerts that could lead to alert fatigue among security teams. The journey towards a secure containerized environment is ongoing and requires vigilance, collaboration, and a commitment to security best practices.

The post Container Drift Detection with Falco appeared first on Sysdig.

Ephemeral Containers and APTs

Nigel Douglas — Mon, 19 Feb 2024 16:00:00 +0000

The Sysdig Threat Research Team (TRT) published their latest Cloud-Native Security & Usage Report for 2024. As always, the research team managed to shed additional light on critical vulnerabilities inherent in current container security practices. This blog post delves into the intricate balance between convenience, operational efficiency, and the rising threats of Advanced Persistent Threats (APTs) in the world of ephemeral containers – and what we can do to prevent those threats in milliseconds.

Attackers Have Adapted to Ephemeral Containers

A striking revelation from the Sysdig report is the increasingly transient life of containers. Approximately 70% of containers now have a lifespan of less than five minutes. While this ephemeral nature can be beneficial for resource management, it also presents unique security challenges. Attackers, adapting to these fleeting windows, have honed their methods to conduct swift, automated reconnaissance. The report highlights that a typical cloud attack unfolds within a mere 10 minutes, underscoring the need for real-time response actions.

How to prevent data exfiltration in ephemeral containers

Many organizations have opted to use open-source Falco for real-time threat detection in cloud-native environments. In cases where the adversary opts to use an existing tool such as kubectl cp to copy artifacts from a container’s file system to a remote location via the Kubernetes control plane, Falco can trigger a detection within milliseconds.

- rule: Exfiltrating Artifacts via Kubernetes Control Plane
  desc: Detect artifacts exfiltration from a container's file system using kubectl cp.
  condition: >
    open_read 
    and container 
    and proc.name=tar 
    and container_entrypoint 
    and proc.tty=0 
    and not system_level_side_effect_artifacts_kubectl_cp
  output: Exfiltrating Artifacts via Kubernetes Control Plane (file=%fd.name evt_type=%evt.type user=%user.name user_uid=%user.uid user_loginuid=%user.loginuid process=%proc.name proc_exepath=%proc.exepath parent=%proc.pname command=%proc.cmdline terminal=%proc.tty)
  priority: NOTICE
  tags: [maturity_incubating, container, filesystem, mitre_exfiltration, TA0010]

This Falco rule can identify potential exfiltration of application secrets from ephemeral containers’ file systems, potentially revealing the outcomes of unauthorized access and control plane misuse via stolen identities (such as stolen credentials like Kubernetes serviceaccount tokens). In cases where an attack can start and complete its goal in less than 5 mins, the need for a quick response action is critical. Unfortunately, this Falco rule alone will only notify users of the exfiltration attempt. We need an additional add-on to stop this action entirely.

Preventing Data Exfiltration with Falco Talon

Falco Talon was recently designed as an open-source Response Engine for isolating threats, specifically in the container orchestration platform – Kubernetes. It enhances the cloud-detection detection engine Falco with a no-code solution. In this case, developer operations and security teams can seamlessly author simple Talon rules that respond to existing Falco real-time in real time. Notice how the below Talon rule gracefully terminates a workload if it was flagged as triggering the aforementioned “Exfiltrating Artifacts via Kubernetes Control Plane” Falco rule.

- name: Prevent control plane exfiltration
  match:
    rules:
      - "Exfiltrating Artifacts via Kubernetes Control Plane"
  action:
    name: kubernetes:terminate
    parameters:
      ignoreDaemonsets: true
      ignoreStatefulsets: true
      grace_period_seconds: 0

In the above example, the action chooses to utilize the existing Kubernetes primitives for graceful termination with the name “kubernetes:terminate“. It’s important that your application handles termination gracefully so that there is minimal impact on the end user and the time-to-recovery is as fast as possible – unlike SIGKILL, which is much more forceful.

In practice, this terminate action means your pod will handle the SIGTERM message and begin shutting down when it receives the message. This involves saving state, closing down network connections, finishing any work that is left.

In Falco Talon, the parameters “grace_period_seconds” specifies the duration in seconds before the pod should be deleted. The value zero indicates delete immediately. If configured, the attacker is instantly kicked out of the session and therefore unable to exfiltrate data.

The Threat of Quick and Agile Attackers

The agility of attackers in the cloud environment cannot be underestimated. Once they gain access, they rapidly acquire an understanding of the environment, poised to advance their malicious objectives. This rapid adaptation means that even short-lived, vulnerable workloads can expose organizations to significant risks. The traditional security models, which rely on longer response times, are proving inadequate against these fast-paced threats.

Conclusion

The insights from the Sysdig report unequivocally call for a strategic reevaluation of security approaches in Kubernetes environments. In response to the challenges posed by limited visibility and the need for effective security controls in ephemeral containers and workloads, projects like the Cloud Native Computing Foundation’s (CNCF) Falco, and its latest open-source companion Falco Talon, have emerged as vital tools. Designed to tackle the intricacies of short-lived (less than 5 minutes) containers, these solutions offer real-time security monitoring and continuous scanning, transitioning from recommended practices to essential components in a Kubernetes security arsenal.

Organizations must find a balance between leveraging the convenience of cloud-native technologies and enforcing stringent security protocols. As attackers increasingly exploit the ephemeral nature of containers, the organizational response must be both dynamic and proactive. Tools like Falco and Falco Talon exemplify the kind of responsive, advanced security measures necessary to navigate this landscape. They provide the much-needed visibility and control to detect and respond to threats in real-time, thereby enhancing the security posture in these fast-paced environments.

Ensuring robust cybersecurity in the face of sophisticated threats is undoubtedly challenging, but with the right tools and strategies, it is within reach. The integration of solutions like Falco and Falco Talon into Kubernetes environments is key to safeguarding against today’s advanced threats, ensuring a secure, efficient, and resilient cloud-native ecosystem for tomorrow.

The post Ephemeral Containers and APTs appeared first on Sysdig.

Resource Constraints in Kubernetes and Security

Nigel Douglas — Mon, 12 Feb 2024 15:15:00 +0000

The Sysdig 2024 Cloud‑Native Security and Usage Report highlights the evolving threat landscape, but more importantly, as the adoption of cloud-native technologies such as container and Kubernetes continue to increase, not all organizations are following best practices. This is ultimately handing attackers an advantage when it comes to exploiting containers for resource utilization in operations such as Kubernetes.

Balancing resource management with security is not just a technical challenge, but also a strategic imperative. Surprisingly, Sysdig’s latest research report identified less than half of Kubernetes environments have alerts for CPU and memory usage, and the majority lack maximum limits on these resources. This trend isn’t just about overlooking a security practice; it’s a reflection of prioritizing availability and development agility over potential security risks.

The security risks of unchecked resources

Unlimited resource allocation in Kubernetes pods presents a golden opportunity for attackers. Without constraints, malicious entities can exploit your environment, launching cryptojacking attacks or initiating lateral movements to target other systems within your network. The absence of resource limits not only escalates security risks but can also lead to substantial financial losses due to unchecked resource consumption by these attackers.

A cost-effective security strategy

In the current economic landscape, where every penny counts, understanding and managing resource usage is as much a financial strategy as it is a security one. By identifying and reducing unnecessary resource consumption, organizations can achieve significant cost savings – a crucial aspect in both cloud and container environments.

Enforcing resource constraints in Kubernetes

Implementing resource constraints in Kubernetes is straightforward yet impactful. To apply resource constraints to an example atomicred tool deployment in Kubernetes, users can simply modify their deployment manifest to include resources requests and limits.

Here’s how the Kubernetes project recommends enforcing those changes:

kubectl apply -f - <1
  selector:
    matchLabels:
      app: atomicred
  template:
    metadata:
      labels:
        app: atomicred
    spec:
      containers:
      - name: atomicred
        image: issif/atomic-red:latest
        imagePullPolicy: "IfNotPresent"
        command: ["sleep", "3560d"]
        securityContext:
          privileged: true
        resources:
          requests:
            memory: "64Mi"
            cpu: "250m"
          limits:
            memory: "128Mi"
            cpu: "500m"
      nodeSelector:
        kubernetes.io/os: linux
EOF

In this manifest, we set both requests and limits for CPU and memory as follows:

requests: Amount of CPU and memory that Kubernetes will guarantee for the container. In this case, 64Mi of memory and 250m CPU (where 1000m equals 1 CPU core).
limits: The maximum amount of CPU and memory the container is allowed to use.
If the container tries to exceed these limits, it will be throttled (CPU) or killed and possibly restarted (memory). Here, it’s set to 128Mi of memory and 500m CPU.

This setup ensures that the atomicred tool is allocated enough resources to function efficiently while preventing it from consuming excessive resources that could impact other processes in your Kubernetes cluster. Those request constraints guarantee that the container gets at least the specified resources, while limits ensure it never goes beyond the defined ceiling. This setup not only optimizes resource utilization but also guards against resource depletion attacks.

Monitoring resource constraints in Kubernetes

To check the resource constraints of a running pod in Kubernetes, use the kubectl describe command. The command provided will automatically describe the first pod in the atomic-red namespace with the label app=atomicred.

kubectl describe pod -n atomic-red $(kubectl get pods -n atomic-red -l app=atomicred -o jsonpath="{.items[0].metadata.name}")

What happens if we abuse these limits?

To test CPU and memory limits, you can run a container that deliberately tries to consume more resources than allowed by its limits. However, this can be a bit complex:

CPU: If a container attempts to use more CPU resources than its limit, Kubernetes will throttle the CPU usage of the container. This means the container won’t be terminated but will run slower.
Memory: If a container tries to use more memory than its limit, it will be terminated by Kubernetes once it exceeds the limit. This is known as an Out Of Memory (OOM) kill.

Creating a stress test container

You can create a new deployment that intentionally stresses the resources.
For example, you can use a tool like stress to consume CPU and memory deliberately:

kubectl apply -f - <1
  selector:
    matchLabels:
      app: resource-stress-test
  template:
    metadata:
      labels:
        app: resource-stress-test
    spec:
      containers:
      - name: stress
        image: polinux/stress
        resources:
          limits:
            memory: "128Mi"
            cpu: "500m"
        command: ["stress"]
        args: ["--vm", "1", "--vm-bytes", "150M", "--vm-hang", "1"]
EOF

The deployment specification defines a single container using the image polinux/stress, which is an image commonly used for generating workload and stress testing. Under the resources section, we define the resource requirements and limits for the container. We are requesting 150Mi of memory but the maximum threshold for memory is fixed at a 128Mi limit.

A command is run inside the container to tell K8s to create a virtual workload of 150 MB and hang for one second. This is a common way to perform stress testing with this container image.

As you can see from the below screenshot, the OOMKilled output appears. This means that the container will be killed due to being out of memory. If an attacker was running a cryptomining binary within the pod at the time of OOMKilled action, they would be kicked out, the pod would go back to its original state (effectively removing any instance of the mining binary), and the pod would be recreated.

Alerting on pods deployed without resource constraints

You might be wondering whether you have to describe every pod to ensure it has proper resource constraints in place. While you could do that, it’s not exactly scalable. You could, of course, ingest Kubernetes log data into Prometheus and report on it accordingly. Alternatively, if you have Falco already installed in your Kubernetes cluster, you could apply the below Falco rule to detect instances where a pod is successfully deployed without resource constraints.

- rule: Create Pod Without Resource Limits
  desc: Detect pod created without defined CPU and memory limits
  condition: kevt and pod and kcreate 
             and not ka.target.subresource in (resourcelimits)
  output: Pod started without CPU or memory limits (user=%ka.user.name pod=%ka.resp.name resource=%ka.target.resource ns=%ka.target.namespace images=%ka.req.pod.containers.image)
  priority: WARNING
  source: k8s_audit
  tags: [k8s, resource_limits]

- list: resourcelimits
  items: ["limits"]

Please note: Depending on how your Kubernetes workloads are set up, this rule might generate some False/Positive alert detections for legitimate pods that are intentionally deployed without resource limits. In these cases, you may still need to fine-tune this rule or implement some exceptions in order to minimize those false positives. However, implementing such a rule can significantly enhance your monitoring capabilities, ensuring that best practices for resource allocation in Kubernetes are adhered to.

Sysdig’s commitment to open source

The lack of enforced resource constraints in Kubernetes in numerous organizations underscores a critical gap in current security frameworks, highlighting the urgent need for increased awareness. In response, we contributed our findings to the OWASP Top 10 framework for Kubernetes, addressing what was undeniably an example of insecure workload configuration. Our contribution, recognized for its value, was duly incorporated into the framework. Leveraging the inherently open source nature of the OWASP framework, we submitted a Pull Request (PR) on GitHub, proposing this novel enhancement. This act of contributing to established security awareness frameworks not only bolsters cloud-native security but also enhances its transparency, marking a pivotal step towards a more secure and aware cloud-native ecosystem.

Bridging Security and Scalability

The perceived complexity of maintaining, monitoring, and modifying resource constraints can often deter organizations from implementing these critical security measures. Given the dynamic nature of development environments, where application needs can fluctuate based on demand, feature rollouts, and scalability requirements, it’s understandable why teams might view resource limits as a potential barrier to agility. However, this perspective overlooks the inherent flexibility of Kubernetes’ resource management capabilities, and more importantly, the critical role of cross-functional communication in optimizing these settings for both security and performance.

The art of flexible constraints

Kubernetes offers a sophisticated model for managing resource constraints that does not inherently stifle application growth or operational flexibility. Through the use of requests and limits, Kubernetes allows for the specification of minimum resources guaranteed for a container (requests) and a maximum cap (limits) that a container cannot exceed. This model provides a framework within which applications can operate efficiently, scaling within predefined bounds that ensure security without compromising on performance.

The key to leveraging this model effectively lies in adopting a continuous evaluation and adjustment approach. Regularly reviewing resource utilization metrics can provide valuable insights into how applications are performing against their allocated resources, identifying opportunities to adjust constraints to better align with actual needs. This iterative process ensures that resource limits remain relevant, supportive of application demands, and protective against security vulnerabilities.

Fostering open communication lines

At the core of successfully implementing flexible resource constraints is the collaboration between development, operations, and security teams. Open lines of communication are essential for understanding application requirements, sharing insights on potential security implications of configuration changes, and making informed decisions on resource allocation.

Encouraging a culture of transparency and collaboration can demystify the process of adjusting resource limits, making it a routine part of the development lifecycle rather than a daunting task. Regular cross-functional meetings, shared dashboards of resource utilization and performance metrics, and a unified approach to incident response can foster a more integrated team dynamic.

Simplifying maintenance, monitoring, and modification

With the right tools and practices in place, resource management can be streamlined and integrated into the existing development workflow. Automation tools can simplify the deployment and update of resource constraints, while monitoring solutions can provide real-time visibility into resource utilization and performance.

Training and empowerment, coupled with clear guidelines and easy-to-use tools, can make adjusting resource constraints a straightforward task that supports both security posture and operational agility.

Conclusion

Setting resource limits in Kubernetes transcends being a mere security measure; it’s a pivotal strategy that harmoniously balances operational efficiency with robust security. This practice gains even more significance in the light of evolving cloud-native threats, particularly cryptomining attacks, which are increasingly becoming a preferred method for attackers due to their low-effort, high-reward nature.

Reflecting on the 2022 Cloud-Native Threat Report, we observe a noteworthy trend. The Sysdig Threat Research team profiled TeamTNT, a notorious cloud-native threat actor known for targeting both cloud and container environments, predominantly for crypto-mining purposes. Their research underlines a startling economic imbalance: cryptojacking costs victims an astonishing $53 for every $1 an attacker earns from stolen resources. This disparity highlights the financial implications of such attacks, beyond the apparent security breaches.

TeamTNT’s approach reiterates why attackers are choosing to exploit environments where container resource limits are undefined or unmonitored. The lack of constraints or oversight of resource usage in containers creates an open field for attackers to deploy cryptojacking malware, leveraging the unmonitored resources for financial gain at the expense of its victim.

In light of these insights, it becomes evident that the implementation of resource constraints in Kubernetes and the monitoring of resource usage in Kubernetes are not just best practices for security and operational efficiency; they are essential defenses against a growing trend of financially draining cryptomining attacks. As Kubernetes continues to evolve, the importance of these practices only escalates. Organizations must proactively adapt by setting appropriate resource limits and establishing vigilant monitoring systems, ensuring a secure, efficient, and financially sound environment in the face of such insidious threats.

The post Resource Constraints in Kubernetes and Security appeared first on Sysdig.