Kubernetes & Container Security | Sysdig Mon, 29 Jul 2024 14:32:36 +0000 en-US hourly 1 https://wordpress.org/?v=6.5.5 https://sysdig.com/wp-content/uploads/favicon-150x150.png Kubernetes & Container Security | Sysdig 32 32 Kubernetes 1.31 – What’s new? https://sysdig.com/blog/whats-new-kubernetes-1-31/ Fri, 26 Jul 2024 14:00:00 +0000 https://sysdig.com/?p=92018 Kubernetes 1.31 is nearly here, and it’s full of exciting major changes to the project! So, what’s new in this...

The post Kubernetes 1.31 – What’s new? appeared first on Sysdig.

]]>
Kubernetes 1.31 is nearly here, and it’s full of exciting major changes to the project! So, what’s new in this upcoming release?

Kubernetes 1.31 brings a plethora of enhancements, including 37 line items tracked as ‘Graduating’ in this release. From these, 11 enhancements are graduating to stable, including the highly anticipated AppArmor support for Kubernetes, which includes the ability to specify an AppArmor profile for a container or pod in the API, and have that profile applied by the container runtime. 

34 new alpha features are also making their debut, with a lot of eyes on the initial design to support pod-level resource limits. Security teams will be particularly interested in tracking the progress on this one.

Watch out for major changes such as the improved connectivity reliability for KubeProxy Ingress, which now offers a better capability of connection draining on terminating Nodes, and for load balancers which support that.

Further enhancing security, we see Pod-level resource limits moving along from Net New to Alpha, offering a capability similar to Resource Constraints in Kubernetes that harmoniously balances operational efficiency with robust security.

There are also numerous quality-of-life improvements that continue the trend of making Kubernetes more user-friendly and efficient, such as a randomized algorithm for Pod selection when downscaling ReplicaSets.

We are buzzing with excitement for this release! There’s plenty to unpack here, so let’s dive deeper into what Kubernetes 1.31 has to offer.

Editor’s pick:

These are some of the changes that look most exciting to us in this release:

#2395 Removing In-Tree Cloud Provider Code

Probably the most exciting advancement in v.1.31 is the removal of all in-tree integrations with cloud providers. Since v.1.26 there has been a large push to help Kubernetes truly become a vendor-neutral platform. This Externalization process will successfully remove all cloud provider specific code from the k8s.io/kubernetes repository with minimal disruption to end users and developers.

Nigel DouglasSr. Open Source Security Researcher

#2644 Always Honor PersistentVolume Reclaim Policy

I like this enhancement a lot as it finally allows users to honor the PersistentVolume Reclaim Policy through a deletion protection finalizer. HonorPVReclaimPolicy is now enabled by default. Finalizers can be added on a PersistentVolume to ensure that PersistentVolumes having Delete reclaim policy are deleted only after the backing storage is deleted.


The newly introduced finalizers kubernetes.io/pv-controller and external-provisioner.volume.kubernetes.io/finalizer are only added to dynamically provisioned volumes within your environment.

Pietro PiuttiSr. Technical Marketing Manager

#4292 Custom profile in kubectl debug


I’m delighted to see that they have finally introduced a new custom profile option for the Kubectl Debug command. This feature addresses the challenge teams would have regularly faced when debugging applications built in shell-less base images. By allowing the mounting of data volumes and other resources within the debug container, this enhancement provides a significant security benefit for most organizations, encouraging the adoption of more secure, shell-less base images without sacrificing debugging capabilities.

Thomas LabarussiasSr. Developer Advocate & CNCF Ambassador


Apps in Kubernetes 1.31

#3017 PodHealthyPolicy for PodDisruptionBudget

Stage: Graduating to Stable
Feature group: sig-apps

Kubernetes 1.31 introduces the PodHealthyPolicy for PodDisruptionBudget (PDB). PDBs currently serve two purposes: ensuring a minimum number of pods remain available during disruptions and preventing data loss by blocking pod evictions until data is replicated.

The current implementation has issues. Pods that are Running but not Healthy (Ready) may not be evicted even if their number exceeds the PDB threshold, hindering tools like cluster-autoscaler. Additionally, using PDBs to prevent data loss is considered unsafe and not their intended use.

Despite these issues, many users rely on PDBs for both purposes. Therefore, changing the PDB behavior without supporting both use-cases is not viable, especially since Kubernetes lacks alternative solutions for preventing data loss.

#3335 Allow StatefulSet to control start replica ordinal numbering

Stage: Graduating to Stable
Feature group: sig-apps

The goal of this feature is to enable the migration of a StatefulSet across namespaces, clusters, or in segments without disrupting the application. Traditional methods like backup and restore cause downtime, while pod-level migration requires manual rescheduling. Migrating a StatefulSet in slices allows for a gradual and less disruptive migration process by moving only a subset of replicas at a time.

#3998 Job Success/completion policy

Stage: Graduating to Beta
Feature group: sig-apps

We are excited about the improvement to the Job API, which now allows setting conditions under which an Indexed Job can be declared successful. This is particularly useful for batch workloads like MPI and PyTorch that need to consider only leader indexes for job success. Previously, an indexed job was marked as completed only if all indexes succeeded. Some third-party frameworks, like Kubeflow Training Operator and Flux Operator, have implemented similar success policies. This improvement will enable users to mark jobs as successful based on a declared policy, terminating lingering pods once the job meets the success criteria.

CLI in Kubernetes 1.31

#4006 Transition from SPDY to WebSockets

Stage: Graduating to Beta
Feature group: sig-cli

This enhancement proposes adding a WebSocketExecutor to the kubectl CLI tool, using a new subprotocol version (v5.channel.k8s.io), and creating a FallbackExecutor to handle client/server version discrepancies. The FallbackExecutor first attempts to connect using the WebSocketExecutor, then falls back to the legacy SPDYExecutor if unsuccessful, potentially requiring two request/response trips. Despite the extra roundtrip, this approach is justified because modifying the low-level SPDY and WebSocket libraries for a single handshake would be overly complex, and the additional IO load is minimal in the context of streaming operations. Additionally, as releases progress, the likelihood of a WebSocket-enabled kubectl interacting with an older, non-WebSocket API Server decreases.

#4706 Deprecate and remove kustomize from kubectl

Stage: Net New to Alpha
Feature group: sig-cli

The update was deferred from the Kubernetes 1.31 release. Kustomize was initially integrated into kubectl to enhance declarative support for Kubernetes objects. However, with the development of various customization and templating tools over the years, kubectl maintainers now believe that promoting one tool over others is not appropriate. Decoupling Kustomize from kubectl will allow each project to evolve at its own pace, avoiding issues with mismatched release cycles that can lead to kubectl users working with outdated versions of Kustomize. Additionally, removing Kustomize will reduce the dependency graph and the size of the kubectl binary, addressing some dependency issues that have affected the core Kubernetes project.

#3104 Separate kubectl user preferences from cluster configs

Stage: Net New to Alpha
Feature group: sig-cli

Kubectl, one of the earliest components of the Kubernetes project, upholds a strong commitment to backward compatibility. We aim to let users opt into new features (like delete confirmation), which might otherwise disrupt existing CI jobs and scripts. Although kubeconfig has an underutilized field for preferences, it isn’t ideal for this purpose. New clusters usually generate a new kubeconfig file with credentials and host details, and while these files can be merged or specified by path, we believe server configuration and user preferences should be distinctly separated.

To address these needs, the Kubernetes maintainers proposed introducing a kuberc file for client preferences. This file will be versioned and structured to easily incorporate new behaviors and settings for users. It will also allow users to define kubectl command aliases and default flags. With this change, we plan to deprecate the kubeconfig Preferences field. This separation ensures users can manage their preferences consistently, regardless of the –kubeconfig flag or $KUBECONFIG environment variable.

Kubernetes 1.31 instrumentation

#2305 Metric cardinality enforcement

Stage: Graduating to Stable
Feature group: sig-instrumentation

Metrics turning into memory leaks pose significant issues, especially when they require re-releasing the entire Kubernetes binary to fix. Historically, we’ve tackled these issues inconsistently. For instance, coding mistakes sometimes cause unintended IDs to be used as metric label values. 

In other cases, we’ve had to delete metrics entirely due to their incorrect use. More recently, we’ve either removed metric labels or retroactively defined acceptable values for them. Fixing these issues is a manual, labor-intensive, and time-consuming process without a standardized solution.

This stable update should address these problems by enabling metric dimensions to be bound to known sets of values independently of Kubernetes code releases.

Network in Kubernetes 1.31

#3836 Ingress Connectivity Reliability Improvement for Kube-Proxy

Stage: Graduating to Stable
Feature group: sig-network

This enhancement finally introduces a more reliable mechanism for handling ingress connectivity for endpoints on terminating nodes and nodes with unhealthy Kube-proxies, focusing on eTP:Cluster services. Currently, Kube-proxy’s response is based on its healthz state for eTP:Cluster services and the presence of a Ready endpoint for eTP:Local services. This KEP addresses the former.

The proposed changes are:

  1. Connection Draining for Terminating Nodes:
    Kube-proxy will use the ToBeDeletedByClusterAutoscaler taint to identify terminating nodes and fail its healthz check to signal load balancers for connection draining. Other signals like .spec.unschedulable were considered but deemed less direct.
  1. Addition of /livez Path:
    Kube-proxy will add a /livez endpoint to its health check server to reflect the old healthz semantics, indicating whether data-plane programming is stale.
  1. Cloud Provider Health Checks:
    While not aligning cloud provider health checks for eTP:Cluster services, the KEP suggests creating a document on Kubernetes’ official site to guide and share knowledge with cloud providers for better health checking practices.

#4444 Traffic Distribution to Services

Stage: Graduating to Beta
Feature group: sig-network

To enhance traffic routing in Kubernetes, this KEP proposes adding a new field, trafficDistribution, to the Service specification. This field allows users to specify routing preferences, offering more control and flexibility than the earlier topologyKeys mechanism. trafficDistribution will provide a hint for the underlying implementation to consider in routing decisions without offering strict guarantees.

The new field will support values like PreferClose, indicating a preference for routing traffic to topologically proximate endpoints. The absence of a value indicates no specific routing preference, leaving the decision to the implementation. This change aims to provide enhanced user control, standard routing preferences, flexibility, and extensibility for innovative routing strategies.

#1880 Multiple Service CIDRs

Stage: Graduating to Beta
Feature group: sig-network

This proposal introduces a new allocator logic using two new API objects: ServiceCIDR and IPAddress, allowing users to dynamically increase available Service IPs by creating new ServiceCIDRs. The allocator will automatically consume IPs from any available ServiceCIDR, similar to adding more disks to a storage system to increase capacity.

To maintain simplicity, backward compatibility, and avoid conflicts with other APIs like Gateway APIs, several constraints are added:

  • ServiceCIDR is immutable after creation.
  • ServiceCIDR can only be deleted if no Service IPs are associated with it.
  • Overlapping ServiceCIDRs are allowed.
  • The API server ensures a default ServiceCIDR exists to cover service CIDR flags and the “kubernetes.default” Service.
  • All IPAddresses must belong to a defined ServiceCIDR.
  • Every Service with a ClusterIP must have an associated IPAddress object.
  • A ServiceCIDR being deleted cannot allocate new IPs.

This creates a one-to-one relationship between Service and IPAddress, and a one-to-many relationship between ServiceCIDR and IPAddress. Overlapping ServiceCIDRs are merged in memory, with IPAddresses coming from any ServiceCIDR that includes that IP. The new allocator logic can also be used by other APIs, such as the Gateway API, enabling future administrative and cluster-wide operations on Service ranges.

Kubernetes 1.31 nodes

#2400 Node Memory Swap Support

Stage: Graduating to Stable
Feature group: sig-node

The enhancement should now integrate swap memory support into Kubernetes, addressing two key user groups: node administrators for performance tuning and app developers requiring swap for their apps. 

The focus was to facilitate controlled swap use on a node level, with the kubelet enabling Kubernetes workloads to utilize swap space under specific configurations. The ultimate goal is to enhance Linux node operation with swap, allowing administrators to determine swap usage for workloads, initially not permitting individual workloads to set their own swap limits.

#4569 Move cgroup v1 support into maintenance mode

Stage: Net New to Stable
Feature group: sig-node

The proposal aims to transition Kubernetes’ cgroup v1 support into maintenance mode while encouraging users to adopt cgroup v2. Although cgroup v1 support won’t be removed immediately, its deprecation and eventual removal will be addressed in a future KEP. The Linux kernel community and major distributions are focusing on cgroup v2 due to its enhanced functionality, consistent interface, and improved scalability. Consequently, Kubernetes must align with this shift to stay compatible and benefit from cgroup v2’s advancements.

To support this transition, the proposal includes several goals. First, cgroup v1 will receive no new features, marking its functionality as complete and stable. End-to-end testing will be maintained to ensure the continued validation of existing features. The Kubernetes community may provide security fixes for critical CVEs related to cgroup v1 as long as the release is supported. Major bugs will be evaluated and fixed if feasible, although some issues may remain unresolved due to dependency constraints.

Migration support will be offered to help users transition from cgroup v1 to v2. Additionally, efforts will be made to enhance cgroup v2 support by addressing all known bugs, ensuring it is reliable and functional enough to encourage users to switch. This proposal reflects the broader ecosystem’s movement towards cgroup v2, highlighting the necessity for Kubernetes to adapt accordingly.

#24 AppArmor Support

Stage: Graduating to Stable
Feature group: sig-node

Adding AppArmor support to Kubernetes marks a significant enhancement in the security posture of containerized workloads. AppArmor is a Linux kernel module that allows system admins to restrict certain capabilities of a program using profiles attached to specific applications or containers. By integrating AppArmor into Kubernetes, developers can now define security policies directly within an app config.

The initial implementation of this feature would allow for specifying an AppArmor profile within the Kubernetes API for individual containers or entire pods. This profile, once defined, would be enforced by the container runtime, ensuring that the container’s actions are restricted according to the rules defined in the profile. This capability is crucial for running secure and confined applications in a multi-tenant environment, where a compromised container could potentially affect other workloads or the underlying host.

Scheduling in Kubernetes

#3633 Introduce MatchLabelKeys and MismatchLabelKeys to PodAffinity and PodAntiAffinity

Stage: Graduating to Beta
Feature group: sig-scheduling

This was Tracked for Code Freeze as of July 23rd. This enhancement finally introduces the MatchLabelKeys for PodAffinityTerm to refine PodAffinity and PodAntiAffinity, enabling more precise control over Pod placements during scenarios like rolling upgrades. 

By allowing users to specify the scope for evaluating Pod co-existence, it addresses scheduling challenges that arise when new and old Pod versions are present simultaneously, particularly in saturated or idle clusters. This enhancement aims to improve scheduling effectiveness and cluster resource utilization.

Kubernetes storage

#3762 PersistentVolume last phase transition time

Stage: Graduating to Stable
Feature group: sig-storage

The Kubernetes maintainers plan to update the API server to support a new timestamp field for PersistentVolumes, which will record when a volume transitions to a different phase. This field will be set to the current time for all newly created volumes and those changing phases. While this timestamp is intended solely as a convenience for cluster administrators, it will enable them to list and sort PersistentVolumes based on the transition times, aiding in manual cleanup and management.

This change addresses issues experienced by users with the Delete retain policy, which led to data loss, prompting many to revert to the safer Retain policy. With the Retain policy, unclaimed volumes are marked as Released, and over time, these volumes accumulate. The timestamp field will help admins identify when volumes last transitioned to the Released phase, facilitating easier cleanup. 

Moreover, the generic recording of timestamps for all phase transitions will provide valuable metrics and insights, such as measuring the time between Pending and Bound phases. The goals are to introduce this timestamp field and update it with every phase transition, without implementing any volume health monitoring or additional actions based on the timestamps.

#3751 Kubernetes VolumeAttributesClass ModifyVolume

Stage: Graduating to Beta
Feature group: sig-storage

The proposal introduces a new Kubernetes API resource, VolumeAttributesClass, along with an admission controller and a volume attributes protection controller. This resource will allow users to manage volume attributes, such as IOPS and throughput, independently from capacity. The current immutability of StorageClass.parameters necessitates this new resource, as it permits updates to volume attributes without directly using cloud provider APIs, simplifying storage resource management.

VolumeAttributesClass will enable specifying and modifying volume attributes both at creation and for existing volumes, ensuring changes are non-disruptive to workloads. Conflicts between StorageClass.parameters and VolumeAttributesClass.parameters will result in errors from the driver. 

The primary goals include providing a cloud-provider-independent specification for volume attributes, enforcing these attributes through the storage, and allowing workload developers to modify them non-disruptively. The proposal does not address OS-level IO attributes, inter-pod volume attributes, or scheduling based on node-specific volume attributes limits, though these may be considered for future extensions.

#3314 CSI Differential Snapshot for Block Volumes

Stage: Net New to Alpha
Feature group: sig-storage

This enhancement was removed from the Kubernetes 1.31 milestone. It aims at enhancing the CSI specification by introducing a new optional CSI SnapshotMetadata gRPC service. This service allows Kubernetes to retrieve metadata on allocated blocks of a single snapshot or the changed blocks between snapshots of the same block volume. Implemented by the community-provided external-snapshot-metadata sidecar, this service must be deployed by a CSI driver. Kubernetes backup applications can access snapshot metadata through a secure TLS gRPC connection, which minimizes load on the Kubernetes API server.

The external-snapshot-metadata sidecar communicates with the CSI driver’s SnapshotMetadata service over a private UNIX domain socket. The sidecar handles tasks such as validating the Kubernetes authentication token, authorizing the backup application, validating RPC parameters, and fetching necessary provisioner secrets. The CSI driver advertises the existence of the SnapshotMetadata service to backup applications via a SnapshotMetadataService CR, containing the service’s TCP endpoint, CA certificate, and audience string for token authentication.

Backup applications must obtain an authentication token using the Kubernetes TokenRequest API with the service’s audience string before accessing the SnapshotMetadata service. They should establish trust with the specified CA and use the token in gRPC calls to the service’s TCP endpoint. This setup ensures secure, efficient metadata retrieval without overloading the Kubernetes API server.

The goals of this enhancement are to provide a secure CSI API for identifying allocated and changed blocks in volume snapshots, and to efficiently relay large amounts of snapshot metadata from the storage provider. This API is an optional component of the CSI framework.

Other enhancements in Kubernetes 1.31

#4193 Bound service account token improvements

Stage: Graduating to Beta
Feature group: sig-auth

The proposal aims to enhance Kubernetes security by embedding the bound Node information in tokens and extending token functionalities. The kube-apiserver will be updated to automatically include the name and UID of the Node associated with a Pod in the generated tokens during a TokenRequest. This requires adding a Getter for Node objects to fetch the Node’s UID, similar to existing processes for Pod and Secret objects.

Additionally, the TokenRequest API will be extended to allow tokens to be bound directly to Node objects, ensuring that when a Node is deleted, the associated token is invalidated. The SA authenticator will be modified to verify tokens bound to Node objects by checking the existence of the Node and validating the UID in the token. This maintains the current behavior for Pod-bound tokens while enforcing new validation checks for Node-bound tokens from the start.

Furthermore, each issued JWT will include a UUID (JTI) to trace the requests made to the apiserver using that token, recorded in audit logs. This involves generating the UUID during token issuance and extending audit log entries to capture this identifier, enhancing traceability and security auditing.

#3962 Mutating Admission Policies

Stage: Net New to Alpha
Feature group: sig-api-machinery

Continuing the work started in KEP-3488, the project maintainers have proposed adding mutating admission policies using CEL expressions as an alternative to mutating admission webhooks. This builds on the API for validating admission policies established in KEP-3488. The approach leverages CEL’s object instantiation and Server Side Apply’s merge algorithms to perform mutations.

The motivation for this enhancement stems from the simplicity needed for common mutating operations, such as setting labels or adding sidecar containers, which can be efficiently expressed in CEL. This reduces the complexity and operational overhead of managing webhooks. Additionally, CEL-based mutations offer advantages such as allowing the kube-apiserver to introspect mutations and optimize the order of policy applications, minimizing reinvocation needs. In-process mutation is also faster compared to webhooks, making it feasible to re-run mutations to ensure consistency after all operations are applied.

The goals include providing a viable alternative to mutating webhooks for most use cases, enabling policy frameworks without webhooks, offering an out-of-tree implementation for compatibility with older Kubernetes versions, and providing core functionality as a library for use in GitOps, CI/CD pipelines, and auditing scenarios.

#3715 Elastic Indexed Jobs

Stage: Graduating to Stable
Feature group: sig-apps

Also graduating to Stable, this feature will allow for mutating spec.completions on Indexed Jobs when it matches and is updated with spec.parallelism. The success and failure semantics remain unchanged for jobs that do not alter spec.completions. For jobs that do, failures always count against the job’s backoffLimit, even if spec.completions is scaled down and the failed pods fall outside the new range. The status.Failed count will not decrease, but status.Succeeded will update to reflect successful indexes within the new range. If a previously successful index is out of range due to scaling down and then brought back into range by scaling up, the index will restart.

If you liked this, you might want to check out our previous ‘What’s new in Kubernetes’ editions:

Get involved with the Kubernetes project:

The post Kubernetes 1.31 – What’s new? appeared first on Sysdig.

]]>
SANS Cloud-Native Application Protection Platforms (CNAPP) Buyers Guide https://sysdig.com/blog/sans-cnapp-buyers-guide/ Thu, 25 Jul 2024 11:00:00 +0000 https://sysdig.com/?p=69509 The SANS Cloud-Native Application Protection Platform (CNAPP) Buyers Guide gives companies a deep dive into what to look for in...

The post SANS Cloud-Native Application Protection Platforms (CNAPP) Buyers Guide appeared first on Sysdig.

]]>
The SANS Cloud-Native Application Protection Platform (CNAPP) Buyers Guide gives companies a deep dive into what to look for in a CNAPP solution. As organizations continue to shift towards integrated platform-based solutions for their cloud security needs, it becomes critical to evaluate whether a CNAPP solution meets all the requirements across use cases like posture management, permissions management, vulnerability management, and threat detection and response. Ideally, teams will be able to unify these capabilities within a single, comprehensive platform to manage risk and defend against attacks.

The SANS CNAPP Buyers Guide provides an in-depth look at what criteria to consider when purchasing a CNAPP solution, as well as a checklist of required and desired capabilities for the security platform. By utilizing this guide as a resource to navigate the buying process, you can ensure your security platform provides a unified cloud and container security experience with no blind spots. Download the full guide here.

Why Purchase a CNAPP?

The explosive growth of cloud and containers has created an expanded and dynamic attack surface that security teams need to defend. As more developers deploy containerized microservices and utilize cloud services and infrastructure, monitoring and protecting them becomes more complex. Security teams now have dynamic workloads with 10–100x more containerized compute instances, large volumes of cloud assets with dynamic activity to track, and messy and overly permissive identity and access management (IAM) permissions to manage. This rapid expansion of the attack surface in cloud-native applications has led to many vulnerabilities, misconfigurations, and security weaknesses to manage, and security teams need a tool that provides full visibility across cloud and containers.

As weaknesses in security posture have increased, security and operations teams have become overwhelmed by the number of alerts and vulnerabilities they face, leaving organizations with long exposure windows to critical vulnerabilities. As the adoption of cloud services and containers/Kubernetes increases the sources of data to analyze, you need a way to process all this data into insights that can be applied to remediating security issues. Without significant additional context on your cloud workloads and infrastructure, it is difficult for teams to prioritize which of these alerts actually present significant risks and which are just noise. An effective CNAPP will use knowledge of which containers and packages are actually running to provide actionable insights that security and DevOps teams can use to prioritize the most critical risks.

The move to the cloud has also led to an evolution in the threat landscape to take advantage of the security gaps in cloud-native applications. Bad actors have adapted their tactics and techniques to quickly compromise cloud environments with valid credentials, find and exploit vulnerabilities, and move laterally across workloads and clouds to extract maximum return from any breach. The changes to the threat landscape call for a complete solution that can detect these modern threats throughout your cloud-native infrastructure.

Traditional Tools Fall Short

Many traditional security tools are not suited to cloud workloads, environments, and the threats that have evolved to take advantage of their weaknesses. Tools like endpoint detection and response (EDR) solutions lack critical visibility into cloud services, workloads, and Kubernetes, and create blind spots that can easily be exploited. Traditional tools also often send many alerts and signals, but lack the context needed to rapidly and effectively respond to threats in cloud-based applications and workloads. The dynamic nature of software development and deployment, as well as the ephemeral nature of containerized environments, only add to the complexity, and security and DevOps teams need a security tool specifically designed to handle cloud-native environments.

Further, point solutions don’t work. Often organizations must choose from among multiple solutions, or even choose vendors that stitch together a workflow from multiple acquisitions. These tools don’t communicate with each other or share context, resulting in a reactive approach of dealing with disparate vulnerability findings, posture violations, and threats as they become a problem. This approach leaves teams without the insights they need to prioritize issues based on their impact.

What to Look for in a CNAPP Solution

Security and DevOps teams need comprehensive visibility into workloads, cloud activity, and user behavior in real time. The number of signals that teams have to make sense of is exploding, and a comprehensive CNAPP solution needs to help users focus on the most critical risks in their cloud-native infrastructure.

This is where having deep knowledge of what’s running right now can help you shrink the list of things that need attention first. Simply put, knowledge of what’s running (or simply what’s in use) is the necessary context needed by security and DevOps teams to take action on the most critical risks first. Ultimately, this context can be fed back early in the development lifecycle to make “shift-left” better with actionable prioritization. With all the sources of data that a CNAPP has to ingest and analyze, an effective CNAPP solution needs runtime insights to help teams focus on the risks that really matter. For example, by filtering on vulnerabilities in packages active at runtime, you can reduce vulnerability noise by up to 95%.

With the SANS CNAPP Buyers Guide, you can make sure your organization is focused on the most critical risks in your cloud infrastructure. The guide includes a detailed checklist of important capabilities and features to look for in a CNAPP solution. While there are too many to list here in full, the capabilities of an effective CNAPP solution fall into these areas.

User Experience: Many solutions today are not intuitive and may be difficult to work with. Effective CNAPP solutions should offer unified security and risk dashboards, as well as aggregated security findings and remediation suggestions through simple interfaces. They should also be simple to deploy.

Cloud Workload Protection (CWP): A CNAPP solution should protect workloads across the software lifecycle, with capabilities in vulnerability management, configuration management for containers/Kubernetes, and runtime security/incident response. The ability to prioritize the most critical vulnerabilities or configurations based on in-use risk exposure is key. The tool should integrate with CI/CD tools, provide rich context to investigate alerts, and give suggestions to fix at the source.

Cloud Security Posture Management (CSPM): Continuous visibility, detection, and remediation of cloud security misconfigurations is key for a CNAPP solution. The solution should offer capabilities in cloud vulnerability management, configuration management, and permissions/entitlement management (e.g., CIEM).

Cloud Detection and Response (CDR): Detection and response capabilities related to cloud-centric threats are critical. Effective CNAPP solutions should expand beyond just workload runtime security and address the cloud control plane to detect suspicious activities across users and services.

Enterprise-grade Platform: Effective CNAPP solutions often have enhancements and additional features that integrate and align with API use, scripting and automation functionality, auditing and logging, and support for large-scale deployments.

Want to see the full list of capabilities? Download the full SANS Cloud-Native Application Protection Platform (CNAPP) Buyers Guide now for all the details.

The post SANS Cloud-Native Application Protection Platforms (CNAPP) Buyers Guide appeared first on Sysdig.

]]>
Introducing Layered Analysis for enhanced container security https://sysdig.com/blog/layered-analysis-for-enhanced-container-security/ Tue, 23 Jul 2024 14:00:00 +0000 https://sysdig.com/?p=91685 Containerized applications deliver exceptional speed and flexibility, but they also bring complex security challenges, particularly in managing and mitigating vulnerabilities...

The post Introducing Layered Analysis for enhanced container security appeared first on Sysdig.

]]>
Containerized applications deliver exceptional speed and flexibility, but they also bring complex security challenges, particularly in managing and mitigating vulnerabilities within container images. To tackle these issues, we are excited to introduce Layered Analysis — an important enhancement that provides precise and actionable security insights.

What’s new: Layered Analysis capabilities

Layered Analysis enhances our container security toolkit by offering a granular view of container images, breaking them down into their composing layers. This capability enables more accurate identification of vulnerabilities and optimized remediation workflows by clearly discerning whether vulnerabilities belong to the base image or the application layers, aiding in proper team assignment and resolution.

Key benefits

  • Enhanced accuracy and reduced time to fix: Identify vulnerabilities at each container image layer, pinpointing the specific package and instruction responsible, thereby reducing fix time.
  • Facilitate attribution and ownership: Discern whether vulnerabilities belong to the base image or the application layers, aiding in proper team assignment and resolution.
  • Actionable insights: Receive practical, contextual recommendations to expedite and prioritize vulnerability resolution.

Detailed insights with Layered Analysis

Container images are constructed in layers, with each change or instruction during the build process creating a new layer. Layered Analysis helps detect and display vulnerabilities and packages associated with each image layer, identifying different remediation actions and ownership depending on the layer introducing the vulnerabilities.

Enhanced Container Security

For example, vulnerabilities in the base OS layer, such as an end-of-life (EOL) Alpine version, can be remediated by updating the base image version, a task typically performed by the security team. In contrast, vulnerabilities in the application or non-OS layers, such as outdated Go libraries like Gin or Echo, can be addressed by updating the versions of libraries and dependencies, tasks that fall to the development teams.

Request a Demo

Request a personalized demo by one of our experts and explore Enhanced Container Security.

How to enable and use Layered Analysis

Layered Analysis is now generally available and requires the following components for full functionality:

  • Cluster and Registry Scanners: Automatically supported with platform scanning.
  • CLI Version 1.12.0 or Higher: Ensure you are using the latest CLI version.
  • CLI Enhancements: Utilize new flags (–separate-by-layer and –separate-by-image) to modify output and view image hierarchy or layer information.
  • JSON Outputs: Updated to include new fields for detailed layer information.

Exploring the image hierarchy

Understanding the image hierarchy is key to Layered Analysis, as shown in the screenshot below.

This view shows the difference between base images and application layers, helping you quickly identify where vulnerabilities come from:

  • All layers: Shows the total number of vulnerabilities in the final image, including both application and OS layers. If a vulnerability is fixed in an intermediate layer, it won’t be included in the total count.
  • Base Images (prefixed with FROM): Display vulnerabilities present in the base image, including those inherited from parent images.
  • Application layers: Only show vulnerabilities introduced in the application layers, excluding those from base images.

Actionable recommendations

Layered Analysis doesn’t just identify vulnerabilities; it also provides recommendations to fix them. You’ll receive suggestions to upgrade base images, address the worst vulnerabilities in application layers, and fix problematic packages. 

These actionable insights help streamline the remediation process, ensuring that vulnerabilities are addressed efficiently and effectively.

Full visibility of image history

Layered Analysis also offers full visibility into the history of your container image. You can see packages that existed in previous layers but were removed in subsequent layers. 

While these packages no longer pose a security issue, having this historical view is invaluable for understanding the evolution of your image and ensuring comprehensive security management. 

This helps teams trace back through changes, making it easier to collaborate and maintain a secure container environment.

Investigate single layers

Another powerful feature of Layered Analysis is the ability to investigate single layers of your container image. You can see exactly what packages exist in each layer and identify any vulnerabilities introduced at that specific stage. 

This granular investigation capability allows teams to pinpoint the source of security issues and understand the impact of each layer’s changes. By isolating and analyzing single layers, you can more effectively manage and remediate vulnerabilities.

Leveraging Layered Analysis for better security

Layered Analysis empowers security and development teams by providing a clear and actionable view of container image vulnerabilities. By enhancing the precision of vulnerability identification and optimizing remediation workflows, teams can effectively reduce risks and improve overall security.

With Layered Analysis, teams can pinpoint exactly where a vulnerability was introduced, identifying the specific layer responsible. This capability is particularly useful in large organizations where multiple teams are involved in containerized applications lifecycle, from building images to deploying and monitoring their health — such as infrastructure engineers creating/curating base images, developers packaging applications, and all of them working together to make sure workloads are as secure and vulnerability free as possible and security patches are promptly applied. By tracing vulnerabilities back to their source, teams can determine responsibility and ensure accountability.

By clearly distinguishing between base image and application layer vulnerabilities, Layered Analysis enables more efficient routing of remediation tasks. Security teams can focus on updating base images to mitigate inherited vulnerabilities, while development teams handle issues within the application layers. This structured approach not only streamlines the remediation process but also enhances the overall security posture of containerized environments.

Want to learn more? Reach out to your Sysdig representative, or book a demo here!

The post Introducing Layered Analysis for enhanced container security appeared first on Sysdig.

]]>
CVE-2024-6387 – Shields Up Against RegreSSHion https://sysdig.com/blog/cve-2024-6387/ Thu, 04 Jul 2024 15:00:00 +0000 https://sysdig.com/?p=90507 On July 1st, the Qualys’s security team announced CVE-2024-6387, a remotely exploitable vulnerability in the OpenSSH server. This critical vulnerability...

The post CVE-2024-6387 – Shields Up Against RegreSSHion appeared first on Sysdig.

]]>
On July 1st, the Qualys’s security team announced CVE-2024-6387, a remotely exploitable vulnerability in the OpenSSH server. This critical vulnerability is nicknamed “regreSSHion” because the root cause is an accidental removal of code that fixed a much earlier vulnerability CVE-2006-5051 back in 2006. The race condition affects the default configuration of sshd (the daemon program for SSH).

OpenSSH versions older than 4.4p1 – unless patched for previous CVE-2006-5051 and CVE-2008-4109) – and versions between 8.5p1 and 9.8p1 are impacted. The general guidance is to update the versions. Ubuntu users can download the updated versions

According to OpenSSH infosec researchers, this vulnerability may be difficult to exploit. 

Their investigation disclosed that under lab conditions, the attack requires, on average, 6-8 hours of continuous connections until the maximum amount accepted by the server is met.

Why is CVE-2024-6387 significant? 

This vulnerability allows an unauthenticated attacker to gain root level privileges and remotely access your glibc-based Linux systems, where syslog() (a system logging protocol) itself calls async-signal-unsafe functions via the SIGALRM handler. Researchers believe that OpenSSH on OpenBSD, a notable exception, is not vulnerable by design as the SIGALRM handler calls syslog_r(), an async-signal-safer version of syslog(). 

What is the impact?

OpenSSH researchers believe the attacks will improve over time –thanks to the advancements in deep learning – and impact other operating systems, including the non-glibc systems. The net effect of exploiting CVE-2024-6387 is full system compromise and takeover, enabling threat actors to execute arbitrary code with the highest privileges, subvert security mechanisms, data theft, and even maintain persistent access. The team at Qualys have already identified no less than 14 million potentially vulnerable OpenSSH server instances exposed to the internet. 

How to find vulnerable OpenSSH packages with sysdig

You can use your inventory workflows to get visibility into resources and security blindspots across your cloud (GCP, Azure and AWS), Kubernetes, and container images. Besides patching, you should also limit SSH access to your critical assets. 

Here’s how you can look for the vulnerable OpenSSH package within your environment using Sysdig Secure:

  • Navigate to the Inventory tab
  • In the Search bar, enter the following query: 
Package contains openssh

The results show all the resources across your cloud estate that have the vulnerable package. Sysdig provides an overview of all the blind spots that may have gone unchecked within your environment. You can interact with the filters and further reduce your investigation timelines from within a single unified platform.

CVE-2024-6387

The need for stateful detections

Exploitation of regreSSHion involves multiple attempts (thousands, in fact) executed in a fixed period of time. This complexity is what downgrades the CVE from “Critical” classified vulnerability to a “High” risk vulnerability, based mostly on the exploit complexity.

Using Sysdig, we can detect drift from baseline sshd behaviors. In this case, stateful detections would track the number of failed attempts to authenticate with the sshd server. Falco rules alone detect the potential Indicators of Compromise (IoCs). By pulling this into a global state table, Sysdig can better detect the spike of actual, failed authentication attempts for anonymous users, rather than focus on point-in-time alerting. 

At the heart of Sysdig Secure lies Falco’s unified detection engine. This cutting‑edge engine leverages real‑time behavioral insights and threat intelligence to continuously monitor the multi‑layered infrastructure, identifying potential security incidents. 

Whether it’s anomalous container activities, unauthorized access attempts, supply chain vulnerabilities, or identity‑based threats, Sysdig ensures that organizations have a unified and proactive defense against evolving threats.

Reference:

https://thehackernews.com/2024/07/new-openssh-vulnerability-could-lead-to.html

https://blog.vyos.io/cve-2024-6387-regresshion

https://www.openssh.com/releasenotes.html

https://github.com/acrono/cve-2024-6387-poc

The post CVE-2024-6387 – Shields Up Against RegreSSHion appeared first on Sysdig.

]]>
Happy 10th Birthday Kubernetes! https://sysdig.com/blog/10-years-of-kubernetes/ Thu, 06 Jun 2024 15:30:00 +0000 https://sysdig.com/?p=89857 As Kubernetes celebrates its 10th anniversary, it’s an opportune moment to reflect on the profound impact Kubernetes has had on...

The post Happy 10th Birthday Kubernetes! appeared first on Sysdig.

]]>
As Kubernetes celebrates its 10th anniversary, it’s an opportune moment to reflect on the profound impact Kubernetes has had on the cloud technology landscape. Since its inception, Kubernetes has revolutionized the way we deploy, manage, and scale containerized applications, becoming the de facto orchestration platform for today’s cloud-native ecosystem. This milestone not only highlights Kubernetes’ success as an open-source project but also the vibrant community that has grown around it, driving continuous innovation and collaboration.

In parallel, Sysdig’s journey has been deeply intertwined with Kubernetes’ evolution. From its early days as a pioneering observability solution for containerized workloads, Sysdig has continually evolved to address the growing security needs of the cloud-native world. As we celebrate Kubernetes’ achievements, we also recognize Sysdig’s contributions, particularly through open-source projects like Falco, which enhance the security and resilience of Kubernetes environments. Join me as we walk down memory lane and take a look at Sysdig’s evolution and its significant contributions to the ecosystem over the past decade.

Early Beginnings: Open Source Visibility

Sysdig was founded in 2013 by Loris Degioanni, leveraging his experience as a co-creator of Wireshark, a widely-used network protocol analyzer. Initially, Sysdig focused on providing deep visibility into containerized environments. The creation of open source Sysdig Inspect was a significant milestone in providing visibility into containers and Kubernetes environments. Sysdig Inspect utilized the same deep packet inspection principles from Wireshark, extending them to modern cloud-native applications.

Addressing Security Needs: Introduction of Falco

As Kubernetes rapidly gained traction following its launch in 2014, the need for comprehensive security solutions for containerized workloads became increasingly evident. Recognizing this demand, Sysdig introduced Falco in 2016, an open source project focused on runtime security and threat detection for Kubernetes, containers, and cloud environments. This was a crucial time when Kubernetes was solidifying its position as the outright standard for container orchestration, and tools like Falco played a significant role in enhancing its security posture.

Falco quickly became an essential component of the security toolkit, capable of detecting unexpected behaviors and potential threats in real-time by monitoring system calls. Its significance was further underscored in 2018 when Falco was donated to the Cloud Native Computing Foundation (CNCF), the same organization that had been nurturing Kubernetes since its early days. This move not only highlighted Falco’s importance but also ensured its continued development within this blooming cloud-native ecosystem.

In 2020, the CNCF and The Linux Foundation introduced the Certified Kubernetes Security Specialist (CKS) certification, aimed at professionals who had already obtained the Certified Kubernetes Administrator (CKA) certification and wanted to showcase their expertise in Kubernetes security. Falco was a core component of the CKS certification spec, further highlighting Falco’s integral role in securing Kubernetes environments.

By 2024, as Kubernetes celebrated a decade of revolutionizing application deployment and management, Falco graduated from the CNCF. This milestone marked it as a mature and stable project, ready for widespread adoption in production environments, paralleling Kubernetes’ own journey to maturity and broad acceptance in the industry.

Expanding the Open Source Ecosystem

Following Falco’s success, the community continued to innovate by developing complementary tools to enhance the overall security posture of cloud-native environments:

  • falcosidekick: A companion project that extends Falco’s alerting capabilities by providing a flexible mechanism to forward Falco alerts to various outputs such as Slack, email, or SIEM systems, improving incident response and introspection.
  • falcoctl: A tool designed to simplify the deployment, management, and operation of Falco. It helps streamline security workflows and integrates seamlessly with existing CI/CD pipelines.
  • Promcat: On our journey to provide a scalable Prometheus experience, we found that companies need a reliable toolbox of observability integrations to succeed. In addition to scale and security controls, they need a quick answer to the following question: “How can I monitor X, Y and Z in my cluster?”
  • Falco Talon: Introduced as a dedicated threat mitigation engine for Kubernetes, Falco Talon enables automated responses to detected threats. It uses Kubernetes primitives to take actions like labeling workloads, terminating suspicious pods, and enforcing network policies, thus mitigating threats in real-time.

The Market Evolution: From Disparate Toolsets to CNAPP

The cloud-native security ecosystem has evolved significantly over the past decade. Initially, organizations relied on various separate tools to secure their cloud environments, each addressing specific needs such as protecting workloads, managing permissions, and ensuring compliance. However, the complexity and fragmented nature of these tools led to a growing demand for a more integrated approach to security.

This shift has given rise to the concept of the Cloud-Native Application Protection Platform (CNAPP), which aims to provide comprehensive security by combining the capabilities of these individual tools into a unified platform. Sysdig has been at the forefront of this evolution, continually enhancing its open-source offerings to deliver end-to-end security solutions that cover the entire lifecycle of cloud-native applications. By integrating workload protection, permission management, and posture management into a single platform, Sysdig simplifies security operations, improves visibility, and enhances the overall security posture of Kubernetes.

Conclusion

Sysdig’s journey over the past 10 years mirrors the rapid evolution of the cloud-native ecosystem. Sysdig has consistently driven innovation to meet the growing demands of modern, containerized environments. As we celebrate Kubernetes’ 10th anniversary, it’s clear that Sysdig’s contributions have been instrumental in shaping the future of cloud-native security, ensuring that organizations can confidently adopt and secure their cloud-native applications.

If you want to see how far Kubernetes has come over the past 10 years, James Spurin from DiveInto shared an interactive, hands-on, in-browser version of the very first version of Kubernetes (v1.0.0). The lab is accessible through his Github repository and you can run it absolutely for FREE! Kubernetes has come a long way since the first ever official release of the project, and this is a cool way to celebrate the evolution of the project.

The post Happy 10th Birthday Kubernetes! appeared first on Sysdig.

]]>
Wireshark: Ethereal Network Analysis for the Cloud SOC https://sysdig.com/blog/wireshark-ethereal-network-analysis-for-the-cloud-soc/ Wed, 05 Jun 2024 14:30:00 +0000 https://sysdig.com/?p=89823 Remember Wireshark from the good old days of your IT degree or early engineering adventures? Well, guess what? It’s still...

The post Wireshark: Ethereal Network Analysis for the Cloud SOC appeared first on Sysdig.

]]>
Remember Wireshark from the good old days of your IT degree or early engineering adventures? Well, guess what? It’s still kicking and just as relevant today as it was back then, and guess what else? It is still open source! Do your engineering or security teams use it? There’s a good chance they do if you’re on-premises. Believe it or not, Wireshark isn’t just for the land of wires and cables anymore. With some help from Falco and Kubernetes, it has a place in the cloud SOC.

In case you’re wondering what we are talking about, let us explain. Wireshark is a high caliber detective tool that allows users to both capture and scan traffic running on a network, in real time. It is useful for:

  • Network Monitoring: Wireshark allows you to monitor network traffic in real time, helping to detect anomalies or suspicious activities.
  • Network Forensics: It enables the inspection of captured traffic for investigating security incidents, identifying attack patterns, and understanding the scope of a breach.
  • Protocol Analysis: Wireshark provides deep insight into network protocols, aiding in understanding how systems communicate and identifying vulnerabilities or misconfigurations.
  • Security Auditing: By analyzing network traffic, Wireshark can help in auditing network security policies, ensuring compliance, and identifying potential weaknesses in the network infrastructure.

The cloud security game is about speed these days. Doesn’t this sound like something useful for security teams looking to find and respond to threats faster? A SOC for an on-premises environment can still use Wireshark as it has for 20 years. However, we want a cloud SOC to be able to use Wireshark. Real-time network detection, analysis, and response is necessary in the cloud. A packet capture file (PCAP) is still relevant for cloud-native environments, as it holds a plethora of information. You just need to know how to generate the file using Kubernetes and containers. 

Using Wireshark and the open source threat management tool Falco Talon, your SOC can automatically receive contextualized packet capture details related to a detection alert. This speeds up the investigation process, which was, and can still be, quite tedious. But in reality, it shouldn’t take long at all because you only have minutes to investigate a cloud attack. The faster the investigation, the sooner remediation can take place. 

Wireshark’s versatility and cost (it’s free!) makes it a valuable asset in the arsenal of security professionals, providing deep visibility into network traffic and aiding in maintaining a secure and resilient network infrastructure.

Want more technical details or want to share with your team? We have a technical explanation and workflow for practitioners.

Is your team looking at packets? Join us at SharkFest this year and take your skills to whole other level!

The post Wireshark: Ethereal Network Analysis for the Cloud SOC appeared first on Sysdig.

]]>
What’s New in Sysdig – May 2024 https://sysdig.com/blog/whats-new-in-sysdig-may-2024/ Thu, 30 May 2024 18:00:00 +0000 https://sysdig.com/?p=89787 “What’s New in Sysdig” is back with the May 2024 edition! My name is Dustin Krysak. I’m a Customer Solutions...

The post What’s New in Sysdig – May 2024 appeared first on Sysdig.

]]>
“What’s New in Sysdig” is back with the May 2024 edition! My name is Dustin Krysak. I’m a Customer Solutions Engineer based in Vancouver, BC, and I’m excited to share our latest updates.

The Sysdig Threat Research Team (TRT) has been busy recently investigating and analyzing new security threats. Their research has uncovered notable vulnerabilities and attack vectors, which they’ve shared insights about through the Sysdig blog. These blog posts include an in-depth look at RUBYCARP, a long-running botnet, and LLMjacking, a technique that can leverage large language models for malicious purposes.

This month, we also announced our latest initiative, the Runtime Insights Partner Ecosystem. If interested, you can check out our blog post and the official press release.

Sysdig Secure

RBAC Permissions Available in Vulnerability Management

Administrators can now create RBAC roles and define which roles can access the Vulnerability Management, Policy, Reporting, and Risk Acceptance functions. For more information, see Custom Roles.

New Version Releases

Stay up-to-date with the latest releases for our scanning tools. May’s updates bring improved functionality, bug fixes, and security enhancements. 

Sysdig CLI Scanner V1.10.0

Runtime Scanner V1.7.0

Host Scanner V0.10.0

Upgrading is easy, but feel free to reach out if you have any questions.

Sysdig Monitor

Alert Editor

When creating alerts, the Alert Editor automatically displays the optimal time window for your alert rule, and every data point in the alert preview now corresponds with an evaluation of an alert rule. You can also Explore Historical Data for Metric alerts 

Sysdig Agents

13.20.0: Enhanced coverage and visibility

Our latest agent update adds support for Suse Linux and increased visibility into JMX and non-interactive commands.  

Suse Linux Enterprise Server Support

You can now install the Sysdig Agent on SLES 12 and SLES 15.

Capture Non-Interactive Commands in Activity Audit

Activity audit can now capture and report non-interactive commands.

Support for Adding Labels to JMX Metrics

Sysdig added support for labels on JMX metrics collected by the agent. For more information, see Collect JMX Labels.

Defect Fixes

We have several fixes for our agent that landed in May. The complete list can be seen in the release notes.

SDK, CLI, and Tools

Terraform Provider V1.26.0

  • Adds the ability to create, update, and delete posture policies.

For more information, see our Terraform Provider docs.

Sysdig Cloud Connector V0.16.66

  • Makes secure_api_token optional in cluster-shield

Admission Controller v3.9.45

This release is available under helm chart 0.16.2.

  • Makes secure_api_token optional in cluster-shield

Sysdig Secure Jenkins Plugin v2.3.1 

  • Bump embedded scanner to 1.9.2
  • Bug fixes:
    • Ensure that all the logs from the embedded scanner have been written to file for proper retrieval by the trailer
    • Increase the waiting time before stopping the logs trailer to 2s
    • Ensure proper management of vuln-list inside result json
    • Use imageTag (if available) when all policy evaluations pass

Prometheus Integration v1.29.0

  • APPLY changes over PromQl labels on cluster status dashboards
  • ADD restarted pods toplist panel to cluster status dashboard
  • New version mysql-exporter fixing HIGH vulnerabilities
  • New version php-fpm_exporter fixing HIGH vulnerabilities

Open Source

Falco

Falco 0.37.1 is the latest stable release.

New Website Resources

Blogs 

Webinars

Sysdig Training

Kraken Discovery Labs

Attacks no longer take days—they take minutes. Cloud security requires a modern detection and response benchmark. The 555 benchmark specifies that you have 5 seconds to detect, 5 minutes to triage, and 5 minutes to respond.

In this 60-minute workshop, you’ll execute actual cloud attacks like SCARLETEEL and then assume the role of the defender, leveraging threat-hunting strategies to detect and respond immediately in the cloud.

You can sign up for this lab on our website.

Instructor Led Training

We have a new Azure-specific Cloud Security Posture Management (CSPM) lab available for ILT (Instructor Led Training) delivery. This ILT content included the concepts of zones and Infrastructure as Code, integrated with source control using GitHub or GitLab.

If you are interested in learning more about how to schedule an ILT workshop, please contact your account team.

The post What’s New in Sysdig – May 2024 appeared first on Sysdig.

]]>
Next-Gen Container Security: Why Cloud Context Matters https://sysdig.com/blog/next-gen-container-security-why-cloud-context-matters/ Thu, 30 May 2024 14:30:00 +0000 https://sysdig.com/?p=89779 Container security has experienced significant transformation over the past decade. From the emergence of foundational tools like Docker to the...

The post Next-Gen Container Security: Why Cloud Context Matters appeared first on Sysdig.

]]>
Container security has experienced significant transformation over the past decade. From the emergence of foundational tools like Docker to the maturation of orchestration platforms such as Kubernetes, the container security landscape looks different than it did even a few years ago. With Gartner predicting 95% of organizations will be running containerized applications in production by 2028, it’s clear that container security is going to be a key priority for most organizations moving forward. The rapid evolution of technology has not only driven advancements in containerization but has also created opportunities for attacks targeting containers and cloud-native infrastructure. Attackers are able to automate their reconnaissance and other tactics due to the uniformity of cloud providers’ APIs and architectures, executing attacks in less than 10 minutes. Organizations need to rethink their approach to cloud container security and workload protection or risk being outpaced by these attacks.

A New Normal Brings New Challenges

In modern application development, containers are quickly becoming a popular tool for developers, providing numerous advantages including improved agility and scalability. They provide developers with flexibility to update a specific container or microservice instead of the entire application, greatly speeding the pace of innovation. The convergence of cloud migration and widespread adoption of DevOps practices has pushed containerization as a prevailing trend, empowering organizations to streamline their operations and increase the pace of new releases.

While adoption increases every year, containers are still a relatively young technology, with many companies still in the early stages of their containerization journey. The ever-evolving technology ecosystem surrounding containers, including Kubernetes, introduces constant shifts and updates, and development teams and infrastructure expanded faster than security teams. As a result, there is a general scarcity of cloud-native security talent and expertise needed to effectively secure these environments. We have also seen developers increasingly shoulder security responsibilities as organizations embrace DevSecOps strategies. Containers offer many advantages for innovation and agility, but they also expand the potential attack surface, posing a challenge for security teams trying to balance security and speed.

Two Sides Of Container Security

As container technology continues to mature, two key security trends have emerged over the last few years. The first revolves around key risks getting obscured by the endless noise and alerts created by many security tools. Under the DevSecOps model, developers are often responsible for fixing vulnerabilities in the code packages they deploy but find themselves overwhelmed by the sheer volume of alerts. Our research found that of cloud workloads with critical or high severity vulnerabilities, only 1.2% are exploitable, have a fix, and are actually in use by the application.The number of new cloud-related CVEs increased by nearly 200% in 2023, and the sharing of open source container images has left security teams facing a large number of critical and high-severity container vulnerabilities. The challenge many organizations face lies in discerning which of these risks actually have a high chance of exploitation and which can be deprioritized. The last thing any developer or security team wants is to waste valuable time sifting through a long list of security findings, only to discover that many are inconsequential.

cloud container security

The second major trend is the impressive speed at which cloud attacks now move. As more companies have shifted to cloud-native applications, attackers have adapted to leverage the architecture upon which these apps are built. After finding an exploitable asset, malicious actors need only minutes to execute an attack and start causing damage. The initial stages of cloud attacks can be heavily automated, and attackers are coming up with all sorts of sophisticated techniques to disguise their presence. In just the past year, we’ve observed numerous attacks where a malicious actor gained initial access through a vulnerability in a container image or open source software dependency, including the well-known SSHD backdoor in XZ Utils. Once infiltrating the environment, attackers can easily move laterally – whether from workload to cloud or vice versa – hunting for credentials or sensitive data to exploit for profit.

A Modern Approach To Cloud Container Security And Workload Protection

As the container security landscape evolves, organizations are looking to strike a balance between prevention and defense. Initially, many utilized different tools to secure their containers than they used for other parts of their cloud infrastructure. However, container threats now often cross cloud domains, making this segmented approach slow and outdated. The lack of communication between these tools results in viewing container security in isolation. While an isolated tool might detect a malicious actor breaching a vulnerable container, the post-escape attack path remains obscured. A more robust approach is to use a unified platform that connects the dots across your broad cloud infrastructure to thwart and respond to threats with agility. Already, numerous enterprises have started this journey towards consolidating cloud security. In the 2023 Gartner® Market Guide for Cloud-Native Application Protection Platforms (CNAPP), Gartner predicts this trend will continue, forecasting that by 2025, 60% of enterprises will have consolidated cloud workload protection platform (CWPP) and cloud security posture management (CSPM) capabilities to a single vendor or CNAPP. Container security falls squarely into this category of CWPP and security leaders and practitioners will need to keep up with this change as the boundaries between domains across the cloud begin to blur.

To adapt to this new normal, organizations need to rethink their approach to container security. Despite the evolving threat landscape, the fundamental challenge remains the same: security and developer teams must catch vulnerabilities in container images and detect threats at runtime. But now, they must approach this challenge with a different lens. In the modern environment, container security and workload protection need cloud context to be truly effective. Correlating container findings with context across the cloud is essential to getting the full picture of how an attacker can exploit your environment. Armed with this context, teams can focus on active real-time risk in their organization and view containers as part of a larger story.

Container security and workload protection typically encompasses use cases like threat detection and response, vulnerability management, and Kubernetes security posture management (KSPM). These elements remain critical, but this new approach integrates them with findings like real-time configuration changes, risky identity behavior, and cloud log detections. These other findings are usually associated with CSPM but are becoming relevant for container security. Combining these factors with real-time contextual insights on vulnerabilities and container threats paints a comprehensive picture of potential attack paths throughout a user’s environment. Solely focusing on containers may reveal an initial breach but fails to unveil the extent of damage or anticipate the attacker’s next move. As long as your organization has workloads running in the cloud, this additional cloud context provides great value.

cloud container security

Bringing The Best Of Agent And Agentless To Workload Protection

The best way to achieve this balance of security and speed combines agent-based and agentless strategies. There is an ongoing debate over whether agent-based or agentless approaches are more effective, with agentless instrumentation becoming a popular approach due to its ease of deployment and rapid time to value. For this reason, many security teams prefer to implement an agentless approach wherever possible. While there are benefits to both approaches, the most effective solutions will integrate both for comprehensive visibility. For containers, agents provide deeper runtime visibility and real-time detection for faster time to discovery. Unfortunately, it is not always possible to deploy them universally due to resource constraints.

In these cases, leveraging agentless instrumentation to supplement agents ensures full breadth of coverage across your infrastructure. For container security, deploying agents strategically allows you to prioritize vulnerabilities based on in-use packages and detect threats in real time – capabilities that are not possible with a solely agentless approach. Supplementing this with agentless deployments enables quick basic vulnerability scanning across all containers. As previously highlighted, integrating cloud context into workload protection – often achieved through agentless means – is a great way to anticipate and combat live attacks. This approach not only tackles the traditional challenges associated with container security and workload protection but also provides a full picture and rich context to address the most significant risks. Both approaches bring clear benefits to container security, but this new approach of implementing agentless where possible to supplement the deeper insights from agents brings the best of both worlds.

Security Must Continue To Adapt

The rise of containerization and cloud-native applications, along with the advances made by attackers, has brought us to a challenging point for workload protection. In this constant chess game, security teams must remain proactive and adaptable, continuously evolving their defenses or risk being breached by emerging threats.

Ultimately, organizations that adapt the quickest will be best equipped to detect attacks that strike without warning in a matter of minutes. As boundaries between cloud domains continue to become less defined and the market moves towards consolidation, the ability to connect events across your entire cloud infrastructure will be key to protecting your assets and mitigating risk.

The post Next-Gen Container Security: Why Cloud Context Matters appeared first on Sysdig.

]]>
Optimizing Wireshark in Kubernetes https://sysdig.com/blog/optimizing-wireshark-in-kubernetes/ Tue, 21 May 2024 17:00:00 +0000 https://sysdig.com/?p=89616 In Kubernetes, managing and analyzing network traffic poses unique challenges due to the ephemeral nature of containers and the layered...

The post Optimizing Wireshark in Kubernetes appeared first on Sysdig.

]]>
In Kubernetes, managing and analyzing network traffic poses unique challenges due to the ephemeral nature of containers and the layered abstraction of Kubernetes structures like pods, deployments, and services. Traditional tools like Wireshark, although powerful, struggle to adapt to these complexities, often capturing excessive, irrelevant data – what we call “noise.”

The Challenge with Traditional Packet Capturing

The ephemerality of containers is one of the most obvious issues. By the time a security incident is detected and analyzed, the container involved may no longer exist. When a pod dies in Kubernetes, it’s designed to instantly recreate itself again. When this happens, it has new context, such as a new IP address and pod name. As a starting point, we need to look past the static context of legacy systems and try to do forensics based on Kubernetes abstractions such as network namespaces and service names.

It’s worth highlighting that there are some clear contextual limitations of Wireshark in cloud native. Tools like Wireshark are not inherently aware of Kubernetes abstractions. This disconnect makes it hard to relate network traffic directly back to specific pods or services without significant manual configuration and contextual stitching. Thankfully, we know Falco has the context of Kubernetes in the Falco rule detection. Wireshark with Falco bridges the gap between raw network data and the intelligence provided by the Kubernetes audit logs. We now have some associated metadata from the Falco alert for the network capture.

Finally, there’s the challenge of data overload associated with PCAP files. Traditional packet capture strategies, such as those employed by AWS VPC Traffic Mirroring or GCP Traffic Mirroring, often result in vast amounts of data, most of which is irrelevant to the actual security concern, making it harder to isolate important information quickly and efficiently. Comparatively, options like AWS VPC Flow Logs or Azure’s attempt at Virtual network tap, although less complex, still incur significant costs in data transfer/storage. 

When’s the appropriate time to start a capture? How do you know when to end it? Should it be pre-filtered to reduce the file size, or should we capture everything and then filter out noise in the Wireshark GUI? We might have a solution to these concerns that bypasses the complexities and costs of cloud services.

The /555 Guide For Security Practitioners

Meet The Only Benchmark For Cloud Security!

Read The Guide

Introducing a New Approach with Falco Talon

Organizations have long dealt with security blindspots related to Kubernetes alerts. Falco and Falco Talon address these shortcomings through a novel approach that integrates Falco, a cloud-native detection engine, with tshark, the terminal version of Wireshark, for more effective and targeted network traffic analysis in Kubernetes environments.

Falco Talon’s event-driven, API approach to threat response is the best way to deal with initiating captures in real time. It’s also the most stable approach we can see with the existing state-of-the-art in cloud-native security – notably, Falco.

Step-by-Step Workflow:

  • Detection: Falco, designed specifically for cloud-native environments like Kubernetes, monitors the environment for suspicious activity and potential threats. It is finely tuned to understand Kubernetes context, making it adept at spotting Indicators of Compromise (IoCs). Let’s say, for example, it triggers a detection for specific anomalous network traffic to a Command and Control (C2) server or botnet endpoints.
  • Automating Tshark: Upon detection of an IoC, Falco sends a webhook to the Falco Talon backend. Talon has many no-code response actions, but one of these actions allows users to trigger arbitrary scripts. This trigger can be context-aware from the metadata associated with the Falco alert, allowing for a tshark command to be automatically initiated with metadata context specific to the incident.
  • Contextual Packet Capturing: Finally, a PCAP file is generated for a few seconds with more tailored context. In the event of a suspicious TCP traffic alert from Falco, we can filter a tshark command for just TCP activity. In the case of a suspicious botnet endpoint, let’s see all traffic to that botnet endpoint. Falco Talon, in each of these scenarios, initiates a tshark capture tailored to the exact network context of the alert. This means capturing traffic only from the relevant pod, service, or deployment implicated in the security alert.
  • Improved Analysis: Finally, the captured data is immediately available for deeper analysis, providing security teams with the precise information needed to respond effectively to the incident. This is valuable for Digital Forensics & Incident Response (DFIR) efforts, but also in maintaining regulatory compliance by logging context specific to security incidents in production.
Wireshark in Kubernetes

This targeted approach not only reduces the volume of captured data, making analysis faster and more efficient, but also ensures that captures are immediately relevant to the security incidents detected, enhancing response times and effectiveness.

Collaboration and Contribution

We believe this integrated approach marks a significant advancement in Kubernetes security management. If you are interested in contributing to this innovative project or have insights to share, feel free to contribute to the Github project today.

This method aligns with the needs of modern Kubernetes environments, leveraging the strengths of both Falco and Wireshark to provide a nuanced, powerful tool for network security. By adapting packet capture strategies to the specific demands of cloud-native architectures, we can significantly improve our ability to secure and manage dynamic containerized applications.

Open source software (OSS) is the only approach with the agility and broad reach to set up the conditions to meet modern security concerns, well-demonstrated by Wireshark over its 25 years of development. Sysdig believes that collaboration brings together expertise and scrutiny, and a broader range of use cases, which ultimately drives more secure software.

This proof-of-concept involves three OSS technologies (Falco, Falco Talon, and Wireshark). While the scenario was specific to Kubernetes, there is no reason why it cannot be adapted to standalone Linux systems, Information of Things (IoT) devices, and Edge computing in the future.

The post Optimizing Wireshark in Kubernetes appeared first on Sysdig.

]]>
Accelerating AI Adoption: AI Workload Security for CNAPP https://sysdig.com/blog/ai-workload-security-for-cnapp/ Tue, 30 Apr 2024 13:45:00 +0000 https://sysdig.com/?p=88105 When it comes to securing applications in the cloud, adaptation is not just a strategy but a necessity. We’re currently...

The post Accelerating AI Adoption: AI Workload Security for CNAPP appeared first on Sysdig.

]]>
When it comes to securing applications in the cloud, adaptation is not just a strategy but a necessity. We’re currently experiencing a monumental shift driven by the mass adoption of AI, fundamentally changing the way companies operate. From optimizing efficiency through automation to transforming the customer experience with speed and personalization, AI has empowered developers with exciting new capabilities. While the benefits of AI are undeniable, it is still an emerging technology that poses inherent risks for organizations trying to understand this changing landscape. That’s where Sysdig comes in to secure your organization’s AI development and keep the focus on innovation.

Today, we are thrilled to announce the launch of AI Workload Security to identify and manage active risk associated with AI environments. This new addition to our cloud-native application protection platform (CNAPP) will help security teams see and understand their AI environments, identify suspicious activity on workloads that contain AI packages, and prioritize and fix issues fast.

Skip ahead to the launch details!

AI has changed the game

The explosive growth of AI in the last year has reshaped the way many organizations build applications. AI has quickly become a mainstream topic across all industries and a focus for executives and boards. Advances in the technology have led to significant investment in AI, with more than two-thirds of organizations expected to increase their AI investment over the next three years across all industries. GenAI specifically has been a major catalyst of this trend, driving much of this interest. The Cloud Security Alliance’s recent State of AI and Security Survey Report found that 55% of organizations are planning to implement GenAI solutions this year. Sysdig’s research also found that since December 2023, the deployment of OpenAI packages has nearly tripled.

With more companies deploying GenAI workloads, Kubernetes has become the deployment platform of choice for AI. Large language models (LLMs) are a core component of many GenAI applications that can analyze and generate content by learning from large amounts of text data. Kubernetes has numerous characteristics that make it an ideal platform for LLMs, providing advantages in scalability, flexibility, portability, and more. LLMs require significant resources to run, and Kubernetes can automatically scale resources up and down, while also making it simple to export LLMs as container workloads across various environments. The flexibility when deploying GenAI workloads is unmatched, and top companies like OpenAI, Cohere, and others have adopted Kubernetes for their LLMs. 

From opportunity to risk: security implications of AI

AI continues to advance rapidly, but the widespread adoption of AI deployment creates a whole new set of security risks. The Cloud Security Alliance survey found that 31% of security professionals believe AI will be of equal benefit to security teams and malicious third parties, with another 25% believing it will be more beneficial to malicious parties. Sysdig’s research also found that 34% of all currently deployed GenAI workloads are publicly exposed, meaning they are accessible from the internet or another untrusted network without appropriate security measures in place. This increases the risk of security breaches and puts the sensitive data leveraged by GenAI models in danger.

Sysdig found that 34% of all currently deployed GenAI workload are publicly exposed.

Another development that highlights the importance of AI security in the cloud are the forthcoming guidelines and increasing pressures to audit and regulate AI, as proposed by the Biden administration’s October 2023 Executive Order and following recommendations from the National Telecommunications and Information Administration (NTIA) in March 2024. The European Parliament also adopted the AI Act in March 2024, introducing stringent requirements on risk management, transparency, and other issues. Ahead of this imminent AI legislation, organizations should assess their own ability to secure and monitor AI in their environments.

Many organizations lack experience securing AI workloads and identifying risks associated with AI environments. Just like the rest of an organization’s cloud environment, it is critical to prioritize active risks tied to AI workloads, such as vulnerabilities in in-use AI packages or malicious actors trying to modify AI requests and responses. Without full understanding and visibility of AI risk, it’s possible for AI to do more harm than good.

Mitigate active AI risk with AI Workload Security

We’re excited to unveil AI Workload Security in Sysdig’s CNAPP to help our customers adopt AI securely. AI Workload Security allows security teams to identify and prioritize workloads in their environment with leading AI engines and software packages, such as OpenAI and Tensorflow, and detect suspicious activity within these workloads. With these new capabilities, your organization can get real-time visibility of the top active AI risks, enabling your teams to address them immediately. Sysdig helps organizations manage and control their AI usage, whether it’s official or deployed without proper approval, so they can focus on accelerating innovation.

Sysdig’s AI Workload Security ties into our Cloud Attack Graph, the neural center of the Sysdig platform, integrating with our Risk Prioritization, Attack Path Analysis, and Inventory features to provide a single view of correlated risks and events.

AI Workload Security in action

The introduction of real-time AI Workload Security helps companies prioritize the most critical risks associated with AI environments. Sysdig’s Risks page provides a stack-ranked view of risks, evaluating which combinations of findings and context need to be addressed immediately across your cloud environment. Publicly exposed AI packages are highlighted along with other risk factors. In the example below, we see a critical risk with the following findings:

  1. Publicly exposed workload
  2. Contains an AI package
  3. Has critical vulnerability with an exploit running on an in-use package
  4. Contains a high confidence event

Based on the combination of findings, users can determine the severity of the risk that exposed AI workloads create. They can also gather more context around the risk, including which packages on the workload are running AI and whether vulnerabilities on these packages can be fixed with a patch.

AI workload risks

Digging deeper into these risks, users can also get a more visual representation of the exploitable links across resources with Attack Path Analysis. Sysdig uncovers potential attack paths involving workloads with AI packages, showing how they fit with other risk factors like vulnerabilities, misconfigurations, and runtime detections on these workloads. Users can see which AI packages running on the workload are in use and how vulnerable packages can be fixed. With the power of AI Workload Security, users can quickly identify critical attack paths involving their AI models and data, and correlate with real-time events.

Sysdig also gives users the ability to identify all of the resources in your cloud environment that have AI packages running. AI Workload Security empowers Sysdig’s Inventory, enabling users to view a full list of resources containing AI packages with a single click, as well as identify risks on these resources.

Want to learn more?

Armed with these new capabilities, you’ll be well equipped to defend against active AI risk, helping your organization realize the full potential of AI’s benefits. These advancements provide an additional layer of security to our top-rated CNAPP solution, stretching our coverage further across the cloud. Click here to learn more about Sysdig’s leading CNAPP.

See Sysdig in action

Sign up for our Kraken Discovery Lab to execute real cloud attacks and then assume the role of the defender to detect, investigate, and respond.

The post Accelerating AI Adoption: AI Workload Security for CNAPP appeared first on Sysdig.

]]>