Sysdig | Pawan Shankar https://sysdig.com/blog/author/pawan-shankar/ Thu, 25 Jul 2024 16:59:04 +0000 en-US hourly 1 https://wordpress.org/?v=6.6.1 https://sysdig.com/wp-content/uploads/favicon-150x150.png Sysdig | Pawan Shankar https://sysdig.com/blog/author/pawan-shankar/ 32 32 Why CNAPP Needs Runtime Insights to Shift Left and Shield Right https://sysdig.com/blog/cnapp-runtime-insights-shift-left-shield-right/ Fri, 17 Mar 2023 20:27:46 +0000 https://sysdig.com/?p=67523 There’s an important shift happening in the cloud security industry: organizations are looking for an integrated platform that connects the...

The post Why CNAPP Needs Runtime Insights to Shift Left and Shield Right appeared first on Sysdig.

]]>
There’s an important shift happening in the cloud security industry: organizations are looking for an integrated platform that connects the dots between several key security use cases from source through production. Whether it is for tool consolidation, consistent end-to-end experience, or “one throat to choke,” customers are increasingly choosing a platform-based approach to address critical cloud security risks.


This line of thinking is why Sysdig has been laser-focused on providing a unified cloud and container security experience for our customers. From our perspective, with the introduction of the 2023 Gartner®  Market Guide for Cloud-Native Application Protection Platforms (CNAPPs), this trend is finally becoming the mainstream approach.

What Is a CNAPP?

Cloud Native Application Protection Platforms (CNAPPs) combine functionality for Cloud Security Posture Management (CSPM), Cloud Workload Protection (CWP), Cloud Infrastructure Entitlement Management (CIEM), and Cloud Detection and Response (CDR) security into one security platform. These integrated capabilities allow DevOps to ship applications fast without security becoming a bottleneck while also allowing security teams to manage risk and defend against attacks. 

Why Do Security and DevOps Teams Need a CNAPP?

Visibility gap when moving to cloud and containers: Empowered developers are configuring infrastructure at will and deploying containerized microservices with the click of a button. Now you have dynamic workloads with 10–100X more containerized compute instances, large volumes of cloud assets with dynamic activity to track, and messy and overly permissive IAM permissions to manage. Without a single tool that analyzes these data sources, blind spots emerge and risk abounds.

Point solutions don’t work: Oftentimes customers must choose from among multiple solutions, or even choose vendors that stitch together a workflow from multiple acquisitions. Regardless of the approach, these tools don’t communicate with each other and share context. Teams are stuck wading through disparate vulnerability findings, posture violations or threats, forcing them to deal with issues as one-off issues vs addressing them as a priority stacked-rank list based on risk and impact.

Talent shortage. Development teams and infrastructure expanded faster than security teams, and there is a shortage of cloud-native security talent. Customers are looking to partner with a trusted leader in this space, one that can provide an opinionated workflow to address these challenges.

Why CNAPP Needs Runtime Insights 

A CNAPP by definition is a data platform that ingests and analyzes multiple data sources. The data volume is exploding, as you factor in the adoption of microservices built on containers/Kubernetes. This can quickly result in a gargantuan volume of both high and low fidelity signals, ultimately resulting in the question: how do I focus on the most critical risks in my cloud native infrastructure?

This is where having deep knowledge of what’s running right now can help you shrink down the list of things that need attention first. Simply put, knowledge of what’s running (a.k.a. runtime insights) is the necessary context needed by security and DevOps teams to take action on the most critical risks first. Ultimately, this context can be fed back early in the development lifecycle to make “shift-left” use cases of CNAPP better with actionable prioritization.

In addition, many customers are starting to see detection and response as a first class citizen within CNAPP. Their need is starting to expand beyond just workload runtime security, and address the cloud control plane (via analyzing cloud logs) to detect suspicious activity across users and services. This subset of CNAPP is seen more as cloud detection and response, and will also evolve further to fill the gaps left by EDR or native capabilities from cloud and platform providers. 

The 2024 Gartner® Market Guide for Cloud-Native Application Protection Platforms Get the full guide for more insights!

Recommendations for Security Professionals When Evaluating a CNAPP

In the Market Guide for Cloud-Native Application Protection Platforms (CNAPP), Gartner® shares several recommendations for security and risk management leaders. Based on our understanding from the report, we’ve provided several questions to help you navigate the buying process.

Do they address a broad set of security use cases from source to production? This includes capabilities such as:

  • IaC security
    • Scanning IaC manifests to identify misconfigurations and security risks before deployment while preventing drift 
  • Vulnerability management / Supply chain security
    • Identifying, prioritizing, and fixing vulnerabilities across your software supply chain (SCM, CI/CD, registry and runtime environments) 
  • Configuration and access management
    • Hardening posture by managing misconfigurations and excessive permissions across cloud environments (cloud resources, users and even ephemeral services like Lambda)
  • Threat detection and response across cloud workloads, users and services
    • Multi-layered detection approach that combines rules and ML based policies, enhanced with threat intelligence, along with a detailed audit trail for forensics/IR.
  • Compliance
    • Meeting compliance standards for dynamic cloud/container environments against PCI, NIST, HIPAA etc

Can they accurately prioritize what matters? Prioritizing the most critical vulns, configuration or access mistakes based on in-use risk exposure is key. For example:

  • Understanding which packages are in-use at runtime, helps you prioritize the most critical vulnerabilities to fix. Our research shows that 87% of container images have high or critical vulnerabilities, but only 15% of vulnerabilities are actually tied to loaded packages at runtime.
  • Real-time cloud activity helps immediately spot anomalous behavior/posture drift that are most risky
  • Runtime access patterns help to highlight the excessive permissions to fix first. 

Also the ability to provide remediation guidance that ultimately helps teams to make informed decisions directly where it matters most – at the source.

Can they maximize coverage but also give deep visibility? Evaluate whether CNAPP vendors provide deep visibility and insights across your entire multi cloud footprint, including IaaS and PaaS, extending across VM, container, and serverless workloads. This often includes both agentless for visibility and control, as well as deep runtime visibility based on instrumentation approaches like eBPF. 

Are they truly getting a consolidated view of risk? Some vendors acquire multiple companies to check the box, and this results in a poor disjointed experience. Look for a CNAPP vendor that tightly integrates the source to production use cases, replacing multiple point products with a comprehensive picture of risk across configurations, assets, user permissions, and workloads.

Are they allowing customizations? Every organization is different. The ability to customize policies, filter results and accept risk based on the organization’s unique environment is key to successfully adopting a solution.

Are they tightly integrated with the DevOps and security ecosystem? The CNAPP tool must integrate with CI/CD tools and scan for misconfigurations and vulnerabilities pre-deployment as well as with SIEM/notification tools trigger alerts / forward events so teams can act immediately. Guidance on how to fix is key; the tool needs the ability to map the violation back to the IaC file, provide situational awareness when investigating an alert through rich context, and give suggestions (in the form of pull request for example) to fix it where it matters: at the source.

Sysdig’s strength at runtime manifests as real-time visibility for detection and response and provides rich context that is required to prioritize what matters. Without it, organizations are left blind and overwhelmed, and ultimately less secure. You can download the complimentary report, and review the full set of recommendations for yourself.

Gartner, Market Guide for Cloud-Native Application Protection Platforms, Neil MacDonald, Charlie Winckless, Dale Koeppen, 14 March 2023.

GARTNER is a registered trademark and service mark of Gartner, Inc. and/or its affiliates in the U.S. and internationally and is used herein with permission. All rights reserved.

Gartner does not endorse any vendor, product or service depicted in its research publications, and does not advise technology users to select only those vendors with the highest ratings or other designation. Gartner research publications consist of the opinions of Gartner’s research organization and should not be construed as statements of fact. Gartner disclaims all warranties, expressed or implied, with respect to this research, including any warranties of merchantability or fitness for a particular purpose.

The post Why CNAPP Needs Runtime Insights to Shift Left and Shield Right appeared first on Sysdig.

]]>
AWS ECR Scanning with Sysdig Secure https://sysdig.com/blog/aws-ecr-scanning/ Tue, 23 Nov 2021 11:30:35 +0000 https://sysdig.com/?p=19906 As container adoption in AWS takes off, ECR scanning is the first step towards delivering continuous security and compliance. You...

The post AWS ECR Scanning with Sysdig Secure appeared first on Sysdig.

]]>
As container adoption in AWS takes off, ECR scanning is the first step towards delivering continuous security and compliance. You need to ensure you are scanning your images pulled from AWS ECR for both vulnerabilities and misconfigurations so that you can avoid pushing applications running on AWS that are exploitable.

Sysdig Secure embeds security and compliance across all stages of the Kubernetes lifecycle. Leveraging 15+ threat feeds, Sysdig Secure provides a single workflow to detect vulnerabilities and security or compliance-related misconfigurations.

Understanding Image Scanning

Image Scanning Schema

Image scanning refers to the process of analyzing the contents and the build process of a container image in order to detect security issues, vulnerabilities, or bad practices. There are several ways to leverage image scanning depending on the use case:

  • Integrate scanning into development lifecycle: As your teams build applications, Sysdig prevents vulnerable images from being pushed through your CI/CD pipelines (Jenkins, Bamboo, Gitlab, or AWS CodePipeline) and identifies new vulnerabilities in production. Sysdig Secure is part of the Sysdig Secure DevOps Platform, which lets you confidently run cloud-native workloads in production. This is primarily requested by development teams.

  • You can also launch inline scanning manually: Read more in the official documentation or have a look at some examples. This is mostly used by security operators. It is also possible to trigger backend scanning manually for images hosted in ECR or any V2 registry by using the “Scan Image” button (this is the old fashioned way and it is not recommended anymore).

  • If you are using AWS ECR, you can get automatic image scanning on push so that any image is secured from the very first step, when it is being inserted. This is the way we are going to explain in this article.

If you would like to know more about image scanning, we strongly recommend following the best practices.

AWS ECR scanning with Sysdig Secure

Diagram AWS ECR and Sysdig

Amazon Elastic Container Registry (ECR) is a fully-managed Docker container registry that makes it easy for developers to store, manage, and deploy Docker container images. Amazon ECR is integrated with AWS container services like ECS and EKS, simplifying your development to production workflow.

Sysdig Secure provides additional ECR scanning capabilities on top of ECR default image scanning based Clair, such as scanning for non-OS vulnerabilities (3rd party libraries), misconfigurations, and compliance checks.

Dashboard AWS with a grafana-test

You only need to deploy Sysdig Secure for Cloud on your AWS account, either by using Cloud Formation Templates or Terraform, and it will provide ECR scanning integration out-of-the-box without needing to add any address or credential manually. You can read more about ECR integration in the official documentation.

Quick create stack AWS ECR

After the installation, you can check that your services have been correctly installed by reviewing this checklist on how to know that your services are working.

Next time an image is pushed into any of your ECR registries, it will be scanned automatically, providing visibility into:

  • Official OS package vulnerabilities
  • Configuration checks:
    • Exposing SSH in a Dockerfile.
    • Users running as root
  • Vulnerabilities in 3rd party libraries:
    • Javascript NPM modules.
    • Python PiP.
    • Ruby GEM.
    • Java JAR archives.
  • Secrets, credentials like tokens, certificates, and other sensitive data
  • Known vulnerabilities & available updates.
  • Metadata: Size of an image among others.
  • Compliance checks for several frameworks:
Docker push example

These artifacts are then stored and evaluated against custom scanning policies that can be specified to a particular registry, repository, or image tag. These policies help detect vulnerabilities, misconfiguration, or compliance issues within your images and generate pass/fail results directly in the UI.

Dashboard results Sysdig Secure
Scan results Sysdig Secure

The report provides any OS/non-OS vulnerabilities discovered in detail. For each vulnerability discovered, Sysdig Secure shows the package version that it found in the image (which is affected by that vulnerability), and it also shows the version number that includes the fix for that issue.

To remove the vulnerability, you’ll need to rebuild the container image to include a version of that package that has a fix available.

For example, the affected package might be in the base image itself. In this case, it is best to update the base images directly. In other cases, the package might be installed on top of the base image by a command in the Dockerfile. For example, it’s common to see package manager commands like apt or yum specified in the Dockerfile. If these specify the version of the affected package, you’ll need to edit the Dockerfile.

Now that we have discussed ECR scanning, let us talk about ECR vulnerability reporting and ECR vulnerability alerting.

AWS ECR vulnerability reporting and alerting

Application security teams often need to ensure they address any high severity CVE with a fix within 30 days.

With Sysdig Secure, you can help bring traditional patch management processes to containers. Teams can set up policies for vulnerability reporting both in ECR and/or running in a particular AWS cluster or region. You can then query for specific vulnerabilities by advanced conditions like CVE ID, severity, fix, age or any other criteria.

For example, if new CVE has been announced, you may want to report on images in ECR that are vulnerable because of it:

Report list Sysdig Secure

After scanning the images currently in ECR, the next question typically is what were the image scanning results for all the builds in the past for that service? Vulnerability management teams need reports on all the scans that have happened against a specific repo over time in ECR.

With Sysdig Secure, you can query by policy and apply a specific scope that answers your question with less than 3 clicks:

Report detail Sysdig Secure

Finally, it is easy to set up vulnerability alerting for ECR. You can set alerts for your team if a new image is analyzed in ECR or a CVE gets an updated score. You can create downstream notifications via Slack, AWS SNS, etc. or create your own custom webhooks to take specific actions.

Create repository alert Sysdig Secure

Conclusion

Hopefully you see how easy it is to get up and running with both AWS Elastic Container Registry and Sysdig Secure. You can also dig deeper into Sysdig’s container and Kubernetes image scanning capabilities or read more about how Sysdig extends security services across various AWS container services (EKS, ECS)

Or go to www.sysdig.com and contact us for a custom demo!


Sysdig Secure will help you add one more layer of security to your AWS ECR. You’ll be set in only a few minutes. Try it today!

The post AWS ECR Scanning with Sysdig Secure appeared first on Sysdig.

]]>
Getting started with Kubernetes audit logs and Falco https://sysdig.com/blog/kubernetes-audit-log-falco/ Tue, 09 Feb 2021 16:00:49 +0000 https://sysdig.com/?p=34350 As Kubernetes adoption continues to grow, Kubernetes audit logs are a critical information source to incorporate in your Kubernetes security...

The post Getting started with Kubernetes audit logs and Falco appeared first on Sysdig.

]]>
As Kubernetes adoption continues to grow, Kubernetes audit logs are a critical information source to incorporate in your Kubernetes security strategy. It allows security and DevOps teams to have full visibility into all events happening inside the cluster.

The Kubernetes audit logging feature was introduced in Kubernetes 1.11. It’s a key feature in securing your Kubernetes cluster, as the audit logs capture events like creating a new deployment, deleting namespaces, starting a node port service, etc.

In this article, you will learn what the Kubernetes audit logs are, what information they provide, and how to integrate them with Falco (open-source runtime security tool) to detect suspicious activity in your cluster.

Learn how to integrate the #Kubernetes audit log with a runtime security tool, like @falco_org 🦅, to detect and block threats. 🛡 Click to tweet

The benefits of Kubernetes audit log

By integrating Kubernetes audit logs, you can answer questions like:

  • What happened? What new pod was created?
  • Who did it? The user, user groups, or service account.
  • When did it happen? The event timestamp.
  • Where did it occur? The namespace that the pod was created in.

In a Kubernetes cluster, it is the kube-apiserver who performs the auditing. When a request, for example, creates a namespace, it’s sent to the kube-apiserver.

If an attacker, or a careless administrator, screws with the cluster, the actions will be registered in the audit log. Your security tools can parse the events in this log and alert you on suspicious activity.

Diagram of processing kubernetes audit logs with falco. Any action taken on the cluster via the api server will be recorded on the audit log and sent to falco. Falco will apply rules to decide if an event is suspicious and generate an alert.

Each request can be recorded with an associated stage. The defined stages are:

  • RequestReceived: The event is generated as soon as the request is received by the audit handler without processing it.
  • ResponseStarted: Once the response headers are sent, but before the response body is sent. This stage is only generated for long-running requests (e.g., watch).
  • ResponseComplete: The event is generated when a response body is sent.
  • Panic: Event is generated when panic occurs.

Now, let’s see this in practice. We’ll show how to introduce a Kubernetes audit policy and enable Kubernetes auditing.

Kubernetes audit policy: An example

A Kubernetes cluster is full of activity, so it’s not feasible nor practical to record all of it. An audit Policy allows you to filter the events and record only the ones you desire.

With security in mind, we’ll create a policy that filters requests related to pods, kube-proxy, secrets, configurations, and other key components.

Such a policy would look like:

apiVersion: audit.k8s.io/v1 # This is required.
kind: Policy
# Don't generate audit events for all requests in RequestReceived stage.
omitStages:
  - "RequestReceived"
rules:
  # Log pod changes at RequestResponse level
  - level: RequestResponse
    resources:
    - group: ""
      # Resource "pods" doesn't match requests to any subresource of pods,
      # which is consistent with the RBAC policy.
      resources: ["pods"]
  # Only check access to resource "pods"
  - level: Metadata
    resources:
    - group: ""
      resources: ["pods/log", "pods/status"]
  # Don't log watch requests by the "system:kube-proxy" on endpoints or services
  - level: None
    users: ["system:kube-proxy"]
    verbs: ["watch"]
    resources:
    - group: "" # core API group
      resources: ["endpoints", "services"]
  # Don't log authenticated requests to certain non-resource URL paths.
  - level: None
    userGroups: ["system:authenticated"]
    nonResourceURLs:
    - "/api*" # Wildcard matching.
    - "/version"
  # Log the request body of configmap changes in kube-system.
  - level: Request
    resources:
    - group: "" # core API group
      resources: ["configmaps"]
    # This rule only applies to resources in the "kube-system" namespace.
    # The empty string "" can be used to select non-namespaced resources.
    namespaces: ["kube-system"]
  # Log configmap and secret changes in all other namespaces at the Metadata level.
  - level: Metadata
    resources:
    - group: "" # core API group
      resources: ["secrets", "configmaps"]
  # A catch-all rule to log all other requests at the Metadata level.
  - level: Metadata
    # Long-running requests like watches that fall under this rule will not
    # generate an audit event in RequestReceived.
    omitStages:
      - "RequestReceived"

One way to apply these rules, as well as enabling Kubernetes auditing, is to to pass the --audit-policy-file flag when starting kube-apiserver along with the audit policy file you defined.

As you can see, you can configure multiple audit rules in a single Kubernetes audit Policy.

The fields that define each audit rule are:

  • level: The audit level defining the verbosity of the event.
  • resources: The object under audit (e.g., “ConfigMaps”).
  • nonResourcesURL: A non resource Uniform Resource Locator (URL) path that is not associated with any resources.
  • namespace: Specific objects within a namespace that are under audit.
  • verb: Specific operation for audit – create, update, delete.
  • users: Authenticated user that the rule applies to.
  • userGroups: Authenticated user group the rule applies to.
  • omitStages: Skips generating events on given stages.

The audit level defines how much of the event should be recorded. There are four audit levels:

  • None: Don’t log events that match this rule.
  • Metadata: Logs request metadata (requesting user, timestamp, resource, verb, etc.) but does not log the request or response bodies.
  • Request: Log event metadata and request body but not response body. This does not apply for non-resource requests.
  • RequestResponse: Log event metadata, request, and response bodies. This does not apply for non-resource requests.

Be aware that when an event is processed and compared against the audit Policy rules, the first matching rule sets the audit level of the event. This is unlike Kubernetes and RBAC policies, where the rules are applied accordingly to the most restrictive one.

Configuring Falco as an audit backend to receive events

You cannot react to threats on time if the logs are just sitting there, waiting for a forensic investigation.

Integration with a security tool, like a Host Intrusion Detection (HIDS), will unleash all of the audit log potential.

You can integrate the Kubernetes audit log with security tools by sending the events in one of two ways:

  • Log backend: Writes the events into the filesystem. If your security tool is installed in the same machine it can parse the files. You can also manually process the files with a json parser, like jq, and build up some queries.
  • Webhook backend: Sends the events to an external HTTP API. Then, your security tool doesn’t need access to the filesystem; or, you could have a single security tool instance protecting several kube-apiserver.

For the following examples we’ll use the webhook backend, sending the events to Falco.

Falco, the open-source cloud-native runtime security project, is the de facto Kubernetes threat detection engine. Falco was created by Sysdig in 2016 and is the first runtime security project to join CNCF as an incubation-level project. Falco detects unexpected application behavior and alerts on threats at runtime. By tapping into Sysdig open-source libraries through Linux system calls, it can run in high performance production environments.

Falco, as a webhook backend, will ingest Kubernetes API audit events and provide runtime detection and alerting for our orchestration activity. Falco ships with a comprehensive set of out-of-the-box rules, that will help you get started on detecting the most common threats. You can easily customize those rules or create your own ones to adapt Falco to your organization needs.

Diagram of processing kubernetes audit logs with falco. Events are received on the embedded web server. The rule engine processes those events as well as events from the kernel module and the sysdig libraries. Falco uses the rules to decide if the events are suspicious and generates alerts.

The following example kube-apiserver configuration would set Falco as a backend webhook:

    apiVersion: v1
    kind: Config
    clusters:
    - name: falco
     cluster:
       server: http://$FALCO_SERVICE_CLUSTERIP:8765/k8s_audit
    contexts:
    - context:
       cluster: falco
       user: ""
     name: default-context
    current-context: default-context
    preferences: {}
    users: []

The URL endpoint in the server field is the remote endpoint that the audit events will be sent to.

To enable the webhook backend, you need to set the --audit-webhook-config-file flag with the webhook configuration file.

On versions 1.13 to 1.18 of Kubernetes, the webhook backend can be also configured dynamically via the AuditSink object. Note that this feature has been removed since 1.19.

End-to-end example

Now that we have the audit policy defined and our webhook backend is configured to send Kubernetes audit logs to Falco, let’s give them life!

We can post a configmap, like the example below, into the Kubernetes API:

apiVersion: v1
data:
 ui.properties: |
   color.good=purple
   color.bad=yellow
   allow.textmode=true
 access.properties: |
   aws_access_key_id = MY-ID
   aws_secret_access_key = MY-KEY
kind: ConfigMap
metadata:
 name: my-config
 namespace: default

Running:

> kubectl create -f badconfigmap.yaml

It triggers the following out-of-the-box rule (you can check the full list of rules on GitHub):

- rule: Create/Modify Configmap With Private Credentials
  desc: >
     Detect creating/modifying a configmap containing a private credential (aws key, password, etc.)
  condition: kevt and configmap and kmodify and contains_private_credentials
  exceptions:
    - name: configmaps
      fields: [ka.target.namespace, ka.req.configmap.name]
  output: K8s configmap with private credential (user=%ka.user.name verb=%ka.verb configmap=%ka.req.configmap.name config=%ka.req.configmap.obj)
  priority: WARNING
  source: k8s_audit
  tags: [k8s]

If we look now at the Falco log (your pod name will be different):

> kubectl tail -f falco-b4ck9

We should see the following alert/detection:

Warning Kubernetes configmap with private credential (user=minikube-user verb=create configmap=my-config config={"access.properties":"aws_access_key_id=MY-ID aws_secret_access_key=MY-KEY","ui.properties":"color.good=purple color.bad=yellow allow.textmode=true"})

Conclusion

In order to enhance your Kubernetes security strategy, it is important to be attentive to new features and improvements, incorporating those that will let you gain visibility into suspicious events or misconfigurations like Kubernetes audit log events.

The information gathered in these logs can be very useful to understand what is going on in our cluster, and can even be required for compliance purposes.

Tuning the rules with care and using less verbose mode when required can also help us lower costs when using a SaaS centralized logging solution.

But what really makes a difference here is the use of Falco as a threat detection engine. Choosing it to be your webhook backend is the first step towards enforcing Kubernetes security best practices, detecting misuse, and filling the gap between what you think the cluster is running and what’s actually running.

If you would like to find out more about Falco:

At Sysdig Secure, we extend Falco with out-of-the-box rules, along with other open source projects, making them even easier to work with and manage Kubernetes security. Register for our Free 30-day trial and see for yourself!

The post Getting started with Kubernetes audit logs and Falco appeared first on Sysdig.

]]>
Sysdig 2020 Container Security Snapshot: Key image scanning and configuration insights https://sysdig.com/blog/sysdig-2020-container-security-snapshot/ Mon, 17 Aug 2020 09:45:38 +0000 https://sysdig.com/?p=27621 Today, we are excited to share our Sysdig 2020 Container Security Snapshot, which provides a sneak peak into our upcoming...

The post Sysdig 2020 Container Security Snapshot: Key image scanning and configuration insights appeared first on Sysdig.

]]>
Sysdig 2020 Container Security Snapshot, which provides a sneak peak into our upcoming 2020 Container Usage Report

As containers and Kubernetes adoption continue to increase, cloud teams are realizing they need to adopt a new workflow that embeds container security into their DevOps processes.

Secure DevOps, a variation of DevSecOps, embeds security and monitoring throughout the application lifecycle, from development through production. This sets you up to deliver applications that are secure, stable, and high performing. This workflow plugs into your existing toolchain and provides a single source of truth across DevOps, developer, and security teams to maximize efficiency.

Dig in to learn about security insights we’ve uncovered from observing real customer environments running containers in production.

What are the top image distros?

Modern software is assembled, not built from scratch. Developers typically use open source base images from various Linux distributions when building their containerized applications.

Sysdig’s data collected from more than 100,000 scanned images highlights that Alpine is the most popular image being used by developers today. This makes sense because Alpine is known for images with a minimal footprint. But ironically, the largest image that was running that we found was also an Alpine-based image that was 10GB!

How big are images?

Although image size depends on the application, based on our data, the average image size observed is 376 MB. The large 10GB Alpine image seems to be an outlier, as it is not a good practice to have a large image unless absolutely necessary. Large images not only take longer to deploy, slowing down release velocity, but they also expose more opportunities for attack.

How many layers are part of an image?

Docker images are composed of several immutable layers, and each one is associated with a unique hash ID. Each layer corresponds to certain instructions in your Dockerfile. These layers are generated when the commands in the Dockerfile are executed during the Docker image build.

What we see is that the average number of layers in a given image is roughly ~ 9.5. The maximum layers observed in the wild was 107! Adding additional layers (for example, via a RUN/ADD/COPY command in a Dockerfile) can impact the performance of your build process, and can also be harder to debug.

What are the top 3rd-party libraries?

Developers also pull in code from open source third-party libraries (non-OS packages) to save time when building and deploying applications. Our data shows npm is the most popular open-source non-OS package.

Where are images typically pulled from?

Images from public sources can be risky, as few are checked for security vulnerabilities. Docker Hub, for example, certifies less than 1% of its nearly three million hosted images. However, we found that 40% of images are pulled from public sources.

What types of vulnerabilities matter?

OS vulnerability snapshot: We noticed that 4% of OS vulnerabilities are high or critical. Although this may seem low, if an OS vulnerability is exploited, it can compromise your entire image and bring down your applications. This is also why there is a heavy focus on scanning for OS vulnerabilities, especially by cloud providers that provide this capability as part of registry scanning (i.e., ECR, GCR, etc.)

Non-OS vulnerability snapshot: What many teams don’t check for are vulnerabilities in third-party libraries. We found that 53% of non-OS packages have high or critical level severity vulnerabilities. Developers might be unknowingly pulling in vulnerabilities from these non-OS open source packages, like Python PIP, Ruby Gem, etc., and introducing security risk.

How common are risky configurations?

While teams understand the need to scan for vulnerabilities, they may not be scanning for common configuration mistakes. What we see is that 58% of images are running as root, allowing for privileged containers that can be compromised.

From talking to our customers, in practice, even if risky configurations are detected at runtime, teams don’t stop containers in order to continue deploying quickly. Instead, they run within a grace period and continuously monitor for deviations from security policies.

Conclusion

When reviewing this year’s data, it’s apparent that security needs to be top of mind for DevOps teams, especially given that open-source components that are widely adopted.

Key data highlights the importance of scanning early in the CI/CD pipeline, both for OS package vulnerabilities and non-OS vulnerabilities that have a large 53% of high/critical severities. 58% of images are also running as root, requiring teams to continuously monitor their containers at runtime and enforce security mechanisms to detect and prevent such risky configurations.

Teams ultimately need to adopt a secure DevOps workflow to ensure they address security across the container lifecycle. To learn more, you can read how to get started with secure DevOps, or sign up for a 30 day free trial with Sysdig. Also, check out our recent cloud native and security reports.

The post Sysdig 2020 Container Security Snapshot: Key image scanning and configuration insights appeared first on Sysdig.

]]>
12 Container image scanning best practices to adopt in production https://sysdig.com/blog/image-scanning-best-practices/ Tue, 21 Jul 2020 15:00:27 +0000 https://sysdig.com/?p=26700 Don’t miss out on these 12 image scanning best practices, whether you are starting to run containers and Kubernetes in...

The post 12 Container image scanning best practices to adopt in production appeared first on Sysdig.

]]>
Don’t miss out on these 12 image scanning best practices, whether you are starting to run containers and Kubernetes in production, or want to embed more security into your current DevOps workflow.

One of the main challenges your teams face is how to manage container security risk without slowing down application delivery. A way to address this early is by adopting a Secure DevOps workflow.

Secure DevOps, also known as DevSecOps, brings security and monitoring throughout the application lifecycle from development through production. This sets you up to deliver applications that are secure, stable, and high performance. This workflow plugs into your existing tool chain and provides a single source of truth across DevOps, developer, and security teams to maximize efficiency.

The five essential workflows for Secure DevOps are image scanning, runtime securiy, compliance, Kubernetes & container monitoring, and application and cloud services monitoring.

Image scanning is a key function to embed into your Secure DevOps workflow. As one of your first lines of defense, it can also help you detect and block vulnerabilities before they are exploited.

Fortunately, image scanning is easy to implement and automate. In this blog, we will cover many image scanning best practices and tips that will help you adopt an effective container image scanning strategy.

What is container image scanning?

Image scanning refers to the process of analyzing the contents and the build process of a container image in order to detect security issues, vulnerabilities or bad practices.

Tools typically gather Common Vulnerabilities and Exposures (CVEs) information from multiple feeds (NVD, Alpine, Canonical, etc.) to check if images are vulnerable. Some also provide out-of-the-box scanning rules to look for the most common security issues and bad practices.

Image scanning can be easily integrated into several steps of your secure DevOps workflow. For example, you could integrate it in a CI/CD pipeline to block vulnerabilities from ever reaching a registry, in a registry to protect from vulnerabilities in third-party images, or at runtime to protect from newly discovered CVEs.

When automated and following the best practices, image scanning ensures your teams are not slowed down from deploying applications.

Let’s deep dive into the 12 image scanning best practices you can implement today.

1: Bake image scanning into your CI/CD pipelines

When building container images, you should be extra careful and scan them before publishing.

You can leverage the CI/CD pipelines you are already building for your DevOps workflow and add one extra step to perform image scanning.

The basics of image scanning on a CI CD pipeline are as follows. Once the code is pushed, a CI/CD pipeline is triggered, an image is built and sent to a staging repository. Then the image scanner scans the image and sends the results back to the CI/CD pipeline. If the image follows the configured security policies, then the image is pushed to a production image repository.

Once your code is tested and built, instead of pushing the images to the production repository, you can push images to a staging repository. Then, you can run your image scanning tool. These tools usually return a report listing the different issues found, assigning different severities to each one. In your CI/CD pipeline you can check these image scanning results and fail the build if there is any critical issue.

Keep in mind, automation is key. This is a core concept for DevOps, right? The same applies to securing DevOps.

By automating security into your CI/CD pipelines, you can catch vulnerabilities before they enter your registry without giving people the chance to be affected by these issues, or the issues to reach production.

2: Adopt inline scanning to keep control of your privacy

In the previous step we saw how image scanning in a CI/CD pipeline traditionally involves a staging registry. But what if your image contains some credentials by mistake? They could reach the wrong hands and end up being leaked.

Going a step further, you can implement inline image scanning, scanning your images directly from your CI/CD pipeline without needing that staging repository.

With inline image scanning, only the scan metadata is sent to your scanning tool, helping you keep control of your privacy.

With Inline image scanning, the images are scanned inside the CI CD pipeline. Once the code is pushed, a CI/CD pipeline is triggered, an image is built and scanned without leaving the CI CD pipeline. The image metadata is sent to the image scanner, who sends the results back to the CI CD pipeline. If the image follows the configured security policies, then the image is pushed to a production image repository.

We’ve prepared some guides on how to implement inline image scanning with the most common CI/CD tools, like Gitlab, Github Actions, AWS Codepipeline, Azure Pipelines, CircleCI, Jenkins, Atlassian Bamboo, and Tekton.

3: Perform image scanning at registries

When you start implementing image scanning, you should include it with your registry as one of the first steps.

Diagram of the repositories you may be using on your deployment. You usually have a private repository where you publish your images, and then some public repositories where you download the images from third parties. You need to scan the images from both.

All of the images you deploy will be pulled from a registry. By scanning images there, you at least know that they have been scanned before running.

4: Leverage Kubernetes admission controllers

Even if you block your vulnerable images in your CI/CD pipelines, nothing is stopping them from being deployed in production. And also, even if they are scanned at a registry, who blocks the images from external developers?

Ideally, you would like Kubernetes to check the images before scheduling them, blocking unscanned or vulnerable images from being deployed onto the cluster.

You can leverage admission controllers to implement this policy.

Kubernetes admission controllers are a powerful Kubernetes-native feature that helps you define and customize what is allowed to run on your cluster. An admission controller intercepts and processes requests to the Kubernetes API after the request is authenticated and authorized, but prior to the persistence of the object.

Diagram of an image scanner triggered by an admission controller. Once a deployment creation request is sent to Kubernetes, the admission controller calls a webhook and sends the image metadata. The image scanner sends back the scan results to the admission controller, who will only persist the deployment if the scan passed.

Scanning tools usually offer a validating webhook that can trigger an image scanning on demand and then return a validation decision.

An admission controller can call this webhook before scheduling an image. The security validation decision returned by the webhook will be propagated back to the API server, which will reply to the original requester and only persist the object in the etcd database if the image passed the checks.

However, the decision is made by the image scanner without any context on what is happening in the cluster. You could improve this solution by using OPA.

Open Policy Agent (OPA) is an Open-Source and general purpose policy engine that uses a high level declarative language called rego. One of the key ideas behind OPA is decoupling decision-making from policy enforcement.

With OPA, you can make the admission decision in the Kubernetes cluster instead of the image scanner. This way, you can use cluster information in the decision making, like namespaces, pod metadata, etc. An example would be having one policy for the “dev” namespace with more permissive rules, and then another very restrictive policy for “production.”

5: Pin your image versions

Sometimes, the image you scan is not the same one you deploy in your Kubernetes cluster. This can happen when you use mutable tags, like “latest” or “staging.” Such tags are constantly updated with newer versions, making it hard to know if the latest scan results are still valid.

A tag ":latest" pointing to the latest version of an image. All versions are scanned but the last one.

Using mutable tags can cause containers with different versions to be deployed from the same image. Beyond the security concerns from the scan results, this can cause problems that are difficult to debug.

For example, instead of using ubuntu:focal, you should enforce the use of immutable tags like ubuntu:focal-20200423when possible.

Keep in mind that version tags (for some images) tend to be updated with minor, non-breaking changes. So, although it looks a bit verbose, the only option to ensure repeatability is using the actual image id:

ubuntu:@sha256:d5a6519d9f048100123c568eb83f7ef5bfcad69b01424f420f17c932b00dea76

You might be starting to think that this is an issue that goes beyond image scanning best practices. And you would be right – you need a set of Dockerfile best practices. This not only affects the FROM command in Dockerfiles, but also Kubernetes deployment files and pretty much everywhere you put image names.

What can you do from an image scanning perspective?

You can enforce this policy via the combination of a Kubernetes admission controller, your image scanner, and the OPA Engine, the system described in the previous point.

6: Scan for OS vulnerabilities

As a general image scanning best practice, keep this thought in mind: “The lighter the image, the better.” A lighter image means faster builds, faster scans, and fewer dependencies with potential vulnerabilities.

New Docker images are usually built off of, or adding a layer over, an existing base image. This base image is defined by the FROM statement in the image Dockerfile. The result is a layered architecture design that saves a lot of time in the most common tasks. For example, when it comes to image scanning, you only need to scan a base image once. If a parent image is vulnerable, any other images built on top of that one will be vulnerable too.

Diagram of an image hierarchy. WordPress and PHP images are based on Apache, which is based on the Ubuntu image. If there is a vulnerability on Apache, both WordPress and PHP images are also vulnerable.

Even if you didn’t introduce a new vulnerability in your image, it will be susceptible to those in the base image.

That’s why your scanning tool should actively track vulnerability feeds for known vulnerable images and notify you if you’re using one of those.

7: Make use of distroless images

Introducing distroless images, base images that don’t contain package managers, shells, or any other programs you would expect to find in a standard Linux distribution. Distroless images allow you to package only your application and its dependencies in a lightweight container image.

Restricting what’s in your runtime container to precisely what’s necessary minimizes the attack surface. It also improves the signal to noise of scanners (e.g., CVE) and reduces the burden of establishing provenance to just what you need.

Belows example shows the Dockerfile for a go “Hello world” app, running on Ubuntu vs. distroless.

FROM ubuntu:focal
COPY main /
ENTRYPOINT ["/main"]
Scan results of a image based on a distro image. The image size is around 80 megabytes and there are 53 warnings and 24 vulnerabilities.

After scanning it, we found 24 OS vulnerabilities, in which two of them are of Medium severity. Also, image size is rather big for such a simple app, 77.98MB.

Now, the same app based on a distroless image:

FROM gcr.io/distroless/static-debian10
COPY main /
ENTRYPOINT ["/main"]
Scan results of an image based on a distroless base image. The image size is only 7 megabytes and there are only 10 warnings and 2 vulnerabilities.

Now, we only found two negligible severity OS vulnerabilities. Also, the image size was reduced to only 6.93MB, which is more appropriate for this app.

This shows that the distroless container didn’t have any unnecessary packages that could lead to more vulnerabilities, and thus exploits, being identified.

8: Scan for vulnerabilities in third-party libraries

Applications use a lot of libraries, so much that those end up providing more lines of code than the actual code your team writes. This means you need to be aware not only of the vulnerabilities in your code, but also the ones in all of its dependencies.

Luckily, those vulnerabilities are well tracked in the same vulnerability feeds that your scanner uses to warn you about OS vulnerabilities. Not all tools go as deep as to scan the libraries in your images, so make sure your image scanner digs deep enough and warns you about those vulnerabilities.

9: Optimize layer ordering

You can further optimize your images if you’re careful with the RUN commands in the Dockerfile. The order of RUN commands can have a big impact on the final image, as it dictates the order of the layers that conform the image.

You can optimize the Docker cache usage by placing bigger layers (usually invariant) first, and the most variable files (i.e., your compiled application) at the end. This will favor reusing existing layers, accelerating building the image, and indirectly speeding up the image scanning as well.

10: Scan for misconfigurations in your Dockerfile

As we’ve just seen, the Docker image building process follows instructions from a manifest, the Dockerfile.

There are several Dockerfile best practices you can follow to detect common security misconfigurations:

  • Running as a privileged (root) user, with access to more resources than needed.
  • Exposing insecure ports, like the ssh port 22 that shouldn’t be open on containers.
  • Including private files accidentally via a bad “COPY” or “ADD” command.
  • Including (leaking) secrets or credentials instead of injecting them via environment variables or mounts. It’s also a good practice to allow users to pass options to the Entrypoint and CMD, like in this java example.
  • Many other things defined by your specific policies, such as blocked software packages, allowed base images, whether a SUID file has been added, etc.

In a Dockerfile like this one:

FROM ubuntu:focal-20200423
USER root
RUN curl http://my.service/?auth=ABD54F0356FA0054
EXPOSE 80/tcp
EXPOSE 22/tcp
ENTRYPOINT ["/main"]

Our image scanning could automatically detect several issues:

USER root

We are running as root.

EXPOSE 22/tcp

Here, we are exposing the port 22, which is commonly used for ssh, a tool no container should include. We are also exposing the port 80, but that should be fine, as is the common port for HTTP servers.

RUN curl http://my.service/?auth=ABD54F0356FA005432D45D0056AF5400

This command uses an auth key that anyone could use to cause us some harm. We should be using some kind of variable instead. Keys like this could be detected using regular expressions, not only on the Dockerfile, but also in any file present in the image. As an extra measure, you could also check for file names that are known to store credentials.

These are a lot of things to keep in mind, and that can be overwhelming. Fortunately, most of these best practices are covered by security standards like NIST or PCI, and many image scanning tools provide out-of-the-box policies that have been mapped to specific compliance controls.

11: Flag vulnerabilities quickly across Kubernetes deployments

An image that passed a scan is not completely secure. Imagine that you scan and deploy an image, and right after doing so a new vulnerability is found in it. Or suppose that you strengthen your security policies at a given moment, but what happens with those images that are already running?

Timeline of a vulnerable image deployed on production. You scan the image, no vulnerabilities are found. You then deploy the image. Time after a vulnerability is found on that image. Your image has been vulnerable all this time. You should deploy a fix as soon as possible.

It’s an image scanning best practice to continuously scan your images to:

  • Detect new vulnerabilities and adapt to your policy changes.
  • Report those findings to the appropriate teams so they can fix the images as soon as possible.

Of course, implementing runtime scanning can help you mitigate the effect of those vulnerabilities. Let’s take CVE-2019-14287 as an example; you could easily write some Falco rules to detect if that vulnerability is being exploited. However, having to write specific rules for each vulnerability is a time consuming effort and should be used as a last line of defense.

So, about continuously scanning the images that are running in your cluster.

Security tools use different strategies to archive this, the simplest one being to rescan all images every few hours. Ideally, you want to rescan affected images as soon as the vulnerability feeds are updated. Also, some tools are able to store the image metadata, and can detect new vulnerabilities without the need of a full rescan.

Then, once a vulnerability is found in a running container, you should fix it as soon as possible. The key here is having an effective reporting of vulnerabilities, so each person can focus on the relevant information for them.

One way to achieve this is with a queryable vulnerability database that allows DevOps and security teams to put some order in their vast catalog of images, packages, and CVEs. They will want to search for parameters like the CVE age, whether there is a fix available, software version, etc. Finally, it’s useful if these reports can be downloaded and shared (PDF/CSV) with vulnerability management teams, CISO’s, etc.

Let’s illustrate this with an example. Imagine a query like the following: Show me all the vulnerabilities in prod namespace where the severity is greater than high, the CVE > 30 days old, and a fix is available.

A report of vulnerabilities found on running images with a fix available.

With such vulnerability reporting capabilities, teams can easily identify the vulnerable images they can actually fix and start working on solutions before vulnerabilities can be exploited.

12: Choose a SaaS-based scanning solution

Choosing a SaaS-based scanning service over an on-prem solution has many benefits:

On-demand and scalable resources: You can start with scanning a few images at first and grow as your container applications scale; without worrying about backend data management.

Fast implementation: You can embed scanning into your CI/CD pipelines and get up and running in minutes, unlike on-premises applications that require more time to install and setup.

Easy upgrades and maintenance: The SaaS provider handles patches and rolls out new feature updates that don’t require you to manually upgrade.

No infrastructure or staff costs: You avoid paying for in-house hardware and software licenses with perpetual ownership. You also don’t need on-site to maintain and support the application.

Conclusion

Image scanning is the first line of defense in your Secure DevOps workflow. By automating it, you can maximize its potential and detect issues before they have the chance to become a problem. Following image scanning best practices will help you embed security into without slowing you down.

Also, image scanning is not something you apply once, but rather a continuous checkpoint in several moments of your workflow, including when building, on registries, before deploying, and once your containers are already running.

Choosing the right tool is key. Sysdig includes advanced image scanning features, like inline image scanning, continuous scanning and vulnerability reporting; and with its guided onboarding you will be set in less than 5 minutes. Try it today!

The post 12 Container image scanning best practices to adopt in production appeared first on Sysdig.

]]>
PCI Compliance for Containers and Kubernetes https://sysdig.com/blog/container-pci-compliance/ Tue, 31 Mar 2020 15:00:36 +0000 https://sysdig.com/?p=23062 In this blog, we will cover the various requirements you need to meet to achieve PCI compliance, as well as...

The post PCI Compliance for Containers and Kubernetes appeared first on Sysdig.

]]>
In this blog, we will cover the various requirements you need to meet to achieve PCI compliance, as well as how Sysdig Secure can help you continuously validate PCI compliance for containers and Kubernetes.

Learn how to meet PCI Compliance Requirements for Container and Kubernetes Environments! Click to tweet

What is PCI DSS Compliance?

Hackers are getting better at stealing credit card data, costing companies up to billions of dollars in fines every year. To prevent these types of attacks, the Payment Card Industry (PCI) security standard was created, along with a set of requirements to meet in order to mitigate risk.

Many of your applications are now starting to run on containers in the cloud, where there are many more attack vectors. So how can you validate container compliance for your cloud applications?

Why is PCI compliance different in containers and Kubernetes?

As applications migrate to the cloud, there are three key attributes of containerized environments that make PCI container compliance challenging:

  1. Container sprawl
  2. Container lifespan
  3. Open-source packaging

Container sprawl: When virtualization took off, there was a concept of Virtual Machine (VM) sprawl that came about as a result of increased density of VMs. As more applications get containerized, the concept of container sprawl is essentially VM sprawl on steroids. Now your microservice is distributed across thousands of nodes, supported by maybe millions of containers. Containers spin up and down and IP addresses are constantly changing, making any form of PCI audit extremely difficult.

Container lifespan: Sysdig’s 2019 container usage report found that 52% of containers live for five minutes or less, while six percent of containers live longer than a week. Meeting requirements of PCI-DSS can be complex in fast-changing container environments where some containers last a long time, while others are quick to come and go. In addition, with 39% of containers living for a minute or less, you  must establish a way to record detailed container activity as proof of compliance after the container has disappeared. In the event of a compliance violation, you’ll want to know what processes were spawned, what connections were made, files modified etc. You also need to be able to correlate this system activity with user activity and understand who accessed what. With access to this granular data, you can effectively provide an audit trail to 3rd party assessors. 

Open-source packaging: Software today is assembled, not built from scratch. Today, the fastest way to drive innovation, speed, and lower cost is by adopting open source. Your developers will pull open-source base images and leverage third-party libraries to build and scale containerized applications. However, using open source also requires that companies remain as diligent about updating their open source dependencies as they would be about updating their own code. If there is a vulnerability that is inherited from the base image or libraries like Java JAR, Python PIP, etc., then the risk belongs to the organization. And when you have hundreds of thousands of containers that support thousands of microservices, how can you prevent vulnerabilities from entering production. Preventing known vulnerabilities and flagging newly identified vulnerabilities not only reduces risk, it is a step in passing PCI audits. In addition, misconfigurations in your dockerfile such as exposed ports, embedded access keys or tokens can easily lead to compliance violations. According to Gartner, through 2023 at least 99% of cloud security failures will be the customer’s fault. 

Traditional compliance tools don’t work

Managing the process with traditional tools does not work for containers and Kubernetes; they can’t see inside containers or assess their behavior. Most container traffic is east-west in nature – versus north-south – meaning traditional security controls never see most container activity. They also don’t have relevant context about the cloud and Kubernetes environment, which means they can’t tie vulnerabilities back to applications and namespaces. Finally, these legacy tools are not built for DevOps, and are designed to be applied post application deployment.

PCI compliance cost implications and consequences

Validating compliance is the number one blocker for faster application delivery. Regulators are increasingly enforcing financial penalties for failure to comply. 

Studies have shown that:

  • Annual cost of non-compliance to businesses runs an average of $14.8 million*
  • The cost of compliance, on the other hand, was found to average $5.5 million*

Kubernetes is a dynamic environment in which it’s difficult to detect when assets fall out of PCI compliance. Without a clear mapping of PCI guidelines to this new environment, your teams won’t be able to prove they meet compliance requirements. As a result, meeting a PCI audit becomes an expensive fire drill, slowing down application delivery for your cloud teams. Kubernetes compliance requires a new approach.

Your teams need to have a clear mapping of controls to their containerized workloads and the ability to continuously track compliance over time. This will let them be confident in their ability to manage security risk and pass security audits. Ultimately, you need to ensure compliance is not blocking cloud adoption, so your business can ship cloud applications faster.

*Ponemon study 

Sysdig Secure helps you validate PCI compliance

Sysdig Secure helps you validate PCI compliance across all stages of the container and Kubernetes lifecycle, ensuring that compliance is not a blocker for cloud adoption. A few examples of how we address PCI:

Out of the box policies –  PCI 1.1.6 and 6.1 Requirements

Sysdig provides default PCI scanning policies and also customize policies based on the scope that is relevant to your PCI controls. These policies provide a single workflow for detecting vulnerabilities and misconfigurations in registries, containers, and Kubernetes

Sysdig secure has out of the box policies for PCI compliance

Kubernetes Network Topology Maps – PCI 1.1.2 Requirement

Sysdig will dynamically generate topology maps of all hosts, containers, and processes on your infrastructure and map any network connection they make inside and outside your network. These topology maps can also be customized to show the logical services and how they’re connected as well.

Sysdig secure will generate network topology maps to help you meet PCI compliance

Asset Inventory Management – PCI 2.4 Requirement

Sysdig comes with an explore view that will give you a view of all hosts, containers, or any process grouped by metadata running on their system. They can use this table to slice and dice all system components however they choose.

Sysdig secure will help you meet PCI compliance with a list of hosts, containers and processes

Access control of cardholder data – PCI 7.1 Requirement

Sysdig analyzes the requirements of the Pod spec in your Deployment definition and creates the least privilege PSP for your application. This controls if you allow privileged pods, users to run as the container, volumes, etc

PSP advisor will help you validate your pod permissions

Kubernetes Audit Trail – PCI 10.1, 10.2 Requirement

Sysdig provides a continuous audit of all container infrastructure events to facilitate incident response and PCI-DSS compliance. Use this as proof of compliance for your 3rd party auditors even after the container is gone.

Activity audit will help you meet PCI-DSS compliance

To dig deeper into how Sysdig Secure can map and address many more PCI controls, check out some of these resources below:

The post PCI Compliance for Containers and Kubernetes appeared first on Sysdig.

]]>
What’s new in Sysdig Secure: January 2020 https://sysdig.com/blog/sysdig-secure-january-2020/ Tue, 11 Feb 2020 17:00:06 +0000 https://sysdig.com/?p=21790 We’ve been busy this New Year (a rather warm one in San Francisco) to bring you exciting new ways to...

The post What’s new in Sysdig Secure: January 2020 appeared first on Sysdig.

]]>
We’ve been busy this New Year (a rather warm one in San Francisco) to bring you exciting new ways to secure your DevOps journey. Highlights include:

  • Vulnerability management is dramatically simplified with a new feature that elegantly tracks vulnerability changes across different image versions
  • Auditing gets more comprehensive with the introduction of file-activity as a stream in the Activity Audit feature
  • And much more!

Read on for the details and see how you can put them to use! For a demo sign up here.

Image Vulnerability Diff Reports

The challenge for all vulnerability management teams is handling and prioritizing very large vulnerability reports. Navigating across a huge list of tens of thousands of vulnerabilities manually through an excel report is a security team’s worst nightmare.

Sysdig Secure’s latest feature enhancements in image scanning allow vulnerability management teams to compare vulnerabilities across different versions of an image and discover new, fixed or shared vulnerabilities quickly.

When you upgrade your image, how do you know your image risk posture is better or worse? Are you confident that you have not introduced new vulnerabilities? Or which existing vulnerabilities have been addressed? With Sysdig Secure, there is a clear way to compare vulnerabilities for different tags directly from the UI. By doing a diff across the new release, by checking different tags (ex. latest vs an older tag per repository) you can discover new and/or fixed vulnerabilities quickly without perusing through tens of 1000s of lines of excel reports.

Vulnerability Diff reports in Sysdig Secure!. Click to tweet

New Image Scanning Policies for File Integrity Monitoring

File attributes can now be verified as part of the image scan analysis. With this new policy, you can:

  1. Check if the file exists or not and trigger alert based on the condition
  2. A specific file can be validated against its SHA256 hash
  3. Validate file permissions. For example, if this file has an executable bit, we can flag that as an alert
  4. Check for filenames based on regex.
  5. Inspect contents (ex. Malware signatures, exposed passwords, credential leaks etc)

File Integrity Monitoring with Sysdig Secure! Click to tweet

File Data type part of Activity Audit

Sysdig’s Activity Audit speeds incident response and enables audit for Kubernetes. Sysdig captures relevant information like:

  • executed commands inside the container
  • network connections
  • Kubernetes API events, like users executing kubectl exec

By correlating this information with Kubernetes application context, the SOC team can spot abnormal activity and understand faster what happened during a security incident: who did what.

With Activity Audit 1.5, we’ve added a new “file” data type. You can now filter the audit trail by file type or specific file attributes:

  • File name
  • Directory
  • Command (used to access the file)
  • Access mode

This feature enables further File Integrity Monitoring (FIM) capabilities and the ability to audit tampering with sensitive files, such as:

  • Container binaries (/usr/sbin/nginx, /usr/bin/java)
  • Configuration files (/etc/passwd, /etc/shadow, /etc/ssh/sshd_config)
  • Kubernetes secrets injected in the pod (/var/run/secrets)

Activity Audit allows you to recreate a full audit trail correlating this action with the command that executed it, the user inside the container that launched the command and even the actual Kubernetes user and external IP that initiated the connection from outside your cluster.

Usability and visual improvements

We also introduced an update to capture files as part of the Kubernetes forensics capabilities in Sysdig Secure. This allows for:

  • The ability to see if a capture was triggered manually or from a policy
  • Search across all capture files

Additional features

CRI-O support for the stop and pause actions associated with Policy Events (in addition to currently supported Docker containers) This is part of our efforts towards extending our integration with RedHat OpenShift.

To learn more, visit our page on Kubernetes security or sign up for a free trial

The post What’s new in Sysdig Secure: January 2020 appeared first on Sysdig.

]]>
RBAC support with Sysdig Secure https://sysdig.com/blog/rbac-sysdig-secure/ Fri, 07 Feb 2020 13:14:41 +0000 https://sysdig.com/?p=21774 We often hear from our customers that to adopt a container and Kubernetes security tool in any mid sized or...

The post RBAC support with Sysdig Secure appeared first on Sysdig.

]]>
We often hear from our customers that to adopt a container and Kubernetes security tool in any mid sized or large organization, separation of duties and least privilege access via RBAC is a must. Admin roles cannot be granted unnecessarily to all teams. If users or groups are routinely granted these elevated privileges, account compromises or mistakes can result in security and compliance violations.

RBAC support in Sysdig Secure! Click to tweet

Why is RBAC a requirement for Secure DevOps?

The asks of DevOps teams that use Sysdig can be summarized in a few use cases:

  • I want the developers to only have access to their cluster/namespace/application
  • I want the security team to access every component, bar account administration and billing sections
  • I want to enable external auditors to perform a full security posture assessment, hence provide separate access controls for 3rd parties.
  • I want programmatic API access to grant exactly the same level of privileges than UI access, providing an unique API key that validates user identity.

How RBAC works in Sysdig Secure

We have just released the ability to support RBAC and provide federated access control across different teams within your organization. Sysdig Secure RBAC supports 4 different types of users in your organization in addition to the admin role:

  1. View Only: Read access to every Secure feature within the team scope. A View Only user cannot modify runtime policies, image scanning policies, or any other content.
  2. Standard User: Can push container images to the scanning queue and view the image scanning reports. Standard Users can also display the runtime security events within the team scope. They cannot access the Benchmarks, Activity Audit. or Policy definition sections of the product.
  3. Advanced User: Can access every Sysdig Secure feature within the team scope in read and write mode. Advanced Users can create, delete, or update runtime policies, image scanning policies, compliance checks or any other security policies. The Advanced User cannot manage other users.
  4. Team Manager: Same permissions as the Advanced User + ability to add/delete team members or change team member permissions.

How to setup RBAC

The admin has full access to management capabilities of the platform. Sysdig is an enterprise ready platform with access management functionalities (RBAC, SSO, LDAP, etc). As an admin, you can login to Sysdig Secure admin console and easily create users and teams and assign those users specific roles and access levels inside the teams (view only, standard, advanced or team manager). You can also control global account and billing, agent install, etc.

Teams settings in Sysdig Secure

View Only User

This is meant for a team member or a 3rd party that doesn’t require any write access, and has limited access to the functionalities in Sysdig. This is useful when you need to carve a role for external assessment and audits. Below, you can see the option of editing policies, importing rules etc are all removed. They can i.e.:

  • Evaluate your scan results of the image scanning policies
  • View the compliance benchmark scores and level of acceptance
  • View runtime security events for their applications and examine captures

Policy events in Sysdig Secure

Standard User

This user has restricted access to certain functionalities in the platform. This restricted access is scoped by a particular namespace specific to this user, so a developer cannot lurk into other applications or see sensitive cluster level data. A standard user can:

  • Submit new images to the scanning queue
  • Display image scanning results
  • View policy events in the context of their scope (specific namespace, cluster etc)

Image Scanning in Sysdig Secure

Advanced User

This user has the responsibility of managing the security posture in your organization. As a result, they have read/write access to all functionalities. They can:

  • Create scanning policies, view scan results for the images in their scope, query and report on vulnerabilities
  • Schedule compliance assessments
  • Create runtime detection and prevention (PSP) policies
  • Analyze correlated events across system and k8s api server level in Activity Audit
  • Dig into incident response workflows in the policy events section

A runtime policy in Sysdig Secure

Team Manager

This user has advanced user access but also has the ability to add/delete team members or change team member permissions.

Users in Sysdig Secure

Conclusion

RBAC is an essential functionality that now enhances Sysdig Secure’s enterprise readiness and allows for separation of duties and least privilege access across teams.

To learn more, visit our page on Kubernetes security or sign up for a free trial

The post RBAC support with Sysdig Secure appeared first on Sysdig.

]]>
Sysdig Secure 3.0 introduces native prevention and incident response for Kubernetes https://sysdig.com/blog/sysdig-secure-3-0/ Wed, 13 Nov 2019 07:02:25 +0000 https://sysdig.com/?p=19533 Today, we are excited to announce the launch of Sysdig Secure 3.0! Sysdig Secure is the industry’s first security tool...

The post Sysdig Secure 3.0 introduces native prevention and incident response for Kubernetes appeared first on Sysdig.

]]>
Today, we are excited to announce the launch of Sysdig Secure 3.0! Sysdig Secure is the industry’s first security tool to bring both prevention and incident response to Kubernetes.

This release has three main features:

  1. Kubernetes Policy Advisor prevents threats at runtime using Kubernetes Pod Security Policies.
  2. Falco Tuning optimizes Falco rules to reduce false positives and alert fatigue.
  3. Activity Audit speeds incident response and enables audit by correlating container and Kubernetes activity.

As DevOps teams ramp Kubernetes in production, their responsibilities expand beyond monitoring, capacity management and troubleshooting to include security and compliance. Teams are looking to merge these two functions into a single secure DevOps workflow. So let’s dive deeper into how the new features of Secure 3.0 streamline security responsibilities for DevOps teams, so they can focus on maximizing availability of their Kubernetes platform.

Sysdig Secure 3.0 features

Kubernetes Policy Advisor provides native threat prevention

Pod Security Policies (PSPs) are a powerful security control

Kubernetes Pod Security Policies provide a framework to ensure that Pods run with appropriate privileges and can only access the required resources.

Pod Security Policy can limit Pod behavior to improve security. For example:

  • Prevent privileged pods from starting and control privilege escalation
  • Restrict Pod’s access to specific filesystems, host and network namespaces
  • Restrict the users/groups a Pod can run as
  • Limiting the volumes a pod can access
  • Restrict other parameters like runtime profiles (AppArmor, SELinux, etc) or read-only root filesystems

Pod Security Policies are actually a threat prevention mechanism. The security constraints allow you to enforce a least privilege model that prevent attacks from spreading across the cluster and block the typical container breakout approaches.

Pod Security Policies are part of the Kubernetes platform, so when they validate Pod configuration and enforce runtime permissions, they don’t impact performance of the application workload. Other tools that tamper with the container infrastructure, or modify the host binaries and container images, can introduce security risks and potentially impact performance.

Challenges of implementing PSPs

Although PSPs are a powerful tool, there are some barriers to implementing them broadly:

  1. PSPs are easy to misconfigure: DevOps teams spend a lot of time manually configuring PSPs. It is challenging to find the right balance, so you don’t have a policy that is too restrictive and breaks the application or a policy that is too permissive and leaves pods exposed. Creating appropriate policies is often complicated and time consuming.
  2. Validating PSPs pre-production is hard: Since PSP configuration is complex, there is a risk of making mistakes. There is also no standard way to test PSPs prior to enforcement, which makes it hard to confidently adopt PSP’s at scale.

Kubernetes Policy Advisor allows you to deploy PSP’s confidently in production

Sysdig’s Kubernetes Policy Advisor provides a simple and scalable mechanism to safely implement Kubernetes Pod Security Policies (PSP) in production.

Kubernetes Policy Advisor

Kubernetes Policy Advisor creates a three step workflow to easily implement PSPs:

  1. Generate: Sysdig Secure auto-generates a restrictive PSP from the pod specs inthe deployment definition of a yaml file. This process allows you to significantly decrease the time spent configuring security policies.
  2. Validate: Policy Advisor validates the policies prior to enforcement to ensure they do not break application functionality. Comparing the PSP against the application runtime behavior, teams can tweak the policy to be more or less permissive. This iterative process gives confidence in the expected pod behavior in production.
  3. Prevent: Sysdig leverages Kubernetes-native controls to handle enforcement. This streamlined approach doesn’t tamper with the container infrastructure and has no performance impact.

For a deep dive into how Sysdig Secure prevents threats using native Kubernetes controls read: Pod Security Policies in production with Sysdig’s Kubernetes Policy Advisor.

Falco Tuning generates fewer false positives by tuning runtime policies

Falco is the open source Kubernetes runtime security project originally started by Sysdig. Since October 2018 it has been a CNCF® Sandbox Project.

Using Falco, DevOps teams can create custom runtime rules based on container behavior or Kubernetes audit-based events. Managing security policies at scale, across clusters and clouds, can be challenging. Sysdig Secure extends Falco by easing the burden of creating and updating runtime Falco policies with centralized management, a flexible policy editor and automated profiling.

DevOps teams can reduce false positives generated by security policies when they leverage Sysdig Secure’s Falco Tuning capabilities. Falco Tuning analyzes recurring events and suggests changes to policies that minimize redundant alerts.

For example, a web service in Kubernetes write logs to a specific path. If the application is upgraded, and in the new version logs are written to a different path, the policy could generate a high volume of false alerts. Falco Tuning observes the runtime events and automatically suggests that a new path is included as a condition filter in the rule. This helps reduce false positives and noisy alerts.

Activity Audit is the first Kubernetes-native tool for incident response

Security teams don’t have an audit trail for Kubernetes

Understanding all the changes in your cluster generated by a Kubernetes user or service is nearly impossible. Without the ability to map system activity to users or services, security teams have no way to uncover malicious behavior and misconfigurations within Kubernetes.

Existing tools provide this information as disparate data points that are not correlated. SOC teams need to be able to analyze an endless list of different scenarios:

  1. Show all outbound connections from my billing namespace to an unknown IP address
  2. Trace a kubectl exec user interaction and list all the command and network activity that happened inside the pod
  3. Show every tcpdump command execution that has happened in a host or Kubernetes deployment

Activity Audit speeds incident response and enables audit in Kubernetes

Sysdig’s Activity Audit speeds incident response and enables audit for Kubernetes. Sysdig captures relevant information like:

  • executed commands inside the container
  • network connections
  • Kubernetes API events like users executing kubectl exec

By correlating this information with Kubernetes application context, the SOC team can spot abnormal activity. For example, they can review a kubectl exec into a pod in Kubernetes andtrace the chain of activity.

Sysdig Activity Audit

Activity Audit provides teams with an audit trail, a common requirement for compliance standards like SOC 2, PCI, ISO, HIPAA and allows them to investigate Kubernetes user activity even if the container no longer exists.

Activity Audit complements the existing Sysdig Secure forensics capabilities (Captures), by recording all the pre and post attack container activity. This allows teams to analyze everything that happened – not only after an incident, but also before, so you can understand the full sequence of activity. Sysdig Secure is the only Kubernetes audit and incident response solution available today.

For a deep dive into incident response and audit for Kubernetes read: Incident response in Kubernetes with Sysdig Activity Audit.

Conclusion

Sysdig Secure 3.0 is the industry’s first tool to provide enterprises with threat prevention at runtime using Kubernetes-native Pod Security Policies (PSP). This launch also includes the first incident response and audit tool specific to Kubernetes environments.

Sysdig Secure embeds security and compliance into the build, run and respond stages of the Kubernetes lifecycle. With Sysdig, enterprises can embed security, maximize availability, and validate compliance. Sysdig Secure integrates into your secure DevOps workflow, giving you the confidence to run Kubernetes in production.

The post Sysdig Secure 3.0 introduces native prevention and incident response for Kubernetes appeared first on Sysdig.

]]>
Sysdig Secure 2.4 introduces runtime profiling for anomaly detection + new policy editor for enhanced security. https://sysdig.com/blog/sysdig-secure-2-4/ Tue, 06 Aug 2019 13:00:36 +0000 https://sysdig.com/?p=17502 Today, we are excited to announce the launch of Sysdig Secure 2.4! With this release, Sysdig adds runtime profiling to...

The post Sysdig Secure 2.4 introduces runtime profiling for anomaly detection + new policy editor for enhanced security. appeared first on Sysdig.

]]>
Sysdig Secure 2.4! With this release, Sysdig adds runtime profiling to enhance anomaly detection and introduces brand new interfaces that improve runtime security policy creation and vulnerability reporting. These features are focused on upgrading the experience of creating your security policy to detect security threats and attacks to your infrastructure and apps. Back in April, we announced the industry’s first unified Cloud-Native Visibility and Security Platform, which provides both monitoring (via Sysdig Monitor) and security (via Sysdig Secure) at massive enterprise scale, across both multi and hybrid cloud environments. Alright, let’s dig deeper into Sysdig Secure 2.4, which focuses on runtime detection and vulnerability management! Sysdig Secure 2.4 released! Focuses on #Kubernetes #security with runtime profiling + Falco library and rule builder Click to tweet

I. Runtime profiling

Sysdig’s approach to runtime defense in large-scale environments is to automatically model runtime behavior by analyzing the activity inside the containers. Analyzing syscalls, traversing the kernel leveraging eBPF technology and enriching them with various metadata including Kubernetes and Cloud provider labels, allows Sysdig to create a truly comprehensive container runtime profile. This reduces the effort required to manually create and update profiles. Secure 2.4 Runtime profiling Sysdig uses its syscall-level understanding to gain deep insights into container runtime behavior such as:
  1. Spawned process – which process and binaries are running?
  2. Network traffic – what TCP/UDP ports does this application communicate on?
  3. File system activity – what files are being read? And written?
  4. System calls – what system calls are executed?
Secure 2.4 - Auto generated profile After the learning is complete, Sysdig Secure provides visibility into auto generated runtime profiles through confidence levels. These three confidence levels (low, medium, high) reflect how much we know about the specific container runtime behavior. Security teams gain more transparency and assurance into their policy when they have full control of what’s inside a runtime profile rather than applying black box auto generated profiles. After the profile is built, a user can simply snapshot it into a runtime policy that can be directly applied to different scopes using metadata: containers, hosts or Kubernetes resources.

II. Falco Rule Builder

Our goal with this release is to** ease the burden on security teams to create their container security policy**. In addition to auto generating runtime profiles, creating, editing and managing your Kubernetes security policy is now much easier.

Create and customize advanced security policies

Sysdig Secure also provides the simplicity and flexibility to create custom runtime rules based on open-source Falco. The new security policies contain a mix and match of runtime profiles, UI built rules and advanced Falco rules. Sysdig Secure has developed a new interface called *Falco Rule Builder *that lets you visually interact with the Falco engine under the hood and create powerful rules via a flexible UI. Runtime rules can be scoped and filtered to any aspect of your environment (such as a particular Kubernetes namespace, deployment, pod, etc.) and managed at scale across multiple clusters, cloud providers and data centers. Runtime policies on Secure 2.4 Users can also create new policies without needing to know in-depth Falco expressions and filtering commands. Although Sysdig Secure focuses on containers, these custom rules can be applied to bare metal and virtual instance hosts as well. Runtime rules library

Leveraging community and framework-driven policies via Rules Library

Taking it a step further, users can easily leverage existing security compliance frameworks such as MITRE ATT&CK, and utilize container runtime security rules that adhere to eight key MITRE categories. Now, they can be easily implemented via the Rules Library to be part of the container security profile. Filtering by tag on the rules library Because of the open source contribution, a wide variety of community sourced rules are available to enhance the rules repository and allow other users to benefit from a community-driven security approach. For example, FIM rules can be easily leveraged via the Rules Library in the Sysdig Secure platform. Creating a new security policy Security ops teams can apply these community or framework driven policies from the Rules Library and have more assurance in their container runtime security posture. Read more about the new Sysdig Secure policy editor.

III. Sysdig Secure vulnerability reporting

Sysdig Secure now provides a unified console for easy vulnerability management including reporting and advanced querying. Sysdig connects the dots between your image scanning vulnerabilities database and what’s currently running on your platform so each person can focus on the relevant information for them and build complex queries to understand what was the status at any point in time. Let’s illustrate this with an example, imagine a query like: Show me all the vulnerabilities in prod namespace where the severity is greater than high, the CVE > 30 days old, and a fix is available. This question is now easily answered with Sysdig. Secure 2.4: Vulnerability report With the new vulnerability reporting capabilities, DevOps and security teams can easily query across a catalog of images, packages, CVEs, as well as check for advanced conditions like CVE age, fix available, software version etc. Finally, these reports can be downloaded and shared (PDF/CSV) with vulnerability management teams, CISO’s, etc.

New advanced alert configuration

We also added a new advanced alert configuration to notify changes in images, policies or CVE exposures via Slack, PagerDuty, email, etc. This is important because your vulnerability management teams can be alerted if:
  1. A new image tag is pushed to a registry
  2. A specific policy change was triggered (pass or fail)
  3. A change in CVE information for a specific image (example nginx:latest)
Secure 2.4: repository alerts This limits security alerts to only people that information is relevant to based on the applications they are responsible for, while it enables security teams to fix issues faster, while also reducing the impact of alert fatigue.

Improved scan results UI

This feature** **provides the ability to view a summary of all policies an image was evaluated against, understand what exactly failed (and passed), any vulnerabilities including specific OS and non-OS package checks, and image contents. Sysdig Secure 2.4: Image scan results With the new scan results UI teams have:
  • Interactive and sortable scan results, including filtering by CVE’s (critical, high, medium, etc).
  • An understanding of how an image has performed against the different audit policies that have been put in place.
Sysdig Secure 2.4 expands on its previous runtime security and vulnerability management capabilities with the addition of runtime profiling and the new policy editor. Now it is easier than ever for security teams to ease the burden of creating their container security policy and they now have the ability to better understand at any point in time if a vulnerability exists within their containers. Sysdig is the only platform that addresses key security challenges across the entire container lifecycle, including image scanning and vulnerability management, compliance, runtime security and incident response and forensics. With deeper insights and better understanding all of the container and environment activity.

The post Sysdig Secure 2.4 introduces runtime profiling for anomaly detection + new policy editor for enhanced security. appeared first on Sysdig.

]]>