Scanning for vulnerabilities is a best practice and a must-have step in your application lifecycle to prevent security attacks. Implementing a strong Cloud Native Application Protection Platform (CNAPP) certainly includes this step. It is also important where this step is performed, but why? Let’s dig into the details of vulnerability scanning with Sysdig.
Applications’ lifecycle involves a number of steps, from the developer workstation creating fine art in the shape of lines of code to the final production environment where the customers use a web application, mobile application, or anything else. Vulnerabilities can be introduced in any of those steps, so it is highly recommended to put some barriers to prevent them from ruining your environment.
The “defense in depth” concept recommends performing automatic vulnerability scanning on different steps of the application lifecycle (sometimes even overlapping them). This will reduce the number of vulnerabilities introduced in your production environment. Sysdig Secure can help.
Sysdig’s vulnerability management is distributed and provides flexible integration across the whole application lifecycle while offering centralized governance to define the policies or create reports. It also provides a constant feedback loop of vulnerabilities with all the context needed to be fixed in a developer friendly way (which packages and versions need to be updated).
Development
Let’s start from the beginning, the developer workstation.
As a developer, you have your own tools which you are comfortable with: your IDE, your CLI tools, your headphones, and your preferred music. You are creating some amazing new applications using your preferred language. In this creative process, you are not starting from scratch (it doesn’t make sense), but you rely on third-party frameworks or libraries that, at the same time, are relying on other third-party frameworks or libraries, that are relying on other third-party frameworks, and on, and on.
Once you finish the piece of code you are working on, you probably want to package it as a container as this is the standard way to deploy applications nowadays. But, before submitting your PR (pull request), you want to try to run a local deployment before committing your code, just in case.
sysdig-cli-scanner
is a binary that you can download and run on your workstation (either x86_64 or arm64, Linux, or OSX!) and it will scan your container image for known vulnerabilities on your dependencies. It’s as simple as:
OS=$(uname -s | tr "[A-Z]" "[a-z]") VERSION=$(curl -L -s https://download.sysdig.com/scanning/sysdig-cli-scanner/latest_version.txt) ARCH="arm64" curl -sL "https://download.sysdig.com/scanning/bin/sysdig-cli-scanner/${VERSION}/${OS}/${ARCH}/sysdig-cli-scanner" -o ~/bin/sysdig-cli-scanner pushd ~/bin/ shasum -a 256 -c <(curl -sL "https://download.sysdig.com/scanning/bin/sysdig-cli-scanner/${VERSION}/${OS}/${ARCH}/sysdig-cli-scanner.sha256") popd SECURE_API_TOKEN=<your-api-token> ~/bin/sysdig-cli-scanner --apiurl <sysdig-api-url> <image-name>
For example, performing a vulnerability scanning with Sysdig using the Sysdig’s dummy vulnerable application on a M1 (arm64) Apple hardware using OSX 12 would look like:
SECURE_API_TOKEN="xxx" ~/bin/sysdig-cli-scanner mariadb:latest --apiurl https://eu1.app.sysdig.com 2023-01-27T11:48:33+01:00 Starting analysis with Sysdig scanner version 1.3.3 2023-01-27T11:48:33+01:00 Retrieving MainDB... 2023-01-27T11:48:33+01:00 Done, using cached DB 2023-01-27T11:48:33+01:00 Loading MainDB... 2023-01-27T11:48:33+01:00 Done 2023-01-27T11:48:33+01:00 Retrieving image... 2023-01-27T11:48:45+01:00 Done 2023-01-27T11:48:45+01:00 Scan started... 2023-01-27T11:48:45+01:00 Uploading result to backend... 2023-01-27T11:48:45+01:00 Done 2023-01-27T11:48:45+01:00 Total execution time 12.069945833s Type: dockerImage ImageID: sha256:a748acbaccae4dc8152ded948fa5a304df7b0888b4cea9116385e5e3bd812bfc Digest: mariadb@sha256:8c15c3def7ae1bb408c96d322a3cc0346dba9921964d8f9897312fe17e127b90 BaseOS: ubuntu 22.04 PullString: mariadb:latest 42 vulnerabilities found 0 Critical (0 fixable) 1 High (1 fixable) 25 Medium (23 fixable) 5 Low (0 fixable) 11 Negligible (4 fixable) PACKAGE TYPE VERSION SUGGESTED FIX CRITICAL HIGH MEDIUM LOW NEGLIGIBLE EXPLOIT github.com/opencontainers/runc golang v1.0.1 v1.1.2 0 1 1 0 0 0 libmysqlclient21 os 8.0.31-0ubuntu0.22.04.1 8.0.32-0buntu0.22.04.1 0 0 18 0 0 0 libgssapi-krb5-2 os 1.19.2-2 1.19.2-2ubuntu0.1 0 0 1 0 0 0 libk5crypto3 os 1.19.2-2 1.19.2-2ubuntu0.1 0 0 1 0 0 0 libkrb5-3 os 1.19.2-2 1.19.2-2ubuntu0.1 0 0 1 0 0 0 libkrb5support0 os 1.19.2-2 1.19.2-2ubuntu0.1 0 0 1 0 0 0 libpam-modules os 1.4.0-11ubuntu2 1.4.0-11ubuntu2.1 0 0 0 0 1 0 libpam-modules-bin os 1.4.0-11ubuntu2 1.4.0-11ubuntu2.1 0 0 0 0 1 0 libpam-runtime os 1.4.0-11ubuntu2 1.4.0-11ubuntu2.1 0 0 0 0 1 0 libpam0g os 1.4.0-11ubuntu2 1.4.0-11ubuntu2.1 0 0 0 0 1 0 POLICIES EVALUATION Policy: Sysdig Best Practices FAILED (1 failures - 0 risks accepted) Policies evaluation FAILED at 2023-01-27T11:48:45+01:00 Full image results here: https://eu1.app.sysdig.com/secure/#/scanning/assets/results/173e24c1b551bf2190c7491c5dda6070/overview (id 173e24c1b551bf2190c7491c5dda6070)
Let’s highlight a couple of facts:
- The total execution time took 12 seconds, but the scan took barely a second:
2023-01-27T11:48:45+01:00 Scan started... 2023-01-27T11:48:45+01:00 Uploading result to backend... 2023-01-27T11:48:45+01:00 Done
- The entire information on what packages are vulnerable and the suggested fix are also available right there:
PACKAGE TYPE VERSION SUGGESTED FIX CRITICAL HIGH MEDIUM LOW NEGLIGIBLE EXPLOIT github.com/opencontainers/runc golang v1.0.1 v1.1.2 0 1 1 0 0 0 libmysqlclient21 os 8.0.31-0ubuntu0.22.04.1 8.0.32-0buntu0.22.04.1 0 0 18 0 0 0 libgssapi-krb5-2 os 1.19.2-2 1.19.2-2ubuntu0.1 0 0 1 0 0 0 libk5crypto3 os 1.19.2-2 1.19.2-2ubuntu0.1 0 0 1 0 0 0 libkrb5-3 os 1.19.2-2 1.19.2-2ubuntu0.1 0 0 1 0 0 0 libkrb5support0 os 1.19.2-2 1.19.2-2ubuntu0.1 0 0 1 0 0 0 libpam-modules os 1.4.0-11ubuntu2 1.4.0-11ubuntu2.1 0 0 0 0 1 0 libpam-modules-bin os 1.4.0-11ubuntu2 1.4.0-11ubuntu2.1 0 0 0 0 1 0 libpam-runtime os 1.4.0-11ubuntu2 1.4.0-11ubuntu2.1 0 0 0 0 1 0 libpam0g os 1.4.0-11ubuntu2 1.4.0-11ubuntu2.1 0 0 0 0 1 0
The scan results could also be observed in the Sysdig URL, where you can see the same results but with more detail (and pretty colors!):
The Sysdig UI shows a detailed view of the vulnerabilities found:
The packages and versions affected:
The policies evaluation:
And some detail about the particular image:
linux/arm64 container images supported.
CI/CD
Let’s assume you already fixed all those vulnerabilities by updating the libraries dependencies and the PR has been submitted. The next step in the build chain is running a CI/CD pipeline to build the application, build the container image, run some tests… and check for vulnerabilities again. Wait, what? Why again?
- Who can guarantee the vulnerability scan has been done religiously by all the developers locally in their workstations before submitting the pull request?
- What if the developer performed the scan a couple of days ago and a new vulnerability has been found?
Additionally, as the vulnerability scan takes only a few seconds, it makes sense to run it again.
But not just that, the CI/CD scan can use different policies than the ones used in previous steps (read more about vulnerabilities policies). What about container image best practices, such as not running as root? Or PCI Audit policies? What about the “My company baseline” policy?
You can perform vulnerability scanning with Sysdig at development level and enforce some more restrictions in your CI/CD as a gate for production workloads.
You can decouple the different policies and run them on different steps of the software supply chain! Using the sysdig-cli-scanner
is just as easy as adding a --policy
flag for the policies you want to check against.
As explained before, the sysdig-cli-scanner
can be executed in virtually any CI/CD system out there, it is just a single binary. If you want to learn more, we have some examples on how to integrate it with some popular tools such as Jenkins (hint: for Jenkins there is an official plugin already), GitHub actions, GitLab CI/CD, or Azure Pipelines.
Registry
The policies have been enforced already in the CI/CD pipeline, so, essentially, the last step of the CI/CD should be pushing the container image to the container registry. Then, why scan container images at the registry level again?
The idea is to follow the zero trust security model, which says “never trust, always verify.” What if the CI/CD was bypassed and someone pushed the container image directly? What about those images that were scanned weeks ago? Were new vulnerabilities discovered since that last scan?
But there is more. What about third-party container images? It is pretty common in enterprise environments to have air-gapped architectures where the container images needed for an application, and provided by third parties (such as SQL databases, event streaming platforms, in-memory databases, web servers, etc.), are mirrored into internal container registries. Those images bypass your CI/CD pipeline but you should still perform vulnerability scanning on them.
Sysdig supports container registry scanning and it scans all the container images periodically or based on events, such as pushing a new container image to the registry.
The scan results can be observed in the Sysdig web interface under the Vulnerabilities -> Registries section:
Registry Scanning is currently in a “Controlled Availability” phase. Please contact your Sysdig representative if you want to try it.
Runtime
All the guardrails are in place, the container image is stored safely in the registry, and it is time to run it in a production environment. The Sysdig agent also performs vulnerability scanning of the container images running in the Kubernetes cluster. The question is why?
First and foremost, having insights into the container images running in the production environment is very useful. If you find a vulnerability affecting a container image that is not running, then it is not so urgent to fix it versus a container image running wildly on production.
But there is more.
Prioritization based on “in-use” exposure
The Sysdig agent performs kernel level instrumentation (via a kernel module or an eBPF application) to observe every single Linux syscall. This means it is able to identify everything that happens under the hood, including the running processes, the files opened, or the network connections, so it can determine which processes are actively running in your container or which libraries are being used. Connecting that awesome feature with the vulnerability scanning capabilities means it can identify the vulnerable packages that are being used, so ideally they are fixed first. This is what we call “Risk Spotlight” and based on customer feedback, we have discovered that up to 95% of vulnerabilities are considered noise.
The Sysdig runtime scanner is deployed by default with the the new Vulnerability Management engine (using the nodeAnalyzer.secure.vulnerabilityManagement.newEngineOnly=true
parameter), and “Risk Spotlight” can be enabled easily by following the official documentation and setting the nodeAnalyzer.runtimeScanner.settings.eveEnabled=true
parameter when you deploy the Sysdig agent using the Helm charts.
Host scanning
A compromised container image is bad. Depending on the vulnerability or the ability of the threat actor, it can perform lateral movements to other containers or, worst case, to the host itself. Fortunately, all of the security isolation mechanisms on containers make this scenario rare (but not impossible). However, directly compromising a host is even worse. If the threat actor is able to compromise the host (depending on the level of compromise, of course), they have access to virtually all the containers running on top of it.
That’s why scanning the container host is also important (remember “zero trust”?). The Sysdig agent performs this procedure every 12 hours to avoid stalled information. As demonstrated before, the vulnerability scanning procedure only takes a few seconds so it is a no brainer not doing it.
Sysdig vulnerability management supports the most common Linux operating systems, even cloud or image-based/immutable operating systems (such as Google Container-Optimized OS (COS), RHCOS or Flatcar). Basically, they use a packaging system under the hood (rpm-ostree for RHCOS or Gentoo’s ebuilds for Flatcar), which means that there is a SBOM to check against. Also, a number of package types are supported, such as Java, Golang, or Python packages.
Host scanning is deployed by default using the sysdig-deploy
Helm chart version 1.5.0+ and HostScanner
container version 0.3.0+, and exposed on the same Sysdig web interface under the Vulnerabilities -> Runtime section by filtering the result by host asset type:
Reporting
Using Sysdig reporting capabilities, the security team can quickly find which containers or hosts are vulnerable and where they are running. Reports are generated by applying filters to focus on what matters the most and are useful to understand which vulnerabilities affect different teams or images so they can be fixed.
The following screenshot shows a report of running images that are vulnerable to the infamous Log4J CVE-2021-44228 that is being sent to our inbox at 9 a.m. everyday.
And this one for Debian hosts in the “demo-kube-aws” cluster containing vulnerabilities which CVSS score is > 7:
Accepting risks
What if you are aware of a vulnerability but it is a low priority to fix (it is not being used or the dependency would be planned for removal soon). You can “Accept risk” it!
Accepting Risk makes an exception to the Vulnerability Policy; it doesn’t make the CVE disappear. It still shows it in the list, but voids the policy violation associated with that CVE.
You can accept risk based on different contexts, such as:
- Global: the CVE is accepted globally
- Container image or host: the CVE is accepted for that particular container image or host
- Package: the CVE is accepted for a particular package (or package + version)
Be careful with the accepted scope or context; overly broad exceptions can create false negatives.
Accepting risks is as simple as selecting the “Accept risk” and filling the form with the details:
Then, a shield icon will be placed to indicate it:
The Policies -> Risk Acceptance section shows all the risk acceptances, including the ones that have already expired.
Neat!
Conclusions
Sysdig Secure vulnerability management provides a single pane of glass across the entire applications’ lifecycle, from the developer workstation, through the CI/CD pipeline, stored in registries, to the final production environment. Vulnerabilities can be introduced in any of those steps and at any time, so it is highly recommended to add as many layers of security as you can to prevent them from ruining your environment.