Sysdig | Loris Degioanni https://sysdig.com/blog/author/loris/ Tue, 21 May 2024 12:44:28 +0000 en-US hourly 1 https://wordpress.org/?v=6.6.1 https://sysdig.com/wp-content/uploads/favicon-150x150.png Sysdig | Loris Degioanni https://sysdig.com/blog/author/loris/ 32 32 The Urgency of Securing AI Workloads for CISOs https://sysdig.com/blog/urgency-of-securing-ai-workloads-for-cisos/ Tue, 21 May 2024 14:00:00 +0000 https://sysdig.com/?p=89504 Media attention on various forms of generative artificial intelligence (GenAI) underscores a key dynamic that CISOs, CIOs, and security leaders...

The post The Urgency of Securing AI Workloads for CISOs appeared first on Sysdig.

]]>
Media attention on various forms of generative artificial intelligence (GenAI) underscores a key dynamic that CISOs, CIOs, and security leaders face – namely, to keep current with the fast pace of technological change and the risk factors that this change brings to the enterprise. Whether it’s blockchain, microservices in the cloud, or these GenAI workloads, security leaders are not just tasked with keeping their organizations secure and resilient, but they are also the key players in understanding and managing the risks associated with new technology and new business models. While each novel technology brings new considerations and risks to evaluate, there are a handful of constants that the security profession must address proactively. 

Temporal considerations

Our businesses and the applications that underpin them run at network and machine speed. Web services, APIs, and other interconnections are designed for near-instantaneous response. It’s not only lawyers who note that “time is of the essence,” it’s every colleague we support and the business applications and services that we use collectively to run the organization. The focus on speed and response times permeates business transactions and the application development environments they rely upon. The rush to respond and deliver has undermined more traditional risk assessments and evaluations that were effectively point-in-time analyses. Security today demands real-time context and actionable insights instantiated at machine speed. Runtime telemetry and runtime insights are required to speed up our security operations.

Automation

The evening news is awash in stories suggesting that AI systems will displace workers with machines and applications that do the job more effectively than us sentient beings. Automation is not new. Almost every industry has invested in automation. We see robots building cars, kiosks at banks and retail outlets, and automation within the cybersecurity profession. We will witness new forms of automation as GenAI tools are rolled out to support businesses. We already see this with system, code, and configuration reviews within infrastructure, operations and security programs. Automation should be welcomed within our security programs and integral to the program’s target operating model. 

Algorithms and mathematical models

The third constant we witness with technological change is using algorithms and mathematical models to contextualize and distill data. We live in an algorithmic economy. Data and information drive our businesses. Algorithms inform business models and decision-making. Like the other constants of speed and automation, algorithms are also used in our cybersecurity profession. Algorithms evaluate processes, emails, network traffic, and many other data sets to determine if the behavior is malignant or benign. A notable challenge with algorithms is that, in most cases, they are considered the manufacturer’s intellectual property. Algorithms and transparency are at odds. Consequently, addressing the fidelity and assurance of an algorithmic outcome is less science and more a leap of faith. We assume the results are fine, but there’s no guarantee that two plus two does not equal 4.01 after executing the algorithm. 

How to assess new technologies

This context of speed, automation, and algorithmic use should be front and center for CISOs as they evaluate how their organization will deploy AI tools for both their business and the security of its operations. Having a methodology to contextualize new technologies, like GenAI, and their commensurate risks is integral to the CISO and CIO roles. Technology leaders must effectively operate their respective programs and support the business while governed by these constants of speed, automation, and the widespread use of algorithms for decision-making and data analysis. 

A methodological approach to rapidly assessing new technologies is required to avoid being caught flat-footed by technological change and the inherent risks that this change brings to the business. While each business will have its own approach to evaluating risk, some effective techniques should be part of the methodology. Let’s take a quick look at some important elements that can be used to evaluate the impacts of GenAI. 

Engage the business

New technologies like GenAI have pervasive organizational impacts. Ensure that you solicit feedback and insights from key organizational stakeholders including IT, lines of business, HR, general counsel, and privacy. CISOs who routinely meet with their colleagues throughout the business can avoid being blindsided by the new tools and applications that these colleagues employ. CISOs should be asking their colleagues and counterparts within the security community how they are using AI currently and/or how they intend to use AI for specific functions within the organization. 

Conduct a baseline threat model using STRIDE and DREAD 

Basic threat modeling complements more traditional risk assessments and penetration tests, and can be done in an informal nature where expediency is required. CISOs and their staff should go through prospective AI use cases and ask questions to evaluate how user activity within an AI application could be spoofed, how information could be tampered with, how transactions would be repudiated, where information disclosure could occur, how services could be denied, and how privileges could be elevated within the environment. The security team should take an inquisitive approach to these questions and should think like a threat actor when trying to exploit a given system or application. A basic STRIDE model ensures that key risks are not omitted from the analysis. DREAD looks at the system’s impact and complements STRIDE context. The CISO and security team should evaluate the potential damages that may result if an AI workload or service were compromised, how easy it would be to reproduce the attack against the system, the degree of skill and tooling required to exploit the given system, who the affected users and systems would be, and how hard it would be to discover the attack. 

Evaluate telemetry risks

Newer applications and technologies, like the current forms of GenAI, may lack some of the traditional telemetry of more mature technologies. The CISO and security team members must ask basic questions about the AI service. A simple open-ended question may start the process – “What is it that we don’t see that we should see with this application, and why don’t we see it?” Delve a bit deeper and ask, “What is it that we don’t know about this application that we should know, and why don’t we know it?” Lean into these questions from the runtime, workload, and configuration perspectives. These types of open-ended questions have led to significant improvements in application security. If questions like these were not being asked, security professionals would not have seen the risks applications encounter at runtime, that service accounts are overly permissioned too frequently, or that third-party code may introduce vulnerabilities requiring remediation or additional controls to be implemented. 

Use a risk register for identified risks

CISOs and their teams should document concerns about using GenAI applications and how these risks should be mitigated. There are many forms of risks that GenAI may present, including issues related to the fidelity and assurance of responses, data, and intellectual property loss that may occur when this information is fed into the application; the widespread use of deep fakes and sophisticated phishing attacks against the organization; and polymorphic malware that quickly contextualizes the environment and attacks accordingly. GenAI dramatically expands the proverbial attack surface of an organization in that these large language models (LLMs) can quickly create organization-specific attacks based on the dossier of the organization’s employees and publicly available information. In effect, while the algorithms that these AI tools use are obfuscated, the data they use is in the public domain and can be quickly synthesized for both legitimate and nefarious purposes. Use a risk register to document all of these potential risks when using AI tools and applications. Ultimately, the business will decide if the upside benefits of using a specific AI function or application outweigh any identified risks. Risk treatment should remain with the business. Our job as security leaders is to ensure that our colleagues in the C-suite are aware of risks and potential remediation and the resources required. 

Focus on training and critical thinking

AI has the opportunity to fundamentally change our economy just as the internet modernized business operations via ubiquitous connectivity and access to information in near real-time. The proverbial genie is out of the AI bottle. Creative and new uses of AI are being developed at breakneck speed. There is no fighting market forces and innovation. As security professionals, we must proactively embrace this change, evaluate sources of risk, and make prudent recommendations to remediate risks without interrupting or slowing the business down. This is not an easy charge for our profession and our teams. However, by adopting a proactive approach, ensuring that our colleagues are well-trained in critical thinking, and exploring how services may be targeted, we can make our organizations more resilient as they embrace what AI may bring to the enterprise. 

As AI’s presence in our enterprises and the economy expands, new business models and derivative technologies will undoubtedly emerge. CISOs and security leaders will need to use this context to evaluate the efficacy of their current and future security practices and security tooling. Our adversaries are highly skilled and use automated techniques to compromise our organizations. These adversaries are already using nefarious forms of GenAI to create new zero-day exploits and other highly sophisticated attacks, frequently using social engineering to target key roles and stakeholders. In short, our adversaries continue to up their game. As security leaders, it’s incumbent upon us to do the same. We know that the pace and speed of our security operations must improve to confront risks executed at runtime and at network speed. 

The post The Urgency of Securing AI Workloads for CISOs appeared first on Sysdig.

]]>
The Urgent Need for Real-time Cloud Detection & Response https://sysdig.com/blog/urgent-need-real-time-cloud-detection-and-response/ Thu, 14 Mar 2024 16:03:53 +0000 https://sysdig.com/?p=85726 It is impressive how explosively the cloud security market has embraced detection and response in recent months. The industry, including...

The post The Urgent Need for Real-time Cloud Detection & Response appeared first on Sysdig.

]]>
It is impressive how explosively the cloud security market has embraced detection and response in recent months.

The industry, including both users and vendors, is rapidly acknowledging the complexity of modern cloud attacks. Facilitated by automation and APIs, attacks cannot be effectively countered with traditional solutions that lack context of cloud environments or focus solely on posture.

Sysdig has been aware of this for quite some time. Sysdig began our journey, having created Falco, the open source standard for cloud-native threat detection and response. More recently, our Threat Research Team uncovered key cloud attacks (SCARLETEEL, AMBERSQUID, SSH-Snake) and determined that it takes less than 10 minutes for bad actors to inflict damage. This work led us to develop a benchmark that security teams must achieve to keep pace with cloud attacks. We call it the 5/5/5 Benchmark for Cloud Security.

Recent events indicate that a growing number of people share our perspective:

  • Falco has achieved graduation status within the Cloud Native Computing Foundation (CNCF). This milestone underscores the cloud native community’s recognition of the critical role of detection and response.
  • During the most recent earnings call, CrowdStrike’s leadership highlighted that the speed of cloud attacks is dramatically accelerating.
  • Finally, Wiz announced the acquisition of Gem Security, an attempt to add detection and response to a posture-centric product.

The cloud demands a comprehensive strategy

The reason for this urgency is clear: traditional, posture-based approaches, while important, are inadequate for addressing modern cloud attacks. Similarly, threat detection and response solutions built for end-points and on-premise networks lack the rich cloud and DevOps context needed to thwart cloud attacks and “zero days.”

The solution is becoming evident and requires a comprehensive strategy:

  • Integrating posture management with detection and response is crucial. Only a holistic view of risk that correlates misconfigurations and supply chain vulnerabilities with actual attack behaviors can eliminate blind spots and prevent the overwhelming flood of irrelevant findings.
  • Operating on a multitude of data sources is essential. This includes configuration information from agentless scans, workload signals from agents, trails from cloud services, and logs from applications like Okta and GitHub. All this data, rich in detail, must be not only collected but also accurately correlated.
  • Being truly real time is non-negotiable. If logs must be sent to a SIEM or a legacy detection and response platform, by the time the data is ingested and indexed, it’s already too late.
  • A combination of agentless and agent-based telemetry is required to truly understand cloud attacks.

Simply put: cloud security requires runtime insights. Preventative measures will never help detect, contain and manage zero-day attacks.

Sysdig’s vision for the past several years has been a runtime-powered CNAPP. We’ve been committed to creating the best integrated platform, where the utilization of runtime insights is a fundamental design principle, not an afterthought. Our platform aims to deliver the best insights in the shortest time possible.

Our users embracing this vision aggressively: just a few weeks after releasing our Okta and GitHub integrations, dozens of customers have deployed them, leveraging them to detect and respond to advanced threats.

The Sysdig vision is to create a world where enterprises can safely innovate in the cloud. To do that, they must have a CNAPP solution that has real-time detection and response at the center. If you’re a security leader looking for a rich, powerfully-integrated solution that leverages detection and response to protect against the most sophisticated cloud threats, you have two options:

  • Wait for other vendors to define their strategies and complete their acquisitions. Then pause to give them time to successfully integrate their new product into their existing platform so it is a seamless experience.
  • Choose Sysdig now.

As we all know, your adversaries won’t wait.

Learn more about the Sysdig 5/5/5 benchmark here.

The post The Urgent Need for Real-time Cloud Detection & Response appeared first on Sysdig.

]]>
Celebrating Falco’s Journey to CNCF Graduation https://sysdig.com/blog/falco-cncf-graduation/ Thu, 29 Feb 2024 14:50:00 +0000 https://sysdig.com/?p=85042 Celebrating Falco Graduation Today, we are proud to celebrate Falco’s graduation within the Cloud Native Computing Foundation (CNCF). Graduation marks...

The post Celebrating Falco’s Journey to CNCF Graduation appeared first on Sysdig.

]]>
Celebrating Falco Graduation

Today, we are proud to celebrate Falco’s graduation within the Cloud Native Computing Foundation (CNCF). Graduation marks an important milestone for a journey that began in 2018 when Sysdig contributed Falco to the CNCF. It’s a significant accomplishment for the industry at large, enhancing the security of modern computing platforms, and has only been made possible by a huge community effort from developers from many companies, and a constellation of adopters worldwide. To understand the impact that Falco has made on the industry, it’s important to understand its origin story.

In 2014, when we started writing the first lines of code of what would ultimately become the Falco drivers, I could hardly have imagined what Falco would become, and its significance to modern computing platforms. The journey has been fun and long, starting even earlier than 2014: Falco’s origins trace back to network packets.

The Journey from Packets to Security Instrumentation in the Cloud

In the late 1990s, the rapid expansion of computer networks highlighted the need for affordable network visibility tools. The Berkeley Packet Filter (BPF) emerged as a significant advancement, enabling packet capture and filtering within the BSD operating system. BPF is the precursor of today’s widely used eBPF, and was originally released together with an accompanying library, libpcap. libpcap was used as the base for tools like tcpdump and Wireshark (originally Ethereal), which became standard tools for packet analysis.

In the following years, the utility of network packets quickly extended beyond troubleshooting to security. A good example is Snort, an open-source intrusion detection system released in 1998. Snort, leveraging packet data and a flexible rule engine, offered real-time detection of threats coming through the network.

With the evolution of computing architectures, packet-based signals were becoming harder to collect and decode. Tools like tcpdump, Wireshark and Snort remained extremely popular, but trends like containerization, encryption and the transition to the cloud made them substantially less effective. 

That is why, after over a decade spent working on these tools, a small group of people decided to rethink what security-focused instrumentation would look like if you could design it from the ground up to support cloud native infrastructures. We decided to focus on the Linux kernel, and in particular on its system call layer, as the instrumentation layer, and we included support for containers and Kubernetes from day 1. Using system calls, we could offer the same workflows of packet-based tools (detailed captures, filters, trace files…), but in a way that was tailored to the modern paradigms. 

The Falco instrumentation components, which we creatively called Falco libs, were released in 2014, together with the command line sysdig tool, which you can think of as tcpdump for system calls.

Runtime Security is Born

Falco was released in 2016. It put together syscall capture and a rich rule engine, allowing to flexibly create detections for both containers and hosts. The community immediately took notice, and runtime security was born.

Falco grew in two dimensions: instrumentation technology and richness of detections. On the first front, we pioneered the use of eBPF to collect security signals. Using eBPF for security is something that is obvious to anyone in the industry today, but in 2018, when we released our eBPF driver, it was unheard of. Actually, it was impossible to imagine: we had to work with the Linux kernel community to address some outstanding issues in eBPF before we could make it functional.

On the second front, Falco gradually became more and more modular, including support for data sources like Kubernetes audit logs, cloud trails, third-party applications like Okta and GitHub, and many more. Our vision is that, as all software becomes cloud software, runtime security requires much more than the collection of kernel signals. Threats are complex and can originate inside your containers, but they can also come from your control plane, your infrastructure, your identities, your data stores, and your cloud services. Falco offers a unified and correlated view that can be used to detect many types of attacks and track them as they move across your infrastructure.

The Power of Community

Contributing Falco to the Cloud Native Computing Foundation (CNCF) in 2018 was a major step for the project. It was based on the belief that runtime security is a key component of the modern computing stack based on Kubernetes, and that it needs to become a default piece of the stack. We also believed that only a community approach, where the good guys work together, gives all of us a real chance against bad actors. 

Falco’s graduation is the culmination of a long journey, and is a great example of open source innovation, where contributions build upon past achievements, connecting diverse communities and technologies. It means that Falco is tested, validated and deployed enough that you can trust it in the most demanding scenarios. Reaching this point wouldn’t have been possible without the contributions of many people: early adopters, developers, core maintainers, sponsors, the community of users, the Cloud Native Computing Foundation. We cannot thank each of them here, but we want to make sure they know we appreciate what they did.

As for Falco as a project, we are delighted to reach such a milestone, but we think this is just the beginning. There are many features we want to add, but even more importantly we want to make sure Falco is easy to deploy, lightweight and always able to detect the latest threats. That way, we hope, we can help run your cloud software confidently and safely. 

The post Celebrating Falco’s Journey to CNCF Graduation appeared first on Sysdig.

]]>
More than an Assistant – A New Architecture for GenAI in Cloud Security https://sysdig.com/blog/genai-cloud-security/ Tue, 25 Jul 2023 10:50:00 +0000 https://sysdig.com/?p=76519 There is no question that cybersecurity is on the brink of an AI revolution. The cloud security industry, for example,...

The post More than an Assistant – A New Architecture for GenAI in Cloud Security appeared first on Sysdig.

]]>
There is no question that cybersecurity is on the brink of an AI revolution. The cloud security industry, for example, with its complexity and chronic talent shortage, has the potential to be radically impacted by AI. Yet the exact nature of this revolution remains uncertain, largely because the AI-based future of cybersecurity is still being invented, step by step.

Today, Sysdig takes a significant leap forward in shaping this future. We’re excited to announce Sysdig Sage, the AI security assistant specializing in cloud security. This blog post aims to describe what Sysdig Sage is; what it can do for you (with examples!); and more importantly, what sets Sysdig Sage apart. Moreover, I’ll outline Sysdig’s perspective on the present and future role of AI in cybersecurity.

Up until now, industry attempts to harness large language models (LLMs) in cybersecurity have mainly fallen into two categories:

  • Context enrichment: Here, AI performs relatively simple tasks that support user workflows. For example, you can feed a compliance violation event to ChatGPT, which can then suggest AWS commands for use in the remediation process. This stateless approach is useful but fairly basic.
  • Query building: This entails providing natural language interfaces to repositories of security events and logs, such as security information and event management (SIEM) back-ends or extended detection and response (XDR) tools. LLMs excel at formulating queries and interpreting small data sets, offering valuable support to both novice and advanced users. Modern LLMs can also retain context across multiple questions, providing effective “chatbot” functionality.

Sysdig Sage is based on a more ambitious and comprehensive approach, striving to be as indistinguishable as possible from a cybersecurity expert, with deep cloud security expertise and the ability to skillfully assist you with the Sysdig Secure cloud-native application protection platform (CNAPP). With this powerful combination, you can gain a clearer picture of your security posture, meet compliance requirements more quickly, and stop cloud attacks more confidently.

In developing Sysdig Sage, we are focusing on these properties:

  • Advanced, multistep reasoning: In a complex field like cloud security, questions rarely have straightforward answers. Often, you need to investigate and iterate before finding a solution. Sysdig Sage is designed to undertake multiple investigative steps before delivering an answer.
Cloud Security and GenAI - Introducing Sysdig Sage
  • Integrating multiple domains: Cloud security comprises numerous data sources, each with its own formats and semantics – vulnerabilities, compliance violations, runtime events, and continuous integration/continuous delivery (CI/CD) security. A true assistant must understand and correlate these domains, treating them as parts of a larger puzzle rather than a collection of acronyms and subcategories.
Cloud Security and GenAI - Introducing Sysdig Sage
  • Exercising judgment: Sysdig Sage is smart enough to aid in risk assessment, prioritization, and decision-making. It can help you understand the scope of an attack, separate the needle from the haystack, and identify correlations.
Cloud Security and GenAI - Introducing Sysdig Sage
  • Proactivity: Sysdig Sage understands what you are doing and interjects with helpful insights at the appropriate moments. It also guides you toward problem resolution.
Cloud Security and GenAI - Introducing Sysdig Sage
  • Action-taking capability: Sysdig Sage can guide you through the UI when you need help, modify a noisy runtime rule, or send a summary on Slack.
Cloud Security and GenAI - Introducing Sysdig Sage

One of the most impressive aspects of Sysdig Sage is that it’s supercharged by Falco, the open source standard for runtime security from the Cloud Native Computing Foundation. The collective knowledge of Falco’s community is integrated into Sysdig Sage right out of the box. This is because most LLMs are trained on publicly available data, which of course encompasses all knowable information (and every discussion!) about Falco. Consequently, Sysdig Sage is extra-effective at detecting, triaging, and responding to runtime threats.

Architecturally, Sysdig Sage is powered by what we call the “LLM controller”. This component, based on a state-of-the-art generative AI architecture and infused with Sysdig’s unique secret sauce, mediates the interaction between the user and the AI. The controller offers expert guidance, validates the accuracy of the responses (therefore mitigating hallucinations), and can perform actions in the product on behalf of the LLM. This not only enhances the scope and effectiveness of the ML models but “steers” the LLM toward specific areas using hierarchical prompting. The controller also safeguards the user’s sensitive data (for example, it is capable of anonymizing the messages that the LLM receives) and mitigates privacy issues.

Our investment in Sysdig Sage stems from our firm belief that generative AI is the most significant revolution the security industry has ever seen. Sysdig is dedicated to leading this revolution, aiming to deliver not just the first, but more importantly, the best AI for cloud security. We have been working tirelessly to create Sysdig Sage, and are confident that it will transform the way you approach cloud security.

Want to learn more? Sysdig Sage is currently accepting candidates for early access later this year. Sign up here for more information.

The post More than an Assistant – A New Architecture for GenAI in Cloud Security appeared first on Sysdig.

]]>
Run Faster, Runtime Followers  https://sysdig.com/blog/run-faster-runtime-followers/ Thu, 04 May 2023 17:51:36 +0000 https://sysdig.com/?p=71674 Recently, there has been a flurry of announcements claiming to have what we call Runtime Insights, the ability to prioritize...

The post Run Faster, Runtime Followers  appeared first on Sysdig.

]]>
Recently, there has been a flurry of announcements claiming to have what we call Runtime Insights, the ability to prioritize vulnerabilities. Here are two examples:

  • Datadog Press Release: Datadog Expands Application Security Capabilities To Automatically Uncover Vulnerabilities In Production Code 
  • Lacework Blog: Finally, a reason for your developers to want an agent

I can confirm that this approach works, and it works very well. It substantially decreases the number of vulnerabilities that a team has to manage, sometimes by a factor of 100 or more!

How do I know it? Because Sysdig invented this approach.

Run Faster, Runtime Followers 

We identified this unmet need after talking with many companies who were trying to implement shift left strategies but struggling to make it work in practice. We heard the overwhelming frustration of chasing endless software vulnerabilities, and we realized that we could use Runtime Insights (aka what’s in use in a production environment) to revolutionize the lives of security and developer teams. We delivered this capability over a year ago and have since worked with forward-looking partners like Snyk to integrate it into their solutions.

Our confidence in the effectiveness of this approach comes from observing its impact across our diverse user base, which includes many large-scale global production environments. The results have been remarkably consistent.

When a leader comes up with a technology that pushes the envelope, it’s only natural for followers to adapt and emulate it. We welcome this, as it ultimately serves as a net positive for users in the long run.

However, Sysdig hasn’t rested on its laurels. In addition to refining our implementation multiple times, we have moved forward and expanded the application of Runtime Insights to several other critical areas, like Identity and Access Management (IAM) and Infrastructure as Code (IaC) security. For example, with Sysdig, you can easily restrict your users’ privileges to precisely what they need to perform their jobs effectively. (By the way, by monitoring and modeling actual cloud access patterns, we found that 90% of granted permissions are not in use). Additionally, you can reduce the surface area of your IaC definitions according to the behavior of your applications at runtime. And we are not done, as there are multiple innovations that we plan to unveil in this space. 

Our unwavering commitment to you is that we will continue to be the ones who advance the state of the art in cloud security. From shift left to shield right, you can’t secure the cloud without deep Runtime Insights. Sysdig is and will continue to be unmatched at that.

The post Run Faster, Runtime Followers  appeared first on Sysdig.

]]>
Real-Time Threat Detection in the Cloud https://sysdig.com/blog/real-time-threat-detection-cloud/ Wed, 09 Mar 2022 16:00:15 +0000 https://sysdig.com/?p=47658 Organizations have moved business-critical apps to the cloud and attackers have followed. 2020 was a tipping point; the first year...

The post Real-Time Threat Detection in the Cloud appeared first on Sysdig.

]]>
Organizations have moved business-critical apps to the cloud and attackers have followed. 2020 was a tipping point; the first year where we saw more cloud asset breaches and incidents than on-premises ones. We know bad actors are out there; if you’re operating in the cloud, how are you detecting threats?

Cloud is different. Services are no longer confined in a single place with one way in or one way out.

Traditionally, services have been deployed in data centers on servers that were close to each other, interconnected physically—and data had only one way in or out of that data center. Security was based around a perimeter; our realm was easily protected through firewalls like a medieval town surrounded by high, thick walls, limiting traffic and attacks through solid doors and defended through thin arrow slits.

Aerial view of Monteriggioni (https://en.wikipedia.org/wiki/Monteriggioni) by Maurizio Moro5153, July 14, 2020. Creative Commons BY 3.0 license.

Nowadays, services are distributed and operate in environments with limited perimeters. Developers, operators and users are located all around the globe. Having all those users access services through a single location will impact productivity and the user experience. Your services are no longer confined to a single place where there’s only one way in or out. If before we compared our infrastructure to a medieval town, now it is more like an amusement park.

Rides Blurry Amusement Park Fun Fair  https://creativecommons.org/publicdomain/zero/1.0/deed.en

An amusement park full of attractions, multiple entrances and exits and many more chances for actors to behave unexpectedly. A distributed infrastructure based on cloud technologies requires detecting threats from myriad sources. So many actors interacting in so many different ways increases the number of potential events and the amount of information that needs to be monitored.

Threat Detection: A Delicate Balance

A common approach to threat detection starts with shipping logs into a centralized repository, then searching for indicators of suspicious behavior or configuration changes that increase risk. That takes time—it’s like trying to identify a moving target. Copying logs outside the cloud and storing an extra copy can be an operational pain in the neck, and it’s expensive. And, most importantly, this approach delays the ability to detect and respond to threats.

Obviously, the closer monitoring tools are positioned to the source of an event, the better the response time can be. This could, however, add complexity and increase costs. Besides, there are still too many steps involved in this pipeline. Couldn’t this be improved somehow?

Inspect Logs in Real-Time with Stream Detection

What if instead of trying to guard a fortified town with a few well-defined entry points, we start thinking about how to watch the activity inside that amusement park? Imagine smart security cameras constantly on alert and looking for anomalous behavior, reacting accordingly and triggering alarms when necessary. Translated to cloud infrastructures, that means the more accurate way to monitor security in the cloud is through stream detection.

Stream detection is a continuous process that collects, analyzes and reports on data in motion. With a streaming detection process, logs are inspected in real-time. This real-time detection allows you to identify unexpected changes to permissions and services access rights as well as unusual activity that can indicate the presence of an intruder or, worst case, exfiltration of data. Based on that idea, the open source community offers a solution: Falco.

Falco is an open source runtime security tool, often described as a security camera for modern cloud infrastructure. Falco is an incubating-hosted level project in the Cloud Native Computing Foundation (CNCF). It was originally designed to watch workloads, so Falco focuses on collecting system calls from running endpoints, like hosts or containers running applications, and collecting granular data from the source in those containers, understanding the details of what applications do.

Obviously, not everything in our infrastructure is hosts and containers. Organizations also benefit from numerous external services offered by their cloud provider(s). Thankfully, the cloud provider also facilitates sharing of valuable information generated by each service, and that information is useful for monitoring. Here is where Falco behaves differently from conventional alternatives: Since the open source project is able to consume additional types of data, it can ingest and digest that information in real-time to generate alerts in the moment.

Consider how this works in an Amazon Web Services (AWS) environment, for example. Almost anything happening in AWS is tracked and logged in AWS’ version of cloud logs, called CloudTrail. By monitoring logs from CloudTrail, you can detect unexpected behavior, configuration changes, intrusions and data theft, not only from existing services but also from newly released ones. Connecting Falco to CloudTrail gives you the flexibility to manage your rules in one single place. Not having to store your logs externally also reduces bandwidth and storage costs.

Respond to Threats Faster

Time is a critical factor in reducing risk. Parsing logs in real-time and leveraging stream detection allows you to immediately detect suspicious activity and trigger an alert for further investigation.

The whole point of stream detection—and of projects like Falco—is to align an effective security approach with the realities of modern architectures. Doing so enables faster time to detection and equips security teams with evolving, community-driven innovation, all of which are critical for effective security with modern stacks. The reality is that the cloud is a new beast, and it enables a new approach to everything, including security. This is your opportunity to get it right!


Originally published on Security Boulevard

The post Real-Time Threat Detection in the Cloud appeared first on Sysdig.

]]>
Sysdig Welcomes Gerald and the Wireshark Community https://sysdig.com/blog/wireshark-creator-joins-sysdig/ Thu, 13 Jan 2022 16:50:29 +0000 https://sysdig.com/?p=45139 Today, I’m excited to announce that Gerald Combs, the original creator and lead maintainer of Wireshark, has joined Sysdig. In...

The post Sysdig Welcomes Gerald and the Wireshark Community appeared first on Sysdig.

]]>
Today, I’m excited to announce that Gerald Combs, the original creator and lead maintainer of Wireshark, has joined Sysdig. In addition, Sysdig is becoming the primary sponsor of Wireshark.

As founder and CTO of Sysdig, I am involved in announcements and press releases on a daily basis. This one, however, has a special meaning for me. Gerald and I have been friends for a long time, starting when Wireshark was called Ethereal. At that time, a capture library that I developed while I was a university student in Italy, WinPcap, was used to port Ethereal to Windows. That was my first contribution to the project. A few years later, in 2006, after I moved to the United States, Gerald joined my first company, CACE Technologies. Together, we renamed Ethereal as Wireshark and created a business around it.

After CACE was successfully acquired, Gerald continued devoting his life to the mission of growing Wireshark and leading its incredible community. On my side, I shifted my attention to security, containers, and the cloud. I started Sysdig, working with a talented team on the creation of another open source tool, Falco. Since the beginning, my work at Sysdig has been heavily inspired by the “packet capture stack” that Gerald and I helped define: Wireshark, tcpdump, libpcap, BPF. One of the reasons why Sysdig’s instrumentation is universally considered the most accurate, rich, and scalable is that we built it on top of the ideas behind that stack, adapting them to the modern world of cloud and containers. Countless times, during Sysdig’s early days, we were inspired by Gerald’s work.

And now Gerald has joined us! This is, on one hand, a great pleasure, like a reunion of old friends. On the other hand, it opens up a universe of possibilities. Wireshark is an incredibly important tool. Its UI is part of the muscle memory of every software professional. Its feature set has saved our butts countless times. At the same time, the world is changing quickly. Software today runs in the cloud, orchestrated by Kubernetes. With the help of Gerald, Sysdig wants to invest in making Wireshark even more useful in modern cloud environments. We’ll work on expanding its feature set and make sure it remains the cornerstone of troubleshooting and security investigation, even when software is containerized and runs in the cloud.

We will do this with the highest respect for the community of developers, contributors, and users who are the true soul of Wireshark. Sysdig is committed to providing continuity to the project and to contributing to its ecosystem, starting with supporting the Sharkfest conference.

For the moment, Gerald, I want to welcome you to Sysdig. I look forward to revolutionizing another industry with you. :-)

To meet Gerald, take a look at his blog post for the Wireshark community.

The post Sysdig Welcomes Gerald and the Wireshark Community appeared first on Sysdig.

]]>
How to Secure Kubernetes, the OS of the Cloud https://sysdig.com/blog/how-to-secure-kubernetes-os-cloud/ Thu, 30 Dec 2021 16:00:17 +0000 https://sysdig.com/?p=44750 As infrastructures and workloads transition to cloud and teams adopt a CI/CD development process, there is a new paradigm shift:...

The post How to Secure Kubernetes, the OS of the Cloud appeared first on Sysdig.

]]>
As infrastructures and workloads transition to cloud and teams adopt a CI/CD development process, there is a new paradigm shift: infrastructure is becoming code. This approach of treating infrastructure as code (IaC) is incredibly powerful, brings us many advantages, and enables transformative concepts like immutability. We define infrastructures in a declarative way and version them using the same source code control tools (in particular git) that we use for our application code. Spinning a new instance and deploying an operating system (i.e., spinning up a new container) all happen declaratively in git, and are deployed in the boundaries of the Kubernetes cluster.

This trend change also puts more responsibility in the developers’ hands, as the person who controls the Kubernetes and git controls the infrastructure, application, and policy. And since developers usually own the git and define their services as Yamls, in practice, it has become their responsibility. All that is required is a change of a Yaml in a GitOps repo and the pipeline will take it from there.

However, it also introduces a new category of risks. When our infrastructure is code, it can include bugs and oversights that become vulnerabilities, and can also enable malicious actors to perform dangerous actions. Adopting IaC does not automatically mean more security; in contrast, it could potentially mean a bigger attack surface. As a result, security needs to evolve and take this new paradigm into account.

To start, we need to recognize that protecting our infrastructure definitions is imperative, just like protecting our code. And this requires a “shift left” in how we secure our infrastructures, identifying risks as they enter the git, ideally by forcing a pull request check before it is merged.

Kubernetes Is the Operating System of the Cloud; Security Must Shift Further Right

Another undeniable industry trend is the use of containers and Kubernetes. Organizations big and small are adopting Kubernetes as the platform to run their applications. As the Linux operating system orchestrates processes on a physical or virtual piece of hardware, Kubernetes orchestrates services on a distributed computing pool.

As Linux grew into powering pretty much everything, from tiny IoT devices to supercomputers, we expect Kubernetes to gradually power all of the distributed software applications in the world. And, since the world is going to the cloud, we like to define Kubernetes as the operating system of the cloud.

Kubernetes offers a rich, sophisticated environment that makes scaling applications easy. At the same time, applications running on Kubernetes tend to be composed of tens, hundreds, or sometimes thousands of services that depend not only on each other, but on many different cloud resources.

This increases complexity and, with it, risk. This type of risk is not easily containable with “shift left” security tools that focus on code scanning and image scanning, because these static techniques cannot completely capture the explosion of combinations and the interdependent nature of microservice-based apps. Securing Kubernetes requires deep runtime security and strong runtime service context. It requires that we shift security right.

Shift Left? Shift Right?

So where does this leave us? Trends like IaC point to the need to shift left and validate infrastructure definitions as soon as they are committed to a git repository. The complexity of Kubernetes requires deep visibility and context at runtime. What is the correct approach?

The right answer, of course, is shift left and shift right!

With this approach, you have a single source of truth across IaC and runtime. When we talk about “source,” we are referring to both the physical source file stored in the git repo (Yamls, Terraforms, Helm charts, and/or Kustomizations), and the source for the runtime configurations eventually deployed to the Kubernetes cluster. Securing the former will have the desired direct impact on the latter, resulting in a more secure runtime environment. However, you must have runtime security in place to spot drift from the IaC source configuration file and detect anomalous activity.

Marrying IaC and runtime security creates opportunities to strengthen security even more. The context collected while the application runs (e.g., Which services talk to each other? Which users access the production environment?) can be used to inform policy creation. Policies can be automatically enforced across the infrastructure by leveraging gitops, making the runtime more secure. Manual work and human error are removed, and attack surface is reduced at the same time.

Security from Source Through Production

A truly complete platform for modern secure DevOps must combine these dimensions in a comprehensive and balanced way. Shifting security to the early stages of the development pipeline is a mandatory and widely accepted practice today, and we need to aggressively expand it to IaC. But this must be supported by strong and precise protection at runtime.

This is a virtuous cycle that can only happen when shift left and shift right support each other, which we are convinced is the future of security.


Originally published on The New Stack

The post How to Secure Kubernetes, the OS of the Cloud appeared first on Sysdig.

]]>
Digging into AWS Fargate runtime security approaches: Beyond ptrace and LD_PRELOAD https://sysdig.com/blog/aws-fargate-runtime-security-ptrace-ld_preload/ Tue, 11 May 2021 15:00:12 +0000 https://sysdig.com/?p=38200 Fargate offers a great value proposition to AWS users: forget about virtual machines and just provision containers. Amazon will take...

The post Digging into AWS Fargate runtime security approaches: Beyond ptrace and LD_PRELOAD appeared first on Sysdig.

]]>
Fargate offers a great value proposition to AWS users: forget about virtual machines and just provision containers. Amazon will take care of the underlying hosts, so you will be able to focus on writing software instead of maintaining and upgrading a fleet of Linux instances. Fargate brings many benefits to the table, including small maintenance overhead, lower attack surface, and granular pricing.

However, as any cloud asset, leaving your AWS Fargate tasks unattended can lead to nasty surprises. If they are compromised, you may get a huge bill after someone uses your infrastructure for crypto mining. Or an attacker could perform cloud lateral movement to gain full access to your data.

Detecting issues like these requires deep visibility into the activity of your fargate workloads, exactly the same visibility you would get when running applications in traditional infrastructures.

Unfortunately, lack of access to the Operating system makes it very challenging to reach such a level of visibility in Fargate. Fortunately, Sysdig came up with the perfect solution to the problem!

But before talking about the solution, let’s describe the problem.

Current deep instrumentation approaches

When you run a runtime security product, commercial or Open Source, with sufficient granularity and support for containers, it’s almost guaranteed to employ one of these three main instrumentation techniques: ptrace, LD_PRELOAD, and kernel instrumentation.

These techniques make it possible to observe the system activity (file I/O, network, etc.) with a high degree of granularity (e.g., single access to disk, network payload content) and are necessary for true security.

When comparing these techniques, there are two important factors to consider: accuracy and overhead. Let’s learn more about the three techniques and how they compare with regards to these factors.

1. LD_PRELOAD

This technique consists of using an environment variable, LD_PRELOAD, to direct the operating system to load a different version of the libc dynamic library. Libc is used by the majority of the programs to invoke kernel functions like open() or connect(). The version loaded with LD_PRELOAD may contain additional instrumentation to record every call to these functions and their parameters.

LD_PRELOAD is relatively efficient, but it’s not accurate. That’s because many programs, such as anything written in go, don’t use the libc dynamic library and call the kernel directly. LD_PRELOAD will be completely blind to these types of programs. Additionally, since LD_PRELOAD is a user-level technique, it can be fooled by a motivated attacker, like calling system calls directly instead of using libc.

Finally, replacing system libraries is fairly risky and can cause serious stability issues in the instrumented application.

2. ptrace & friends

ptrace is one of the most advanced (and complicated!) pieces of functionality exported by the Linux kernel. It makes it possible for a process with the right privileges to pause, introspect, and control another process. Also, it should be noted that in this context, when we use the term ptrace we include other user level introspection techniques, like inotify or fanotify. They have the same privilege requirements and overhead profile of ptrace. Several tools you use on a regular basis are based on ptrace. One of them, for example, is gdb, the GNU debugger.

ptrace and the other similar techniques are very accurate, but they are not efficient. They are accurate because they are language and stack independent (i.e., they can see the activity of Go programs). Also, the information is produced by the operating system kernel and the kernel can’t be fooled, at least not easily. However, all of these techniques involve multiple context switches to hand the data to the collection process, which makes them very slow.

When something is too slow, the solution is to collect less data to reduce overhead, but this makes it less accurate. Thus, ptrace is either too slow or not very accurate.

3. Kernel instrumentation

This technique consists of going down to the kernel with traditional (and more invasive) methods based on kernel modules. Or use more modern, less invasive methods like eBPF, which is based on a virtual machine that can run code safely inside the kernel.

Kernel instrumentation is the most efficient and accurate way to collect system activity. It’s efficient because collecting the actions in the kernel guarantees the lowest overhead. It’s accurate because, as we know, “the kernel never lies.”

For this reason, Falco, Sysdig Secure, and Sysdig Monitor have been based on this technique. It is dramatically superior, both in terms of accuracy and performance to any other approach.

The problem with Fargate

The problem with Fargate instrumentation is easy to summarize: with no access to the host, kernel instrumentation is not available, so we have to choose between the LD_PRELOAD and ptrace-based approach. As a consequence, we have to sacrifice either accuracy, performance, or both. Or do we?

Introducing Sysdig’s advanced Fargate instrumentation

With the release of Fargate support in Sysdig Secure and Sysdig Monitor, Sysdig is coming to the rescue with a new patented technology that builds on top of the Open Source pdig framework and adds powerful optimizations that focus on the nature of serverless applications. The focus during the development of this solution was zero compromises: we wanted something that could offer the same data precision as kernel capture, full support for any type of executable (including Go programs!), and an overhead comparable to what you would experience in traditional containerized infrastructures.

The result? Let’s have the numbers tell us the story! The chart below shows the time it takes to run an I/O intensive workload (over 100k IOPS) in a Fargate container, with full accuracy (every I/O operation is recorded).

We can observe that, even in this demanding scenario, kernel instrumentation causes a minimal slowdown in the instrumented app. On the other hand, at least at full accuracy, ptrace’s impact is very substantial, with a slowdown of over 4X.

The Sysdig advanced Fargate instrumentation manages to contain the instrumentation overhead close to the one of kernel instrumentation, while simultaneously capturing every I/O operation and supporting languages like Go.

Conclusions

If you are serious about Fargate and its security, Sysdig’s advanced Fargate instrumentation brings you the deep visibility you need to confidently run and secure your workloads without having to accept compromises. Sysdig just announced AWS Fargate runtime security based on this technology. Check it out by requesting a 30-day free trial. You’ll be set in just minutes!

The post Digging into AWS Fargate runtime security approaches: Beyond ptrace and LD_PRELOAD appeared first on Sysdig.

]]>
Sysdig contributes Falco’s kernel module, eBPF probe, and libraries to the CNCF https://sysdig.com/blog/sysdig-contributes-falco-kernel-ebpf-cncf/ Wed, 24 Feb 2021 12:00:38 +0000 https://sysdig.com/?p=34665 Today, I’m excited to announce the contribution of the sysdig kernel module, eBPF probe, and libraries to the Cloud Native...

The post Sysdig contributes Falco’s kernel module, eBPF probe, and libraries to the CNCF appeared first on Sysdig.

]]>
Today, I’m excited to announce the contribution of the sysdig kernel module, eBPF probe, and libraries to the Cloud Native Computing Foundation. The source code of these components will move into the Falco organization and be hosted in the falcosecurity github repository.

These components are at the base of Falco, the CNCF tool for runtime security and de facto standard for threat detection in the cloud. They are also at the base of sysdig, the broadly adopted Open Source tool for container forensics and incident response.

This move is an important milestone. It means that from now on, all of the core components of the Falco stack will be part of the CNCF.

What is this exactly?

Let’s start with a diagram showing the main components at the base of Falco and Open Source sysdig:

Falco and sysdig operate on top of the same data source: system calls. This data source is collected using either a kernel module or an eBPF probe. The two methods are equivalent in functionality, but the kernel module is a tiny bit more efficient, while the eBPF approach is safer and more modern.

Before being consumed by sysdig or Falco, the collected data needs to be enriched (i.e., a file descriptor number needs to be converted into a file name or an IP address). This is accomplished by two libraries, libsinsp and libscap.

The green boxes in the diagram above identify what was previously owned by the Cloud Native Computing Foundation. This includes the components that make Falco work, but not the ones that collect data. The reason for this separation is that the data collection modules were originally developed for sysdig and they stayed in its repository, while Falco (and other tools) treated them as external dependencies.

As a consequence of this donation, the diagram is changing this way:

libsinsp, libscap, the kernel module, and the eBPF probe have been relicensed and are now owned by the CNCF. They will live in an independent repository under the falcosecurity organization, and their licensing and governance will be guided by the CNCF community principles.

Why is this important?

Sysdig has been committed to Open Source since its inception and strongly believes the future of security is open. The core technologies behind our products have been available as Open Source software from day one. Today, we’re taking another major step toward ensuring that we live by our principles. In particular, we want to make sure that Falco is fully free and owned by the community.

This contribution completes the effort. It took a bit of time (Falco joined the CNCF in Oct. 2018) because it involved separating the components that were originally part of sysdig and making them independent.

What makes me really excited is that we are taking a set of extremely powerful building blocks and we’re putting them in the hands of the Cloud Native community. Among other things, we are talking about:

These components are the perfect foundation for tools in runtime security, troubleshooting, incident response, forensics, and many other areas. I am convinced that the community will embrace them and will come up with some really cool tools on top of them. I, myself, have some interesting ideas that I want to show to the community soon. ;-)

How can you hack with this?

If you would like to find out more and hack on Falco:

I hope you will have as much fun using it as we had building it.

And remember: PRs are welcome!

The post Sysdig contributes Falco’s kernel module, eBPF probe, and libraries to the CNCF appeared first on Sysdig.

]]>