Vulnerabilities are only one part of the cloud security story. Misconfigurations are still the biggest player in security incidents and, therefore, should be one of the greatest causes for concern in organizations. Addressing it is a key component of a strong Cloud Native Application Protection Platform (CNAPP).
According to Gartner®, “By 2023, 75% of security failures will result from inadequate management of identities, access, and privileges, up from 50% in 2020.” [1] Although many organizations are talking about zero trust principles, such as enforcing least privilege, our data shows little evidence of action. With this information, some questions begin to arise:
- Why do many companies still have misconfigurations today?
- Is it really that complicated for organizations to correctly configure access controls in the cloud?
- If frameworks and best practices exist, why are they not followed?
If you want to know more about IAM misconfigurations, vulnerability prioritization, supply chain attacks, or how many resources and how much money you are wasting right now in Kubernetes, don’t miss the opportunity to read Sysdig’s Cloud-Native Security and Usage Report.
What is the principle of least privilege?
Depending on your role, it is easy to assume that you will only have specific permissions that are necessary to perform your tasks. For example, if you are a developer within a project, you should only be able to view or edit code but not create a new project.
That’s what we call the least privilege principle; any developer, security architect, or compliance expert should be able to do their work without blockers but should not be able to go beyond this scope. Don’t forget the non-human accounts that also have their own authorizations that must also follow least privilege.
But keep calm. Cloud providers give you all the resources and services to maintain protected accounts, such as IAM, Policies, Roles, Users, Groups, and so on. There are also best practices that, if followed, will minimize risk as much as possible.
So where’s the problem?
The least privilege-cloud implementation disconnect
Why is there such a disconnect between intended secure design with least privilege and an actual cloud implementation? There is no simple answer, or answers, since many access control elements are involved and implementations vary between organizations. We’ve grouped the main causes into three categories for simplicity, in no particular order of importance.
Every developer or employee of the company is an exception
It can be hard to know who needs what permissions in an organization to effectively do their job.
The basics of IAM stress that you should group your users based on parameters, such as job title or department, and then attach only the access policies with the exact rules they need to perform their work. Typically, this is explained as role-based access control, but this theory fights with the realities of complex organizational structures where individuals are not working in isolated teams.
Think about cross-team efforts, an individual or group that contributes to various engineering teams, interacts with sales, and provides assistance to developers. With each of these efforts come permissions that ideally should be granted temporarily, as each project or support line exists. However, doing it every time is granular and adds a huge workload, leading to situations where permissions are given globally or permanently for simplicity’s sake.
In summary, users need some adjustments to their permissions to do their work, which has an unintended side effect of producing over-permissioned users.
In the end, DevOps and security teams tend to grant more permissions than needed, so teams can perform work without being inhibited and focus on business goals. Functionality and availability are usually paramount, sometimes at the expense of robust security. There is intense pressure on security teams to steer away from the “culture of no” as part of the DevOps movement. Scrutinizing why someone or something needs elevated permissions slows down business process and application releases.
Evidence of this business reality and the resulting impact can be seen in the usage report, where Sysdig found that 90% of granted permissions are not used.
Furthermore, cloud vendors and their offerings are evolving incredibly fast year-over-year. The continuous addition of services, as well as changes to existing ones, complicates permissions further, leading to scaling challenges.
Identity and access management is challenging to scale
The first root cause leads us to the second one, when everyone in the company becomes an exception to your perfect and compliant configuration because they need specific permissions to accomplish their work. Generating constant changes in the policies leads to a need for an entire team to manage them. Oftentimes, only the largest organizations have the luxury of staffing dedicated IAM teams, but it can still be a losing battle in their case.
Focusing on individual users and groups is too hard to manage given granularity. The number of users creates an exponential amount of identities and access rules. Additionally, we have resources that increase the complexity and maintaining them manually makes it impossible to scale.
And it’s not only human or traditional users that must be addressed. Applications, cloud services, commercial tools, and many other entities (or machine identities) must be authenticated and authorized appropriately as well. Similar to how applications on your cell phone request permissions to your contacts, photos, camera, microphone, and more, these machine identities request and require permissions to your environment. For this reason, we must also consider access management for these non-human entities.
Challenging these types of accounts for authentication is different than traditional users, and it can also break system integrations or automations.
It is clear that companies face this challenge when scaling their identity management, as they must be precise yet flexible in allowing activity but maintain the principle of least privilege in each of the accounts, groups, or policies they manage at the same time. These constraints are also foundational to zero trust architecture that many organizations are pursuing.
Visibility and analysis of access controls are poor
Why do so many granted permissions never get used? At the very least, organizations need visibility into user accounts, non-human identities, and their relevant permissions.
- More than 98% of permissions granted to non-human identities have not been used for at least 90 days.
Oftentimes, these unused permissions are granted to orphaned identities, such as expired test accounts or third-party accounts. This can also happen because there is no direct 1:1 mapping with classic identity managers, such as LDAP, when migrating to the cloud or if there are multiple authentication systems. Not excluding the fact that, in many systems, there is also a mix of DAC, MAC, and RBAC, and they are not aligned.
We should assume that excessive permissions increase the likelihood of a security incident or breach. We should always reduce permissions as much as possible, regardless of identity type, to understand the risk we assume when our security is deficient. OWASP Kubernetes can be of help, sorting out the most likely risks you are going to face.
We took a closer look at all of Sysdig’s customer accounts with administrator permissions and calculated relative risk scores. The risk scores represent the percentage of a customer’s cloud accounts with poor security hygiene. We set the parameters as accounts with administrator access, no multi-factor authentication (MFA) enabled, and account inactivity of 90+ days. These are conveniences attackers look for because they provide easier account access with elevated permissions and reduced chance of detection since normal activity on these accounts may not be captured in defensive detection alerts.
Best strategies to mitigate unnecessary risks
Based on our research, we demonstrate that although there is awareness of IAM processes, tooling, and zero trust approaches, appropriate access control in the cloud still lags behind the fast pace of cloud adoption.
Let’s explain what the best strategies to fight with the challenges mentioned above are.
Find an appropriate balance for permissions exceptions
As a company, you invest resources and time in creating the most accurate policies, groups, and roles to attach to your users. You keep all the inventory and monitor the application of these policies to avoid any human error, but as we mentioned before, the problem starts when a user wants or needs more permissions to finish his tasks. Should we not allow any exceptions in our company? This is an important question to ask yourself. If you want to ensure that configuration errors are avoided, no exceptions is your best solution. Unfortunately, in the real world, the first priority is availability and security is secondary. Such a secure design choice need not be made unilaterally either. You may only employ stricter access control in mission or business critical environments, or those that are more exposed, such as cloud tenants.
This opens up the discussion on how to manage every exception in your environment: you should have them all inventoried, no matter the scale; follow the lifecycle of your permissions on your users, focusing on the most sensitive administrators or policies; and finally, accept the risk every time you accept a change.
Promote collaboration between IAM and IT teams
So, how do you manage this identity and access management at scale?
It is not just assigning a person or a team dedicated exclusively to the governance of permissions (managing and provisioning resources such as Policies, Groups, Roles, etc.). This will help, but unfortunately it is not a perfect solution. It does not solve the scalability problem if everything depends on a single point or your organization does not have the resources to maintain a dedicated team.
The possible solution then goes through collaboration and ownership of the different teams to maintain granularity in the assignment of permissions to accounts and non-human users. Each team (engineers, business teams, platform ops, etc.) knows their area of work and knows what minimum resources they need. IT teams, or IAM teams if they exist, should follow the principle of least privilege using the controls that cloud providers offer.
Another solution is self-service, having strict policies in place and obtaining these permits on a temporary basis to minimize risk. Otherwise, people will over-permission and access controls may degrade. These automated solutions could solve the big problem of requests and approvals by adding complexity to identity and access management.
Individuals working in tandem with other organizational teams will be better equipped to satisfy compliance with cloud security best practices, such as CIS Benchmarks or well-architected frameworks.
Reduce time waste and lean on runtime detection
Last but not least, you need to maintain awareness of any changes made to permissions and permissions managers through the use of detections. Otherwise, you may be missing an adversary’s initial access and allowing compromise risk to increase.
Traditionally, organizations try to dump all data logs from applications and services (e.g., in data lakes) and use tools such as SIEMs to detect threats, investing resources in minimizing mean time to detection (MTTD). We consider it is necessary to speed up this detection of any changes in your permissions manager and cloud tenants, giving rise to runtime security capabilities to restrict inappropriate changes on the fly or revert to the default configuration based on the usual behavior of our users. Additionally, this reduces the cost of infrastructure that organizations need to maintain the threat detection systems working efficiently.
IAM and IT teams should also remove unused test accounts wherever possible to prevent initial access opportunities and reduce the attack surface. While this can be tedious to determine manually, a smart way to define policies automatically can be based on the behavior of your users and analysis of the permissions they typically use. This information can be used to generate baselines which are ideally codified (such as in the form of policy-as-code) and then enforceable. This “in-use” permission policy works as filters and automatically generates recommendations and can make this process more efficient.
These recommendations follow established security best practices and should be part of your overall cybersecurity program. Adhering to them will benefit your risk posture.
Final thoughts
We’ve scratched the surface of why it is so difficult to manage identities in the cloud and how organizations commonly misconfigure access controls and over-permission identities. You can use the guidance and suggestions provided here to improve your permissions management practices.
Identity and access management is an incredibly dense topic that underpins most if not all security program approaches. Clearly, IAM gets more complicated in cloud environments. Practitioners know they should follow least privilege principles and pursue zero trust initiatives, but the answer to “how?” is nuanced. We believe that effective IAM in the cloud is a huge challenge for any organization as highlighted by the statistics in Sysdig’s usage report, and optimal strategy isn’t one-size-fits-all.
[1] Gartner, Best Practices for Optimizing IGA Access Certification, Gautham Mudra, 4 April 2022. Gartner is a registered trademark and service mark of Gartner, Inc. and/or its affiliates in the U.S. and internationally and is used herein with permission. All rights reserved.