Why the Capital One Breach Will Not Deter the Unstoppable Growth of Public Cloud

Sandy Bird

July 31, 2019

The Capital One loss of credit card applications for 106 million people is a sobering reminder that complexity stifles even the most sophisticated cloud teams. Yes, vulnerabilities must be patched, and misconfigurations do happen. However, if or when a vulnerability or misconfiguration in public cloud is breached, the impact can have a nasty blast-radius. The flexibility and dynamism of cloud open up a plethora of connections between users, roles, serverless functions, containers, VMs, critical data resources, and other PaaS resources. In today’s world, excess privilege or access fails at scale.

Here at Sonrai Security, we firmly believe that public cloud will be more secure than enterprise data centers. But, in public cloud, defense in depth is not achieved with multiple layers of network firewalls and similar controls but rather:

  • By having identity at the center with the least privilege in identity controls
  • Multiple controls on data access (auth + network ACLs + conditions)
  • Robust continuous monitoring to ensure ‘Zero-Trust’ is not compromised

Here is how you can achieve this:

Minimize privilege and access
Roles and policies with ‘read-data’ access (not to be confused with ‘list’ or ‘describe’) across data stores like S3 might not seem over-privileged, but on specific buckets (or any data store that contains SPII) ‘read-data’ access is more sensitive than being able to write to the data store. The Capital One case shows this clearly. The WAF-Role was configured to allow enough permission to enumerate and read data of 700 buckets. This shows resource restrictions were not configured on the WAF-Role. On the data store side, a critical control on any bucket (especially those with SPII) is to require ACLs and conditions that further restrict access.

Minimize access paths
With cloud, it is not unusual for users, compute, containers, and serverless functions to have access to critical data along many different permission paths:

  • Users could be a member of a group
  • A serverless function may be able to assume a role
  • Through a role assumption, a user might be able to get access to another account

And the list goes on and on. The Capital One issue seems to have leveraged a direct path (external IP -> configured role -> APIs) but given time, an attacker will explore all possible paths to escalate privilege, and this must be modeled by organizations to understand the actual blast radius of any resource, identity, or role. Security architecture teams must be relentlessly hard on DevOps teams creating too many identities that workloads can ‘assume.’

Baseline trust
You have 10’s of thousands of compute instances, 1000’s of data stores, 100’s of clouds accounts, and countless agile dev teams. To use public cloud the right way, you must verify ‘least privileged’ by observing actual activity to uncover unused trust relationships that could allow escalation.  Understanding trust relationships requires more than looking at the roles in isolation. Baselining platform config is important but insufficient. Baselining trust relationships illuminates your blast-radius and helps you reduce it.

Continuously monitor (beyond S3)
S3 and other object stores continue to be a focus, but this is just the tip of the iceberg. If an EC2 instance can access a DynamoDB table but has never done so before, you need to know about it. In their defense, Capital One had extensive logs and audit data. But, we have seen so many cases within organizations where an audit of object-level access isn’t even enabled on buckets. Monitor all data access and identity activity with alarms for unusual behaviors. Also, DynamoDB, ElastiCache, Hashicorp Vault, and many other platform services contain critical data too and must be continuously monitored.

Flat-compliance and configuration checks are not good enough
Polling cloud APIs for configuration without understanding access patterns does not give the full-view needed to reduce permission to a least privilege model. Flat compliance checks don’t provide enough information to understand how a workload relates to other resources or how an identity with permission to use a key can then access data in another location. It is nearly impossible to know who or what can access a piece of data without modeling organization service controls policies, decoding and normalizing all of the possible cloud permissions, understanding resources statements, conditions, boundary conditions, group memberships, ACLs, and many more items. Doing a few API calls to regex match strings in JSON might get you past an auditor, but trying to understand how an attacker is going to exploit a workload requires a much deeper understanding of cloud platforms.

So what does this all mean (the good news)?  

With public cloud security, aspirations can be grand. The task is not just about finding mechanisms to ‘secure the cloud’ but to reimagine security so that the result is far superior to a traditional data center and enterprise network. This is definitely possible with public cloud. Set up appropriately, public cloud is remarkably well-instrumented. By coupling this visibility with access, privilege modeling and continuous monitoring, you can achieve a ‘zero-trust’ security model that was not possible before.