AWS IAM Roles — Why AdministratorAccess Cost $47K
Lambda with AdministratorAccess: full account takeover in 4 minutes.
20+ years shipping production infrastructure and CI/CD at scale. Everything here is grounded in real deployments.
- IAM controls who can do what in your AWS account — Users (humans), Roles (machines), Groups (management containers)
- Every Role has two separate policies: Trust Policy (who can assume it) and Permission Policy (what it can do) — both must be correct or you get AccessDenied
- AWS IAM is default-deny — if a permission is not explicitly granted, it is blocked. Explicit Deny always wins over Allow, regardless of what any other policy says.
- Roles issue short-lived STS tokens (15min–12hr) that auto-rotate — leaked credentials expire in hours, not forever
- Use IAM Access Analyzer to generate least-privilege policies from actual CloudTrail data — deploy broad in staging, tighten in production
- Biggest mistake: using AdministratorAccess on Lambda or EC2 roles 'just to make it work' — a single compromised function gets full account access in seconds
Imagine your AWS account is a giant office building. IAM is the security desk at the front door — it decides who gets a key card, which floors they can visit, and whether they can open the filing cabinets once they get there. A contractor (an EC2 instance) might get a temporary badge that expires at 5pm, while a full-time employee (a developer) gets access to their own floor but cannot wander into the CEO's office. The key insight most people miss: there are two separate questions the security desk asks. First, are you allowed through the front door at all? That is the Trust Policy. Second, once you are inside, which rooms can you enter? That is the Permission Policy. Both gates must open or you are not getting in.
Every significant AWS breach traces back to one root cause: IAM was misconfigured or ignored. Exposed S3 buckets, compromised Lambda functions, leaked credentials on GitHub — all IAM failures at their core. IAM is not a niche security topic you deal with after everything else is working; it is the foundation every other AWS service is built on, and getting it wrong is how companies make the news.
IAM introduced fine-grained, programmable permissions to replace the blunt root-account-or-nothing model. You can now specify that 'this Lambda function may read from exactly one S3 bucket and nothing else.' That specificity is the difference between a contained incident and a company-ending breach. The controls exist. The question is whether your team uses them.
The three concepts to master: Users (human identities with permanent credentials), Roles (machine identities with temporary STS tokens that auto-expire), and Policies (JSON documents defining what is allowed or denied). Getting the relationship between these three right — and specifically understanding that Trust Policies and Permission Policies are completely separate documents with completely different jobs — is what separates a secure architecture from a ticking time bomb.
In 2026, with AWS Identity Center replacing the old SSO console and OIDC-based authentication now the standard for CI/CD pipelines, the era of long-lived Access Keys for automation should be over. The patterns in this guide reflect where the security bar actually sits today, not where it was in 2019.
Why AdministratorAccess Cost $47K
AWS IAM (Identity and Access Management) is the service that controls who can do what in your AWS account. At its core, IAM defines principals (users, groups, roles) and attaches policies that grant or deny actions on resources. The fundamental mechanic: every API call is evaluated against all applicable policies — if any policy explicitly denies the action, the call fails; otherwise, if any policy allows it, the call succeeds. This is the least-privilege enforcement point, and it's evaluated in real-time for every request.
In practice, IAM roles are the most critical construct. A role is an identity you assume temporarily, not a long-lived user. Roles have no static credentials — they issue temporary security tokens via AWS STS (Security Token Service), valid for up to 12 hours. This eliminates the risk of leaked access keys. Roles are assumed by trusted entities: EC2 instances, Lambda functions, other AWS services, or even users from another account. The trust policy defines who can assume the role; the permission policy defines what they can do once assumed.
Use roles for any workload that needs AWS access — EC2 instances, Lambda functions, ECS tasks, or cross-account access. Never embed long-term access keys in code or configuration files. The real-world impact: a single over-permissive role (like AdministratorAccess) attached to an EC2 instance can lead to a $47K bill in hours if that instance is compromised and used to spin up expensive resources. Roles are the mechanism to enforce least privilege and contain blast radius.
Users, Groups, and Roles — Picking the Right Identity Tool
The most common IAM confusion comes from mixing up three distinct concepts that look superficially similar but serve completely different architectural purposes.
IAM User represents a human or legacy script requiring long-term credentials — a password for the console and an Access Key and Secret Key pair for programmatic access. The modern rule is stark: machines should almost never be IAM Users. A User's Access Key does not expire. If it leaks, it works until someone manually revokes it.
IAM Group is a management container for Users. Instead of attaching policies to ten developers individually, you attach them once to a BackendEngineers group. Groups have no credentials of their own and cannot be assumed by services — they are purely an administrative convenience, not a security boundary.
IAM Role is the most important tool in your toolkit. A Role has no long-term credentials whatsoever. Instead, it is assumed temporarily by a principal — an EC2 instance, a Lambda function, a CI/CD pipeline, or a cross-account service. When something assumes a Role, AWS STS issues a set of temporary credentials (access key, secret key, and session token) that expire automatically between 15 minutes and 12 hours. That auto-expiry is your primary defence against credential leaks: a leaked temporary credential becomes useless within hours, not years.
In 2026 the modern standard is unambiguous: AWS IAM Identity Center (formerly SSO) for all human access, Roles for all machine and service identities, and OIDC-based federation for CI/CD pipelines. Permanent Access Keys on developer machines should be a red flag in any security review, not a routine practice.
IAM Policies Deep-Dive — Trust Policies vs Permission Policies
IAM policies are JSON documents that define permissions, but there are two completely different types of policies in play on every Role, and confusing them is the source of most AccessDenied errors and most of the time spent debugging them.
The Permission Policy answers the question: what actions can this identity perform? It is attached to a User or Role and defines the allowed or denied API calls, the resources those calls can target, and the conditions under which the permission applies. This is what most people think of when they hear 'IAM policy.'
The Trust Policy answers a completely different question: who is allowed to assume this role? It lives exclusively in the AssumeRolePolicyDocument on the Role resource itself and is evaluated before the permission policy is even consulted. If your Trust Policy does not list lambda.amazonaws.com as a trusted principal, no Lambda function can assume that role — it does not matter how many permissions the role has. The role is simply unavailable to that service.
The evaluation order when an API call hits AWS: first, can this principal assume the role at all? (Trust Policy). Second, is there an explicit Deny anywhere in the policy stack? (any SCP, Permission Boundary, or policy with Deny wins immediately). Third, is there at least one Allow for this specific action on this specific resource? (Permission Policy, Resource Policy). If the answer to the second question is yes, the request is denied regardless of any Allows. If the answer to the first or third question is no, the request is also denied. Default deny means that every request starts at denied and must earn its way to allowed through explicit grants.
Understanding this evaluation chain is what makes you fast at debugging access issues. Most engineers start at the permission policy and work outward. The right approach is to start at the Trust Policy and work forward through the chain.
- Gate 1: Trust Policy — can this principal assume the role at all? If no, the request never reaches the other gates.
- Gate 2: SCPs — does the AWS Organization allow this action in this account? An SCP Deny is absolute and cannot be overridden.
- Gate 3: Permission Boundary — does this action fall within the role's maximum ceiling? A Boundary Deny overrides any Allow in the Permission Policy.
- Gate 4: Permission Policy — does the role's identity-based policy explicitly allow this action on this resource?
- Gate 5: Resource Policy — does the target resource (S3 bucket, KMS key, SQS queue) allow access from this role? Required for cross-account access.
Least Privilege in Practice — Building a Real-World IAM Strategy
Least privilege is the security practice of granting the absolute minimum permissions required for a task to complete — nothing more. In fast-moving teams, AdministratorAccess is tempting because it eliminates permission debugging entirely. It also means that any bug in your code, any compromised dependency, or any leaked credential is immediately a full-account incident.
The practical path to least privilege is not guessing permissions upfront — it is observing what your application actually calls and generating a policy from that data. IAM Access Analyzer's policy generation feature does exactly this. You run your application in a staging environment with a broad logging policy, CloudTrail captures every API call the application makes, and Access Analyzer generates a tight JSON policy containing only those exact calls. The process takes 10 minutes and produces a policy you can trust to be accurate because it came from real usage data, not someone's best guess.
For human access, the standard in 2026 is short-lived sessions via AWS IAM Identity Center. Developers authenticate through their corporate identity provider (Okta, Google Workspace, Entra ID), receive temporary credentials valid for 1 to 12 hours, and use those to make API calls. A stolen laptop means credentials that expire before the attacker can do meaningful damage. There are no permanent Access Keys in ~/.aws/credentials to exfiltrate.
For CI/CD pipelines, the modern standard is OIDC federation. GitHub Actions, GitLab CI, and most modern CI platforms can authenticate as themselves using an OIDC token and assume an IAM role that trusts their identity. No stored credentials, no rotation, no credential scanning needed for a secret that does not exist.
The three-layer defence for least privilege at scale: Access Analyzer generates tight policies per role. SCPs at the AWS Organizations level set hard limits that no account-level policy can override — denying iam:CreateUser in non-security accounts, restricting compute to approved regions, preventing S3 public access changes. AWS Config rules provide continuous monitoring and alert within minutes when a role receives broader permissions than its baseline, catching the exact mistake from the production incident above.
IAM Policy JSON Structure — Effect, Action, Resource, and Condition
Every IAM policy is a JSON document with a specific structure. Understanding this structure down to the field level is essential for writing, auditing, and troubleshooting policies. The core of any policy is the Statement array, where each statement is an independent permission rule.
The table below breaks down each top-level field and the sub-fields inside a statement.
| Field | Required? | Description | Example Value |
|---|---|---|---|
Version | Yes | Policy language version. Always use "2012-10-17" — the only version AWS supports. | "2012-10-17" |
Statement | Yes | Array of one or more individual permission statements. Each statement is evaluated independently; if one allows and another denies, the deny wins. | [ { ... } ] |
Sid | No | Optional statement identifier. Useful for auditing and debugging — you can include a human-readable name for each rule. | "AllowS3ReadAccess" |
Effect | Yes | Either "Allow" or "Deny". Deny always overrides Allow, regardless of the order of statements. | "Deny" |
Action | Yes | One or more AWS API actions. Use wildcard * to match all actions for a service. Use full service prefixes like s3:GetObject. | [ "s3:GetObject", "s3:PutObject" ] or "ec2:*" |
Resource | Yes | One or more ARNs that the action applies to. Use "*" sparingly — each wildcard is an opportunity for privilege escalation. | "arn:aws:s3:::my-bucket/" or "" |
Condition | No | Optional block that specifies when the policy applies. Conditions use operators like StringEquals, IpAddress, ArnLike with keys like aws:SourceIp or aws:RequestedRegion. | { "StringEquals": { "aws:RequestedRegion": "us-east-1" } } |
Principal | Only in Trust/Resource Policies | Defines who the policy applies to. In identity-based policies, this field is not allowed; using Principal in a permission policy causes an error. | { "Service": "lambda.amazonaws.com" } or { "AWS": "arn:aws:iam::123456789012:root" } |
NotAction / NotResource | No | Inverse conditions — specify actions or resources that are excluded. Use with care: NotAction often leads to unintended allowances. | "NotAction": "iam:*" would deny everything except IAM actions when paired with Effect": "Deny" |
Key rule for beginners: Every statement must have Effect, Action, and Resource. The most common mistake is forgetting the Resource field — it is required for almost every service, and leaving it out causes a policy validation error during attachment. Use the IAM Policy Simulator before putting a new policy into production to catch these mistakes early.
Production insight: When auditing existing policies, look for statements that use "Resource": "" combined with "Effect": "Allow" and a broad action like s3: or ec2:*. Each such statement is a potential privilege escalation vector. IAM Access Analyzer can generate a report of all policies in your account with these patterns, giving you a concrete backlog of items to tighten.
Principal field is the most common source of confusion for engineers new to cross-account access. In identity-based policies (attached to a User/Role), Principal is forbidden — AWS tells you who you are from the credentials, and the policy just defines what you can do. In resource-based policies (S3 bucket policy, KMS key policy, SQS queue policy) and trust policies, Principal is mandatory: it specifies who gets the permission. Putting Principal in the wrong kind of policy is a syntax error that prevents deployment."Resource": "") are necessary for some actions (like sts:AssumeRole, iam:GetUser, cloudwatch:PutMetricData) but should be audited rigorously. Every wildcard resource that can be replaced with a specific ARN should be. The IAM Policy Simulator can help you test whether a specific ARN would work instead of "".Root Account Hardening — MFA and Access Key Removal Checklist
The AWS root account is the most powerful identity in any AWS account. It has unlimited access to every service and cannot be restricted by any IAM policy, SCP, or permission boundary. For that exact reason, root account credentials must never be used for daily operations, and the account must be hardened so that even if credentials leak, the blast radius is limited.
The following checklist covers every mandatory step to secure the root user. AWS recommends completing these steps within the first 24 hours of creating a new account.
☐ Enable MFA on the root account - Go to the IAM console, select "Security credentials" for the root user, and activate a virtual MFA device (Google Authenticator, Authy) or a hardware TOTP token. Do not rely on SMS — it is vulnerable to SIM-swap attacks. - Store the recovery code (QR code or secret key) in a secure offline location accessible to at least two trusted team members. The most common cause of lockout is losing the MFA device and having no backup.
☐ Remove or disable root access keys - Root access keys are rarely needed and are a massive security risk because they cannot be restricted by any policy. Run aws iam get-account-summary and check the AccountAccessKeysPresent field. If true, delete the keys immediately: aws iam delete-access-key --user-name <root> --access-key-id <key>. - If you must keep a root access key for a legacy use case (such as a long-running CloudFormation bootstrap), create a strict rotation policy with a calendar reminder and set up an AWS Config rule (iam-root-access-key-check) to alert if the key exists at all.
☐ Set up an IAM user or role for administrative tasks - Create a dedicated IAM user for a few break-glass administrators, or better, use IAM Identity Center with role assumption. Never use the root user for anything except the initial account setup. - Attach the AdministratorAccess policy to this admin identity, then enforce MFA on that user as well.
☐ Configure an email alias for root account recovery - The root user's contact email address is used for password reset and billing notifications. Ensure it goes to a monitored distribution list (e.g., aws-admin@yourcompany.com) rather than an individual's inbox. - Keep the phone number for the root account current — it is used as a second factor for some support cases.
☐ Enable CloudTrail on the root account - CloudTrail logs all root user API calls. Create a trail that applies to all regions and delivers logs to a central S3 bucket in a security account. Set up an SNS notification for any root-level action via CloudTrail events with userIdentity.type = Root.
☐ Set up AWS Config rules to monitor root account activity - Rule: iam-root-access-key-check — alerts if any root access key exists. - Rule: iam-root-mfa-check — alerts if root MFA is not enabled. - Rule: cloudtrail-security-trail-enabled — ensures CloudTrail is logging.
☐ Lock down root user API access with an SCP (for Organizations) - If your account is part of AWS Organizations, create an SCP that explicitly denies all actions from the root user except GetAccountSummary, ChangePassword, and List* read actions. This prevents the root user from making changes even if credentials are compromised. - Note: SCPs cannot completely block the root user in a standalone account; they only apply in Organizations.
☐ Test the hardening - Log out, then attempt to sign in as root. Verify MFA works. Attempt an API call with root credentials (if any remain) and confirm it is blocked by SCP or policy. - Test that your administrative IAM user can perform all required management tasks without ever touching the root credentials.
Production insight: The most common root account incident is not a sophisticated attack — it is an engineer needing to change a support case or update billing and realising they have lost the MFA device. The fix for that is a multi-day support case with AWS. Having a backup MFA recovery code in a password manager accessible to two people prevents this. The second most common incident is a compromised AWS Partner Network (APN) account where the root access key was shared for convenience.
Cross-Account Access — STS AssumeRole Visual Walkthrough
Cross-account access is the ability for a resource or user in one AWS account to access resources in another account. This is essential for multi-account strategies, third-party integrations, and centralised logging or deployment pipelines. The mechanism is always the same: the requesting account assumes an IAM Role in the target account using the AWS Security Token Service (STS) via the sts:AssumeRole API call.
Understanding this flow visually is much easier than reading the policy documents in isolation. The diagram below maps the complete handshake between Account A (the caller) and Account B (the target resource owner).
The flow involves three policy documents that must all grant the appropriate permission: 1. Resource-based policy on the target resource (e.g., S3 bucket policy, KMS key policy, SQS queue policy) — this allows the assumed role's ARN to access the resource. 2. Trust policy on the target IAM Role — this allows the calling principal (from Account A) to assume the role. 3. Permission policy attached to the target role — this defines what actions the role can perform after assumption.
The requesting entity (user/role in Account A) also needs sts:AssumeRole permission on its identity policy targeting the role ARN in Account B.
Step-by-step process: 1. The caller (a Lambda function in Account A that has its own IAM role with sts:AssumeRole permission) sends an AssumeRole API call to Account B's role ARN. 2. AWS evaluates the trust policy on the target role in Account B. If the caller's principal ARN is allowed (and optional conditions like MFA or ExternalId are satisfied), STS returns temporary credentials (AccessKeyId, SecretAccessKey, SessionToken) that are valid for a configurable duration (default 1 hour, max 12 hours). 3. The caller uses those temporary credentials to make API calls to the target resource (e.g., write to an S3 bucket) in Account B. 4. The bucket policy in Account B must also allow the assumed role's ARN to perform the action.
Both the trust policy and the resource policy must be in place — missing either one results in AccessDenied. This is the most common debugging pitfall for cross-account setups.
Best practices: - Always use the ExternalId condition in the trust policy when granting access to a third party. This prevents the confused deputy problem where another customer of the same third party could trick the service into assuming your role. - Use aws:SourceArn or aws:SourceAccount condition keys instead of or in addition to ExternalId where possible — these are more secure because they tie the assumption to a specific resource in your account. - Set the role's MaxSessionDuration to the minimum required. For a nightly batch job, 1 hour is enough. For a long-running ETL, maybe 4 hours. Never use the maximum 12 hours unless there is a concrete need.
kms:Decrypt and kms:GenerateDataKey to the assumed role ARN. Without that, you get an AccessDenied from KMS even when S3 and IAM policies are perfect. Always check KMS key policies when debugging cross-account S3 or DynamoDB access with server-side encryption.Principal field must use the full ARN of the caller (including the role name), not just the account ID. Using "AWS": "arn:aws:iam::111111111111:role/AppServerRole" is precise. Using "AWS": "111111111111" (the account ID) would allow any principal in that account — including IAM users — which may be too broad. Be explicit.IAM Use Cases — Where the Abstraction Pays Off
IAM isn't a checkbox you tick. It's a runtime enforcement layer. Let me show you where it actually saves your bacon.
User and Group Management: Stop attaching policies to individual users. Create a DevTeam group with read-only S3 access. Add developers. Done. When Sarah leaves, remove her from the group. Permissions revoke instantly. No orphan policies.
Enhanced Security (Least Privilege): Your intern needs to restart EC2 instances for patching. Give them ec2:StartInstances and ec2:StopInstances — nothing else. No terminate, no modify security groups. When they SSH in and try to delete an instance? Denied. That's the point.
Access Management for Resources: Project manager needs CloudWatch dashboards but no EC2 write access. Create a policy with cloudwatch:GetDashboard and cloudwatch:ListDashboards. Attach to their user. They read metrics, they don't touch infrastructure.
MFA for Extra Security: Your root user has MFA. Your IAM users managing production databases should too. It's a time-based one-time password. Password leaks? Without the OTP, they're locked out. Simple. Effective.
How IAM Actually Evaluates a Request — The Flow You Can't Skip
Every API call to AWS hits IAM first. Here's the exact evaluation order so you understand why some requests fail silently.
- Authentication: IAM checks the access key and secret key. Or the role's temporary credentials from STS. If the keys are bad, you get an auth error immediately.
- Authorization Decision: IAM evaluates all policies that apply to the caller — identity-based policies, resource-based policies, permissions boundaries, and Organizations SCPs. The default is implicit deny. If no policy explicitly allows the action, it's denied.
- Contextual Evaluation: Condition blocks are checked here. That
aws:SourceIpcondition? It runs now. If the request comes from outside the IP range, theAllownever executes. - Final Decision: IAM logs the decision to CloudTrail — both allows and denies. Use CloudTrail to debug every time a Lambda role can't write to S3.
Real example: Your EC2 instance needs to upload files to S3. You create a role with s3:GetObject and s3:PutObject on arn:aws:s3:::your-prod-bucket/. Attach the role to the EC2. When the application calls PutObject, IAM sees the role, evaluates the policy, finds the allow, checks no deny condition, logs it, and lets it through. If you left off the / on the resource ARN? Denied. The object path didn't match.
Deny statement — but it's rarely needed if you scope your Allow correctly.Compromised Lambda with AdministratorAccess — full account takeover in 4 minutes
- AdministratorAccess on a workload role is a single point of total account compromise — a code vulnerability in the function is now a full account vulnerability
- Use IAM Access Analyzer to generate least-privilege policies from actual CloudTrail calls — it takes 10 minutes and eliminates a category of risk entirely
- A compromised Lambda inherits its role's permissions exactly — the security of the function's code and the security of its IAM role are the same problem
- AWS Config rules and SCPs provide automated guardrails that catch what code review misses — deploy them before you need them, not after
aws cloudtrail lookup-events --lookup-attributes AttributeKey=EventName,AttributeValue=<api-action> --max-results 5 --query 'Events[*].CloudTrailEvent' --output text | jq .aws iam simulate-principal-policy --policy-source-arn <role-arn> --action-names <failing-action> --resource-arns <resource-arn>Key takeaways
Common mistakes to avoid
5 patternsUsing AdministratorAccess on a Lambda or EC2 execution role to avoid debugging permission errors
Forgetting the InstanceProfile wrapper when attaching an IAM Role to EC2 in CloudFormation
Writing a Trust Policy with Principal set to asterisk with no conditions
Storing Access Keys in ~/.aws/credentials for automation instead of using role assumption or OIDC
Debugging AccessDenied without checking SCPs at the AWS Organizations level
Interview Questions on This Topic
What is the difference between a Trust Policy and a Permission Policy on an IAM Role? What happens if the Trust Policy is missing the service that needs to use the role?
Frequently Asked Questions
20+ years shipping production infrastructure and CI/CD at scale. Everything here is grounded in real deployments.
That's Cloud. Mark it forged?
17 min read · try the examples if you haven't