Understanding IAM
Identity is the most granular unit of security. The users, services, and systems that interact with infrastructure, applications, APIs, and endpoints must all be identified, authenticated, and authorized in order to perform their functions. The AWS platform operates under a rigid identity-centric model. Bridging that model with your own organization's identity implementation can be daunting.
Identity practitioners can (and do) argue about the minutiae and nuances of the terminology used within IAM. However, for our purposes, we can afford to use a broad definition of IAM in AWS:
For something purported as a simple definition, that sure is a mouthful. However, if we break the statement down into its constituent components and consider a typical use case, it affords us an opportunity to see how many technical disciplines you may already be familiar with that relate to IAM:
In layman's terms, we have these digital accounts that can be used to access computer systems. These accounts either directly or indirectly map to a person. This means that the account is either a digital representation of that person or the person owns and controls those accounts. That person can demonstrate proof of control of those accounts and is accountable for actions taken with those accounts. And those accounts have a life cycle, meaning under certain conditions they are created, under other conditions they may change, and at some point, they may eventually cease to be.
This is called identity management. Identity management is responsible for the following:
- Keeping accounts up to date
- Keeping downstream consumers of those accounts synchronized with the authoritative sources that define the account
- Provisioning and deprovisioning accounts entirely from various data stores
In short, it's a collection of processes responsible for managing account life cycle events in accordance with business, legal, or technical controls. These controls trigger life cycle events for accounts, such as account creation, modification, and disablement. What those life cycle events are will vary depending upon the event, type of account, business, and requirements of the system using those identities.
Now, let's look at the rest of the definition:
Those accounts, having been created, can be used to execute specific activities. What they can do is determined by rules and policies. In order to do anything, the account must first provide proof that whoever or whatever is using it to perform an activity is actually allowed to do so. That proof comes through a shared secret that validates the authenticity of the actor behind the account. This second part of our IAM definition addresses something called access management. Access management addresses the authentication of the account (proving you are who you say you are) and the authorization of that account (proving that you are allowed to do what you are trying to do with that account) to access resources or to perform certain tasks in accordance with established policies.
IAM applied to real-world use cases
To understand this better, and to provide a flimsy pretext to introduce some additional concepts that are not so easily derived from that definition of IAM, let's imagine what happens when someone joins a new company. To help visualize all the actors, systems, and life cycle events in play, take a look at the diagram in Figure 1.1.
In this example, Bob has applied for a sales role at a large identity services organization called Redbeard Identity, which has a reasonably mature internal IAM program, application portfolio, and cloud platform capabilities. Bob's identity experience actually began long before he got to the point where the hiring manager was prepared to make an offer, because in order to apply for the position, he had to create a profile inside of Redbeard Identity's candidate management system.
Important note
The Redbeard Identity organization will be the organization referenced for several use cases and scenarios throughout this book. Whereas real organizations typically have a fixed enterprise architecture, we will adjust the architecture, capabilities, services, user accounts, and other characteristics of the Redbeard Identity organization from chapter to chapter in whatever ways we need to best demonstrate the material of that chapter. Please don't be confused if our example organization's characteristics are not entirely consistent throughout the book.
This marks the first identity life cycle event in Bob's onboarding journey: user account creation. Bob, as a user of the candidate management system, is providing self-issued, unverified information about himself such as his name, contact information, and details about his work history. As there is neither external proof nor an outside source of control validating the information he enters into this system, his candidate account is considered a low-assurance record. As long as Bob remains merely a candidate for the sales role, that low level of assurance is sufficient for the purpose that the candidate record system account serves:
Bob knows his craft well, is an impressive salesman, and aces his interviews. After the details are agreed upon, the hiring manager sends Bob the offer letter confirming the details of his role, along with instructions for accepting the offer. Bob accepts by signing into the candidate portal and accepting the job offer. Now that Bob is more than just a candidate, the authenticity of the details that Bob provided when populating his candidate account must now be verified. To ensure that he is who he says he is, the HR representative will start a process called identity verification. This process is defined by the US Department of Commerce's National Institute of Standards and Technology as a process ''to ensure the applicant is who they claim to be to a stated level of certitude'' (NIST Special Publication 800-63A, Digital Identity Guidelines, Section 4, NIST).
The HR representative asks Bob to provide some identifying documents to facilitate his onboarding and help corroborate the information that he already entered as part of his candidate profile, such as a copy of his passport, a state-issued identification card, and his tax information. For the sake of argument, let's just say Bob hands the HR representative these documents in person to ensure that Bob himself has been compared against these artifacts. Thus, he sidesteps any concerns about him stealing valid credentials from someone else to use in his efforts to secure employment. The HR representative will finally validate these artifacts against their authoritative sources to ensure their authenticity, proving that Bob really is who he says he is. With the confidence that Bob is Bob and that the information Bob entered into the candidate management system is accurate, the HR system creates Bob's employee record and sets it to become active on Bob's start date.
As we said earlier, this organization has a reasonably mature IAM program. As part of a nightly process, the IAM system checks the HR system for any discrepancies in the data between the records stored there and its own corresponding identity records that it maintains in order to keep them in sync. When a change is made to an existing HR record that has a corresponding identity record, such as in the case of an employee changing departments, the department attribute on that employee's identity record also gets updated with the new department value. This is an example of attribute and metadata synchronization being used to ensure the consistency of identity data across data stores. In this case, the HR system is acting as an authoritative source for the IAM system, meaning that the records, attribute values, and other information from that system will overwrite any changes made directly against the records in downstream systems.
This organization uses business logic that tells the IAM system to create new identity records for new joiners one week from the start date listed on the new joiner's HR record. Once Bob's start date is less than a week away, that logic triggers the IAM system to provision, or create, his identity record. This will be the authoritative account record for all downstream systems, which in turn look to the IAM system as their own respective authoritative source. The IAM system will create Bob's identity record based upon an established pattern of attributes and characteristics, or a schema. It contains certain attribute types and values based upon the kind of account that Bob's identity record is. In Figure 1.1, we see a sample of (an admittedly spartan) schema for Bob's identity record. Let's pretend that we can actually take a look at the identity schema for Bob's record within the IAM system using Table 1.1:
This shows us the attribute names, their current values, and the authoritative sources for each of the attributes in this schema. You'll notice that for the most part, the HR system provides the bulk of the authoritative data for the attributes, with the exception of ''mail,'' which is currently null (or without a value), and which also uses Azure Active Directory (AD) as its authoritative source.
You aren't constrained to a single authoritative source for your identity schema. In fact, you can have nearly infinite combinations of conditional clauses, secondary sources, and compound sources when defining your schema and the authoritative sources used to populate the schema's attributes. Beyond that complexity, you can also have several distinct schemas depending on the type of identity you are defining. We've only been examining Bob's identity journey as he gets onboarded at Redbeard Identity, and he is an employee as denoted by the emptype
attribute. Contractors will likely have distinct schemas, as will bot process automation accounts, service accounts, business-to-business accounts, and customer accounts. But to keep things simple, we will stick with Bob the employee.
Bob works in sales, but it is doubtful that Redbeard Identity is a pure-sales organization given that they have enough technical wherewithal to run their own IAM infrastructure. Even if they were that operationally lean, there are regulations that demand evidence that some workers with certain job responsibilities cannot perform other, complementary responsibilities in order to reduce the risk of malfeasance. The go-to example for this is the protection control between accounts receivable and accounts payable in financial services organizations in order to prevent someone entitled to issue invoices from also approving their payment.
Separation of duties requires more than one person in order to complete a business task. Organizations implement separation of duties by applying technical controls that restrict or enable what a person can do based on business and regulatory requirements. Those rules, restrictions, and permissions are called policies, and a collection of policies that grants somebody the full range of access that they are entitled to depending upon their responsibilities is called a role. Aligning policies to roles, and roles to users through attributes or business logic is one part of access management. Providing evidence that those controls function as designed and comply with business and regulatory requirements is identity governance and audit. Identity governance and audit, access management, roles, and policy, all work to ensure that Bob will only be able to access the systems and resources that are appropriate for him to access, or in other words, that he is authorized to access.
This ''all or nothing'' approach to access is an example of coarse-grained authorization. Here, access is determined on a seemingly binary ''yes/no'' level based on the role that Bob was assigned provisioning him an account in the system. In Bob's case, he received the Sales role because, as we've said more than a few times now, he works in sales. However, there was no attribute labeled ''role'' that indicated which role he would be assigned. And there doesn't need to be. The logic that determines which entitlements get applied to an identity upon creation can vary wildly. In this scenario, Redbeard Identity's IAM system assigns roles based on the combination of the ''costcenter'' and ''department'' attributes. There could also be application-level roles and policies that provide fine-grained authorization to certain application-specific functions.
Now that Bob's identity has been provisioned and the IAM system has determined what role aligns to that identity, the next step is for the IAM system to begin provisioning Bob's accounts in the various downstream systems that he is entitled to access. Users with the Sales role get certain birthright entitlements, which are accounts and access that everyone gets just by being active employees within Redbeard Identity with that basic Sales role. Figure 1.2 shows the provisioning process from the IAM system into these account stores in greater detail, with information about the schema for each of the accounts that Bob will be getting:
The IAM system provisions the following:
- Bob's Azure AD account
- An LDAP account in the company's directory
- A user account in Redbeard Identity's customer relationship management system where Bob will be spending most of his workdays
- An account in the cloud directory used by Redbeard Identity's cloud-hosted applications
Each one of these account stores is an example of an identity store. This is the place where an application or system can store its own instance of Bob's account with all the application-specific attributes added on. Just like how the HR system was the authoritative source for the IAM system, the IAM system is the authoritative source for these accounts and for many of the attributes within these identity stores. Now that Bob has an Azure Active Directory Account, he can get a mailbox and email address. If you recall from Table 1.2, Bob's main identity record did not have a value under the mail attribute when it was first provisioned. It is only now that the IAM system will detect Bob's email address when checking Azure AD for any new account updates. Upon detecting the discrepancy between the null value for the mail attribute in the identity record it has for Bob and the email attribute it reads on Bob's Azure AD account, the IAM system imports that update into Bob's IAM record with the new information obtained from that authoritative source. But the updates don't stop there! Look at Figure 1.3:
Remember that the IAM record itself is an authoritative source for several of the attributes on the downstream accounts that were provisioned as part of Bob's Sales role. In this instance, Bob's LDAP, CRM, and Cloud Directory account will each get Bob's new email address value written to the attribute that each has mapped to correspond to the IAM record's mail attribute value. Now that Bob has all of his accounts provisioned and synchronized with their authoritative sources, Bob is poised to be productive on his first day on the job.
That is to say Bob could be productive, assuming he knew how to identify himself as the owner of the account in each of these systems. This takes us to the last life cycle event depicted in Figure 1.1, which is the issuance of Bob's credentials. Credentials are the evidence used to attest that the person accessing a resource is who they say they are.
When talking about user accounts, credentials most often take the form of a unique identifier (such as a username) and a shared secret. This shared secret is between the person attempting to access a resource and the system that is trying to validate the identity of the person attempting to access a resource (such as a password). Bob's username plus his account password are his credentials to access these Redbeard Identity systems. Let's take a look at how that credential was created and delivered to Bob, as well as how the downstream applications can also verify Bob's identity despite not necessarily needing to maintain a set of their own for Bob to use.
Within Redbeard Identity, the mechanics of creating Bob's credentials are fairly straightforward. As part of the initial account creation process, the IAM system generates a random password to use as the password value on Bob's account. As you can see in Figure 1.3, though the password was generated by the IAM system, the IAM system is not acting as the authoritative source for the password, nor is it even storing the password attribute in its main identity record for Bob. Looking more closely at the schemas on those downstream accounts, we see that the only system that stores Bob's password value (or some form of hash of this value) is the LDAP directory.
In addition to being the only place where that value is stored, there is another unique attribute on that LDAP account called changepwonlogon
, which is currently set as true
. When the changepwonlogon
value is set to true
, it will force the person who entered the username and password to enter a new value for the password. When changepwonlogon
is false
, the person who correctly enters the account's username and password will simply be permitted to access the system or resource they were attempting to access when challenged for their credentials.
Providing the credentials is how a user can authenticate themselves, or how they prove that they are who they say they are. As Bob can't receive that initial password directly from Redbeard Identity's systems since he does not have access to Redbeard Identity's network yet, the IAM system instead issues the first password for Bob's account to Bob's hiring manager.
So why isn't the password written into all of the other identity stores where Bob has an account? In the specific situation we are examining using Bob's onboarding into the Redbeard Identity organization, they are maintaining a single authoritative identity store for all of their application authentication. This means that Bob will use a single, centrally managed username and password to access the applications and systems he needs to use to perform his job. This is as opposed to a system where he would be required to memorize a unique username and password stored and managed by each individual application. This is single sign-on (SSO).
Applications maintain application-specific user records for each user that they use for their own purposes (such as authorization). However, the application delegates authentication to a central identity store using a directory services protocol such as LDAPS or Kerberos, or in the case of many modern web apps, a federated web-based protocol such as SAML or OpenID Connect. Using SSO reduces the number of credentials and the locations where those credentials are stored. This reduces the attack surface that a malicious actor can try to exploit to steal a credential. Using SSO also helps keep Redbeard Identity workers happy since they only have one password to manage.
Bob's first day at Redbeard Identity arrives. He shows up at the office for new hire orientation, receives his laptop, and his hiring manager shares the initial password for his account with him so he can sign into his account. After his credentials are validated, the changepwonlogon
attribute triggers the life cycle event responsible for ensuring that the initial password gets changed. Bob enters a new password.
Once that is accepted and written to his LDAP account, the changepwonlogon
value flips to false
, and Bob becomes the sole owner of his account, which is essential for non-repudiation. From now on, any actions logged under emplid
can be tied solely to him since he is the only one who can access resources and applications by authenticating using those credentials. And with that, Bob's identity onboarding experience is complete:
Now that Bob has his account, he needs to sign into the applications he will use to perform the majority of his job duties. As we mentioned earlier, the Redbeard Identity organization maintains its users' passwords exclusively in its LDAP directory. Though Bob has accounts in the user stores of other systems and applications, those applications have delegated their user authentication to that LDAP directory. Applications can perform lookups and password validations directly against the LDAP using LDAPS, but that model has constraints that limit its usefulness as a modern authentication pattern.
Modern applications should rather use identity federation for user authentication, which is a model where the application looks to an external identity authority to receive trusted identity information. The CRM application that Bob will be spending most of his time in uses identity federation to authenticate its users. The process for the CRM app receiving an authentication token for Bob's identity from the identity provider is shown in Figure 1.5:
Let's break down the steps:
- From a browser, Bob goes to the CRM application.
- Since Bob doesn't have a session cookie, the CRM application redirects the browser to the Identity Provider that it uses for user authentication.
- The Identity Provider redirects Bob's browser to a logon form to collect Bob's username and password.
- Bob's username and password are posted to the Identity Provider.
- The Identity Provider performs the password validation on Bob's account against the authoritative source it uses for authentication – in this case, the LDAP directory where the Redbeard Identity organization stores its user credentials.
- The LDAP directory responds that the credentials are valid and may optionally send along some additional attributes that the CRM application may need to reference at authentication time.
- The Identity Provider creates a signed authentication token using its private signing certificate and posts that to the CRM application. The CRM application is assured that Bob has been authenticated by the external Identity Provider due to the unique cryptographic signature on the authentication token.
- The CRM application looks at the subject of the authentication token, which in this instance is the
emplid
from the LDAP directory, and matches it to its local account under that sameusername
value. The CRM application examines its local user record for Bob for hisjobcode
value to determine what application role he can assume. Job code66061
corresponds to a sales representative role. The CRM app establishes an application session for Bob under that authorization context, and Bob is now logged in.Important note
It is important to remember that the example we just walked through was meant to highlight IAM concepts, not necessarily IAM architecture or engineering best practices. Organizations' IAM and security maturity can vary greatly as they balance the risk equation of facilitating their core business against the monetary and opportunity costs of identifying and remediating potential security threats.
The Redbeard Identity scenario has provided us with an example of IAM principles in action and shows how the various components of the IAM system combine to form a platform that facilitates business outcomes and secure organizational resources. Now that we have an idea of what an IAM is, let's begin our exploration of it within AWS.