Understanding the Role of Identity in API Security

Whenever digital identity is mentioned, authentication usually comes first to mind — the process of verifying a user’s identity. And nowadays, authentication is no simple topic. It encompasses many areas. These include maintaining strong security with multiple factors, correctly verifying if the digital identity matches the physical identity, creating complex flows that will keep the authentication process secure in all situations, and so on.

However, there is more to identity than just authentication. Identity is usually established with a reason — for example, it is often used to subsequently call APIs, which use this identity information to perform authorization. Services must ensure that they return data to a party authorized to receive that concrete information. Thus, understanding how APIs handle identity information is essential to implementing robust security solutions.

Zero-Trust APIs

Zero trust is not only a popular buzzword but a vital security trend that organizations should follow. Zero trust is summarized by the simple phrase, “never trust, always verify.” This means that an API should, by default, reject all requests and allow only those that meet criteria set by security policies. This is a fundamental change in the approach to securing API endpoints.

Let’s illustrate zero trust with an example. It’s common for APIs to have a policy that looks roughly like this: 

If the request calls an admin endpoint, ensure the credentials contain admin rights. Otherwise, allow all requests containing credentials (reject only anonymous access).

If zero-trust principles were to be implemented instead, the same policy would look similar to this: 

If the request calls an admin endpoint, ensure the credentials contain admin rights. Otherwise, check that the credentials contain rights for a user to read data. Deny access to any other requests. 

Switching to such a paradigm requires the API to always have enough information about the subject’s identity to perform informed authorization decisions.

Identity and Access Tokens

When designing API security, it’s important to retain proper optics around how the API works with the identity contained in an access token. Architects and developers should remember that an access token (unlike a session) does not answer the question of “who the caller is.” An access token contains the subject claim, which identifies the user (or, in some cases, an application or another API) that consented to the authorization server to issue the token to the requester. But when an API receives the token, it is often incorrectly assumed that the request represents the user’s action.

For example, say an API receives a request for a transaction list for the user John, so it assumes that John wants to view his transactions. It is assumed that the token asserts the caller’s identity. In fact, the information that the token carries is that “the caller, who happened to possess this token, wants to get a list of John’s transactions.” (At least in the case of the predominant bearer tokens. It’s a bit different with proof-of-possession tokens, which is described below). The API should not assume that receiving an access token with a given subject means the user is actively using the API (has an active session). It might be a backend process authorized to work with the user’s data. Or, in a worst-case scenario, it might be an attacker who got ahold of the token.

Architects and developers must understand this distinction and design security solutions accordingly. For some APIs, this might not matter, but for others, the distinction between the identity of the token’s subject and the caller might be crucial.

Bearer Tokens vs. Proof-of-Possession Tokens

A vast majority of applications and APIs use bearer tokens. As the name suggests, these tokens can be used by anyone who possesses them. Even if such a token contains a client_id claim, this only indicates to which client (an application) the token was initially issued. Usually, APIs won’t require client authentication, in which case, the client’s identity is not confirmed in any way. 

Security can be enhanced by using proof-of-possession (PoP) tokens. When PoP tokens are issued to a client, they are bound to it. In subsequent requests to the API, the client must present proof of identity, and the API can verify whether the client can actually use the given access token. This protects organizations from situations where someone could steal an access token. Even if the token is stolen, an attacker cannot use it to call an API unless they also have the client’s credentials.

Even though using PoP requires more resources and adds complexity to an application, it’s invaluable in highly sensitive setups. There are different ways of implementing PoP tokens, for example, through mutual TLS or a standard called Demonstration of Proof-of-Possession.

Identity and the Subject

Ideally, the subject included in the access token presented to an API in a request represents a unique identity. But it doesn’t necessarily have to convey any meaning to the API — the subject’s value can be an opaque string, such as when pairwise pseudonymous identifiers (PPID) are used by the authorization server. Still, it’s valuable for API developers to know that a physical identity (a user) will always be represented by the same subject. Unfortunately, this is not always the case. Organizations often create a new digital identity (account) for the same user whenever a different authentication factor is used.

For example, suppose a user first registers an account with a username and password. When they return to the website, they log in using their Google account. The next time the user visits, they log in with a different Google email. This scenario should result in one user account with three methods linked — a password and two other Google credentials. However, usually, this will result in three separate user accounts. As identity in an organization is mapped to business data, like the user’s purchases or health history, it’s crucial that APIs can correctly return the complete data to the end user, regardless of the authentication method used. Companies should make sure that the authentication flows are modeled in a way that helps mitigate duplicating accounts. Of course, there might be situations where the user deliberately wants to create multiple accounts, which is not an issue. The problem is when this happens unintentionally.

The issue of account proliferation does not pose an immediate security risk. However, when such a situation arises, organizations must implement procedures that allow users to deduplicate accounts. Such procedures must be designed and implemented carefully, as this can lead to intentional or unintentional abuses, where one user obtains access to another user’s data. Preventing the situation in the first place will positively impact the security of the users’ data.

Trust in an Identity

The authorization server drives the authentication flow to assert the user’s identity and issue tokens with claims about the authenticated user. Authentication itself can be handled by using several various factors, and those factors are characterized by different levels of trust in the asserted identity.

For example, when a user logs in using a social provider, there is less certainty about the identity than when the user logs in using a password confirmed with a private key fob. It’s much simpler for an attacker to gain access to the user’s social account than their physical key fob. What is more, some factors are only meant to strengthen the security of the authentication. For example, using a time-based one-time password (TOTP) factor, like the Google Authenticator app, makes it more difficult for anyone to impersonate a user. Yet, utilizing this factor does not provide more certainty on whether the physical user is who they claim to be. Using factors like eID or a bank ID together with biometrics provides stronger certainty about the authenticity of the digital identity.

The level of trust in identity should be reflected in the access token’s claims. This allows the API to perform authorization decisions based on the level of trust. Some endpoints might require tokens obtained with strong authentication methods, and others might focus on the authenticity of the digital identity. It should be remembered that, as noted previously, access tokens are distinct from sessions. Even if the access token was obtained using strong authentication methods, it doesn’t mean that the given user is subsequently making requests with that token.

Conclusion

Understanding the role of identity in API access is crucial to implement a secure, zero-trust approach in your organization. Architects and API developers should consider how identity will be used to grant access to endpoints and tailor the security measures accordingly. More than strong authentication methods is needed to ensure that your APIs are well protected. What matters is how the identity information from the authentication process is used to authorize access to the API. Upon realizing that, it’s easier to recognize essential security measures and create solutions far less vulnerable to attacks.