Best Practices for Managing User Identifiers

(1)

for Managing

User Identifiers

(2)

1 Introduction 1

2 Defining user identifiers 1

3 Different types of identifiers 1

4 Scope and uniqueness 1

5 When identifiers are assigned 2

6 Machine-readable versus human-readable identifiers 2

7 Desirable attributes of identifiers 3

8 Addressing challenges in identifier management 4

9 Common and recommended algorithms for assigning login IDs 5

9.1 Login IDs for internal users. . . 5 9.2 Login IDs for external users . . . 8 9.3 Assigning new E-mail addresses to internal users . . . 9

10 Example business processes 9

10.1 Employee / contractor onboarding . . . 9 10.2 Customer onboarding (Internet-facing) . . . 10 10.3 Renaming an existing employee login ID . . . 10

(3)

1 Introduction

This document presents best practices for assigning and managing unique identifiers to the users of com-puter systems in medium to large organizations. It begins with definitions and background information, then proceeds to explain scope, uniqueness, business processes, challenges and best practices.

2 Defining user identifiers

What is a user identifier, or ID for short?

Technical definition:

Multi-user computer systems often need to identify users, so that access to applications and data can be controlled, logged and attributed to people. Computers refer to people using unique numbers or strings of characters. These numbers or character strings are user identifiers.

User-centric definition:

Users have a variety of identifiers, which uniquely identify them in some context. Examples in the IT environ-ment include operating system login IDs, e-mail addresses, employee numbers. Examples from day-to-day life include driver’s license numbers, credit card numbers and passport numbers.

3 Different types of identifiers

In the context of a medium to large organization, users often have at least the following identifiers: 1. An employee number.

2. At least one network login ID.

3. Possibly additional login IDs to a variety of applications. 4. At least one e-mail address.

This document offers guidance to organizations regarding the management of these corporate user IDs.

4 Scope and uniqueness

An ID must uniquely identify a person within a defined scope.

For example, since no two users can have the same login ID on an application, the application can be thought of as anidentification domain,within which each user has a unique ID.

(4)

Scope Examples

Single system or application Active Directory domain, RAC/F security database. Single organization Employee number, standardized cross-application login ID Sub-national Driver’s license, voter number.

National Passport number, federal tax number. Global Fully qualified e-mail address.

In general, the scope over which an ID is unique can be expanded by appending the context where it was defined. This can be illustrated with some additional examples:

Original scope Example Append New scope Example

Single system JSMITH Application name Organization JSMITH@App01 Single organization JSMITH Organization name Global [email protected]

State/province DL 1341135-013 Jurisdiction National DL 1341135-013@NewYork National QC0318876 Country code Global QC0318876 from Canada

5 When identifiers are assigned

When discussing how identifiers are assigned, it is helpful to consider when they are assigned. Here are some examples:

1. At birth – as happens in some jurisdictions for government IDs, social insurance numbers, etc. 2. When joining an organization – enrolling as a student, starting a new job, etc.

3. When being granted a new login ID to a system or application.

Identifiers are sometimes changed as well – for example following name changes, which in turn often follow marriage or divorce.

6 Machine-readable versus human-readable identifiers

People find it easier to remember and enter memorable strings of characters. On the other hand, computers are able to assign numeric identifiers which are guaranteed to be unique in some scope. This leads to two broad categories of identifiers:

(5)

2. Computer-friendly identifiers, such as globally unique IDs (GUIDs) - which are strings of 32 hexadec-imal digits.

Computer-friendly identifiers often have the benefits of being unique in a larger scope and of never changing during the lifecycle of a user. In contrast, user-friendly identifiers are less unique (unique only in a smaller scope) and more volatile, but are easier for people to manage.

7 Desirable attributes of identifiers

Following is a list of desirable characteristics of user IDs. When designing an algorithm to assign IDs to users or business processes for managing user IDs, it is helpful to consider each of these and to develop a process which satisfies as many of them as possible.

• Identify a person, not a position:

Identifiers should refer to people, not to positions. People often move from one position to another and changing their identifier when this happens is a nuisance and creates inconsistencies in audit logs. • User friendly:

Identifiers should be reasonably easy to remember and short enough to enter quickly. Long and hard-to-remember IDs should be avoided unless they are only used by machines.

• Easily recognizable:

It is helpful for users to be able to recognize that a string of characters is a user ID on casual inspection. In other words, user IDs should be constructed in an easily recognizable format. This is helpful both for users, when reading text that contains IDs, and for automated processes, which can scan log files, scripts, network traffic or other data sets for user IDs.

• Reusable:

It makes sense to assign the smallest possible number of identifiers to a user and to reuse existing identifiers where possible. This is more user friendly, less troublesome to manage and easier to audit. In short, use an existing identifier if possible, rather than creating a new one. Standardized identifiers across as many systems as possible.

• Compatible:

Identifiers are often used on a variety of systems. For example, a user might type the same identifier to sign into Windows / Active Directory, into a mainframe using RAC/F and into an ERP running SAP. Each of these systems will have different constraints on the allowable length and characters that can comprise an identifier. In order to support reuse (previous objective), it makes sense to assign identifiers that are compatible with the largest possible number of systems.

• Maximum scope:

Different systems may have different, overlapping user populations. It makes sense to assign iden-tifiers which are unique over the largest possible scope, so that they can be reused by the largest possible number of systems.

(6)

Identifiers assigned to a user should be designed so that they never have to be changed. Changing identifiers is an administrative burden and leads to inconsistencies in audit logs,

Changes in user IDs can create significant operational problems. For example, the ID may appear on multiple systems, making it costly to change. Changing the ID would create a discontinuity in audit logs, perhaps violating security policy. The ID may be embedded in programs or scripts, which would stop working after the change. The ID may be known to other users, who would have to be informed of the change.

• Never reused:

Identifiers should never be reused. For example, when a user leaves an organization, that (old) user’s identifier should never be assigned again, to another (new) user. Doing so can have undesirable and unexpected consequences, such as the new user acquiring security access rights from the old user’s profile. This means that a repository of every identifier that has ever been assigned must be maintained, rather than just a repository of currently-in-use identifiers.

• Not offensive:

People have an amazing ability to read meaning into meaningless strings of characters. This leads to situations which range from humorous to offensive, where identifiers are assigned to users, often by automatic processes, which users can read – literally or with “poetic license” to have colorful or offensive meanings.

This problem suggests that a human review process is often needed when new identifiers are as-signed, so that they can be vetted and perhaps replaced if they are found to be offensive.

• Cross-language:

Many organizations span countries, languages and cultures. In this context, a question of cultural, rather than just technical compatibility arises. For example, would a uniligual English speaker be able to read, remember or type an identifier for a co-worker if that identifier is in Kanji (Japanese)?

Since identifiers may have to be accessible by multiple users, it is important to consider the ability of users fluent in different languages to read and enter them.

• Accessible only within an appropriate scope:

In some cases, an organization may consider identifiers to be confidential. This is true in the legal sense with some identifiers, such as social security numbers. Confidentiality of identifiers may also be considered a secondary line of defense against security attacks such as automated password guessing.

Since users often have to know, remember and enter their own identifiers, confidentiality means limit-ing the visibility of identifiers to just authorized users and not discloslimit-ing information about whether an identifier is valid to unauthorized or unauthenticated users.

8 Addressing challenges in identifier management

Some challenges arise in most organizations in the course of assigning new or managing existing identifiers. These are described below:

• Collisions:

If the algorithm used to assign unique IDs to users is based on users’ names then users with identical or even similar names may be assigned the same identifier. This obviously needs to be rectified.

(7)

For example, an organization may employ 10 people with the (common among English speakers) name Michael Smith. If IDs are assigned using the algorithm “last name plus first initial” then they would all be assigned the ID “smithm.” Assigning the same ID to multiple users would defeat the purpose of IDs – unique identification – so the algorithm must be adjusted to eliminate these collisions. This may be done by appending one or two digits to the IDs above, for example.

• Name changes:

Where IDs are assigned using an algorithm based on the user’s name, in the event that the user’s name changes (for example, due to marriage or divorce) the user may wish the in the event that the user’s name changes (for example, due to marriage or divorce) the user may wish to change his ID to match his new name.

Changes to user IDs are undesirable, as described inSection 7on Page3. • Short names:

Where IDs are based on user names, the algorithm used to calculate IDs may produce unsatisfactory results for users with short names. For example, two common Chinese surnames are written (in English) as Wu and Li. An organization with many Chinese users and IDs based on surname might have many collisions and require two or more extra characters appended to IDs, to make them unique. These unique suffixes are hard to remember and tend to lead to confusion, such as e-mails intended for one user being sent to another.

• Changes in user role or status:

Where IDs are based on a user’s role (e.g., which department he works in) or status (e.g., employee vs. contractor), changes in the user’s role or status would trigger a change to the user’s ID. For example, a contractor who is subsequently hired as an employee would be assigned a new ID. Changes to user IDs are undesirable, as described inSection 7on Page3.

• Multiple character sets:

As described inSection 7on Page4, users fluent in one language, or whose computer is configured for text input in one language, may be unable to read, remember or enter an ID in another language, especially when the two languages use different character sets.

9 Common and recommended algorithms for assigning login IDs

9.1 Login IDs for internal users

The following process and algorithm can be used to satisfy each of the requirements set forth inSection 7 on Page3:

(8)

Requirement Strategy

Identify a person Assign IDs to people, not roles. User friendly IDs should be 7 characters, total.

Easily recognizable Formulate IDs as “Unnnnnn” wherenrepresents a digit. There are 10,000,000 possible IDs of this form.

Reusable Use the same ID on every system and application.

Compatible IDs starting with a letter and containing only one letter and 6 digits work on almost every conceivable system and application.

Maximum scope Assign an ID to every user in the organization and use these IDs to sign users into applications. If possible, use the same ID as an employee number as well.

Unchanging Since the IDs are numeric, changes in user names should not trigger a request for a new ID. Since they do not represent user role or status, changes in these attributes also do not trigger a request for a different ID.

Never reused Create a database of every ID ever assigned. Only append to it and never reuse IDs.

Not offensive Numbers are not generally offensive, though some numbers are considered “bad luck” in some cultures. Give users an opportunity to request a new ID (but not to specify what it will be) when they are first assigned an ID.

Cross-language Roman letters (U) and digits are legible across cultures and languages. Limited disclosure Do not publish lists of IDs or the correlation between user names and

(9)

Another reasonable process is as follows:

Identify a person Assign IDs to people, not roles. User friendly IDs should be 7 characters, total.

Easily recognizable Formulate IDs as the user’s surname, in English, with up to 3

characters followed by a 4 digit number assigned sequentially for each prefix. Example: the fourth “Mike Smith” could be assigned “SMI0003.” Reusable Use the same ID on every system and application.

Compatible IDs always start with a letter, only have letters and digits and contain no more than 7 characters. Almost every conceivable system and application supports this.

Maximum scope Assign an ID to every user in the organization and use these IDs to sign users into applications. If possible, use the same ID as an employee number as well.

Unchanging Since IDs do not represent user role or status, changes in these attributes do not trigger a request for a different ID. Changes in a user’s name may cause users to request an ID, but in most cases only a short subset of the name is used, so users are likely to tolerate continuing use of their old ID.

Never reused Create a database of every ID ever assigned. Only append to it and never reuse IDs.

Not offensive Short strings of letters are not usually offensive and neither are

numbers. Give users an opportunity to request a new ID, indicating the string they did not like, when they are first assigned an ID.

Cross-language Roman letters and digits are legible across cultures and languages. Limited disclosure Do not publish lists of IDs or the correlation between user names and

(10)

9.2 Login IDs for external users

External users that sign into an organization’s Internet-facing applications generally only sign on infre-quently. Since Internet users generally already have an e-mail address and since e-mail addresses are guaranteed to be globally unique, it makes sense to identify external users with their fully qualified e-mail address.

This has many advantages:

Identify a person Use fully qualified e-mail addresses.

User friendly Users already know their own e-mail addresses.

Easily recognizable E-mail addresses are easily recognized by people and programs. Reusable Users already use their e-mail address elsewhere, so by definition

assigning this as an ID is reusing it.

Compatible E-mail addresses are not compatible with all applications. They can be quite long (over 100 characters) and may contain symbols not

supported by some applications (@, _, -, .). These limitations are not usually problematic with Internet-facing applications, but they can present difficulties for “back office” systems, such as mainframes. Maximum scope E-mail addresses can be used as IDs on every Internet-facing

application.

Unchanging Users do periodically change their e-mail address, so this requirement is, unfortunately, violated.

Never reused Few if any e-mail systems assign the same ID, consecutively, to different users. This reduces the problem of ID reuse to a vanishingly small size.

Not offensive Users presumably already address this problem when provisioning their e-mail account, so this problem is transferred to another organization.

Cross-language SMTP e-mail addresses are, by definition, cross-cultural and global. Limited disclosure E-mail addresses are widely known, so this requirement cannot be

(11)

9.3 Assigning new E-mail addresses to internal users

Identify a person Assign a new and unique e-mail address to every new e-mail user. User friendly AssignfirstName.lastName@organizationDomainand insert .uniqueID

before the@if required, where theuniqueIDis two letters – aa, ab, ac, etc.

Easily recognizable E-mail addresses are easily recognized by people and programs. Reusable Users can use their e-mail address to sign into a variety of web-based

applications. Since many legacy applications do not support long IDs or IDs containing punctuation marks, e-mail addresses cannot be reused everywhere, nor should they – because they are long and so take longer to type than other, typically internal IDs.

Compatible E-mail addresses a standard format, compatible with all mail systems. Compatibility with other applications is not predictable.

Maximum scope E-mail addresses can be used as IDs on many 3rd party Internet-facing application.

Unchanging Unfortunately, users will generally demand changes to their e-mail address when their name changes. This is unavoidable with this format.

Never reused Create a repository of all current and previously assigned e-mail addresses. Even in the case where a user with a given name leaves and later a different person with the same name joins, use the unique field.

Not offensive Users are not generally offended by their own names.

Cross-language SMTP e-mail addresses are, by definition, cross-cultural and global. Limited disclosure E-mail addresses are widely known, so this requirement cannot be

met.

10 Example business processes

Following are some typical examples that illustrate how the naming algorithms described inSection 9on Page5above are used.

10.1 Employee / contractor onboarding

1. For employees: HR creates a new employee record.

2. For contractors: a manager submits a new-contractor request. 3. In either case, the request includes the user’s full name. 4. Once the request is approved:

(12)

(a) A new login ID is assigned.

(b) Using the algorithm inSubsection 9.1on Page5:

i. A database is referenced to find the highest-number, already-assigned ID. ii. The next number is used.

iii. Database locking is used to ensure that two users, provisioned at nearly the same instant, do not get the same ID.

iv. The ID might beU0012311.. (c) A new e-mail address is assigned.

(d) Using the algorithm inSubsection 9.3on Page9:

i. “John Smith” might become “[email protected]”

ii. As with the previous example, a database lookup is required to check for duplicates. iii. If a duplicate is found, the e-mail address might become “[email protected]” iv. The new ID must be stored in the database, correlated toU0012311.

v. Also as before, record locking semantics must be used to avoid a case where two same-named users are assigned the same address if they are provisioned nearly simultaneously.

10.2 Customer onboarding (Internet-facing)

1. A new customer fills in an access request form.

2. The form should include a CAPTCHA to ensure that it is filled in by a person, rather than a (possibly malicious) script.

3. The user should be required to enter his existing e-mail address. 4. Form input should validate that the e-mail address is well formed. 5. Account activation may involve e-mail validation:

(a) An activation URL is sent to the user’s e-mail address. (b) The URL includes a pseudo-random string.

(c) The user has to click through to the URL to activate the account.

(d) Activation strings and un-activated accounts should be scrubbed periodically – for example when they are over 24 hours old.

6. This method ensures that all users have a globally-unique, already-remembered ID.

7. Password reset can be accomplished by sending an activation string to the user, just like account activation.

10.3 Renaming an existing employee login ID

1. Users may ask for a new ID in the event that their old ID was based on their name, which has since changed.

2. Organizational changes – mergers, acquisitions, etc. – may trigger renames to align naming stan-dards.

(13)

3. In general, so long as a user has the same ID on all systems, it is safer to leave that ID alone and provision any new accounts for the same user with the pre-existing ID. Name changes are dangerous since scripts or programs may explicitly refer to the old name.

4. Where renaming a user is deemed essential, be careful to consider: (a) Scripts or programs that refer to the old ID.

(b) Uniqueness of the new ID (should not be used by any other user on any system).

(c) Compatibility of the new ID with all systems, not just those which the user will access immediately. 5. Before renaming a user, notify him of the change, both so that he can sign in after it happens and so

that he can report problems that may have been caused by the change quickly.