Character set / Key space - Hakin9_StarterKit_04

A password character-set refers to a closed set of characters which the password might contain.

Common sets includes:

• Digits – the password contains only digits [0-9]

• Lower case letters – [a-z]

• Alphanumeric case sensitive – [a-zA-Z0-9]

• All standard keyboard character.

A password’s key space determines the number of possibilities each character has. For instance the Digits character-set gives us ten different possibilities to choose from for each character.

Table 1. Keyspace

Length

Length is simply the number of characters a password uses |p|.

Combining the two properties, we can determine the number permutations there are for a given keyspace and length:

For example suppose that we’re told that a password is 8 characters in length and it is using the lower case keyspace, thus each character can be anyone of the 26 possibilities [a-z], now we’ve got 8 characters, so the number of permutations for such a password would be

26^8 = 208827064576

Honestly, I find it hard to wrap my mind around such a number but as you’ll see 26^8 is considered small.

Table 2. permutations

Passwords are the center of many authorization mechanisms; Login forms asking us to provide credentials which usually consist of email / username and a password in order to get access into a system be it our Email or Facebook account.

Let us take a closer look on how a typical login process works:

Before logging in, one must first sign up by providing the system with at least a username and a password.

Figure 1. Registration form

Upon registration, the system will typically store your credentials in a database,

A new record will be created containing your username and password in either clear text or a hashed form.

Because databases are subject to theft it is considered bad security practice to store passwords as clear text, so a common approach uses a one direction hash function to transform a given password into a hashed value, the system will then store this value instead of the original clear text password.

Figure 2. Naive snip which demonstrates both the signup and login process, note the conversion and use of the hash value

Using hashed values has its benefits, for instance no one including the authentication mechanism could easily determine the user’s original password just by looking at its hashed value.

A Login process might look as follows: Given a username and a password, look up the user’s record within the database using the provided username. If a record is found, hash the given password and compare this value against the one within the fetched record’s password field, if the stored hash matches the calculated hash then the user is authenticated.

In this article I’m going to focus on the MD5 hash function although there are others for instance SHA-1, The former generates a 16 byte hash value while the later creates 20 bytes hash values.

Example: MD5(Hakin9) = 9a6d4d8263e13790bdbd81610487f1f2

For a secured Hash function to perform well, it must maintain the following properties:

• One directional – There’s no (known) way to reverse the process: Suppose that Hash(v) = h, given h there’s no function G such that G(h) = v.

• Fixed length – It doesn’t matter how long the input is, output will always have the same size. For example a 1000 bytes input or a 10 bytes input will both result in a 16 bytes output, when using MD5 as the hash function.

• Collision resistance – finding a pair X and Y such that Hash(X) = Hash(Y) should require hash computations. (where N is the size of output in bytes)

• Preimage resistance – for a given hash value H ϵ {0,1}n finding an input value X ϵ {0,1}* such that Hash(X) = H expected to require tries.

• Second preimage resistance – for a given X ϵ {0,1}* finding Y ϵ {0,1}* such that Hash(X) = Hash(Y) expected to require tries.

Collisions are inevitable, as we allow an infinite set of inputs and constrain ourselves to a close set of outputs, using the hash function as a reduce function with a fixed output length.

Suppose that we’ve followed the best practice of storing a hashed password, instead of clear text password within our database, and for some reason our database been compromised, an attacker trying to use the stored credentials, within the compromised database, to gain access into the system would fail. Let’s examine why:

Supposed that an attacker tries to login using the following credentials:

username: roi_lipman

password: 5ebe2294ecd0e0f08eab7690d2a6ee69 (Hash value of the password “secret”)

Indeed a record for the user name roi_lipman would be found, but the provided password would get rehashed (a second time) resulting with 7022cd14c42ff272619d6beacdc9ffde which is not the hash value for the original password “secret”.

For the attacker to gain access, he or she would have to figure out what was the original input, which

resulted in the current hash “5ebe2294ecd0e0f08eab7690d2a6ee69” as I mentioned earlier there’s no straight way of finding this out.

Unfortunately, not all hope is lost as our attacker still has several password cracking methods at his disposal:

• Brute force – In a brute force attack, one tries every possible value within a desired key space and character length, for example our attacker can calculate hash values for every string, of length 9, within the Alpha numeric key space. comparing each against a lookup hash, in case of a match the attacker can use its findings to login as a legit user.

• Dictionary attack – Similar to a brute force attack, but instead of trying every possible string within a desired length and keyspace, the attacker limits itself to strings within a much smaller set. for example the English dictionary, Hashing each word and comparing the current calculated hash against a lookup hash.

• Rainbow tables – Using a precomputed data structure to speed up the lookup process.

As our attacker has no preliminary information about the original password, he decided to use brute force hoping to crack the hash. Let’s estimate the time required for an attacker to find out the hash’s originating value:

A modern CPU can compute around 150,000,000 MD5 hashes every second, (this number will vary depending on the CPU model) using such CPU to crack a password of length 9 from the mix alphanumeric

Figure 3. Time to crack using CPU, X axis password length Y axis time in years

To speed up the process an attacker could migrate the hash crunching process from the CPU over to the GPU (Graphics Processor Unit).

In document Hakin9_StarterKit_04_2013 (Page 116-120)