« Version 4.1 - 2.0 Security Protocols 14% | Main | Version 4.1 - 2.0 RSA »

Version 4.1 - 2.0 MD5 and SHA

I grouped MD5 and SHA together because they both produce hashes. If you look at the Wikipedia page on hash functions, you will see that a hash function takes data of an arbitrary length and produces output of a fixed length. They are commonly called "cryptographic" hash functions because if you have the hash, you can't reconstruct the message. It's not cryptography like 3DES, but does provide (limited) security.

Keep in mind what it is and what it does. If you have a 3K text file and you take the hash of that file, it will produce a much smaller hash. For MD5, it's a 128-bit hash and for SHA-1, it's a 160-bit hash. If one character of that file is changed, the subsequent hash will not match. So it can be used to verify a message's integrity. If you log into a system and submit your username and password, that can be hashed and the hash sent to the server who can verify if your username and password matches. Sending the hash means that you're not sending the actual password over the network. Therefore, if you capture the hash, you cannot re-create the password from the hash. The other thing about a hash is that it is computationally faster (compared to actual encryption).

Let's start with the specifics of each. MD5 is outlined in RFC 1321. The RFC outlines the steps used to calculate the hash, but a fuller treatment of the function can be found at Wikipedia.

MD5 takes a message of arbitrary length and breaks it into 512-bit blocks (padded to the 512-bit boundary). Then it uses those blocks to manipulate the state of the 128-bit output - broken into four 32-bit words (A, B, C, D). The output is a sequence of 32 hexadecimal digits.

There are two important places you will see MD5. The first relates to routing protocol authentication. The second is the MD5 hash as a file integrity check. OSPF uses none, clear text or md5 for authentication. Compared to the other choices, md5 is better. However, it's not bullet-proof and not the only form of protection for the control plane.

"The security of the MD5 hash function is severely compromised." [Wikipedia] There are two basic ideas that illustrate the weakness of the MD5 hash. The first is that you can take Document1 with hash value x and find another document (DocumentZ) that has the same hash value x. This is called a collision. Note that similarly you can take Document1 and alter it (perhaps with hidden data) until it creates such a DocumentZ. The other relates to md5 password hashes. If you know it's a hash of a password and you know how the function calculates the hash, you can brute force (try all possibilities) until you find a matching hash. Some of these hashes can be computed in advance and then the hashes are compared to these tables. It finds the password very quickly.

So with all the issues of MD5, let's turn to the other hash function - SHA. It's slightly better than MD5, but is also vulnerable to attacks. If you have insomnia or really want to bake your noodle, there is a Wikipedia page on Hash function security which summarizes the strength / weakness of the various hash functions. Fascinating.


There are a number of different variants of SHA. SHA-1 uses 512-bit blocks of input, but produces 160-bit output. The idea is that adding another 32-bits to the state (compared to MD5) would make it more difficult to produce a collision. However, as computers get more powerful, it becomes increasingly feasible that brute force methods could provide a collision. Wikipedia also has entries on SHA-2 and SHA-3.

SHA-2 is actually a "family" of six functions - SHA-224, SHA-256, SHA-384, SHA-512, SHA-512/224, SHA-512-256. I don't think there will be any specific questions about the differences between these family members. Just remember that the numbers tell the bits of output of the hash. One thing that may be important is that SHA-2 is designed by the NSA and FIPS-approved. It also crops up in the AMP material - since the SHA-256 hash is used to identify malware.

I haven't seen SHA-3 in any of the recommended study materials. It is relatively new, but just know it exists (because anything is fair game) and that it uses a really big internal state size (1600 bits - with 64 bit words in a 5 X 5 array).

Note that SHA-2 is considered cryptographically sound. (MD5 and SHA-1 are not sound.) And remember that IPv6 CGAs use SHA-1 hashing to bind the RSA public key to the generated address. I really don't think they are going to go any deeper than that on this topic.

Sections

Powered by
Movable Type 3.2