# MD5CRK

In cryptography, MD5CRK was a distributed effort (similar to distributed.net) launched by Jean-Luc Cooke and his company, CertainKey Cryptosystems, to demonstrate that the MD5 message digest algorithm is insecure by finding a collision — two messages that produce the same MD5 hash. The project went live on March 1, 2004. The project ended in August 2004 after a collision for MD5 was discovered using analytical methods.

Missing image
Pollard's Rho collision search for a single path

A technique called Pollard's rho algorithm (a cycle detection algorithm) was used to try and find a collision for MD5. The algorithm can be described by analogy with a random walk. Using the principle that any function with a finite number of possible outputs placed in a feedback loop will cycle, one can use a relatively small amount of memory to store outputs with particular structures and use them as "markers" to better detect when a marker has been "passed" before. These markers are called distinguished points, the point where two inputs produce the same output is called a collision point. MD5CRK considered any point whose first 32 bits were zeroes to be a distinguished point.

## Complexity

The expected time to find a collision is not equal to [itex]2^{N}[itex] where [itex]N[itex] is the number of bits in the digest output. It is in fact [itex]2^N! \over {(2^N - K)! \times {2^N}^K}[itex], where [itex]K[itex] is the number of function outputs collected.

For this project, the probability of success after [itex]K[itex] MD5 computations can be approximated by: [itex]1 \over { 1 - e^{K \times (1-K) \over 2^{N+1} } }[itex].

The expected number of computations required to produce a collision in the 128-bit MD5 message digest function is thus: [itex]{1.17741 \times 2^{N/2}} = {1.17741 \times 2^{64}}[itex]

To give some perspective to this, using Virginia Tech's System X (http://www.tcf.vt.edu/) with a max performance of 12.25 Teraflops, it would take approximately [itex]{2.17 \times 10^{19} / 12.25 \times 10^{12} \approx 1,770,000} [itex] seconds or about 3 weeks. Or for commodity processors at 2 gigaflops it would take 6,000 machines approximately the same amount of time.

## References

• Paul C. van Oorschot, Michael J. Wiener: Parallel Collision Search with Application to Hash Functions and Discrete Logarithms. ACM Conference on Computer and Communications Security 1994: pp210–218 Online version (http://www.scs.carleton.ca/~paulv/papers/acmccs94.pdf) (PDF format).

• Art and Cultures
• Countries of the World (http://www.academickids.com/encyclopedia/index.php/Countries)
• Space and Astronomy