Instruction level parallelism
|
Instruction-level parallelism (ILP) is a measure of how many of the operations in a computer program can be performed simultaneously. Consider the following program:
1. e = a + b 2. f = c + d 3. g = e * f
Operation 3 depends on the results of operations 1 and 2, so it cannot be calculated until both of them are completed. However, operations 1 and 2 do not depend on any other operation, so they can be calculated simultaneously. If we assume that each operation can be completed in one unit of time then these three instructions can be completed in a total of two units of time, giving an ILP of 3/2.
A goal of compiler and processor designers is to identify and take advantage of as much ILP as possible.
Micro-architectural techniques that are used to exploit ILP include:
- Instruction pipelining which depend on CPU caches
- Register renaming which refers to a technique used to avoid unnecessary serialization of program operations imposed by the reuse of registers by those operations
- Speculative execution which reduce pipeline stalls due to control dependencies
- Branch prediction which is used to keep the pipeline full
- Superscalar execution in which multiple execution units are used to execute multiple instructions in parallel
- Out of Order execution which reduces pipeline stalls due to operand dependencies
Due to the complexity of scaling the last two techniques, the industry has re-examined instruction sets which explicitly encode multiple operations per instruction. These instruction set types include:
- VLIW and the closely related Explicitly Parallel Instruction Computing concepts
As of 2004, the computer industry has hit a roadblock in getting further performance gains from ILP. Instead the industry is heading towards exploiting higher levels of parallelism that is available through techniques such as multiprocessing and multithreading.