NetBurst
|
Simply put, NetBurst is the name given to the architecture of the Willamette core used in the P4 line of CPUs manufactured by Intel.
The NetBurst architecture basically includes features such as "Hyper Pipelined Technology" and "Rapid Execution Engine" which are firsts in this particular microarchitecture.
Hyper Pipelined Technology
This is the name that Intel chose for the 20 stage pipeline that the Willamette architecture has. This is a significant increase in the number of stages when compared to the PIII which had only 10 stages in its pipeline. Although, this is an impressive number, a longer pipeline always has its inherent disadvantages, mainy a reduced IPC (Instructions per cycle). This however is anulled by the fact that the higher number of stages in the pipeline allow the CPU to have higher clock speeds which will technically offset any loss in performance due to the reduced IPC. Another drawback of having too many stages in a pipeline is the number of stages that need to be traced back in case the branch tree prediction makes a mistake. The longer the pipeline, the further back in the process you have to trace back to in order to rectify the mistake, which is obviously a huge penalty to pay. Keeping this in mind, Intel came up with the second feature in the NetBurst architecture, which is known as the "Rapid Execution Engine"
Rapid Execution Engine
As per this technology, the ALUs in the core of the CPU actually operate at twice the core clock frequency. What this means is that, in a 1.5GHz CPU, the ALUs will effectively be operating at an impressive 3GHz. The reason behind this is to generally make up for the low IPC count, additionally this also considerably enhances the integer performance of the CPU.
Execution Trace Cache
Within the L2 cache of the CPU, Intel has incorporated what it calls an Execution Trace Cache. What this cache does is, it stores decoded micro-operations, so that when executing a new instruction, instead of fetching and decoding the instruction again, the CPU can directly access the decoded micro-ops from the trace cache, thereby saving a considerable amount of time. Moreover the micro-ops are cached in their predicted path of execution, which means that when instructions are fetched by the CPU from the cache, they are already present in the correct order of execution.
Despite all these enhancements however, today the NetBurst architecture has not proved to be very successful in terms of performance. With this architecture, Intel was looking to touch speeds of 10GHz, but with rising clock speed, Intel has faced increasing problems with keeping power dissipation within acceptable limits. As a result, newer Intel roadmaps clearly indicate abandoning NetBurst and adopting a newer microarchitecture (based on the Pentium M) to help them achieve their goals.