CDC 6600

From Academic Kids

The CDC 6600 was a mainframe computer from Control Data Corporation, first manufactured in 1965. It is generally considered to be the first successful supercomputer, outperforming the fastest machines of the era by about three times. It remained the world's fastest computer from 1964 to 1969, when it relenquished that status to its successor, the CDC 7600.


History and impact

CDC's first products were based on the machines designed at ERA, which Seymour Cray had been asked to update after moving to CDC. After an experimental machine known as the Little Character, they delivered the CDC 1604, one of the first commercial transistor-based computers, and one of the fastest machines on the market. Management was delighted, and made plans for a new series of machines that were more tailored to business use; they would include instructions for character handling and record keeping for instance. Cray was not interested in such a project, and set himself the goal of producing a new machine that would be 50 times faster than the 1604. When asked to complete a detailed report on future plans at one and five years into the future, he wrote back that his five year goal was "to produce the largest computer in the world", and his one year plan "to be one-fifth of the way".

Taking his core team to new offices nearby the original CDC headquarters, they started to experiment with higher quality versions of the "cheap" transistors Cray had used in the 1604. After much experimentation they found that there was simply no way the germanium-based transistors could be run much faster than the 1604. In fact the "business machine" that management had originally wanted, now forming as the CDC 3600, pushed them about as far as they could go. Cray then decided the solution was to work with the then-new silicon-based transistors from Fairchild Semiconductor, which were just coming onto the market and offered dramatically improved switching performance.

During this period CDC grew from a startup to a large company. Cray became increasingly frustrated with what he saw as ridiculous management requirements. Things became considerably more tense in 1962 when the new 3600 started to near production quality, and appeared to be exactly what management wanted, when they wanted it. Cray eventually told CDC's CEO, William Norris that something had to change, or he would leave the company. Norris felt he was too important to lose, and gave Cray the green light to set up a new lab wherever he wanted. After a short search, Cray eventually decided to return to his home town of Chippewa Falls, WI, where he purchased a block of land and started up a new lab. Although this process introduced a fairly lengthy delay in the design of his new machine, once in the new lab things started to progress quickly. By this time the new transistors were becoming quite reliable, and modules built with them tended to work properly on the first try. Working with Jim Thornton, who was the system architect and the 'hidden genius' behind the 6600, the machine soon took form.

About 50 CDC 6600's were sold over the machine's lifetime. Most of these went to various nuclear bomb-related labs, although some found their way into university computing labs as well. Cray immediately turned his attention to its replacement, this time setting a goal of 10 times the performance of the 6600, delivered as the CDC 7600. The later CDC Cyber 70 and 170 computers were much like the CDC 6600.


Typical machines of the era used a single complex CPU to drive the entire system. A typical program would first load data into memory (often using pre-rolled library code), process it, and then write it back out. This required the CPU's to be fairly complex in order to handle the complete set of instructions they needed to run, and that complexity meant that it was much more difficult to make it run fast.

Cray took another approach. At the time CPU's generally ran slower than the main memory they were attached to, for instance a processor might take 15 cycles to multiply two numbers, while each memory access took only one. This meant there was a significant time where the main memory was idle. It was this idle time that the 6600 extracted. Instead of trying to make the CPU handle all the tasks, the 6600's handled math and logic only, allowing it to be tuned far beyond what a more complex machine could handle. All input/output tasks was then handed off to another processor to complete, and the two could work in parallel, using the memory while the other was busy. This effectively doubled the speed of the system overall.

Of course this would also make the machine dramatically more expensive, if it contained two CPU's each dedicated to a specific purpose. Key to the 6600's design was to make these I/O processors, known as Peripheral Processors, or PP's, much simpler. Of course this also implied they would be much slower, so in order to make sure they could keep feeding data into the main processor fast enough, the 6600 included ten of them. The machines were based on the simple 12-bit CDC 160A, which ran much slower than the CPU, gathering up data and "squirting" it into main memory at high speed via dedicated hardware. Even though they were slow, the inclusion of ten machines meant that they could keep the memory filled with new data by operating in parallel. For any given cycle one of the PP's would be in control, telling the main CPU which instructions to run, processing some data, and then handing off control to the next PP in round-robin fashion.

The basis for the 6600 is what we would today refer to as a RISC system, one in which the processor is tuned to do instructions which are comparatively simple. The philosophy of many other machines was toward using instructions which were complicated — for example, a single instruction which would fetch an operand from memory and add it to a value in a register. In the 6600, loading the value from memory would require one instruction, and adding it would require a second. While slower in theory due to the additional memory accesses, at the time memory was fast so this was not a concern.

With the CPU now much simpler, it simplified timing, leading to higher throughput and a higher clock speed of 100ns (10 MHz). This simplification also forced programmers to be very aware of their memory accesses, and therefore code deliberately to reduce them as much as possible. Since memory was much slower than the CPU, forcing programmers to code defensively generally resulted in faster programs.

The Central Processor (CP)

The Central Processor, or CP, has eight general purpose 60-bit registers X0 through X7, eight 18-bit address registers A0 through A7, and eight 18-bit scratchpad registers B0 through B7 (typically used for array indexing). Additional registers used for bookkeeping (such as the scoreboard register) are not accessible to the programmer. Additional registers (such as RA and FL) can be loaded only by the operating system. In keeping with the RISC "load/store" philosophy, there are no instructions to read or write from/to core memory. All memory accesses are performed through loading an address into the A registers; loading A1 through A5 with an address would cause the data word at that location to be read into the corresponding X register (X1 through X5), while loading an address into A6 or A7 would cause register X6 or X7 to be written out to memory at that address. A separate hardware load/store unit handled the actual data movement independent of the operation of the instruction stream, allowing other operations to complete while memory was being accessed. In modern designs this sort of operation is normally supported directly by load/store instructions, which are given an explicit memory location to read or write, instead of the address regisiters used in the 6600.

The CP included several parallel functional units, allowing multiple instructions to be worked on at the same time. Today this is known as a superscalar design, while at the time it was simply "unique". The system read and decoded instructions from memory as fast as possible, generally faster than they could be completed, and fed them off to the units for processing. The units included two floating point multipliers, a divider, an adder and "long" adder, two incrementors, a shifter, a boolean logic unit and a branch unit. A stack of eight instruction words was kept in memory at all times, and since the 15-bit instructions were packed four to a word, this allowed the system to pick any one of the 32 instructions inside to run depending on which units were free. The system used a 10 megahertz clock, but used a four-phase signal to match the four-wide instructions, so the system thus effectively operated at 40 MHz. A floating point multiply took about three cycles, while a divide took about ten, and the overall performance considering memory delays and other issues was about 1 MFLOPS. Using the best available compilers, late in the machine's history, FORTRAN programs could expect to maintain about 0.5 MFLOPS.

Memory organization

User programs are restricted to use only a portion of contiguous core memory. The portion of memory the program has access to is controlled by the RA (Relative Address) and FL (Field Length) registers, and when a user program tries to read or write a word in central memory at address a, the processor will first check that a is between 0 and FL-1. If this passes, the processor will access the word in central memory at address RA+a. This process is known as logical address translation; each user program sees core memory as a contiguous block of FL words starting at address 0, while in fact the program may be anywhere in the physical memory. Using this technique, each user program can be moved around in core memory by the operating system, as long as the RA register reflects its position in memory. A user program trying to access memory outside the allowed range will trigger an error, and will be terminated by the operating system. When this happens, a core dump will be output in a file, allowing the developer a way to know what happened. However, contrary to virtual memory systems, the entirety of a process addressable space must be in core memory. Support for virtual memory came much with later with the CDC Cyber 180 models.

Peripheral Processors (PPs)

To handle the 'household' tasks which other designs put in the CPU, Cray included ten other processors, based partly on his earlier computer, the CDC 160A. These machines, called Peripheral Processors, or PPs, were full computers in their own right, but were tuned to performing I/O tasks and running the operating system. One of the PP's was in overall control of the machine, including control of the program running on the main CPU, while the others would be dedicated to various I/O tasks. When the program needed to perform some sort of I/O, it instead loaded a small program into one of these other machines and let it do the work. The PP would then inform the CPU when the task was complete with an interrupt.

Each PP included its own memory (up to 4096 12-bit words), both for I/O buffering as well as program storage, but the execution units were shared by 10 PPs, in a configration called the Barrel and slot. This meant that the execution units (the "slot") would execute one instuction cycle from the first PP, then one instruction cycle from the second PP, etc. in a round robin fashion. This was done both to reduce costs, and because access to CP memory required 10 PP clock cycles: when a PP accesses CP memory, the data is available next time the PP receives its slot time.

Wordlengths, characters

The central processor had 60-bit words, whilst the peripheral processors had 12-bit words. CDC used the term "byte" to refer to 12-bit entities used by peripheral processors; characters were 6-bit, and central processor instructions were either 15 bits, or 30 bits with an 18-bit address field, the latter allowing for a directly addressable memory space of 256K words (converted to modern terms, with 8-bit bytes, this is 1.88 megabytes). Central processor instructions started on a word boundary when they were the target of a jump statement or subroutine return jump instruction, so no-operations were sometimes required to fill out the last 15, 30 or 45 bits of a word.

The 6-bit characters could be used to store up to 10 characters in a word. This permitted a character set of 64 characters. This is enough for all upper case letters, digits, and some punctuation. Certainly, enough to write FORTRAN, or print financial or scientific reports. There were actually two character sets in use, 64-character and 63-character. The 64-character set had the disadvantage that two consecutive ':' (colon) characters might be interpreted as the end of a line if they fell at the end of a 10-byte word.

With no byte addressing instructions at all, code had to be written to pack and shift characters into words. The very large words, and comparatively small amount of memory, meant that programmers would frequently economise on memory by packing data into words at the bit level.

Physical design

The machine was build in an X-shaped cabinet with the most important circuitry in the center in order to shorten cabling lengths. Less important circuits, like the PP's, were placed at the ends of the arms. Each arm consisted of several "stacked" chassis, each carrying parts of a functional unit or memory. They could be opened on hinges, like opening a book, in order to service the modules inside. The logic modules of the machine were built from 2.5x2.5 inch cards known as "cordwood", a reference to their tight packing within the machine. Modules within any one functional unit were connected together using twisted-pair wiring, while the units were connected to each other using coaxial cable. Memory was mounted on larger 2.5x6 inch modules. Heat was carried away by a built-in air conditioning unit.


External links


Academic Kids Menu

  • Art and Cultures
    • Art (
    • Architecture (
    • Cultures (
    • Music (
    • Musical Instruments (
  • Biographies (
  • Clipart (
  • Geography (
    • Countries of the World (
    • Maps (
    • Flags (
    • Continents (
  • History (
    • Ancient Civilizations (
    • Industrial Revolution (
    • Middle Ages (
    • Prehistory (
    • Renaissance (
    • Timelines (
    • United States (
    • Wars (
    • World History (
  • Human Body (
  • Mathematics (
  • Reference (
  • Science (
    • Animals (
    • Aviation (
    • Dinosaurs (
    • Earth (
    • Inventions (
    • Physical Science (
    • Plants (
    • Scientists (
  • Social Studies (
    • Anthropology (
    • Economics (
    • Government (
    • Religion (
    • Holidays (
  • Space and Astronomy
    • Solar System (
    • Planets (
  • Sports (
  • Timelines (
  • Weather (
  • US States (


  • Home Page (
  • Contact Us (

  • Clip Art (
Personal tools