Intel Microprocessor History

Intel 8008 microprocessor

http://en.wikipedia.org/wiki/Intel_8008

1972, 10 microns, 500KHz, 3,500 transistors, 18 pin DIP

Background

The first microprocessor, the Intel 4004, was completed in 1971. It was a 4-bit microprocessor designed to implement calculators. Computer Terminal Corporation asked Intel to use the same technology to implement an 8-bit microprocessor they had designed. for creating a programmable. CTC never used the resulting 8008 microprocessor, but Intel secured the right to market the chip to other customers. It is ironic that the Intel franchise, its microprocessor family, is based on a design by CTC and was never intended to be used as a general purpose computer.

The 8008 Registers

   code   name       contents        Special function   
                +---------------+        
    000    A    |b b b b b b b b|    the accumulator       
                +---------------+        
    001    B    |b b b b b b b b|       
                +---------------+       
    010    C    |b b b b b b b b|     
                +---------------+       
    011    D    |b b b b b b b b|      
                +---------------+       
    100    E    |b b b b b b b b|    
                +---------------+        
    101    L    |b b b b b b b b|    contains 8 low order bits of a memory address 
                +---------------+       
    110    H    |b b b b b b b b|    contains 6 high order bits of a memory address
                +---------------+

Eight additional 14 bit processor registers were used to implement the instruction pointer and a seven-level stack for calling functions. Although the 8008 architecture used 16-bit addresses (so that 64k bytes could be addresses), the 8008 only implemented 14 of the 16 bits, limiting memory to 16K bytes.

Machine language instructions

The Intel 8008 implemented 48 instruction using 1, 2, and 3 byte instructions with the following instruction formats:

   length      first byte         second byte        third byte 
            +---------------+  
   1 byte   |    op code    |  
            +---------------+  
            +---------------+ +---------------+ 
   2 bytes  |    op code    | |     data      |                      Immediate  
            +---------------+ +---------------+   
            +---------------+ +---------------+ +---------------+ 
    3 byte3 |    op code    | |low 8 addr bits| |x|x|high 6 bits|    Jump or Call  
            +---------------+ +---------------+ +---------------+

Programming

Descriptions of the 48 instructions can be found at http://www.classiccmp.org/8008/8008UM.pdf. To access memory, the 8-bit H (High) and L(Low) registers are combined to from a 16 bit address. Three instructions are required to transfer a byte between a processor register and memory. For example, the following three instructions (occupying 5 bytes in memory) will load the byte at address 1234 (hex) into the A (for Accumulator) register.

      LLI    34       Load Register L immediate
      LHI    12       Load Register H immediate
      LAM             Load Accumulator (A) from memory (using registers L and H)

 

Intel 8080

http://en.wikipedia.org/wiki/Intel_8080

(1974, 6 microns, 2MHz, 6,000 transistors, 40 pin DIP)

The Intel 8080 was an enhanced version of the 8008 with added instructions for  manipulate 16-bit quantities. The seven 8-bit 8080 registers (A, B, C, D, E, L, and H) were renumbered as follows: This made it easier to include   instructions to treat certain pairs of 8-bit registers as 16-bit registers (BC, DE, HL).

               +---------------+      
     000    B  |b b b b b b b b|
               +---------------+      
     001    C  |b b b b b b b b|
               +---------------+      
     010    D  |b b b b b b b b|
               +---------------+      
     011    E  |b b b b b b b b|
               +---------------+      
     100    H  |b b b b b b b b|    contains 8 low order bits of a memory address
               +---------------+      
     101    L  |b b b b b b b b|    contains 6 high order bits of a memory address
               +---------------+      
     111    A  |b b b b b b b b|    the accumulator    
               +---------------+      

The processor also contained a 16-bit instruction pointer (IP) and stack pointer (SP). Instructions were added to load, increment, and decrement the BC, DE, HL, and SP registers. In total, the 8080 added more than 40 new machine language instructions (111 in the 8080 versus 67 in the 8008, although the results can vary depending on how you define an “instruction”).

                 +---------------+---------------+
      00    BC   |b b b b b b b b|b b b b b b b b|
                 +---------------+---------------+
      01    DE   |b b b b b b b b|b b b b b b b b|
                 +---------------+---------------+
      10    HL   |b b b b b b b b|b b b b b b b b|
                 +---------------+---------------+
      11    SP   |b b b b b b b b b b b b b b b b|
                 +---------------+---------------+
   

               +-------------------------------+            IP   |b b b b b b b b b b b b b b b b|                +-------------------------------+

The January 1975 issue of Popular Electronics contained an ad by MITS corporation a microcomputer kit called the Altair 8800 based on the Intel 8080. This can be considered the beginning of the microprocessor revolution. When Bill Gates and his high school friend, Paul Allen, saw the advertisement, Bill dropped out of Harvard and moved to Albuquerque New Mexico where MITS was located and founded Microsoft to develop software for the Altar 8800 the Intel 8080. Their first product was a Basic interpreter for the 8080.

 

Intel 8086 and 8088 Processors

http://en.wikipedia.org/wiki/Intel_8088

(P1, 1978, 3 microns, 5 MHz, 29,000 transistors, 40-pin DIP)

The P1 (Processor design 1) is the first member of Intel's x86 family of processors and Intel’s first 16-bit processor. The 8088 was a less expensive version of the 8086 that used and 8-bit (rather than a 16-bit) bus to access memory. As the successors to the 8080 8-bit processor, the 8086 and 8088 were designed to be compatible with the 8080 at the assembly language level. This was accomplished through the use of eight 16-bit registers, four of which could be treated as a pair of 8-bit registers. The 8086 and 8088 names are on the right and the analogous 8080 names are on the left. In referencing the accumulator on the 8086 or 8088, the programmer could refer to AH (A register, High 8-bits),   AL (A register, Low 8 bits) or AX (A register, all 16 bits).

              +---------------+---------------+     
            A |       AH      |       AL      | %ax (Accumulator) 
              +---------------+---------------+
           BC |       BH      |       BL      | %bx (Base) 
              +---------------+---------------+
           DE |       CH      |       CL      | %cx (Count))
              +---------------+---------------+
           HL |       DH      |       DL      | %dx (Data)
              +---------------+---------------+
              +-------------------------------+      
           SP |b b b b b b b b b b b b b b b b| %sp (stack pointer)
              +-------------------------------+
              |b b b b b b b b b b b b b b b b| %bp (base pointer)
              +-------------------------------+
              |b b b b b b b b b b b b b b b b| %si (source index)
              +-------------------------------+
              |b b b b b b b b b b b b b b b b| %di (destination index)
              +-------------------------------+
              +-------------------------------+
           IP |b b b b b b b b b b b b b b b b| ip (instruction pointer)
              +-------------------------------+

With a 16-bit address, only 2**16 or 65k bytes of memory can be directly addressed. The 8086 and 8088 used four segment registers to access to four 64k blocks of memory that could be located anywhere in a 2**20 or one megabyte memory. The four segment registers are:

Intel 286 Processor

http://en.wikipedia.org/wiki/Intel_80286

(P2, 1982, 1.5 microns, 6 MHz, 134,000 transistors, 68-pin PLCC)

The P2 (Processor 2) is the second member of Intel's x86 family of processors. It used a Memory Management Unit (MMU) that allowed the operating system to provide programs with a larger 2**24 or 16 megabit virtual memory space. . The use of four privilege levels allowed an operating system to protect itself (as well as application programs) from misbehaving application software). The 286 was the first processor in the 86 family to provide the kind of features that are needed by any general purpose multitasking system.

 

Intel 386 Processor

http://en.wikipedia.org/wiki/Intel_80386

(P3, 1985, 1.5 micron, 16 MHz, 275,000 transistors, 132-pin PGA)

The 386 is the third member (P3) of Intel's x86 family of processors and the first processor to implement what is now called IA-32 (Intel Architecture 32-bits). Recall that Intel created the 16-bit 8086 processor by “stretching” the   registers of the 8080 from 8 to 16 bits. In a similar manner, Intel created the 32-bit 386 processor by stretched the 8086 registers from 16 to 32 bits. The major processor registers included:        

         <-------------------- 32-bit % e?x registers ------------------>
                                         <---- 16-bit %?x registers --->
                                         <- 8-bit reg -> <- 8-bit reg ->   
   
  32-bit                                                                  16-bit
   Regs                                                                    regs
         +-------------------------------+---------------+---------------+
 % eax   |                               |      %ah      |      %al      |   %ax
         +-------------------------------+---------------+---------------+
 % ebx   |                               |      %bh      |      %bl      |   %bx
         +-------------------------------+---------------+---------------+
 % ecx   |                               |      %ch      |      %cl      |   %cx
         |-------------------------------+---------------+---------------+
 % edx   |                               |      %dh      |      %dl      |   %dx
         +-------------------------------+---------------+---------------+
   
         +---------------------------------------------------------------+      
 % esp   |                               |                               |   %sp
         +---------------------------------------------------------------+      
 % ebp   |                               |                               |   %bp
         +---------------------------------------------------------------+      
 % esi   |                               |                               |   %si
         +---------------------------------------------------------------+      
 % edi   |                               |                               |   %di
         +---------------------------------------------------------------+      
   
         +---------------------------------------------------------------+      
   eip   |                               |                               |    ip
         +---------------------------------------------------------------+      

The following instructions show how the %al, %ah, %ax, and % eax (Extended ax register) can be accessed by an assembly language program.

Because the 386 was a 32-bit machine using 32-bit addresses, up to 4 Gigabytes of memory could be directly addressed. For backward compatibility, the architecture still included the segment registers, but by setting all of the segments to start at address 0x00000000, the segmentation hardware seems to "disappear" creating what is called the "flat" memory model. Later processors have added many features (in addition to much higher speeds), but the 386 instruction set (IA-32) is the basis for all of the later processors. In this course, we are really just studying the Intel 386 architecture, a design that is more than 25 years old.

 

Intel 486 Processor

http://en.wikipedia.org/wiki/Intel_80486

(P4, 1989, .8 microns, 25 MHz, 1.2M transistors, 168-238 pin PGA)

The fourth design (P4) of the x86 processor family. Integrated the floating point unit on the processor chip (as did later models of the Intel 386). Basically just a faster Intel 386.

 

Pentium Processor

http://en.wikipedia.org/wiki/P5_(microarchitecture)

(P5, 1993, .8 microns, 60 MHz, 3.1M transistors, 273 pin socket 4)

The fifth design (P5) of the x86 processor family. Basically a faster 486. As the Pentium evolved, later models include 8-16 KB of on-chip level 1 cache and 1-4 MB of level 2 cache. The Pentium MMX Processor added MMX ( MultiMedia eXtensions) instructions which could manipulate (add, subtract, multiply, etc.) 64-bit quantities that represented eight 8-bit numbers, four 16-bit numbers or two 32-bit integers. Instructions that can operate on multiple items of data at the same time (e.g. adding four pairs of 16-bit numbers at the same time) are referred to as SIMD (single instruction, multiple data) instructions.

 

Pentium Pro

http://en.wikipedia.org/wiki/Pentium_Pro

(P6, 1995, .35 microns, 150 MHz, 6.5M transistors, 387 pin socket 8)

The sixth design (P6) of the x86 processor family. Added conditional moves to the instruction set. The Pentium Pro Included 256-512 KB of level 2 cache on a separate chip inside the package containing the processor chip.

 

Pentium II

(P6, 1997, .35 microns, 233 MHz, 6.5M transistors, 242-contact slot 1)

Added the MMX instructions to the Pentium Pro P6 architecture along with an integrated L2 cache. Packaged on a separate "daughter card" that plugged into a slot on the motherboard.

 

Pentium III

http://en.wikipedia.org/wiki/Pentium_III

(P6, 1999, .25 microns, 450 MHz, 9.5M transistors, Socket 370)

Added the SSE (Streaming SIMD Extensions) instruction set to augment the MMX instructions (which used the floating point registers but could only process integer operands). Included 70 new instructions and eight new 128-bit registers % xmmo through %xmm7), each of which could hold four 32-bit floating point

numbers. The SSE2 (introduced with the Pentium 4) added integer support, providing SIMD instructions for data types from 8-bit integer to 64-bit floating point, making the MMX instructions somewhat redundant. The Pentium III added a unique (can controversial) identification number to each chip called PSN (Processor Serial Number) that could be read by software through the new CPUID instruction.

 

Pentium 4

http://en.wikipedia.org/wiki/Pentium_4

(P7, 2000, .18 micron, 1.4 GHz, 42M transistors, socket 423, 478)

Intel referred to the seventh design of the Intel x86 processor as the NetBurst microarchitecture. It was designed to outperform the AMD Athlon processors by using a deep pipeline (20 to 31 stages) with very high clock frequencies. (In contrast, the various P6 designs used 10 to 15 stage pipelines). The Pentium 4 never lived up to expectations (the Athlons were faster) but by 2006, it reached 3.8 Gigahertz with 169 million transistors using a 90nm process.

The Pentium 4 had a long lifetime and various features were added over the years. In 2003, hyper threading was added. To perform a task switch (e.g. temporarily stop running my program and start running yours), the processor must save hundreds of bytes in processor registers. Hyper threading duplicated many of the processor to allow the task switch to be made in just a few clock cycles. When one task "stalls" because of a cache miss or a mis-predicted branch instruction, the processor with hyper treading can quickly switch to another task. To allow the operating system to take advantage of this fast switch, the single core processor with hyper threading appears to be a dual core processor but only registers, not the entire processor, have been duplicated.

The Pentium 4's 32-bit addresses allow up to 4 gigabytes of memory to be directly accessed. It became clear that servers were going to require more memory than this and, in 2001, Intel introduced the 64-bit Itanium processors that used a completely new instruction set (IA-64). AMD took a different approach to 64-bit architecture. In 2003, they introduced the Opteron processor with the AMD-64 architecture which "stretched" the Pentium's registers from 32 to 64 bits (and increased the number of general registers from 8 to 16) while maintaining compatibility with the IA-32 (Pentium) architecture. In 2005, Intel followed AMD by adding the AMD-64 features to the Pentium 4, calling the resulting architecture Intel 64.

  64-bit                                       32-bit       8-16    8-bit    16-bit
   Regs                                         regs       regs     regs
  
   bits    63                          32 31           16 15    8 7      0
         +-------------------------------+---------------+---------------+
   %rax  |                               |      %eax     |  %ah  |  %al  |   %ax
         +-------------------------------+---------------+---------------+
   %rbx  |                               |      %ebx     |  %bh  |  %bl  |   % bx
         +-------------------------------+---------------+---------------+
   %rcx  |                               |      %ecx     |  %ch  |  %cl  |   %cx
         +-------------------------------+---------------+---------------+
   %rdx  |                               |      %edx     |  %dh  |  %dl  |   %dx
         +-------------------------------+---------------+---------------+
   %rsp  |                               |      %esp     |       | %spl  |   % sp
         +---------------------------------------------------------------+      
   %rbp  |                               |      %ebp     |       | %bpl  |   % bp
         +---------------------------------------------------------------+      
   %rsi  |                               |      %esi     |       | %sil  |   % si
         +---------------------------------------------------------------+      
   %rdi  |                               |      %edi     |       | %dil  |   %di
         +---------------------------------------------------------------+      
   %r8   |                               |      %r8d     | %r8w  | %r8b  | %r8w 
         +---------------------------------------------------------------+      
   %r9   |                               |      %r9d     | %r9w  | %r9b  | %r9w 
         +---------------------------------------------------------------+      
   %r10  |                               |     %r10d     | %r10w | %r10b | %r10w 
         +---------------------------------------------------------------+      
   %r11  |                               |     %r11d     | %r11w | %r11b | %r11w 
         +---------------------------------------------------------------+      
   %r12  |                               |     %r12d     | %r12w | %r12b | %r12w 
         +---------------------------------------------------------------+      
   %r13  |                               |     %r13d     | %r13w | %r13b | %r13w 
         +---------------------------------------------------------------+      
   %r14  |                               |     %r14d     | %r14w | %r14b | %r14w 
         +---------------------------------------------------------------+      
   %r15  |                               |     %r15d     | %r15w | %r15b | %r15w 
         +---------------------------------------------------------------+      
   
         +---------------------------------------------------------------+      
    rip  |                               |      eip      |      ip       |   
         +---------------------------------------------------------------+      

The irony that began with the Intel 8008 (which was designed by CTC) continues. Intel’s current processors use a 64-bit architecture designed by AMD.

 

Newer Intel Microarchitecture Designs

http://en.wikipedia.org/wiki/File:IntelProcessorRoadmap-3.svg

Beginning with the NetBurst microarchitecture described above, Intel used names rather than numbers to identify its successive microprocessor microarchitectures. These designs are closer to the design of the P6 (Pentium Pro) than the NetBurst-based Pentium 4. They use a shallower pipeline (e.g. 14 stages) and slower clock speed to outperform the Pentium 4 with its longer pipeline and faster clock.

The first Pentium 4’s were based on chips containing 42 million transistors and later Pentium 4’s had over 100 million transistors. As transistor counts increased toward the billion transistor chip, Intel developed a succession of more powerful microarchitectures. The newer chips feature:

In the future, we will see more of the logic on the motherboard (e.g. memory management logic) integrated into the processor chip. See http://en.wikipedia.org/wiki/File:IntelProcessorRoadmap-3.svg. The newer microarchitectures include:

                             Evolution of Intel x86 Architecture

      Processor    year        process       speed       transistors        pins

 8008              1972     10 microns      500KHz             3,500          18   
 8080              1974      6 microns        2MHz             6,000          40
 8086        P1    1978      3 microns       5 MHz            29,000          40
  286        P2    1982    1.5 microns       6 MHz           134,000          68
  386        P3    1985    1.5 microns      16 MHz           275,000         132
  486        P4    1989     .8 microns      25 MHz         1,200.000         168-238 
Pentium      P5    1993     .8 microns      60 MHz         3,100,000         273 
Pentium Pro  P6    1995    .35 microns     150 MHz         6,500,000         387 
Pentium II   P6    1997    .35 microns     233 MHz         6.500,000         242
Pentium III  P6    1999    .25 microns     450 MHz         9,500,000         370
Pentium 4          2000    .18 microns     1,4 GHz,       42,000,000         423-478
Core 2 Duo         2006    .065 microns    2.66 GHz      291,000,000         77
Core i3-5-7        2008    .045 microns    2.8 GHz       731,000,000        1366 
Sandy Bridge       2011    .032 microns    3.8 GHz       915,000,000        1155 (2011)
Ivy Bridge         20??    .022 microns                                     1155 (2011)