## Chapter 06: Instruction Pipelining and Parallel Processing

## Example of the Pipelined CISC and RISC Processors

## Objective

• To understand pipelines and parallel pipelines in CISC and RISC Processors by Examples

#### **Most modern processors**

 Improve performance —SunSparc, Pentiums, PowerPCs are the superscalars and examples of pipelining and parallel processing of instructions

## **Pentium and PowerPC pipelines in parallel**

 Performance improves in processors by allowing independent instructions to execute simultaneously (instruction-level parallelism)

#### Pentiums

#### Pentiums

- The superscalar processors
- Have simple I-bit-technique— based dynamic branch prediction

## **Pentiums Two pipelines**

- Named as U and V
- V pipeline— for simple instructions and
- U— for any instruction
- 64-bit wide external data bus
- The conditional jump was made a second instruction among the two running in parallel



Copyright © The McGraw-Hill Companies Inc. Indian Special Edition 2009

# Two instructions proceed through the parallel pipelines

- Two instructions proceed through the parallel pipelines at one stage per cycle, until they reach the register (result) write-back (WB) stage
- At WB the execution of instruction  $I_n$  is complete in pipeline 1 and  $I_{n+1}$  in 2

• In earlier versions, Pentium processors executed two integer instructions or two floating-point instructions in parallel

- Pentium's later versions had three independent units, two for integer operations and one for floating-point operations
- Pentium Pro version had an additional pipeline and supported two integer operations and two floating-point operations in four pipelines

• **Pentium II** versions had additional MMX instructions

#### **Pentium MMX**

(a) 128-bit MMX registers, each for two packed64-bit floating-point operations and pipelineshas 20-stages

(b) 128-bit MMX registers, each for two packed64-bit floating-point operations and pipelineshas 10-stages

#### **Pentium MMX**

(*c*) Two number 64-bit MMX registers, each for one packed 64-bit floating-point operations and pipelines has 5 stages

(*d*) Four number 32-bit MMX registers, each for two packed 64-bit floating-point operations and pipelines has four sets of 5-stage pipelines.

#### **Pentium III versions**

- Introduced single instruction multiple data SIMD instructions with extension for execution of streaming floating point operations in pipelines
- Pipelines had 10 stages

#### **Pentium IV versions**

- Introduced in the twenty-first century with 1GHz plus clock cycles
- 128-bit XMM (extended multi media) registers, each for two packed 64-bit floatingpoint operations
- The 128-bit registers handle long integers also
- Pipelines have 20 stages



## **ARM (Advance RISC Machines)**

• ARM (Advance RISC Machines) developed the large variations in superscalars as per application-specific needs

### **ARM (Advance RISC Machines)**

 It has separate 16 GPRs, CPSR, and SPSR (Current Process Saved Status Register and Saved Process Status Register)

## **ARM RISC design**

• ARM7TDMI had three stages pipeline

## **ARM7 three stage pipeline**



## **Pipelines in StrongARM and ARM 9**

#### **Five Clock Cycle Stages Pipeline**



• StrongARM has 5-stage pipeline and a high-speed multiplier circuitry

#### • ARM9 has 5-stage pipeline

#### **Pipelines in ARM 10**

#### **Five Stages Clock Cycle Pipeline**



ARM10 has 6-stage pipelines

### Summary

#### We learnt

- Superscalar Processors Pentium, PowerPC
- ARM pipelines

End of Lesson 14 on Example of the Pipelined CISC and RISC Processors