### Chapter 07: Instruction—Level Parallelism— VLIW, Vector, Array and Multithreaded Processors ...

Lesson 01:

**Instruction–Level Parallelism in Superscalar Processors and Parallel Computer Systems** 

#### **Objective**

- To learn instruction level parallelism in superscalar processors
- To learn various forms of the parallel computer systems

### Multiple execution Units in Superscalar Processor

#### **Superscalar Processor**

- Multiple execution units to execute instructions
- Each execution unit reads its operands from and writes its results to a single, centralized register file

## Multiple execution Units - Superscalar Processor

• When an operation writes its result back to the register file, that result becomes visible to all of the execution units on the next cycle, allowing operations to execute on different units from the operations that generate their inputs

# Instruction Issue Logic and Four execution units in a superscalar



# Instruction Level Parallelism (ILP) in Superscalar processors

- Have complex bypassing hardware that forwards the results of each instruction to all of the execution units to reduce the delay between dependent instructions
- The instructions that make up a program are handled in superscalar processors by the instruction issue logic, which issues instructions to the units in parallel

### Superscalar processors Instruction Issue Logic

 Allows control flow changes, such as branches, to occur simultaneously across all of the units, making it much easier to write and compile programs for instruction-level parallel superscalar processors

#### A superscalar processor hardware

- Extracts instruction-level parallelism from sequential programs
- During each cycle, the instruction issue logic of a superscalar processor examines the instructions in the sequential program to determine which instructions may be issued on that cycle

## Strengths and the weaknesses of instruction-level parallelism

### Strengths and the weaknesses of instructionlevel parallelism

• Can achieve significant speedups on a wide variety of programs by executing instructions in parallel, but their maximum performance improvement is limited by instruction dependencies

# Effect of more execution units added to a processor

- The incremental performance improvement that results from adding each execution unit decreases
- Going from one execution unit to two gives substantial reductions in execution time

# Effect of more execution units added to a processor

 However, as the number of execution units is increased to four, eight, or more, the additional execution units spend most of their time idle, particularly if the program has not been compiled to take advantage of the additional execution units

#### Forms of Parallel Computer Systems

# Variation in the size of the tasks that are executed in parallel Computer Systems

- Instruction-level parallel processors—running the instructions in parallel
- Vector processors— operations on vector elements in parallel
- Multithreaded processors operations on multiple thread with one thread instructions scheduled in the pipelines in a parallel in given allotted time slot by an OS

# Variation in the size of the tasks that are executed in parallel Computer Systems

- Simultaneous multithreading (SMT) or hyper threading in multithread processor—running multiple threads in parallel in a single processor
- Multicore running multiple programs in two or more cores in parallel on same chip
- Multiprocessors— running multiple interconnected computers

### Summary

#### We Learnt

- Parallel Pipelines with parallel execution units
- Improvement in Performance
- Strength and weaknesses
- Instruction level, thread level, multiple thread level, program level parallel computer systems

### End of Lesson 01 on Superscalar Processors and Parallel Computer Systems