CIS 351 |
Lab 10: Pipeline |
Winter 2021 |
The purpose of this lab is to become more familiar with the concept of pipelining, as found in modern CPU architectures. Specifically, the lab involves the use of a graphical pipeline simulator. This tool will be used to further your understanding of how a pipeline works, how it is implemented, and how it deals with the problems introduced by using this high-performance technique.
DLXview
. Its purpose is to graphically illustrate the
operation of the pipeline implemented in the DLX architecture.
DLX is the name Patterson and Hennessy gave to the hypothetical
machine and instruction set we have been studying. (Although, I don't
think they use the DLX name in the current edition of the
textbook.)dlxview
at the command
prompt.
configure
button. A pop-up window
will present a set of scheduling modes.
basic pipeline
.
A pop-up window will present a set of configuration parameters
appropriate for the pipeline mode you have selected.
default
configuration by clicking OK
.OK
.exampleDLX-1.s
into your current directory.
load
button. A file selection
window will pop up.
exampleDLX-1.s
) by clicking on it.load.
done
.step forward
button to begin
execution.; A pop-up window will request a starting address.
default
button.step forward
button or the next
cycle
button to progress through the assembly program. Stepping advances execution
forward to the next instruction. Cycling
advances the simulation by one clock cycle. These are usually, but
not always, the same thing (i.e., most instructions take a single
cycle). The step back
and previous cycle
buttons will also be very helpful in observing and understanding
pipeline operation.
and r6, r7,
r8
in during cycle number 5 (i.e., the cycle labeled 5)?
and r12, r13, r14
? (Make sure your answer clearly states "the x
th
cycle" or
"the cycle labeled x").
and r12, r13, r14
? (Make sure your answer clearly states "the x
th
cycle" or
"the cycle labeled x").
block diagram
shows the entire pipeline as
labeled stages
integer datapath
presents a detailed
device-level view of the pipeline
The assembly program loaded earlier (exampleDLX-1.s
) is a very simple
program with no data dependencies, branches, or other potential
problems. The code is well-documented and should be easily
understood. Take some time now to execute the program,
familiarizing yourself with the different views and the operation of the
simulator.
Pay particular attention to the register file. Notice how it indicates which registers are being read from and written to. Also pay close attention to the muxes. They help indicate which of the many values generated in the previous stage are actually being used in the current stage. (In other words, watching the lines through the muxes shows you where the inputs to the ALU are coming from.) Stepping backward and forward will help you understand how an individual instruction progresses through the pipeline.
Note: we are only concerned here with the operation and control of
the pipeline. In other words, we don't care what data values are
actually in the registers or what the results of the operations
are. It is possible, however, to initialize registers to a desired
value using a *.i
file. See the User's manual for additional
information.
Note: it is possible at any time to change the operation of the simulator by editing the source code (see the file selection window).
Let's trace through the execution of a single instruction. The first instruction in the example program loads a word from memory into a register. Recall that DLX data addressing uses the indexed method. The word to be read from memory is located at the address contained in the instruction itself (i.e. the constant value 0), offset (or indexed) by the value contained in register R2. The word read from memory is to be loaded into register R1.
load
operation). In addition, any necessary values are read from the
register file. In this case, the value of register R2 is read
because it is needed for subsequent memory addressing. The
simulator indicates the register read
by displaying a
colored R2 in the box representing the register file.
read
values, the value displayed near the bottom represents a write
operation). Tracing back the wire representing the register file
write operation reveals the origin of the value being written -- the
data memory.
nops
at the end of the sample program?
A structural hazard is another term for a resource conflict. Resource conflicts (we will discuss this in more detail in class) refer to contention for a specific functional unit. Take a moment to study the integer data path of the DLX pipeline.
It is possible that the Instruction Fetch stage might be accessing memory at the same time as the Memory Access stage. One stage is reading an instruction and the other is reading a data value. Only one word can be read from a memory unit at time. This is an example of a resource conflict.
load
or an ALU
operation). Program exampleDLX-2.s
contains such a dependency. Download the program and study the
code. Load and run the program and study its execution.
sub r3, r4,
r5
is routed directly to the main ALU.
add
instruction is writing
register r3
during the same cycle that the second
add
instruction is reading r3
.
r3
will be read if the DLX
processor used the register file provided for Project 4? If the
wrong value would be read, describe how you would properly
coordinate the reading and writing of the registers.
Not all data dependencies can be solved by forwarding. Program exampleDLX-4.s contains an example of a type of hazard that forwarding cannot eliminate. Download the program and study the code. Load and run the program and study its execution.
Examine all of the instructions in the example program and identify
their dependencies. Notice that the add
instruction immediately follows a load
instruction. This is an example of a dependency immune to
forwarding. Execute the program using cycles instead of
stepping. Refer to the timing diagram and notice the presence
of stalls in the pipeline. A stall is a delay cycle, or
bubble, inserted into the pipeline. When this type of data
dependency (called a load data hazard) is detected, hardware
known as a load interlock
will delay subsequent instructions to resolve
the timing problem. Study the operation of this program until
you understand both the problem and the solution.
add
operation be delayed one
cycle? Your answer should consider timing issues and functional
units. Be sure to explain why forwarding cannot solve the
problem.
The final type of hazard involves control transfer -- branches and jumps. Program exampleDLX-5.s implements control transfer in the form of a loop. Download, study, and execute the code until you understand its operation. The comments in the code should help you understand what the loop is doing.
The DLX hardware resolves branches in the Instruction Decode stage of
the pipeline. Instead of a beq
instruction that uses
subtraction in the ALU to compare two registers, the DLX hardware has
bez
and bnz
instructions that branch if the
specified register is equal or not equal to 0 respectively. These
new branch instructions allow the hardware to determine by the end of
the second cycle whether to take the branch.
Notice that the instruction immediately following the branch
instruction is a nop
. This
nop
is called a branch delay slot and is
typically inserted by the compiler.
nop
?
nop
.
add r6, r4, r5
in the "write-back"
phase?)
Updated Saturday, 27 March 2021, 8:01 PM