CIS 451 Week 3
Addressing Modes
- Addressing modes
- How does the machine differentiate between
add r1 <= r2 + r3
and add r1 <= r2 + 6
?
- Addressing modes: How the machine interprets the parameters.
- What are some different addressing modes?
- Register Direct
- Register Indirect
- Memory Direct
- Memory Indirect
- Offset / Displacement
- Immediate
- Implicit
- Pass out handout from Stallings
- One difference between ISAs (and a decision to make when designing CPUs) is which/how many addressing modes are
supported.
- IA86 supports memory addresses in arithmetic; MIPS does not. Why ?
- Display MIPS single-cycle diagram.
- Load/Store vs. Register/Memory
- CISC vs. RISC
- Not always black and white. RISC processors can have a few complex addressing modes
- Offset addressing in MIPS
- load word looks like this
lw $t0, 8($a0)
- Do we need all three parameters?
- What is the benefit of the third parameter? (Or, alternately the cost of removing it?)
- How can we quantify these benefits?
- What is the latency of the single-cycle CPU?
- How does removing the offset affect
p
(in general)?
- How does removing the offset affect
n
?
- Look at diagram of single-cycle CPU from Harris and Harris and discuss specific performance values.
- How does removing the offset affect
p
(specifically. Give numbers)?
- Notice that whether the third parameter improves performance depends on the program/workload!
- Suppose only 85% of memory instructions require a helper.
- Compare performance when 20% and 35% of instructions are
lw
or sw
725 * (1 + (.35*.85)) = 940
725 * (1 + (.20*.85)) = 848
- This is why you usually can’t declare one CPU to be definitively better than another.
- Instruction Mix
- What instruction mix would make the “no offset” MIPS CPU better?
n' = n*(1 + l)
where l
is fraction of instructions that are lw.
- want n’p’ < np
n' = 1 + .85x
725(1 + .85x)n < 925n
1 + .85x < 1.28
.85x < .25
x < .33
- Should we change the number of registers?
- What would be the consequence of doubling the number of registers from 32 to 64 in MIPS?
- Fewer loads and stores
- slightly longer cycle time (muxes in RF have a little more
work to do)
- Messed up ISA. ?? Why ?? (Ignore this for now)
- Suppose (1) cycle time goes up by 25. (2) 35% of instructions
are loads and stores. What % of loads and stores must be removed?
- Want
95(.65n + .35xn) < 925n
- True when
x < .9248
. This is max we can keep.
- Must remove at least 7.5%
- Back to
addi
.
- How many bits do we want?
- What are are our choices if we want a fixed-width instruction set?
- “Make the Common Case Fast”
- Talk about how design decisions are interconnected
MultiCycle CPU
- Notice that the Single Cycle CPU wastes a lot of time.
- Anything that’s not a load or store sits idle 20% of the time.
- What if we could “tighten” CPU?
- Start on HH Chapter 7, slide 40
- Microinstructions
- Microcode
- Facilitates CISC instructions.
- Avoids duplicate hardware
- Now CPI becomes important
- Weighted average
- MHz rating no longer primary descriptor of performance
- Performance (? Can we make the numbers work ?)
- Notice that other parts still sit around when not in use
- Performance
- What is CPI?
- Calculate CPI for instruction mix. (4.12)
- Timing of MultiCycle CPU (325)
- Is not faster than single cycle. (At least, not as presented in textbook.)
- What if clock time is optimal? (925/5 = 185) ?
185*4.12 = 762
; speedup of 1.21
- What kind of CPI works with timing of 325? ?
- What time needed for CPI of 4.12 ?
- Add a ‘one-instruction’
addi
to multi cycle ?
- How many instructions might this reduce ?
- A matter of guessing. Say
- 2% of “regular” arithmetic
- All
la
instructions. (Say 2% of lw/sw)
- Is about 2% overall.
- Still not enough
- What else could we add?
- Intel-style R/M instructions