Midterm Examination
Total 100 points
1. [20 points] MIPS
(1)There are 3 basic formats in MIPS. Fill in the following blanks.
(3pts.)
R-format:
Opcode Rs Rt Rd Shamt Funct
____bits ____bits ____bits ____bits ____bits ____bits
I-format:
Opcode Rs Rt Address/Immediate
____bits ____bits ____bits ____bits
J-format:
Opcode Address
____bits ____bits
(2)Please describe five addressing modes of MIPS and then show one instruction ofeach. (6 pts.)
(3)The following is a MIPS function. It expects one non-negative integer
argument and returns one integer result.
function:
addi $t0, $0, 1
addi $t1, $0, 1
loop:
blt $a0, 2, exit
add $t2, $t0, $t1
add $t0, $t1, $0
add $t1, $t2, $0
add $a0, $a0, -1
j loop
exit:
add $v0, $t1, $0
jr $ra
a. Translate the function intoC. Do not use "goto". (4 pts.)
b. What will this functionreturn if it is called with an argument
($a0) of 4?
What foes this function compute?(4 pts.)
(4)For the pseudoinstruction "ble $t4, $t3, Loop" (branch less thanequal.)
Please produce a minimal sequence of MIPS instructions (beq, bne or slt)
to accomplish the same thing. You may need to use $t5 if necessary.
(3 pts.)
2. [8 points] IEEE754 (in the singleprecision)
(1)Write the IEEE 754 representation in the single precision for 14/3.
(4 pts.)
(2)What decimal number is represented by this word in IEEE 754? (4 pts.)
11000001111100000000000000000000
3. [14 points] ALU
(1)We will use the following 1-bit ALU that supports "add","complement
subtract", "and", "or" and "slt" toconstruct a 16-bit ALU. Overflow
detection has to be applied too. Describe how to use this 1-bit ALU
and how to modify MSB and LSB. (8 pts.)
(圖:1-bit ALU)
(2)Carry Look Ahead Adder
The figure shows a 2-level CLA adder. (ALUi is a 4-bit CLA adder.)
(a) please determine the value of CarryOut15 (C4) of adding a and b:
a: 0 0 0 1 1 1 1 0 0 0 1 1 0 0 1 1
b: 1 1 1 1 1 1 0 1 1 1 1 0 1 0 1 1
Please show your answer by determining (pt, gi) and (Pi, Gi) step by
step. (8 pts.)
4. [28 points] Datapath
Youare given four sheets of datapath figures of the single-cycle
implementation. Please highlight the active parts in the datapath for
(1)"addi $t1, $t2, 2" on Figure 1 (4 pts.)
(2)"lw $t1, 2($t2)" on Figure 2 (4 pts.)
(3)"beq $t1, $t2, Loop" on Figure 3 (4 pts.)
(4)"slt $t1, $t2, $t3" on Figure 4 (4 pts.)
(發四張 Single-Cycle Datapath 的圖)
5. [8 points] Control
How will you plan to design the main control of "the single cycle
implementation" of MIPS? Given the following table of a simplifiedMIPS,
please show your design.
Control Signal name R-format lw sw beq
Op5 0 1 1 0
Op4 0 0 0 0
Inputs Op3 0 0 1 0
Op2 0 0 0 1
Op1 0 1 1 0
Op0 0 1 1 0
RegDst 1 0 X X
ALUSrc 0 1 1 0
MemtoReg 0 1 X X
RegWrite 1 1 0 0
Outputs MemRead 0 1 0 0
MemWrite 0 0 1 0
Branch 0 0 0 1
ALUOp1 1 0 0 0
ALUOp0 0 0 0 1
6. [12 points] Multiplier/Divisor
(1) Please describe which multiplier in the following two datapath
(left vs. right) is better and why. Then use the better multiplier
to calculate 6x7 (4 bits for each number and 8 bits for the result)
step by step. (6 pts.)
(左圖: First Version) (右圖: Refined Version)
(2) Please use the Booth algorithm to calculate -5*7. (4 pts.)
(3) Please show the process of calculating 5(0101)/2(0020) step bystep with the following restoringdivision. (4 pts.)
(圖:Datapath of Refined Version of Division)
7. [20 points] Performance:
(1) Amdahl'Law
1. Given a computational task inwhich at most 20% can be parallelized,
what is the maximal speedup you can obtain according to Amdahl's law?
(3 pts.)
2. Suppose you want to achieve aspeedup of 80 with a multiprocessor that
has 100 processors. If you have a sequential program, X, what fraction
of X can still be sequentially executed in order to achieve this goal?
(3 pts.)
(2) A processor has a clock rate of 500MHz, and the followingmeasurements
have been made by using a test program:
Instruction class CPI Frequency
A 2 35%
B 3 25%
C 4 20%
D 5 5%
E 6 15%
1. What is the MIPS of the processor? (3 pts.)
2. What is the CPI of theprocessor? (3 pts.)
The compiler of the processor is then improved. The instruction
improvements from this enhanced compiler have been estimated as follows:
Instruction class Percentageof instructions executed vs. old compiler
A 90%
B 80%
C 85%
D 100%
E 80%
3. What is the new CPI of theprocessor after the test program
is processed by the enhanced compiler? (4 pts.)
4. How many times faster can beachieved by using the enhanced compiler?
(4 pts.)
MidtermExamination
Total 100 points
1. [10 points] MIPS
(1) Thereare 3 basic formats in MIPS. Fill in the following blanks. (3pts.)
R-format:
Opcode | Rs | Rt | Rd | Shamt | Funct |
_____bits | _____bits | _____bits | _____bits | _____bits | _____bits |
I-format:
Opcode | Rs | Rt | Address/Immediate |
_____bits | _____bits | _____bits | _____bits |
J-format:
Opcode | Address |
_____bits | _____bits |
(2) Forthe pesudoinstruction “ble $t5, $t3, Loop” (branch less than equal.)
Please produce a minimal sequence of MIPS instructions (beq, bne or slt) toaccomplish the same thing. You may need to use $t4 if necessary. (4 pts.)
(3) Followingthe MIPS procedure calling convention, which of the following should be storedby the caller explicitly before a procedure call so that the callee can usethem as it wishes?(3 pts.)
a. Savedregisters: $s0-$s7
b. Temporaryregister: $t0-$t9
c. Returnaddress: $ra
d. Returnvalue: $v0-$v1
2. [12 points] True or False: (Awrong answer costs one more pt. off)
(1) Thereare more addressing modes in CISC than in RISC
(2) Vaxis RISC.
(3) Thegeometric mean in used to calculate the average SPECratio.
(4) Upto 232 words are addressable in MIPS.
(5) Theproblem of MIPS is that it is program dependent and can even be affected by thecompiler.
(6) Ifwe write a 32-bit (4-byte) data word, 0x12345678, to the address 0x1000 in abig-endian system, it should be
0x1000 0x0001 0x1002 0x1003
3. [10 points] IEEE754 (inthe single precision)
(1) Writethe IEEE 754 representation in the single precision for 13/6. (5 pts.)
(2) Whatdecimal number is represented by this word in IEEE 754? (5 pts.)
01000001010100000000000000000000
4. [14 points] ALU
(1) Pleaseuse the following 1-bit ALU that supports “add”, “complement subtract”, “and”, “or”and “slt” ALU to form a 3-bit ALU. (6 pts.)
(2) CarryLook Ahead Adder
The right figure shows a 2-level CLA add. (ALU is a 4-bit CLA adder.)
(a) Pleasedetermine the value of CarryOut15 (C4) of adding a and b:
a: 0101 0110 0011 1111
b: 1111 1101 1110 1011
(b) Pleaseshow your answer by determining (pi, gi) and (Pi, Gi) step by step. (6pts.)
(3) CarrySave Adder
In what situation will the carry save adder be used and perform well? (2 pts.)
5. [16 points] Datapath
You are given four sheets ofdatapath figures of the single-cycle implementation. Please highlight the active partsin the datapath for 123
(1) “addi$t1, $t2, 5” on Figure 1 (4 pts.)
(2) “lw$t1, 3($t2)” on Figure 2 (4 pts.)
(3) “beq$t1, $t2, Loop” on Figure 3 (4 pts.)
(4) “slt$t1, $t2, $t3” on Figure 4 (4pts.)
6. [8 points] Control
Howwill you plan to design the main control of “the single cycle” implementation”of MIPS? Given the following table of a simplified MIPS, please show yourdesign.
7. [15 points] Multiplier/Divisor
(1) Pleasecalculate 5x7 step by the following multiplier (4 bits for each number and 8bits for the result). Please do not switch multiplicand and multiplier. (5pts.)
(2) Pleaseuse the Booth algorithm to calculate -5x5 (4 bits for each number and 8 bitsfor the result). Please show the steps and follow a similar structure of thefigure above. Again, please do not switch the multiplier and multiplicand. (5pts.)
(3) Pleaseshow the process of calculating 5(0101)/3(0011) steps by step by using thefollowing restoring division. (5 pts.)
8. [15 points] Performance:
(1) Whichchange is more effective on a certain machine: speeding up 10 times thefloating point square root operation only, which takes up 20% of executiontime, or speeding up 2 times all other floating point operations, which take up50% of total execution time? Assume that the cost of accomplishment either changeis the same, and the two changes are mutually exclusive. (5 pts.)
(2) Assumewe have two computer named “Orange” and “Tomato”. “Orange” computer runs a 1.2GHz clock while “Tomato” computer runs a 1.4 GHz clock. They use the same ISA,which has three classed of instructions, A, B, and C with CPI equal to 1, 2 and3, respectively. Assume computer #1 is designed for (or sold/bundled with) “Orange”computers and compiler#2 is designed for “Tomato” computers. If we use the Cprogramming language to implement a computer program, which is then compiled tomachine instructions by the two compilers. We obtain the following data:
Machine instruction executed (in billions) |
Code from compile#1 on “Orange” | Code from compiler#2 on “Tomato” |
A | B | C | A | B | C |
5 | 2 | 2 | 9 | 1 | 1 |
Pleasecalculate (a) MIPS and (b) execution time to determine which machine is faster.(10 pts.)
Computer
Organization Midterm Examination
total:100 points
1.
(10%) True or false(一題兩分,答錯扣三分,False要寫理由)
(1) Microprogramming is usually used in the control design of CISC
machines.
(2)For a little-endian machine, if we write a 32-bit(4-byte) data
word, 0x87655678, to the address
0x1000, the byte at 0x1000 is 01111000.
(3)The stack in MIPS32 grows from lower address to higher address.
(4)In the history of computers' development, the size of memory and
number of registers in a computer grow
according to the Moore's law.
(5)In MIPS32, by using J-format, the program can jump to
anywhere(any 32-bit address)in the
memory
2.
(12%)MIPS
(1)Please describe five addressing modes of MIPS and then show
one instruction of each(6%)
(2)For the pseudoinstruction "ble $t4, $t3, Loop"(branch
less than equal), please use "beq/bne/slt"to accomplish it. Try as
few instructions as possible. Please use $t5 if necessary(6%).
3.
(12%)IEEE754 (in the single
precision)
(1)Write the
IEEE 754 representation in the single precision for 8/3(4%)
(2)What decimal
number is represented by this word in IEEE 754?(4%) 11000001010100000000000000000000
(3)What are the
largest Denormalized number and the smallest normalized number?(4%)
4.
(28%)Datapath
(1)You are given
4 sheets of datapath figures of single-cycle implementations. Please highlight the active part in the
datapath for
(a)addi $t1, $t2, 2 on
figure1
(b)lw $t1,2($t2) on
figure2 老師發四張datapath的圖,描圖
(c)beq $t1, $t2, Loop on figure3
each 4 pts.
(d)slt $t1,$t2,$t3 on
figure4
(2)What are the
CPI values of a/b/c/d?(4%)
(3)Roughly
describe the control design methodology of MainControl and ALUControl in the datapath figure(8%)
5.
(16%)ALU
(1)We will use the
following 1-bit ALU that supports "add","or",and
"slt"to construct a 16-bit
ALU that supports these operations.Overflow detection has to be applied too.
Describe how to use this 1-bit ALU and how to modify MSB and LSB (8%)
圖片
(2)Carry look
ahead adder. The figure shows a 2-level CLA adder.(ALUi is a 4-bit CLA adder.)
Please determine the value of
CarryOut15(C4) of adding a and b:
a: 0001 1010 0011 0011
b: 1110 1101 1110 1011
Please show your answer by determining(pi,gi)and(Pi,Gi)step by step(8%)
6.
(12%)Multiplier/Divisor
(1)Please
describe which multiplier in the following two datapath(左vs右)is
better and why.Then use the better multiplier to calculate 4x5(4 bits
for each number and 8 bits for the result)step by step(6%)
(就是First multiplication
和refined version那兩個圖拿來比較)
(2)Please show
the processor of calculating 5/2 step by step with the following
7.
(10%)Performance
Two different
compilers, A and B, are being tested for an 1GHz machine with three different
classes of instructions:
(I) branch/Jump
(II) arithmetic
and
(III) memory-access
instructions
, which require
one, two and three cycles per instructions, respectively. Both compilers are
used to produce code for a large piece of software. The compiler A's code uses
60 million branch/jump instructions, 20million arithmetic instructions and 0
million memory-access instructions. The compiler B's code uses 100 million
branch/jump instructions, 10 million arithmetic instructions and 10 million
memory-access instructions.
Which sequence will
be faster according to MIPS? Which sequence will be faster according to
execution time?(You have to calculate MIPS and execution time.)
文章定位: