Computer Architecture
Computer Architecture
Fall 2020/2021
Question 1:
Consider the following fragment of code writer in three versions:
This program fragment Adds the contents of the memory word at address 940
(0003), to the contents of the memory word at address 941 (0002) and stores the
result (0005)
in the first/second/other location (940)/ (941)/ new memory address according to
the above three versions. The PC of the CPU contains 300 as the starting address
of the memory instructions.
(b) Complete by drawing and explanation the other four steps to implement the
above partial program, for the above three cases.
Ans:
Version 1
Version 2
Version 3
(c) From your opinion, which version is the best programming style? Why?
Ans:
Version 1&2 are the best programming style, because it’s better to use already
in use space in memory than allocating new space.
Question 2:
Consider a program consists of four fragments 1, 2, 3 and 4. The processor
could be subjected to multiple interrupts during the execution of this program.
2022/2021 مش علينا فالميد
(a) What is the purpose of having interrupt mechanism during the program
execution?
Ans:
(b) Draw and explain how the above program is going to be executed in case of
short wait interrupt.
Ans:
(c) Draw and explain how the above program is going to be executed in case of
long wait interrupt.
Ans:
(d) Which approach, in (b) or (c); is better for the processor performance? Why?
Ans:
Fall 2019/2020
Question 1
Consider the following fragment code:
Int A,B;
A=4;
B=5;
A=A+B;
This program fragment Adds the contents of the memory word at address 950
(0004) to the contents of the memory word at address 951 (0005) and stores the
result (0009) in the first location (950). The PC of the CPU contains 300 as the
starting address of the memory instructions.
Assume that a partial list of CPU opcode of a hypothetical machine is as follows:
0001 Load AC from memory
0010 Store AC to memory
0101 Add AC from memory
(a) Describe and draw how the PC, AC and the IR interacted with the memory
locations into six steps step 1 to step 6 fetch and execute steps.
Ans:
Step1 (Fetch) Step 2 (Execute)
PC started with counter 300. PC incremented with 1, so the new PC is 301.
IR was loaded with address 300. IR still has the same value as step 1.
AC is empty. AC is loaded with the address 950 from memory.
Step 3 (Fetch)
PC is still the same with value 301.
IR is loaded with address 301.
AC is still with same value as step 2.
Step 4 (Execute)
PC is incremented and become 302.
IR is still the same as step 3.
AC is loaded with the result of addition of 950 & 951 which is (0009).
(b) Repeat your solution in (a with slight change in the program fragments:
Int A, B;
A=5;
B=4;
B=A+B;
Ans:
B الى951 ( وفاالخر هنخزن الناتج فB) 951 ( معA) 950 هنا نفس الحل الى فوق بس هنبدل
(c) Assume that the programmer changed his code to get the tests such that
C = A + B; how do you think the implementations going to be affected?
Ans:
The implementation is going to be affected by allocating another memory space
for C with address (952) and store the final result in it in the last execute step.
Question 2
Intel Research Center (IRC) tried to increase the clock speed and logic density by
solving two serious problems; Power density (watts/cm2) with its corresponding
heat dissipation due to the increase in the density of logic in addition to the clock
speed on the chip and the RC delay problem.
Consider the following four experiments conducted into two groups by the R&D
group in Intel to solve the RC delay problem:
Group 1
A L ρ R C 𝜏
Experiment #1 0.2 m 0.5 m 0.008 ohm.m ??? 50 μF ???
2
Group 2
A L ρ R C 𝜏
Experiment #3 0.2 m 0.5 m 0.008 ohm.m ??? 50 μF ???
2
𝜌2 𝑥 𝐿2 0.008 𝑥 0.3
𝑅2 = = = 6𝑥10−3 𝑜ℎ𝑚
𝐴2 0.4
𝜌4 𝑥 𝐿4 0.03 𝑥 0.5
𝑅4 = = = 0.075 𝑜ℎ𝑚
𝐴4 0.2
Question 3
Case # No. of processors f Parallel portion 1-f serial portion
1 6 0.35 0.65
2 60 0.55 0.45
3 600 0.98 0.02
(b) Compute the speedup based on the given table of running a program on 6
processors; then increase the parallel portion f and re-execute it on 60 and 600
processors.
Ans:
Case # No. of f Parallel 1-f serial Speed up Bound
processors portion portion 𝟏 𝟏
𝒇 (𝟏−𝒇)
(𝟏 − 𝒇)+
𝑵
1 6 0.35 0.65 1.411 1.538
2 60 0.55 0.45 2.177 2.22
3 600 0.98 0.02 1.633x10-3 50
Fall 2018/2019
Question 1:
Consider the Following fragment of code:
Int A, B;
A=3;
B=2;
A=A+B;
This program fragment Adds the contents of the memory word at address 940
(0003) to the contents of the memory word at address 941 (0002) and stores the
result (0005) in the first location (9400). The PC of the CPU contains 300 as the
starting address of the memory instructions.
Assume CPU opcode of a hypothetical machine is as follows:
0001 Load AC from memory
0010 store AC to memory
0101 Add to AC from memory
a) Describe how the PC, AC and the IR interacted with the memory locations as
shown in step 1 and step 2 as fetch and execute steps.
Ans:
Step 1
PC started with counter 300.
IR was loaded with address 300.
AC is empty.
Step 2
PC incremented with 1, so the new PC is 300.
IR still has the same value as step 1.
AC is loaded with the address 940 from memory.
b) Complete by drawing and explanation the other four steps to implement the
above partial program.
Ans:
Question 2:
Intel Research Center (IRC) tried to increase the clock speed and logic density by
solving two serious
Problems: Power density (watts/𝑐𝑚2 ) with its corresponding heat dissipation due
to the increase in the
density of logic and the clock speed on the chip and the RC delay problem.
Consider the following data related to three experiments conducted by the R&D
group in Intel to
solve the second problem:
A L ρ R C 𝜏
Experiment #1 0.2 𝑚2 0.5 𝑚 0.008 Ohm. m ??? 50 µF ???
Experiment #2 0.4 𝑚2 0.3 𝑚 0.01 Ohm. m ??? C2=C1 ???
Experiment #3 1.6 𝑚2 0.1 𝑚 0.03 Ohm. m ??? C3=C1 ???
𝜌3 𝑥 𝐿3 0.03 𝑥 0.1
𝑅3 = = = 1.875𝑥10−3 𝑜ℎ𝑚
𝐴3 1.6
c) Based on your computations in (a) and (b), what is your decision to follow
experiment 1, 2 or 3 to achieve higher performance? Why?
Ans:
To achieve high performance, best to use experiment 3, because it has the
lowest delay seconds than 1 and 2.
d) Based on your computations in (a) and (b), what is your decision to follow
experiment 1, 2 of 3 to achieve lower cost? Why?
Ans:
To achieve lower cost, best to use experiment 1, because it has the lowest area
than 1 and 2. And cost might increase depending on the chosen wiring material.
Fall 2017/2018
Question 1:
Consider the Following fragment of code:
Int A, B;
A=2;
B=3;
A=A+B;
This program fragment Adds the contents of the memory word at address 200
(0002) to the contents of the memory word at address 201 (0003) and stores the
result (0005) in the first location (200). The PC of the CPU contains 500 as the
starting address of the memory instructions.
Assume CPU opcode of a hypothetical machine is as follows:
0001 Load AC from memory
0010 store AC to memory
0101 Add to AC from memory
a) Name and describe the functionality of each the basic registers in the CPU and
the memory components which cloud interact to implement the above partial
program?
Ans:
Basic registers in the CPU
1. Instruction register (IR) contains the 8-bit opcode instruction being
executed.
2. Program counter (PC) contains the address of the next instruction-pair
to be fetched from memory.
3. Accumulator (AC) and multiplier quotient (MQ) to hold temporarily
operands and results of ALU operations.
Memory module
• Consists of a set of locations, defined by sequentially numbered
address.
• Each location can contain binary numbers that could be interpreted
either an instruction or data.
b) Draw and explain how this partial program is going to be processed; including
three fetch and three execute cycles.
Ans:
Question 2:
Intel Research Center (IRC) tried to solve the problem of RC delay, the
speed at which electrons can How on a chip between transistors. This speed
is limited by the resistance and capacitance of the metal wires connecting
them.
Consider the following data related to two experiments conducted by the
R&D group in Intel:
R L ρ A C 𝜏
2
Experiment #1 6.66 Ohm 0.5 m ??? 0.06 m 50 µF ???
Experiment #2 ??? L2 = L1 ρ2 = 2 ρ1 A2 = A1 C2 = C1 ???
𝜌2 𝑥 𝐿2 1.5984 𝑥 0.5
𝑅2 = = = 13.32 𝑜ℎ𝑚
𝐴2 0.06
Question 3:
Consider a program consists of four fragments 1, 2. 3 and 4. The processor could
be subjected to multiple interrupts during the execution of this program.
2022/2021 مش علينا فالميد
a) What is the purpose of having interrupt mechanism during the program
execution?
Ans:
b) Draw and explain how the above program is going to be executed in case of
short wait interrupt.
Ans:
c) Draw and explain how the above program is going to be executed in ease of
long wait interrupt.
Ans:
d) Which approach, in (b) or (c) is better for the processor performance? Why?
Ans:
Fall 2016/2017
Question 1:
Intel Research Center (IRC) tried to solve the problem of RC delay, the
speed at which electrons can flow on a chip between transistors. This speed
is limited by the resistance and capacitance of the metal wires connecting
them.
a) Assuming that the specific resistance (ρ1) is 0.8 Ohm.m, the length of the
wire (L1) is 0.5 m, and the cross section area of the wire (A1) is 0.06 m2. If
you have a wiring capacitance (C1) of 20 µF.
Compute the delay 𝜏1.
Ans:
Given:
• ρ1 = 0.8 ohm.m
• L1 = 0.5 m
• A1 = 0.06 m2
• C1 = 20*10-6 F
• R1 = ???
• 𝜏1 = ???
Sol:
𝜏1 = 0.6 𝑅𝐶
𝜌1 𝑥 𝐿1 0.8 𝑥 0.5
𝑅1 = = = 6.66 𝑜ℎ𝑚
𝐴1 0.06
b) Suppose that Intel's researchers tested another wiring material with (ρ2 = ρ1)
with half the wiring length (L2 = 0.5 L1) and half cross section area
(A2 = 0.5 A1). Compute 𝜏2.
Ans:
Given:
• ρ2 = 0.8 ohm.m
• L2 = 0.5 * L1
= 0.5 * 0.5
= 0.25 m
• A2 = 0.5 * A1
= 0.5 * 0.06
= 0.03 m2
• C2 = 2 C1
= 2 * 20 * 10-6
= 4 * 10-5 F
• R2 = ???
• 𝜏2 = ???
c) Given that C2 = 2C1 Compute 𝜏2.
Ans:
𝜏2 = 0.6 𝑅𝐶
𝜌2 𝑥 𝐿2 0.8 𝑥 0.25
𝑅2 = = = 6.66 𝑜ℎ𝑚
𝐴2 0.03
d) Based on your computations in (a), (b) and (c), compare the achieved
performance and the cost of production in the previous two experiments.
Ans:
In case 1 the delay is lower, so the performance increase and the cost also
may increase on the chosen wiring material.
Question 2:
Assume the programmer wrote the following fragment of code:
Int A, B, C;
A=4;
B=3;
C=A+B;
i.e. this program fragment Adds the contents of the memory word at address 650
(0004) to the contents of the memory word at address 651 (0003) and stores the
result in the first location (650). The PC of the CPU contains 400 as the starting
address of the memory instructions.
Assume that a partial list of CPU opcode of a hypothetical machine is as follows:
I/O مش هنحتاجهم فالحل طالما مدينا
0001 Load AC from memory
0010 store AC to memory
0101 Add to AC from memory
0011 load AC from I/O
0111 Store AC to I/O
1000 Subtract AC from memory
a) What are the basic registers in the CPU and the memory components which
could interact to implement this partial program?
Ans:
Basic registers in the CPU
1. Instruction register (IR) contains the 8-bit opcode instruction being
executed.
2. Program counter (PC) contains the address of the next instruction-pair to
be fetched from memory.
3. Accumulator (AC) and multiplier quotient (MQ) to hold temporarily
operands and results of ALU operations.
Memory module
• Consists of a set of locations, defined by sequentially numbered address.
• Each location can contain binary numbers that could be interpreted either an
instruction or data.
b) Draw and explain how this partial program is going to be processed, including
three fetch and three execute cycles.
Ans:
Question 3
Nested Interrupt processing, as a method to improve processing efficiency
follows either a priority policy or FIFO policy. Consider the case of having
interrupts from printer interrupt service routine (P2), communication
interrupt service routine (P5) and finally from disk interrupt service routine
(P4) as shown in fig. 1.
(c) Explain the third case of having interrupts with long I/O waits and clarify its
difference from the second case of having interrupts with short I/O wails.
Ans: