High Performance Computer Architecture 

CS60002  (4-0-0)

 

Spring Semester 2012-13

Class Schedule: Monday (11:30-12:30), Tuesday (9:30-11:30), Thursday (7:30-8:30)  

VENUE: Room 119, Ground Floor, CSE Department (from March 12 onwards)

 

 

 

Attendance/Absence Record

 

Course Contents:

Introduction: Review of basic computer architecture, quantitative techniques in computer design, measuring and reporting performance. CISC and RISC processors.

Pipelining: Basic concepts, instruction and arithmetic pipeline, data hazards, control hazards, and structural hazards, techniques for handling hazards. Exception handling. pipeline optimization techniques. Compiler techniques for improving performance.

Hierarchical Memory Technology: Inclusion, coherence and locality properties. Cache memory organizations. Techniques for reducing cache misses. Virtual memory organization, mapping and management techniques, memory replacement policies.

Instruction-level parallelism: Basic concepts, techniques for increasing ILP. Superscalar, super-pipelined and VLIW processor architectures. Array and vector processors.

Multiprocessor Architecture: Taxonomy of parallel architectures. Centralized shared-memory architecture, synchronization, memory consistency, interconnection networks. Distributed shared-memory architecture.

Non von Neumann Architectures: Data flow Computers, reduction computer architectures, systolic Architectures.

References:

  1. J.L. Hennessy and D.A. Patterson, “Computer Architecture: a Quantitative Approach”, Third/Fourth/Fifth Edition, Morgan Kaufmann Publishers, 2006/2011.
  2. J.L. Hennessy and D.A. Patterson, “Computer Organization and Design: the Hardware/Software Interface”, Fourth Edition, Morgan Kaufmann Publishers, 2008.
  3. J.P. Shen and M.H. Lipasti, “Modern Processor Design”, Tata McGraw-Hill Publishing Company Ltd, 2005.
  4. M.J. Flynn, “Computer Architecture: Pipelined and Parallel Processor Design”, Narosa Publishing House, 1995.
  5. Kai Hwang, “Advanced Computer Architecture: Parallelism, Scalability, Programmability”, McGraw-Hill, 1993.

Course Coverage:

·         Quantitative principles of computer design, Amadahl’s law, CPU performance equation [1,2]

·         Measuring processor performance [1,2]

·         Basic principles of pipelining: reservation table, latency analysis, optimizing throughput [5]

·         MIPS64 instruction set architecture, instruction encoding, non-pipelined implementation of MIPS64 integer instruction set [1,2]

·         Pipelined implementation of MIPS64 integer instruction set, microoperations, handling of hazards, interrupt handling [1,2]

·         Dynamic branch prediction techniques [1,2]

·         Loop unrolling, superscalar and VLIW processors [1,2]

 

Assignment (to be done in groups of 3):

Design and implement a subset of the MIPS64 processor in Verilog. The following features have to be implemented.

·         Only integer operations.

·         Only 64-bit operand load and store.

·         Following instructions are to be handled:

LD, SD, DADD, DSUB, DAND, DOR, DXOR, NOP

DADDI, DSUBI,

DSLL, DSRL, DSRA, SLT, SLTI

BEQZ, BNEZ, J, JR

TRAP (to terminate a program)

·         Implement register bank as a separate module as an array of 64-bit registers.

·         Implement memory as a separate module:

Inside the module, define it as an array of bytes.

Separate instruction and data memory (no need to implement cache).

·         5-stage pipelined version of MIPS has to be implemented.

With data forwarding

Delayed branch for control hazard handling

Deliverables:

·         Verilog code for the MIPS processor.

·         Test benches for example codes, and simulation results.

·         Detailed design documentation (approx. 20 pages).

ADDITIONAL FEATURES IMPLEMENTED WILL FETCH BONUS MARKS.

EVALUATION:   March 25-26 (Interim);  Before end-sem (Final)

 

 

Important Instructions:

·         Attendance in the classes is mandatory. Students with poor attendance (< 60%) will be deregistered from the course from February 14, 2013 onwards. Current attendance status of all the students will be posted on the web site on a regular basis.

·         The break-up of marks will be as follows:

o   30: Mid-Semester Examinations

o   40: End-Semester Examinations

o   15: Class Tests + Attendance

o   15: Assignments (on processor design using Verilog)