联系方式

  • QQ:99515681
  • 邮箱:99515681@qq.com
  • 工作时间:8:00-23:00
  • 微信:codinghelp

您当前位置:首页 >> C/C++编程C/C++编程

日期:2020-03-11 10:31

CIS341, Spring 2020

Assembly Lab: Programming a Y86 Processor

Assigned: Feb. 18st, Due: Mar. 13th, 8:59PM

1 Introduction

In this lab, you will learn about the Y86 instruction set architecture and how an assembly program executes

on a sequential Y86 processor. You are allowed to make any semantics-preserving transformation to the

program, and optionally make enhancements to the sequential processor. When you have completed the lab,

you will have a keen appreciation for the interactions between code and hardware that affect the performance

of your programs.

You will be hand-coding Y86 assembly codes with the goal of either making the code size small, or making

the code run fast. You may optionally extend the Y86 instruction set architecture for the SEQ processor to

speed up your assembly program.

2 Logistics

You will work on this lab alone. Any clarifications and revisions to the assignment will be posted on

blackboard.

3 Handout Instructions

Start by copying asmlab-handout.tar to your home directory on lcs-vc-cis341.syr.edu machine.

Then untar the tarball file to unpack the assignment in the directory.

unix> cp /home/cis341-proj/asmlab-handout.tar .

unix> tar -xvf asmlab-handout.tar

unix> cd asmlab-handout

In this directory, you will find subdirectories misc, ptest, seq, src, and y86-code, and two files

Makefile and README. You may build all the available tools at this working directory.

1

unix> make clean; make

Read the README files in all directories to familiarize yourself with the assignment.

4 Y86 Programming

You will be working in directory src in this part. Your task is to write and simulate the following two Y86

programs: selectsort.ys and quicksort.ys. The required behavior of these programs is defined

by C programs in the src directory. You can test your programs by first assembling them with the program

YAS and then running them with the instruction set simulator YIS.

In all of your Y86-64 functions, you should follow the x86-64 conventions for passing function arguments,

using registers, and using the stack. This includes saving and restoring any callee-save registers that you

use. The callee-save registers are %rbx, %rbp, %r12, %r13, and %r14. %rsp is a special case where it

should be restored to the value before the call via call/ret and push/pop mechanism.

selectsort.ys: selection sort algorithm

Write a Y86-64 program selectsort.ys that performs a selection sort on an array of data. Your program

should consist of some code that sets up the stack structure, invokes a function, and then halts. In this case,

the function should be Y86 code for a function (selectsort) that is functionally equivalent to the C

selectsort function in Figure 1. The directory includes a skeleton file for you to implement everything

except for the selection sort algorithm itself.

You may choose to implement selectsort without calling swap: in such case, delete swap.

quicksort.ys: quick sort algorithm

Write a Y86-64 program quicksort.ys that performs a quick sort on an array of data. Your program

should consist of some code that sets up the stack structure, invokes a function, and then halts. In this

case, the function should be Y86 code for a function (quicksort) that is functionally equivalent to the C

quicksort function in Figure 2. The directory includes a skeleton file for you to implement everything

except for the quick sort algorithm itself.

quicksort must recursively call itself, and must call partition. However, you may choose to implement

quicksort without calling swap: in such case, delete swap.

Getting started

You should first examine the C code to understand what the algorithm is doing line by line. Then try

compiling to x86 assembly with various optmization levels to understand how the compiler is generating

the assembly code.

unix> gcc -Og -S <C file>.c # optimize debugging experience

2

1 #include <stdint.h> // includes definition for int64_t

2

3 /* swap - Swaps the data pointed by the two pointer arguments */

4 void swap(int64_t *srcA, int64_t *srcB) {

5 int64_t valA, valB;

6 valA = *srcA;

7 valB = *srcB;

8 *srcA = valB;

9 *srcB = valA;

10 }

11

12 /* selectsort - Sorts an array of data using selection sort algorithm */

13 void selectsort(int64_t *arr, int64_t size) {

14 int64_t i, j, min_idx;

15 for (i=0; i<size; i+=1) {

16 min_idx = i;

17 for (j=i+1; j<size; j+=1) {

18 if (arr[j]<arr[min_idx])

19 min_idx = j;

20 }

21 if (min_idx!=i) {

22 swap(&arr[i], &arr[min_idx]);

23 }

24 }

25 }

Figure 1: C versions of selection sort. See src/selectsort.c

3

1 #include <stdint.h> // includes definition for int64_t

2

3 /* swap - Swaps the data pointed by the two pointer arguments */

4 void swap(int64_t *srcA, int64_t *srcB) {

5 int64_t valA, valB;

6 valA = *srcA;

7 valB = *srcB;

8 *srcA = valB;

9 *srcB = valA;

10 }

11

12 /* partition - Partitions the array into two:

13 with one side smaller than a pivot value, and the other, larger.

14 This function returns the index of the pivot. */

15 int64_t partition(int64_t *arr, int64_t lo, int64_t hi) {

16 int64_t i, j, p;

17 p = arr[hi];

18 i = lo;

19 for (j=lo; j<hi; j+=1) {

20 if (arr[j] < p) {

21 swap(&arr[i], &arr[j]);

22 i += 1;

23 }

24 }

25 swap(&arr[i], &arr[hi]);

16 return i;

17 }

18

19 /* quicksort - Sorts an array of data using quick sort algorithm */

20 void quicksort(int64_t *arr, int64_t lo, int64_t hi) {

21 int64_t p;

22 if (lo<hi) {

23 p = partition(arr, lo, hi);

24 quicksort(arr, lo, p-1);

25 quicksort(arr, p+1, hi);

26 }

27 }

Figure 2: C versions of quick sort. See src/quicksort.c

4

Once you understand how x86 code is generated, try transliterating to Y86 code. You may also want to try

other compiler options and observe how different the assembly code can become.

unix> gcc -Os -S <C file>.c # optimize for size

unix> gcc -O0 -S <C file>.c # optimization level 0 (default)

unix> gcc -O1 -S <C file>.c # optimization level 1

unix> gcc -O2 -S <C file>.c # optimization level 2

unix> gcc -O3 -S <C file>.c # optimization level 3

5 Optional Architectural Enhancement

You will be working in directory seq in this part.

You may optionally extend the SEQ processor to support the iaddq instruction, described in Homework

problems 4.51 and 4.52. To add this instruction, you will modify the file seq-full.hcl, which implements

the version of SEQ described in the CS:APP3e textbook. In addition, it contains declarations of some

constants that you will need for your solution.

Building and Testing Your Solution

Once you have finished modifying the seq-full.hcl file, then you will need to build a new instance of

the SEQ simulator (ssim) based on this HCL file, and then test it:

• Building a new simulator. You can use make to build a new SEQ simulator:

unix> make VERSION=full

This builds a version of ssim that uses the control logic you specified in seq-full.hcl. If you

want to revert back to the standard textbook version (without your modification in seq-full):

unix> make VERSION=std

• Testing your solution on a simple Y86-64 program. For your initial testing, we recommend running

simple programs such as asumi.yo (testing iaddq) in TTY mode, comparing the results against

the ISA simulation:

unix> ./ssim -t ../y86-code/asumi.yo

If the ISA test fails, then you should debug your implementation by single-stepping the simulator in

GUI mode:

unix> ./ssim -g ../y86-code/asumi.yo

• Retesting your solution using the benchmark programs. Once your simulator is able to correctly

execute small programs, then you can automatically test it on the Y86-64 benchmark programs in

../y86-code:

5

unix> (cd ../y86-code; make testssim)

This will run ssim on the benchmark programs and check for correctness by comparing the resulting

processor state with the state from a high-level ISA simulation. Note that none of these programs test

the added instructions. You are simply making sure that your solution did not inject errors for the

original instructions. See file ../y86-code/README file for more details.

• Performing regression tests. Once you can execute the benchmark programs correctly, then you

should run the extensive set of regression tests in ../ptest. To test everything except iaddq

and leave:

unix> (cd ../ptest; make SIM=../seq/ssim)

To test your implementation of iaddq:

unix> (cd ../ptest; make SIM=../seq/ssim TFLAGS=-i)

For more information on the SEQ simulator refer to the handout CS:APP3e Guide to Y86-64 Processor

Simulators (simguide.pdf).

6 Evaluation

This lab is worth 200 points: 100 points for selectsort.ys and 100 points for quicksort.ys. They

will be evaluated on correctness as well as code size or performance. If no seq-full.hcl is submitted,

we will evaluate using the unmodified seq-full.hcl. If seq-full.hcl is submitted, we will use that

SEQ implementation for the evaluation.

6.1 Correctness

Correctness accounts for 50 points each for selectsort.ys and quicksort.ys. The simulation of

the programs must end with the HLT (halt) status.

Both algorithms perform in-place updates. After the sort, the array must contain elements in non-decreasing

order. An array size of 32 will be used with different initialized values.

In addition, the Y86 assembly code must follow the correct register saving convention. By the end of the

program, all callee-saved registers must be restored to the values to before executing the call to the sort. The

callee-save registers are %rbx, %rbp, %r12, %r13, and %r14. %rsp is a special case where it should be

restored to the value before the call via call/ret and push/pop mechanism.

6.2 Code size for selectsort.ys

The code size for selectsort.ys accounts for 50 points. You will only receive these points if your

implementation of selectsort is correct.

6

The assembled program (.yo) provides the program size. Part of the program is allocated for data (the

array to sort) and stack (for function calls). These would not count as part of the code size. For example,

the provided skeleton files selectsort.ys and quicksort.ys would take x52 or 82B and x5d or

93B, respectively.

Your code size score depends on this assembled code size:

Scorecodesize =





0 , size > 300B

1/2 · (300 − c), 200B ≤ size ≤ 300B

50 , size < 200B

6.3 Performance for quicksort.ys

The dynamic instruction count (number of instruction executed) for quicksort.ys accounts for 50

points. You will only receive these points if your implementation of quicksort is correct.

The dynamic instructoun count may vary depending on the initialized value of the array and the size of the

array to sort. We will evaluate using 32 elements, and compute the average of multiple runs to determine

the score:

Scoreperf =

0 , inst.count > 6000

1/40 · (6000 − c), 4000 ≤ inst.count ≤ 6000

50 , inst.count < 4000

7 Handin Instructions

You will be handing in two assembly files, and one optional HCL file.

• selectsort.ys: assembly program that performs a select sort algorithm

• quicksort.ys: assembly program that performs a quick sort algorithm

• seq-full.hcl: optional processor implementation

In order to turn-in your work, place your files in the turnin directory, inside asmlab-handout.

unix> cd ˜/asmlab-handout

unix> cp src/selectsort.ys turnin/

unix> cp src/quicksort.ys turnin/

unix> cp seq/seq-full.hcl turnin/ # optional

We will copy out your completed files 5 days after the deadline. The modified time of your files will be

used to determine your completion time. Thus, if you over-write your files by performing another copy after

the deadline, it will count as a late turn-in, even if you have turn-in before on time. Please be aware of this

consequence: the act of copying the files into the turnin directory is a conscious decision to update the

modified time. Each late day reduces the total grade by 2% (20% of the assignment per late day). A late

day is defined as rounding up the late time to the nearest day (24-hours).

7

8 Hints

• If you running in GUI mode on a Unix server, make sure that you have initialized the DISPLAY

environment variable:

unix> setenv DISPLAY myhost.edu:0

• With some X servers, the “Program Code” window begins life as a closed icon when you run ssim

in GUI mode. Simply click on the icon to expand the window.

• With some Microsoft Windows-based X servers, the “Memory Contents” window will not automatically

resize itself. You’ll need to resize the window by hand.

• The ssim simulator terminates with a segmentation fault if you ask them to execute a file that is not

a valid Y86-64 object file.

8


版权所有:留学生编程辅导网 2020 All Rights Reserved 联系方式:QQ:99515681 微信:codinghelp 电子信箱:99515681@qq.com
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。 站长地图

python代写
微信客服:codinghelp