CIS341, Spring 2020
Assembly Lab: Programming a Y86 Processor
Assigned: Feb. 18st, Due: Mar. 13th, 8:59PM
1 Introduction
In this lab, you will learn about the Y86 instruction set architecture and how an assembly program executes
on a sequential Y86 processor. You are allowed to make any semantics-preserving transformation to the
program, and optionally make enhancements to the sequential processor. When you have completed the lab,
you will have a keen appreciation for the interactions between code and hardware that affect the performance
of your programs.
You will be hand-coding Y86 assembly codes with the goal of either making the code size small, or making
the code run fast. You may optionally extend the Y86 instruction set architecture for the SEQ processor to
speed up your assembly program.
2 Logistics
You will work on this lab alone. Any clarifications and revisions to the assignment will be posted on
blackboard.
3 Handout Instructions
Start by copying asmlab-handout.tar to your home directory on lcs-vc-cis341.syr.edu machine.
Then untar the tarball file to unpack the assignment in the directory.
unix> cp /home/cis341-proj/asmlab-handout.tar .
unix> tar -xvf asmlab-handout.tar
unix> cd asmlab-handout
In this directory, you will find subdirectories misc, ptest, seq, src, and y86-code, and two files
Makefile and README. You may build all the available tools at this working directory.
1
unix> make clean; make
Read the README files in all directories to familiarize yourself with the assignment.
4 Y86 Programming
You will be working in directory src in this part. Your task is to write and simulate the following two Y86
programs: selectsort.ys and quicksort.ys. The required behavior of these programs is defined
by C programs in the src directory. You can test your programs by first assembling them with the program
YAS and then running them with the instruction set simulator YIS.
In all of your Y86-64 functions, you should follow the x86-64 conventions for passing function arguments,
using registers, and using the stack. This includes saving and restoring any callee-save registers that you
use. The callee-save registers are %rbx, %rbp, %r12, %r13, and %r14. %rsp is a special case where it
should be restored to the value before the call via call/ret and push/pop mechanism.
selectsort.ys: selection sort algorithm
Write a Y86-64 program selectsort.ys that performs a selection sort on an array of data. Your program
should consist of some code that sets up the stack structure, invokes a function, and then halts. In this case,
the function should be Y86 code for a function (selectsort) that is functionally equivalent to the C
selectsort function in Figure 1. The directory includes a skeleton file for you to implement everything
except for the selection sort algorithm itself.
You may choose to implement selectsort without calling swap: in such case, delete swap.
quicksort.ys: quick sort algorithm
Write a Y86-64 program quicksort.ys that performs a quick sort on an array of data. Your program
should consist of some code that sets up the stack structure, invokes a function, and then halts. In this
case, the function should be Y86 code for a function (quicksort) that is functionally equivalent to the C
quicksort function in Figure 2. The directory includes a skeleton file for you to implement everything
except for the quick sort algorithm itself.
quicksort must recursively call itself, and must call partition. However, you may choose to implement
quicksort without calling swap: in such case, delete swap.
Getting started
You should first examine the C code to understand what the algorithm is doing line by line. Then try
compiling to x86 assembly with various optmization levels to understand how the compiler is generating
the assembly code.
unix> gcc -Og -S <C file>.c # optimize debugging experience
2
1 #include <stdint.h> // includes definition for int64_t
2
3 /* swap - Swaps the data pointed by the two pointer arguments */
4 void swap(int64_t *srcA, int64_t *srcB) {
5 int64_t valA, valB;
6 valA = *srcA;
7 valB = *srcB;
8 *srcA = valB;
9 *srcB = valA;
10 }
11
12 /* selectsort - Sorts an array of data using selection sort algorithm */
13 void selectsort(int64_t *arr, int64_t size) {
14 int64_t i, j, min_idx;
15 for (i=0; i<size; i+=1) {
16 min_idx = i;
17 for (j=i+1; j<size; j+=1) {
18 if (arr[j]<arr[min_idx])
19 min_idx = j;
20 }
21 if (min_idx!=i) {
22 swap(&arr[i], &arr[min_idx]);
23 }
24 }
25 }
Figure 1: C versions of selection sort. See src/selectsort.c
3
1 #include <stdint.h> // includes definition for int64_t
2
3 /* swap - Swaps the data pointed by the two pointer arguments */
4 void swap(int64_t *srcA, int64_t *srcB) {
5 int64_t valA, valB;
6 valA = *srcA;
7 valB = *srcB;
8 *srcA = valB;
9 *srcB = valA;
10 }
11
12 /* partition - Partitions the array into two:
13 with one side smaller than a pivot value, and the other, larger.
14 This function returns the index of the pivot. */
15 int64_t partition(int64_t *arr, int64_t lo, int64_t hi) {
16 int64_t i, j, p;
17 p = arr[hi];
18 i = lo;
19 for (j=lo; j<hi; j+=1) {
20 if (arr[j] < p) {
21 swap(&arr[i], &arr[j]);
22 i += 1;
23 }
24 }
25 swap(&arr[i], &arr[hi]);
16 return i;
17 }
18
19 /* quicksort - Sorts an array of data using quick sort algorithm */
20 void quicksort(int64_t *arr, int64_t lo, int64_t hi) {
21 int64_t p;
22 if (lo<hi) {
23 p = partition(arr, lo, hi);
24 quicksort(arr, lo, p-1);
25 quicksort(arr, p+1, hi);
26 }
27 }
Figure 2: C versions of quick sort. See src/quicksort.c
4
Once you understand how x86 code is generated, try transliterating to Y86 code. You may also want to try
other compiler options and observe how different the assembly code can become.
unix> gcc -Os -S <C file>.c # optimize for size
unix> gcc -O0 -S <C file>.c # optimization level 0 (default)
unix> gcc -O1 -S <C file>.c # optimization level 1
unix> gcc -O2 -S <C file>.c # optimization level 2
unix> gcc -O3 -S <C file>.c # optimization level 3
5 Optional Architectural Enhancement
You will be working in directory seq in this part.
You may optionally extend the SEQ processor to support the iaddq instruction, described in Homework
problems 4.51 and 4.52. To add this instruction, you will modify the file seq-full.hcl, which implements
the version of SEQ described in the CS:APP3e textbook. In addition, it contains declarations of some
constants that you will need for your solution.
Building and Testing Your Solution
Once you have finished modifying the seq-full.hcl file, then you will need to build a new instance of
the SEQ simulator (ssim) based on this HCL file, and then test it:
• Building a new simulator. You can use make to build a new SEQ simulator:
unix> make VERSION=full
This builds a version of ssim that uses the control logic you specified in seq-full.hcl. If you
want to revert back to the standard textbook version (without your modification in seq-full):
unix> make VERSION=std
• Testing your solution on a simple Y86-64 program. For your initial testing, we recommend running
simple programs such as asumi.yo (testing iaddq) in TTY mode, comparing the results against
the ISA simulation:
unix> ./ssim -t ../y86-code/asumi.yo
If the ISA test fails, then you should debug your implementation by single-stepping the simulator in
GUI mode:
unix> ./ssim -g ../y86-code/asumi.yo
• Retesting your solution using the benchmark programs. Once your simulator is able to correctly
execute small programs, then you can automatically test it on the Y86-64 benchmark programs in
../y86-code:
5
unix> (cd ../y86-code; make testssim)
This will run ssim on the benchmark programs and check for correctness by comparing the resulting
processor state with the state from a high-level ISA simulation. Note that none of these programs test
the added instructions. You are simply making sure that your solution did not inject errors for the
original instructions. See file ../y86-code/README file for more details.
• Performing regression tests. Once you can execute the benchmark programs correctly, then you
should run the extensive set of regression tests in ../ptest. To test everything except iaddq
and leave:
unix> (cd ../ptest; make SIM=../seq/ssim)
To test your implementation of iaddq:
unix> (cd ../ptest; make SIM=../seq/ssim TFLAGS=-i)
For more information on the SEQ simulator refer to the handout CS:APP3e Guide to Y86-64 Processor
Simulators (simguide.pdf).
6 Evaluation
This lab is worth 200 points: 100 points for selectsort.ys and 100 points for quicksort.ys. They
will be evaluated on correctness as well as code size or performance. If no seq-full.hcl is submitted,
we will evaluate using the unmodified seq-full.hcl. If seq-full.hcl is submitted, we will use that
SEQ implementation for the evaluation.
6.1 Correctness
Correctness accounts for 50 points each for selectsort.ys and quicksort.ys. The simulation of
the programs must end with the HLT (halt) status.
Both algorithms perform in-place updates. After the sort, the array must contain elements in non-decreasing
order. An array size of 32 will be used with different initialized values.
In addition, the Y86 assembly code must follow the correct register saving convention. By the end of the
program, all callee-saved registers must be restored to the values to before executing the call to the sort. The
callee-save registers are %rbx, %rbp, %r12, %r13, and %r14. %rsp is a special case where it should be
restored to the value before the call via call/ret and push/pop mechanism.
6.2 Code size for selectsort.ys
The code size for selectsort.ys accounts for 50 points. You will only receive these points if your
implementation of selectsort is correct.
6
The assembled program (.yo) provides the program size. Part of the program is allocated for data (the
array to sort) and stack (for function calls). These would not count as part of the code size. For example,
the provided skeleton files selectsort.ys and quicksort.ys would take x52 or 82B and x5d or
93B, respectively.
Your code size score depends on this assembled code size:
Scorecodesize =
0 , size > 300B
1/2 · (300 − c), 200B ≤ size ≤ 300B
50 , size < 200B
6.3 Performance for quicksort.ys
The dynamic instruction count (number of instruction executed) for quicksort.ys accounts for 50
points. You will only receive these points if your implementation of quicksort is correct.
The dynamic instructoun count may vary depending on the initialized value of the array and the size of the
array to sort. We will evaluate using 32 elements, and compute the average of multiple runs to determine
the score:
Scoreperf =
0 , inst.count > 6000
1/40 · (6000 − c), 4000 ≤ inst.count ≤ 6000
50 , inst.count < 4000
7 Handin Instructions
You will be handing in two assembly files, and one optional HCL file.
• selectsort.ys: assembly program that performs a select sort algorithm
• quicksort.ys: assembly program that performs a quick sort algorithm
• seq-full.hcl: optional processor implementation
In order to turn-in your work, place your files in the turnin directory, inside asmlab-handout.
unix> cd ˜/asmlab-handout
unix> cp src/selectsort.ys turnin/
unix> cp src/quicksort.ys turnin/
unix> cp seq/seq-full.hcl turnin/ # optional
We will copy out your completed files 5 days after the deadline. The modified time of your files will be
used to determine your completion time. Thus, if you over-write your files by performing another copy after
the deadline, it will count as a late turn-in, even if you have turn-in before on time. Please be aware of this
consequence: the act of copying the files into the turnin directory is a conscious decision to update the
modified time. Each late day reduces the total grade by 2% (20% of the assignment per late day). A late
day is defined as rounding up the late time to the nearest day (24-hours).
7
8 Hints
• If you running in GUI mode on a Unix server, make sure that you have initialized the DISPLAY
environment variable:
unix> setenv DISPLAY myhost.edu:0
• With some X servers, the “Program Code” window begins life as a closed icon when you run ssim
in GUI mode. Simply click on the icon to expand the window.
• With some Microsoft Windows-based X servers, the “Memory Contents” window will not automatically
resize itself. You’ll need to resize the window by hand.
• The ssim simulator terminates with a segmentation fault if you ask them to execute a file that is not
a valid Y86-64 object file.
8
版权所有:留学生编程辅导网 2020 All Rights Reserved 联系方式:QQ:99515681 微信:codinghelp 电子信箱:99515681@qq.com
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。