
  • QQ:99515681
  • 邮箱:99515681@qq.com
  • 工作时间:8:00-23:00
  • 微信:codinghelp

您当前位置:首页 >> Java编程Java编程

日期:2020-04-29 11:25

CS 214: Systems Programming, Spring 2020

Assignment vLast: Where’s the file?

Warning: This is a complex assignment that will take time to code and test. Be sure to define modules and give

it its due time. Make sure to read the assignment carefully!

Note: Due to the recent suspension of in-person contact you may conplete this assignment

either with a partner so solo (which I do not recommend). You must sign up/register your

group in either case.

0. Abstract

Git is a a popular current iteration of a series of version control systems. Most large, complex software projects

are coded using version control. Version control can be very helpful when working on a single set of source code

that multiple people are contributing to by making sure that everyone is working on the same version of the

code. If people are working on code in physically separate locations, it is entirely possible that two different

people have edited the same original code in two ways that are incompatible with each other. A versioning

control system would not allow two different versions of the same file to exist in its central repository, enforcing

that any changes made to a file are seen by everyone before they can submit additional changes to the repository.

1. Introduction

You will write a version control system for this assignment. You first need a primer for the vocabulary:

project: some collection of code that is being maintained

also, the directory under which a collection of code resides

repository: the union of all the canonical copies of all projects being managed as well as

metadata and backups/historical data

also, the directory under which all managed project directories are located

.Manifest: a metadata file listing all paths to all files in a project, their state (version), a

hash of the all the contents of each file and a project version (which increments any

time there is any change to any part of the project)

history: a list of all updates/changes made to a given project since creation

(is maintained at the repository, but can be requested by clients)

roll back: change the current version of a project in the repository to be a previous version

commit/push: upload changes you made to a project locally to the repository, updating its current


check out: download the current version of a project in the repository

update: download from the repository only the files in a project that are newer than your local


A version control system consists of some remote server that holds the repository and manages changes to it, and

any number of user clients that can fetch projects from the repository and push changes to projects they have

checked out. The clients have a local copy of the project they can make changes to, while the server holds the

canonical or definitive version of the project.

The overriding mandate of a version control system is to make sure no changes are made to a project in the

repository unless the user making the changes to that project has its current version. This can ease difficulties

when working remotely on a shared code base with other people. Rather than, for instance, emailing around

copies of code as it is changed, a versioning system will maintain a single, canonical version. A version control

system enforces synchronization by requiring that you have the current version of a project before you can

submit a change to it. This means it is not possible for two people to check out the current version of a project,

make different changes to the same file, and then both submit those changes. The first one to submit their

changes will alter the current version of the repository, so the next person who tries to submit changes will need

to update to the current version. On updating, they would see the file they have been editing has been altered and

would have to integrate their changes with the current version of the file before submitting.

A version control system can also protect a team from development mishaps. If you accidentally delete a file,

you can just update from the repository and get it restored. If you have an odd bug and just want to start fresh,

you can delete your whole project directory and check out a fresh copy. The version control system also saves all

versions of a project as changes are committed to it. Every new version of a project is saved separately in the

repository so it is possible to fetch old versions. Given the scope of this assignment, your version control system

will have limited functionality and will only be able to roll back (most version control systems can fork a

repository to have different development paths, and can roll forward undoing previous roll backs). This would

allow a group to start coding, realize that the current design or direction is flawed and roll back to a previous

version of the repository. The next check out done would then restore an older version of the project that the

version control system had saved.

The server will then necessarily need to support multiple client connections, perhaps simultaneous ones, be able

to automatically scan recursively through a project directory and compare files to determine similarity. The

server will also need to keep multiple versions of a project so that it can roll back the current version to a

previous and send those old files to a client. The client will need to parse commands from the user, scan through

a project directory to build a local manifest and know how to communicate with the server to either commit

changes make by the user to the local copy up to the repository, or fetch from the server files that it has that are

newer versions of files in the user's local copy of the project.

2. Implementation

You will need to write two programs; a “Where's The File” server and client. The server will maintain a

repository of projects that a client can check out from the server's repository and then commit and fetch updates

to and from them.

This functionality is mostly provided by the .Manifest file. Every project has a .Manifest file in its root directory.

The .Manifest file contains:

the current version of the project

for each file in the project:

that file's path/name

that file's current version

a stored hash of that file

The project's version is incremented any time any change is pushed to the repository. The files that hold the

changes have their specific versions incremented as well (if they were not removed). The Manifest file is

discussed more in 4.4, below.

The client and server programs can be invoked in any order. Client processes that cannot find the server should

repeatedly try to connect every 3 seconds until killed or exited with a SIGINT (Ctrl+C).

Minimally, your code should produce the following messages:

- Client announces completion of connection to server.

- Server announces acceptance of connection from client.

- Client disconnects (or is disconnected) from the server.

- Server disconnects from a client.

- Client displays error messages.

- Client displays informational messages about the status of an operation

(another operation is required, aborted and reason)

- Client displays successful command completion messages.

2.0 WTF client:

The WTF client program will taken as command line arguments a WTF command and a number of arguments.

Most commands require a project name first, some need only the project name. The first command the client is

given should always be a configure command where the hostname or IP address of the machine on which the

server is located as well as the port number where it is listening are command-line arguments. After a configure

is done, based on the command it is given, the client may send different things to the WTF server. The client

program can take only one command per invocation. The client's job is to maintain a Manifest; a list of all files

currently considered to be part of the project and to verify that list with the server when asked. Most commands

that result in files being sent or received to or from the server have two halves; a command of preparation to get

ready for an operation and then a command of execution to do the operation:

update - get the server's .Manifest and compare all entries in it with the client's .Manifest and see

what changes need to be made to the client's files to bring them up to the same version as the

server, and write out a .Update file recording all those changes that need to be made.

upgrade - make all the changes listed in the .Update to the client side

commit - get the server's .Manifest and compare all entries in it with the client's .Manifest and find out

which files the client has that are newer versions than the ones on the server, or the server does

not have, and write out a .Commit recording all the changes that need to be made.

push - make all the changes listed in the .Commit to the server side

All other commands are fairly direct; they create or destroy a project, add or remove a file to or from a project,

fetch the current version of the entire project from the server or change the current version of the project, or get

metadata about the project.

2.1 WTF server:

The WTF server program firstly need to be multithreaded, as it needs to serve potentially any number of clients

at once. It should spawn a new client service thread whenever it gets a new connection request. It should not do

any client communication in the same execution context that listens for new connections.

Since there will be multiple threads trying to access the files in the repository at the same time, you should have

a mutex per project to control access to it. Be sure to lock the mutex whenever reading or writing information or

files from or to a project. You do not want to, for instance, send a .Manifest to a client while you're adding a file

to it so that the .Manifest sent would be out of date the moment it is sent. Be careful not to deadlock your server's


When being started the server takes a single command line argument, a port number to listen on;

./WTFserver 9123

The server can be quit with a SIGINT (Crtrl+C) in the foreground of its process. You should however make sure

that you catch the exit signal (atexit()) and nicely shut down all threads, close all sockets and file descriptors and

free() all memory before allowing the process to terminate.

3. WTF Client Commands

The client process will send commands to the server, and the server will send responses back to the client. The

server will send back error, confirmation messages, and/or files for each command. All messages sent to the

server should result in a response to the client. The client program will take one command at a time and can only

perform one command per execution/invocation.

3.0 ./WTF configure <IP/hostname> <port>

The configure command will save the IP address (or hostname) and port of the server for use by later

commands. This command will not attempt a connection to the server, but insteads saves the IP and port number

so that they are not needed as parameters for all other commands. The IP (or hostname) and port should be

written out to a ./.configure file. All commands that need to communicate with the server should first try to get

the address information and port from the ./.configure file and must fail if configure wasn’t run before they were

called. All other commands must also fail if a connection to the server cannot be established.

Note: if you can write out to an environment variable that persists between Processes, feel free to do so, but all

recent feedback has been that security upgrades to the iLabs seem to have obviated this option.

3.1 ./WTF checkout <project name>

The checkout command will fail if the project name doesn’t exist on the server, the client can't communicate

with the server, if the project name already exists on the client side or if configure was not run on the client side.

If it does run it will request the entire project from the server, which will send over the current version of the

project .Manifest as well as all the files that are listed in it. The client will be responsible for receiving the

project, creating any subdirectories under the project and putting all files in to place as well as saving the


3.2 ./WTF update <project name>

The update command will fail if the project name doesn’t exist on the server and if the client can not contact the

server. The update command is rather complex since it is where lots of things are compared in order to maintain

proper versioning. If update doesn't work correctly, almost nothing else will.

Update's purpose is to fetch the server's .Manifest for the specified project, compare every entry in it to the

client's .Manifest and see if there are any changes on the server side for the client. If there are, it adds a line to

a .Update file to reflect the change and outputs some information to STDOUT to let the user know what needs to

change/will be changed. This is done for every difference discovered. If there is an update but the user changed

the file that needs to be updated, update should write instead to a .Conflict file and delete any .Update file (if

there is one). If the server has no changes for the client, update can stop and does not have to do a line-by-line

analysis of the .Manifest files, and should blank the .Update file and delete any .Conflict file (if there is one),

since there are no server updates.

There is one full success case, three partial success cases and one failure case:

Full success case: (client won't have to download anything)

Update code: (server has no updates for client and client may or may not have updates for the server)

criteria: the server and client .Manifests are the same version ... can stop immediately!

action: Write a blank .Update file because everthing is awesome!

(*dum*dum*dum*dum*dum*... everything is great when you're part of a team! ...)

Delete .Conflict if it exists

Output 'Up To Date' to STDOUT

Partial success cases: (client will have to download some things)

Modify code: (server has modifications for the client)

criteria: the server and client .Manifest are different versions, and the client's .Manifest:

- has files whose version and stored hash are different than the server's,

and the live hash of those files match the hash in the client's .Manifest

action: Append 'M <file/path> <server's hash>' to .Update (create it if you need to)

Output 'M <file/path>' to STDOUT

Add code: (server has files that were added to the project)

criteria: the server and client .Manifest are different versions, and the client's .Manifest:

- does not have a file(s) that appear in the server's

action: Append 'A <file/path> <server's hash>' to .Update (create it if you need to)

'A <file/path>' to STDOUT

Delete code: (server has removed files from the project)

criteria: the server and client .Manifest are different versions, and the client's .Manifest:

- does have a file(s) that does not appear in the server's

action: Append 'D <file/path> <server's hash>' to .Update (create it if you need to)

'D <file/path>' to STDOUT

Failure case: (need to download some things, but can't because the user has made changes to the same files)

Conflict code: (server has updated data for the client, but the user has changed that file locally!)

criteria: ther server and client .Manifest are different versions, and the client's .Manifest:

- has a file whose stored hash is different than BOTH the server's .Manifest and

a live hash of the file

action: Append 'C <file/path> <live hash>' to .Conflict (create it if you need to)

'C <file/path>' to STDOUT

note: Don't stop if you hit a conflict! Keep going and find all updates and conflicts.

After scanning all of the server's .Manifest, be sure to output to STDOUT

that conflicts were found and must be resolved before the project can be updated.

3.3 ./WTF upgrade <project name>

The upgrade command will fail if the project name doesn’t exist on the server, if the server can not be

contacted, if there is no .Update on the client side or if .Conflict exists. The client will apply the changes listed

in the .Update to the client's local copy of the project. It will delete the entry from the client's .Manifest for all

files tagged with a “D”, fetch from the server and then write or overwrite all files on the client side that are

tagged with a “M” or “A”, respectively. When it is done processing all updates listed in it, the client should

delete the .Update file. Note that the client does not make any changes to files in the the project directory that are

not listed in the .Update. If the .Update is empty, the client need only inform the user that the project is up to

date and delete the empty .Update file. If no .Update file exists, the client should tell the user to first do an

update. If .Conflict exists, the client should tell the user to first resolve all conflicts and update.

3.4 ./WTF commit <project name>

The commit command will fail if the project name doesn’t exist on the server, if the server can not be contacted,

if the client can not fetch the server's .Manifest file for the project, if the client has a .Update file that isn't empty

(no .Update is fine) or has a .Conflict file. After fetching the server's .Manifest, the client should should first

check to make sure that the .Manifest versions match. If they do not match, the client can stop immediatley and

ask the user to update its local project first. If the versions match, the client should run through its own .Manifest

and compute a live hash for each file listed in it. Every file whose live hash is different than the stord hash saved

in the client's local .Manifest should have an entry written out to a .Commit with its file version number

incremented. The commit should be successful if the only differences between the server's .Manifest and the

client's are:

Modify code: (client has a file with changed data)

criteria: the server and client .Manifest have the same file, and:

- the hash stored in both the server and client .Manfiest is the same

- the client's live hash of the file is different than its stored hash

action: Append 'M <file/path> <server's hash>' to .Commit (create it if you need to)

with the file version incremented

Output 'M <file/path>' to STDOUT

Add code: (client has a file that was added to the project)

criteria: the server .Manifest does not have the file, the client .Manifest does

action: Append 'A <file/path> <server's hash>' to .Commit (create it if you need to)

with the file version incremented

'A <file/path>' to STDOUT

Delete code: (cilent has removed a file from the project)

criteria: the server .Manifest does have the file, but the client .Manifest does not

action: Append 'D <file/path> <server's hash>' to .Commit (create it if you need to)

with the file version incremented

'D <file/path>' to STDOUT

If all the differences are only the above cases, the client should send its .Commit to the server (and the server

should save it as an active commit) and report success. If however there are any files in the server's .Manifest

that have a different hash than the client's whose version number are not lower than the client's, then the commit

fails with a message that the client must synch with the repository before committing changes. If the client's

commit fails, it should delete its own .Commit file.

3.5 ./WTF push <project name>

The push command will fail if the project name doesn’t exist on the server, if the client can not communicate

with the server or if the client has no .Commit file. The client should send its .Commit and all files listed in it to

the server. The server should first lock the repository so no other command can be run on it. While the repository

is locked, the server should check to see if it has a stored .Commit for the client and that it is the same as the

.Commit the client just sent. If this is the case, the server should expire all other .Commits pending for any other

clients, duplicate the project directory, write all the files the client sent to the newly-copied directory (or remove

files, as indicated in the .Commit), update the new project directory's .Manifest by replacing corresponding

entries for all files uploaded (and removing entries for all files removed) with the information in the .Commit the

client sent, and increasing the project's version. The server should then unlock the repository and send a success

message to the client. If there is a failure at any point in this process, the server should delete any new files or

directories created, unlock the repository and send a failure message to the client. The client should erase its

.Commit on either response from the server.

3.6 ./WTF create <project name>

The create command will fail if the project name already exists on the server or the client can not communicate

with the server. Otherwise, the server will create a project folder with the given name, initialize a .Manifest for it

and send it to the client. The client will set up a local version of the project folder in its current directory and

should place the .Manifest the server sent in it.

3.7 ./WTF destroy <project name>

The destroy command will fail if the project name doesn’t exist on the server or the client can not communicate

with it. On receiving a destroy command the server should lock the repository, expire any pending commits,

delete all files and subdirectories under the project and send back a success message.

3.8 ./WTF add <project name> <filename>

The add command will fail if the project does not exist on the client. The client will add an entry for the the file

to its own .Manifest with a new version number and hashcode.

(It is not required, but it may speed things up/make things easier for you if you add a code in the .Manifest to

signify that this file was added locally and the server hasn't seen it yet)

3.9 ./WTF remove <project name> <filename>

The remove command will fail if the project does not exist on the client. The client will remove the entry for the

given file from its own .Manifest.

(It is not required, but it may speed things up/make things easier for you if you add a code in the .Manifest to

signify that this file was removed locally and the server hasn't seen it yet)

3.10 ./WTF currentversion <project name>

The currentversion command will request from the server the current state of a project from the server. This

command does not require that the client has a copy of the project locally. The client should output a list of all

files under the project name, along with their version number (i.e., number of updates).

3.11 ./WTF history <project name>

The history command will fail if the project doesn’t exist on the server or the client can not communicate with

it. This command does not require that the client has a copy of the project locally. The server will send over a

file containing the history of all operations performed on all successful pushes since the project's creation. The

output should be similar to the update output, but with a version number and newline separating each push's log

of changes.

3.12 ./WTF rollback <project name> <version>

The rollback command will fail if the project name doesn’t exist on the server, the client can't communicate

with it, or the version number given is invalid. This command does not require that the client has a copy of the

project locally. The server will revert its current version of the project back to the version number requested by

the client by deleting all more recent versions saved on the server side.

4. Methodology

There are any number of ways to code the operations above. There are however some conventions that are

common in such situations that can alleviate pressures or problems you haven't come up against yet. In this

section we describe some of the common practices you might want to be aware of.

4.0 Simple network protocols

Once you open a socket and connect to the WTF server, you will get a file descriptor. You can treat that file

descriptor like any other file descriptor, you can read() and write() from and to it (but you can't f-command it ...

see, there is a method to the madness). This is how you communicate with the server. One

thing you must decide before you start writing your sockets is your protocol. Remember, read() and write() just

deal in bytes. If you need to download 3 files from the WTF server, for example, you need to know how long

each is first. You can't just start reading bytes and somehow know when the first file stopped, since you are just

reading bytes. You need to use the same file descriptor to send both commands and data, so you need a rigid set

of rules (a protocol) to allow you to separate commands and metadata from actual data since you can't tell them

apart just by inspecting the bytes themselves.

A common practice is to, whenever sending data, first send the length of the data and then the data itself. This

way you get the number of bytes, and then just read in that many bytes. Presume you define your protocol the

following way:

<text command><delimiter>

- if the command is the text 'sendfile' next could be

<number of files><delimiter>

- then, <number of files> number of the following

<filename length><delimiter><filename><size of file in bytes><delimiter>

- then, the bytes for each file, in sequence

So, if you wanted to send two files, 'thing.txt' and 'stuf.txt' and your delimiter is ':'


The message above broken out and interpreted based on the rules above with the delimiter “:”

sendfile: I am getting files

2: I am getting 2 files

9: first file's name is the next 9 bytes (chars)

thing.txt .. this is the first file's name

21: .. this is the first file's length in bytes

8: second file's name is the next 8 bytes (chars)

stuf.txt .. this is the second file's name

9: .. this is the second file's length in bytes

(I've read in information for all (2) files, now I should be getting the files' bytes)

0f009fflll1l100JIAlz0 .. these next 21 bytes make up the first file

&89*1H9s0 .. these next 9 bytes make up the second file

You should design your own protocol after looking carefully over all the WTF commands. Be sure to

remember that you will need to use the same file descriptor to send commands, metadata and data, so you will

need to tell them all apart. As stated above, often the most direct way to do that is to first note the number of

bytes something will be, and then to read that many bytes. The reason there is a delimiter after each length is that

you do not know how long (how many chars) a given number will be.

A rule of thumb is to use a delimiter for anything that is read in as fairly short text data and number of bytes for

anything that is not text data or text data that could be quite long. You don't want to use a delimiter on data

because your delimiter might appear as data, which makes things difficult. You often don't want to use a

delimiter on something long because you don't want to do a thousand compares looking for your delimiter if you

could just count bytes.

4.2 Network Request/Response

Another common rule of thumb for network protocols is to make sure you always get a response for every

message. The WTF client should always expect a response from the WTF server. In other words, make sure that

you write the WTF server so that it always sends a response to the WTF client for every message. Even if the

WTF server has no data or information to send in response, a simple 'OK' lets the WTF client know that the

WTF server has seen the message. This way, you can tell the difference between a command that failed on the

server vs a message that got lost.

4.4 File Manifests

Your WTF client will need to determine if the user changed any files in its local copy of the project so that it can

tell if it should upload those files to the server or not on a commit/push and if the client's files have been changed

and are inconsistent with the server's files on an update/upgrade.

The rule is that the client is allowed to send changes to the server if, for all files the client changed, there were no

changes made to those files on the server yet. The most direct way to do this would be download every file in a

project from the WTF server and compare them, byte by byte, with the local copies the client has - which is

terrible. Instead of doing that, your WTF client will compute a Manifest, ask the WTF server for its Manifest and

compare them to determine which files, if any, have changed.

A Manifest is an index consisting of a version number for the Manifest itself, all files in a project (by full path),

their current version number and a digest of that file. A digest is some short way of representing the contents of a

file. You do not need the acutal contents of the file, you just want some code that can be compared to to see if a

given file changed. A popular way to implement a digest is to use some type of hash, like AES or SHA. Feel free

to use any hash library on the iLabs that suits your purposes. The hash of the contents of the file will change if

you change any single byte in the file and hashes can be much shorter than the entire file. The WTF server and

client should maintain a .Manifest file for each project. A Manifest can be as direct as a version number at the

top, a newline, and an entry per line consisting of:

<file version number><<projectname>/path/filename><hashcode>

A file's version is incremented whenever it is changed. A Manifest's version number is incremented after a

successful push. This means that if a project is changed in any way, the Manifest version increases, however one

push may contain multiple changes. A user might erase one character in a file then commit and push that change,

incrementing that file's version number and the Manifest's version number. A user might also remove three files,

add two and change seven other files, and then commit and push all those changes all at once, which would also

only increment the Manifest's version number by one, because they were changes that came from one push.

4.3 Implementation Strategy

Be sure to map out everything first. There are definitely places where you can save coding effort and time by

reusing the functionality of some commands in other places. Most of the commands are quite simple. You could

write add and remove in a minute or two. Create and destroy are also quite simple. It can help a lot to write

currentversion and history early, since they will let you parse through and read metadata to see how things

change as you run other commands. For update/upgrade and commit/push it can help to first write out a

flowchart of the interactions, which can help to you figure out how to code that functionality and what

information and data you need to pass back and forth and what messages you will need in your networking

protocol to get it.

As for networking, first get a single-threaded server that you can send a message to and read a response. Then

send a message that is a file name and get the server to open that file and send it to you, and have your client side

can write it out. Then you can verify if it works or not. Then put your file reading and sending code in to a

function, make the function a void */void * and make it a thread. Implement some simple error handling (file

does not exist, no read permissions, etc), then modify your client to open a file with a list of file names that it

requests from your server and have your server send them all back. Once you have that add functionality so that

the client sends a list of files to the server, then the client sends those files to the server and the server saves them

on its side. You then have pretty much all the networking functionality you need for the project. Note, you can

dovetail this part and the add/remove with a simplified commit/push. Add adds a filename to a list, remove

removes it and commit/push sends the file that holds the list of filesnames and then push sends the actual files.

5. Results

Turn in a tarred gzipped file named Asst3.tgz containing a directory named Asst3 with the following

files in it:

• A readme.pdf file documenting your design paying particular attention to the thread

synchronization requirements of your application.

• A file called testcases.txt that contains a thorough set of test cases for your code, including

inputs and expected outputs.

• All source code including both implementation (.c) and interface(.h) files.

(and no executables or object files)

• A test plan documented in testplan.txt and code to run the test cases you specify

• A Makefile for producing your object and executable files with at least the following targets:

o all - should build a client executable named “WTF” and server executable named


o clean - should remove all executables and object files buildable by the Makefile

o test - should build all files needed to run your test cases, and an executable named

“WTFtest” to run them.

Your grade will be based on:

• Correctness (how well your code works, including avoidance of deadlocks).

• Efficiency (avoidance of recomputation of the same results).

• Good design (how well written your design document and code are, including modularity and


(you may work either individually or with a partner on this project, however you must register/sign

up, otherwise you will receive no grade for the project)

6. Future Work

There are a few common optimizations you can apply for extra credit:


Compress old versions of the project at the repository: +10pts

Compress all files to be sent over the network in to one file: +20pts

Do the above using zlib or tar library calls (rather than system()) +10pts

版权所有:留学生编程辅导网 2020 All Rights Reserved 联系方式:QQ:99515681 微信:codinghelp 电子信箱:99515681@qq.com
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。 站长地图
