联系方式

  • QQ:99515681
  • 邮箱:99515681@qq.com
  • 工作时间:8:00-23:00
  • 微信:codinghelp

您当前位置:首页 >> C/C++编程C/C++编程

日期:2024-04-18 10:09

Project 2: Parsing

The goal of this project is to give you experience in writing a top-down recursive descent parser and to get

introduced to the basics of symbol tables for nested scopes.

We begin by introducing the grammar of our language. Then we will discuss the semantics of our

language that involves lexical scoping rules and name resolution. Finally, we will go over a few examples

and formalize the expected output.

NOTE: This project is significantly more involved than the first project. You should start on it immediately.

1. Lexical Specification

Here is the list of tokens that your lexical analyzer needs to support:

Comments and Space

In addition to these tokens, our input programs might have comments thatshould be ignored by the

lexical analyzer.Acommentstartswith // andcontinues until a newline characteris encountered. The

regular expressions for comments is: // (any)* \n in which any is defined to be any character except

\n . Also, like in the first project, your lexical analyzer should skip space between tokens.

PUBLIC = “public”

PRIVATE = “private”

EQUAL = “=”

COLON = “:”

COMMA = “,”

SEMICOLON = “;”

LBRACE = “{”

RBRACE = “}”

ID = letter (letter + digit)*

2. Grammar

Here is the grammar for our input language:

Here is an example input program with comments:

Note that our grammar does not recognize comments, so our parser would not know anything about

comments, but our lexical analyzer would deal with comments. This is similar to handling of spaces by

the lexer, the lexer skips the spaces. In a similar fashion, your lexer should skip

program ? global_vars scope

global_vars ? e

global_vars ? var_list SEMICOLON

var_list ? ID

var_list ? ID COMMA var_list

scope ? ID LBRACE public_vars private_vars stmt_list RBRACE

public_vars ? e

public_vars ? PUBLIC COLON var_list SEMICOLON

private_vars ? e

private_vars ? PRIVATE COLON var_list SEMICOLON

stmt_list ? stmt

stmt_list ? stmt stmt_list

stmt ? ID EQUAL ID SEMICOLON

stmt ? scope

a, b, c; // These are global variables

test {

public:

a, b, hello; // These are public variables of scope test

private:

x, y; // These are private variables of scope test

a = b; // the body of test starts with this line

hello = c;

y = r;

nested { // this is a nested scope

public:

b; // which does not have private variables

a = b;

x = hello;

c = y;

// we can also have lines that only contain comments like this

}

}

comments.

We highlight some of the syntactical elements of the language:

Global variables are optional

The scopes have optional public and private variables

Every scope has a body which is a list of statements

A statement can be either a simple assignment or another scope (a nested scope)

3. Scoping and Resolving References

Here are the scoping rules for our language:

The public variables of a scope are accessible to its nested scopes

The private variables of a scope are not accessible to its nested scopes

Lexical scoping rules are used to resolve name references

Global variables are accessible to all scopes

Every reference to a variable is resolved to a specific declaration by specifying the variable's

defining scope. We will use the following notation to specify declarations:

? If variable a is declared in the global variables list, we use ::a to refer to it

? If variable a is declared in scope b, we use b.a to refer to it

And if reference to name a cannot be resolved, we denote that by ?.a

Here is the example program from the previous section, with all name references resolved (look at the

comments):


4. Examples

The simplest possible program would be:

Let's add a global variable:

a, b, c;

test {

public:

a, b, hello;

private:

x, y;

a = b; // test.a = test.b

hello = c; // test.hello = ::c

y = r; // test.y = ?.r

nested {

public:

b;

a = b; // test.a = nested.b

x = hello; // ?.x = test.hello

c = y; // ::c = ?.y

}

}

main {

a = a; // ?.a = ?.a

}

a;

main {

a = a; // ::a = ::a

}

Now, let's add a public variable a:

Or a private a:

Now, let's see a simple example with nested scopes:

If we add a private variable in main:

a;

main {

public:

a;

a = a; // main.a = main.a

}

a;

main {

private:

a;

a = a; // main.a = main.a

}

a, b;

main {

nested {

a = b; // ::a = ::b

}

}

a, b;

main {

private:

a;

nested {

a = b; // ::a = ::b

}

}

And a public b:

You can find more examples by looking at the test cases and their expected outputs.

5. Expected Output

There are two cases:

In case the input does not follow the grammar, the expected output is:

NOTE: no extra information is needed here! Also, notice that we need the exact

message and it's case-sensitive.

In case the input follows the grammar:

For every assignment statement in the input program in order of their appearance in the

program, output the following information:

? The resolved left-hand-side of the assignment

? The resolved right-hand-side of the assignment

in the following format:

NOTE: You can assume that scopes have unique names and variable names in a single

scope (public and private) are not repeated.

a, b;

main {

public:

b;

private:

a;

nested {

a = b; // ::a = main.b

}

}

Syntax Error

resolved_lhs = resolved_rhs

For example, given the following input program:

The expected output is:

6. Implementation

Start by modifying the lexical analyzer from previous project to make it recognize the tokens

required for parsing this grammar. It should also be able to handle comments (skip them like

spaces).

NOTE: make sure you remove the tokens that are not used in this grammar from your

lexer, otherwise you might not be able to pass all test cases. Your TokenType type declaration

should look like this:

a, b, c;

test {

public:

a, b, hello;

private:

x, y;

a = b;

hello = c;

y = r;

nested {

public:

b;

a = b;

x = hello;

c = y;

}

}

test.a = test.b

test.hello = ::c

test.y = ?.r

test.a = nested.b

?.x = test.hello

::c = ?.y

typedef enum { END_OF_FILE = 0,

PUBLIC, PRIVATE,

EQUAL, COLON, COMMA, SEMICOLON,

LBRACE, RBRACE, ID, ERROR

} TokenType

Next, write a parser for the given grammar. You would need one function per each non-terminal

of the grammar to handle parsing of that non-terminal. I suggest you use the following signature

for these functions:

Where X would be replaced by the target non-terminal. The lexical analyzer object needs to be

accessible to these functions so that they can use the lexer to get and unget tokens. These functions

can be member functions of a class, and the lexer object can be a member variable of that class.

You also need a syntax_error function that prints the proper message and terminates

the program:

Test your parser thoroughly. Make sure it can detect any syntactical errors.

Next, write a symbol table that stores information about scopes and variables. You would also

need to store assignments in a list to be accessed after parsing is finished. You need to think

about how to organize all this information in a way that is useful for producing the required

output.

Write a function that resolves the left-hand- side and right-hand-side of all assignments and

produces the required output. Call this function in your main() function after successfully

parsing the input.

NOTE: you might need more time to finish the last step compared to previous steps.

7. Requirements

Here are the requirements of this project:

You should submit all your project files (source code [.cc] and headers[.h]) on

Gradescope. Do not zip them.

You should use C/C++, no other programming languages are allowed.

? Besides the provided test cases, you need to design test cases on your own to rigorously test your

implementation.

You should test your code on Ubuntu Linux 19.04 or greater with gcc 7.5.0 or higher.

void parse_X()

void syntax_error()

{

cout << “Syntax Error\n”;

exit(1);

}

You cannot use library methods for parsing or regular expression (regex) matching in

projects. You will be implementing them yourself. If you have doubts about using a library

method, please check it with the instructor or TA beforehand.

You can write helper methods or have extra files, but they should have been written by you.

8. Evaluation

The submissions are evaluated based on the automated test cases on the Gradescope. Gradescope test cases

are hidden to students. Your grade will be proportional to the number of test cases passing. You have to

thoroughly test your program to ensure it pass all the possible test cases. It is not guaranteed that your code

will pass the Gradescope test cases if it passes the published test cases. As a result, in addition to the

provided test cases, you must design your own test cases to rigorously evaluate your implementation. If

your code does not compile on the submission website, you will not receive any points. On Gradescope,

when you get the results back, ignore the “Test err” case, it is not counted toward the grade.

The parsing test cases contain cases that are syntactically correct and cases that have syntax errors. If a

syntax test case has no syntax error, your program passes the test case if the output is not Syntax Error .

If a syntax test case has syntax error, your program passes the test case if the output is Syntax Error .

Note that if your program prints the syntax error message independently of the input, for example:

It will pass some of the test cases, but you will not receive any points.

You can access the Gradescope through the left side bar in canvas. You have already been enrolled in the

grade scope class, and using the left side bar in canvas you will automatically get into the Gradescope course.

int main()

{

cout << “Syntax Error\n”;

return 0;

}


版权所有:留学生编程辅导网 2020 All Rights Reserved 联系方式:QQ:99515681 微信:codinghelp 电子信箱:99515681@qq.com
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。 站长地图

python代写
微信客服:codinghelp