Assignment2 SyntaxAnalysis doc Assignment 2 Syntax Analysis I Introduction In this assignment, you are required to implement a parser manually for MC programs The parser performs a syntax analysis pro[.]
Assignment 2: Syntax Analysis I Introduction In this assignment, you are required to implement a parser manually for MC programs The parser performs a syntax analysis process that receives a sequence of tokens produced by the scanner, which should have been implemented in Assignment 1, and verifies if the token sequence is grammatically correct or not In order to complete the assignment, the following tasks are to be fulfilled: - Construct a context-free grammar for the MC language - Implement a parser according to the constructed grammar You can either employ a top-down or bottom-up parsing technique for your parser You must adopt the scanner provided by the course’s staffs to perform lexical analysis for your parser Thus, the used token set must be the same as that previously specified in Assigment You should refer to lecture notes and textbooks as well as the MC language specification carefully to find out the grammar rules that precisely reflect the MC program structures II Operational Instructions The programming language used to implement the scanner must be Java You should install Java JDK 5.0, which includes a java compiler and a java virtual machine as done in Assignment To implement the parser, you first download provided file ass2.zip, which will be uploaded after the due date of Assignment 1, and decompress it in a directory, called supposedly $ROOT$ as done in Assignment In your $ROOT$ directory, you will have an MC directory whose structure is as follows (bold names are folder, the remaining are files): MC | lexicalanalysis | | _Scanner.java, SourcePosition.java, Token.java, ErrorReport.java | syntaxanalysis | | _ grammar | | | grammar.txt | | _ test | | | test.txt | | | solution.txt | | _ Parser.java | MCCompiler.java The files in directory lexicalanalysis are the scanner provided for your convinient You must not modify these files File grammar.txt specifies your constructed grammar You can either use BNF or EBNF formalism to specify your productions In case you transform the grammar for top-down parsing, please put both original and transformed grammars in the same file as separated sections File MCCompiler.java defines class MCCompiler You must not modify this file File Parser.java defines class Parser This class will perform all the necessary tasks of a parser - The public method parse will be involked from the main method of class MCCompiler and report if the input file is grammatically correct or not File test.txt and solution.txt are supplied for your convenience You can try the provided files by typing the following and compare the output to the content of solution.txt $ROOT$> javac MC\*.java $ROOT$> java MC.MCCompiler MC\syntaxanalysis\test.txt If necessary, you can create some your own new java source files but they must be in the same package of MC.syntaxanalysis Files Parser.java, grammar.txt and your own new files are only required to be submitted Output Format If the input file is grammatically correct, the parser outputs nothing; otherwise an error message will be reported accordingly (please refer to Section for more detail of error message) Parser Testing Although some test files are provided for this assignment, you are recommended to design additional test cases to make sure your parser works as desired The mechanism to test your parser is similar to that in Assignment Syntax Error Your parser must be capable of detecting syntax errors in the input file as soon as present When a syntax error is detected, a corresponding error message will display the positions of the error and the token whose occurrence causes the error The conventional error message is of the following format [Syntax error: Unexpected token:][“ “][tab][Lexeme:][tab][charStart=][“ “][charFinish=][“ ”][line=] For example, with the input “x = a+ ;” at line 3, the following message should be displayed: “Syntax error: Unexpected token: Token.SEMICOLON charFinish=8 line =3” Lexeme:; charStart=8 Note that there is no new line at the end of the error message The parser will terminate immediately when a syntax error is found No further error recovery action is taken Submission and Late Penalties Instructions for submiting your assignment will be available in the course’s site around week Basically, the submission mechanism is similar to that of Assignment The deadline for this Assignment is at 12:00 noon Wed, Oct 25th, 2006 This assignment is worth 25% of the assignment mark You are strongly advised to start as soon as possible and should not wait until the last minute If you are late for day (12:00 noon Oct 26th, 2006), the maximum mark for you is If you are late for day (12:00 noon Oct 27th, 2006), the maximum mark for you is If you are late for day (12:00 noon Oct 28th, 2006), the maximum mark for you is After Oct 28th,2006, you not need to submit your assignment anymore Also, no excuse is accepted after this point of time Plagiarism You must the assignment by yourself If it is discovered that your assignment is a copy of your friend’s work, both of you will receive a zero-mark for this subject (not only assignment mark) NO EXCUSE AND NO EXCEPTION! TUTORIAL Consider the following grammar S (L) | a L L,S | S a) What are the terminal, nonterminal and start symbol? b) Find parse tree for the following sentences: (a,a) (a,(a,a)) (a,((a,a),(a,a))) c) Construct a leftmost derivation for each sentence given in (b) d) Construct a rightmost derivation for each sentence given in (b) e) What is the language generated by this grammar? Consider the following grammar S aSbS| bSaS | ∈ a) Find a rightmost derivation for abab b) Construct all possible parse trees for abab c) Is this grammar ambiguous? Why? d) What is the language generated by this grammar? Write a grammar that generates all of boolean expressions Construct the corresponding parse tree for not (true or false) a) Eliminate the left-recursion from the grammar in Exercise b) Compute the First, Follow and Select sets for the transformed grammar c) Construct a recursive predictive parser for the transformed grammar d) Show the behavior of the parser for the sentences given in Exercise 1b Eliminate left-recursion and left-factoring for the grammar constructed in Exercise 3, if present TUTORIAL Consider the following grammar S (L) | a L L,S | S a) What are the terminal, nonterminal and start symbol? b) Find parse tree for the following sentences: (a,a) (a,(a,a)) (a,((a,a),(a,a))) c) Construct a leftmost derivation for each sentence given in (b) d) Construct a rightmost derivation for each sentence given in (b) e) What is the language generated by this grammar? Consider the following grammar S aSbS| bSaS | ∈ a) Find a rightmost derivation for abab b) Construct all possible parse trees for abab c) Is this grammar ambiguous? Why? d) What is the language generated by this grammar? Write a grammar that generates all of boolean expressions Construct the corresponding parse tree for not (true or false) a) Eliminate the left-recursion from the grammar in Exercise b) Compute the First, Follow and Select sets for the transformed grammar c) Construct a recursive predictive parser for the transformed grammar d) Show the behavior of the parser for the sentences given in Exercise 1b Eliminate left-recursion and left-factoring for the grammar constructed in Exercise 3, if present