Chapter 1 Intro to Compiling:
A compiler take the "source" language and translate to another language "target" language. It is important for the compiler to report to the user any errors in the source program.
There are two parts of compilation: analysis and synthesis.
Analysis: breaks down the source program into a tree structure, called syntax tree.
Synthesis: constructs the desired target program from the intermediate representation.
source program -> compiler -> assembler -> loader/link-editor
Lexical Analysis: linear analysis or scanning, how computer reads the source program, reads the characters in source program and group them into a stream of tokens, it identifies keywords, expressions and statements.
Syntax Analysis: parsing, grouping the tokens of the source program into grammatical phrases to synthesize output.
Semantic Analysis: Checks for semantic errors then gathers type information for the subsequent code-generation phase.
lexical analyzer -> syntax analyzer -> semantic analyzer -> intermediate code generator -> code optimizer -> code generator -> target program
during each of the phases above the program may encounter errors, compiler may try to deal with the errors and keep going while other errors most stops the program.
Last step of the compiler generates machine code or assembly code. memory are allocated for the program.
Preprocessors: produce input to compilers, macro processing, file inclusion, ration preprocessors, language extension.
Assemblers: assembly code is mnemonic version of machine code, which names are used instead of binary codes for operation, names are given memory addresses. Some assembly takes two passes at the input. First identify and denote storage location, second translate operation into bits.
Chapter 2 A simple one-pass compiler:
Syntax: what the language looks like
Semantics: what the language mean
character stream -> lexical analyzer -> token stream -> syntax directed translator -> intermediate representation
syntax definition:
if (expression) statement else statement
keyword(terminals) token(nonterminals)
some operands has a higher precedence 9+5*2
Depth-First Traversals: visit an unvisited child of a node whenever it can, tries to visit nodes as far away from the root as quickly as it can.
9-5+2
95-2+
9 and 5 are the bottom of the tree as leaves, one level up is - then 2 then lastly +
Tree is constructed by precedence level, operators at the same precedence level are evaluated left to right. Where the left most operators has less precedence than right operators.
Evaluation is then done on the stack.
expression -> expression + term = infinite loop
no changes to the input takes between recursive calls
The current token being scanned in the input is frequently referred as the lookahead symbol.
count = count + increment;
id = id + id;
input -> lexical analyzer -> parser
A data structure called a symbol table is generally used to store information about various source language constructs.
FrontEnd constructs the intermediate representation of the source program.
BackEnd generates the target program.
Stack manipulation:
push v: push v onto the stack
rvalue l: push contents of data location l
lvalue l: push address of data location l
pop: throw away value of top of the stack
:= the r-value on top is placed in the l-value below it and both are popped
copy: push a copy of the top value on the stack
a+b
rvalue a
rvalue b
+
Control Flow:
label l: target of jumps to l, no other effect
goto l: next instruction is taken from statement with label l
gofalse l: pop the top value; jump if it is zero
gotrue l: pop the top value; jump if it nonzero
halt: stop execution
Lexical Analysis: Reads the the code strip out white space, identify symbols and tokens
Parser: Check for syntax error, constructs functions.
to be continued... page 72
A compiler take the "source" language and translate to another language "target" language. It is important for the compiler to report to the user any errors in the source program.
There are two parts of compilation: analysis and synthesis.
Analysis: breaks down the source program into a tree structure, called syntax tree.
Synthesis: constructs the desired target program from the intermediate representation.
source program -> compiler -> assembler -> loader/link-editor
Lexical Analysis: linear analysis or scanning, how computer reads the source program, reads the characters in source program and group them into a stream of tokens, it identifies keywords, expressions and statements.
Syntax Analysis: parsing, grouping the tokens of the source program into grammatical phrases to synthesize output.
Semantic Analysis: Checks for semantic errors then gathers type information for the subsequent code-generation phase.
lexical analyzer -> syntax analyzer -> semantic analyzer -> intermediate code generator -> code optimizer -> code generator -> target program
during each of the phases above the program may encounter errors, compiler may try to deal with the errors and keep going while other errors most stops the program.
Last step of the compiler generates machine code or assembly code. memory are allocated for the program.
Preprocessors: produce input to compilers, macro processing, file inclusion, ration preprocessors, language extension.
Assemblers: assembly code is mnemonic version of machine code, which names are used instead of binary codes for operation, names are given memory addresses. Some assembly takes two passes at the input. First identify and denote storage location, second translate operation into bits.
Chapter 2 A simple one-pass compiler:
Syntax: what the language looks like
Semantics: what the language mean
character stream -> lexical analyzer -> token stream -> syntax directed translator -> intermediate representation
syntax definition:
if (expression) statement else statement
keyword(terminals) token(nonterminals)
some operands has a higher precedence 9+5*2
Depth-First Traversals: visit an unvisited child of a node whenever it can, tries to visit nodes as far away from the root as quickly as it can.
9-5+2
95-2+
9 and 5 are the bottom of the tree as leaves, one level up is - then 2 then lastly +
Tree is constructed by precedence level, operators at the same precedence level are evaluated left to right. Where the left most operators has less precedence than right operators.
Evaluation is then done on the stack.
expression -> expression + term = infinite loop
no changes to the input takes between recursive calls
The current token being scanned in the input is frequently referred as the lookahead symbol.
count = count + increment;
id = id + id;
input -> lexical analyzer -> parser
A data structure called a symbol table is generally used to store information about various source language constructs.
FrontEnd constructs the intermediate representation of the source program.
BackEnd generates the target program.
Stack manipulation:
push v: push v onto the stack
rvalue l: push contents of data location l
lvalue l: push address of data location l
pop: throw away value of top of the stack
:= the r-value on top is placed in the l-value below it and both are popped
copy: push a copy of the top value on the stack
a+b
rvalue a
rvalue b
+
Control Flow:
label l: target of jumps to l, no other effect
goto l: next instruction is taken from statement with label l
gofalse l: pop the top value; jump if it is zero
gotrue l: pop the top value; jump if it nonzero
halt: stop execution
Lexical Analysis: Reads the the code strip out white space, identify symbols and tokens
Parser: Check for syntax error, constructs functions.
to be continued... page 72
Comments
Post a Comment