# Shared Flashcard Set

## Details

Comp2010 - Flashcard Set 1 - Lexical Analysis
Comp2010 - Flashcard Set 1 - Lexical Analysis
18
Computer Science
05/27/2013

Term
 What does lexical analysis consist of?
Definition
 Splits the source into words or tokents Computes semantic values Convert numeric strings into numebrs Convert names into identifiers Discard white space (spaces & comments)
Term
 What is a token?
Definition
 A token is a sequence of characters that is treated as a unit in the grammar of the programming language. A specific semantic value can have different lexemes: Sematic value: REAL(42.0) Lexemes: 4.2e1, 42.0, 42.
Term
 Why a separate lexical analysis makes syntax analysis simpler and easier?
Definition
 if spaces and comments are allowed everywhere, we can rewrite "if n==/*base*/ 0 then return 0" to "IF ID(n) ROP(==) NUM(0) THEN RETURN NUM(0)" Tokens group characters and encapsulate variability Syntax rules become simpler Scanner uses more efficient mechanism than parser Finite-state machine vs. push-down automaton
Term
 What would be a theory based approach to lexical analysis?
Definition
 Regular expressions to describe tokens Conversion  Non-deterministic finite automaton Subset construction Deterministic finite automaton Minimization Linear-time algorithm in C, Java to recognize tokns
Term
 Define alphabet, string and language
Definition
 Alphabet is a finite set of symbols - string is a finite sequence of symbols in the alphabet - language is a possibly infinite set of string
Term
 What is a regular expression M?
Definition
 concisely describes a regular language L(M) Finite representation of infinite sets Built form operators: a ε | • * ( ) +
Term
 Define a finite automaton
Definition
 Consists of: A set of input symbols (alphabet) ∑ a finite set of states S a finite set of labeled edges E proper subset of (∑ U ε) x S x S  Notation : s1 -a> s2 A set of final states F proper subset of S
Term
 What is the difference between a deterministic and non-deterministic finite automata?
Definition
 Difference is in the edges: A non deterministic finite automaton (NFA) May have edges with label ε (ε-edges) May have multiple edges with the same label (starting at the same state) A deterministic finte automaton (DFA) Has no ε-edges Has at most one edge with label a for each state s and symbol a
Term
 How can regular expressions be converted into finite automata?
Definition
 McNaughton/Yamada/Tompson-algorithm: Recurses over structure fo regular expression Defines one pattern for each opearator X | Y would be state s1 and s2 where x and y would both have an edge from s1 to s2 XY would be states s1, s2, s3, where s1 -x> s2 and s2 -y> s3 X* would be s1 -x> s1 uses ε-edges to glue together sub-automata each automaton has a single accepting state has a start state with no incoming edge
Term
 What is the formal definition of acceptance of a string by a DFA?
Definition
 A string a1, a2, ... an is accepted by a DFA if there is a sequence s1, s2, ... sn+1 of staets with s1 is the initial state sn+1 is a final state s1 -a1> s2 ... sn -an> Sn+1 DFA accepts ε iff s1 is in F
Term
 What is the formal definition of acceptance of a string by a DFA?
Definition
 A string a1, a2, ... an is accepted by a DFA if there is a sequence s1, s2, ... sn+1 and t1 ... tn states with s1 is the initial state sn+1 is a final state t1 -a1> s2 ... tn -an> Sn+1 ti is a ε-closure(si) iff there exists a sequence of s2, ..., s-1 states with s -ε> s2, ... sn-1 -ε> t DFA accepts ε iff s1 is in F
Term
 How is a DFA runned over a string to check acceptace?
Definition
 Algorithm curr = s1; for i = 1 to n curr:= E[curr, w[i]];  return curr in F; w = string to be checked E = edges represented as state array indexed by state x symbol curr  = current state
Term
 What do we need to define to convert an NFA into DFA?
Definition
 DFA state ==  set(NFA states) DFA edge == NFA edge + ε-edges   We let: s, t be NFA states T be a set of NFA states (= single DFA state) 'a' be an input symbol We define move(T, a) = {t | Ê s in T and S -a> t} ε-closure(T) = union of all s in T of ε-closure(s)
Term
 What is the algorithm for a subset construction?
Definition
 Let D = set of DFA states Let DTran[T, a] be the DFA transition table  D = ε-closure(s1); while (There exists unmarked state T in D) { mark T; for each (a in Σ) { V = ε-closure(Move(T,a)); if (V not in D) D = D U {V}; DTran[T, a] = V; } }
Term
 What is ε-closure of a state s?
Definition
 It is all the states you can reach from state s by following all the ε paths
Term
 Intuitively explain the conversion from NFA to DFA
Definition
 http://www.youtube.com/watch?v=taClnxU-nao We write a table where the columns are the transition inputs followed by ε*, and the rows are the states - initially we just have the starting state  We take the starting state and see all the possible transitions with the inputs followed by ε* and we write the results in the respective cell all the states we can reach with that input ε* We then add all the states we obtained as a state (if they contain a final state, they are a final state, so we circle them We then draw all the states we obtained, and we join them with their corresponding inputs
Term
 How to minimize a DFA?
Definition
 Add "dead" state so that transitions are defined for all symbols in all states Partition states into final and non-final Partition each group so that two states s, t are in the same (new) subgroup iff for all a in ∑, s -a> S' and t -a> t' then s' and t' are in the same (old) group Repeat 3 until no further change Merge equivalent states Remove dead states
Term
 Explain intuitively how to minimize a DFA
Definition
 http://www.youtube.com/watch?v=TvMEX2htBYw From the DFA we make two sets, the first set will hav all the final sets, and the second set will have the rest of the sets We take one set and for each element of that set we try all the inputs and observe which are the states it reaches.  If the states it reaches are the same states that we have in the set we don't have to do anything Otherwise, if it reaches states outside its own set we classify it as a new set. If elements reach the same sets, they would belong to the same set We repeat S3 until we see no changes
Supporting users have an ad free experience!