Shared Flashcard Set

Details

Comp2010 - Flashcard Set 1 - Lexical Analysis
Comp2010 - Flashcard Set 1 - Lexical Analysis
18
Computer Science
Undergraduate 2
05/27/2013

Additional Computer Science Flashcards

 


 

Cards

Term
What does lexical analysis consist of?
Definition
  • Splits the source into words or tokents
  • Computes semantic values
    • Convert numeric strings into numebrs
    • Convert names into identifiers
  • Discard white space (spaces & comments)
Term
What is a token?
Definition

A token is a sequence of characters that is treated as a unit in the grammar of the programming language.

A specific semantic value can have different lexemes:

Sematic value: REAL(42.0)

Lexemes: 4.2e1, 42.0, 42.

Term
Why a separate lexical analysis makes syntax analysis simpler and easier?
Definition
  • if spaces and comments are allowed everywhere, we can rewrite "if n==/*base*/ 0 then return 0" to "IF ID(n) ROP(==) NUM(0) THEN RETURN NUM(0)"
  • Tokens group characters and encapsulate variability
    • Syntax rules become simpler
  • Scanner uses more efficient mechanism than parser
    • Finite-state machine vs. push-down automaton
Term
What would be a theory based approach to lexical analysis?
Definition
  1. Regular expressions to describe tokens
    • Conversion 
  2. Non-deterministic finite automaton
    • Subset construction
  3. Deterministic finite automaton
    • Minimization
  4. Linear-time algorithm in C, Java to recognize tokns
Term
Define alphabet, string and language
Definition
Alphabet is a finite set of symbols - string is a finite sequence of symbols in the alphabet - language is a possibly infinite set of string
Term
What is a regular expression M?
Definition
  • concisely describes a regular language L(M)
    • Finite representation of infinite sets
    • Built form operators: a ε | • * ( ) + 
Term
Define a finite automaton
Definition

Consists of:

  • A set of input symbols (alphabet) ∑
  • a finite set of states S
  • a finite set of labeled edges E proper subset of (∑ U ε) x S x S 
    • Notation : s1 -a> s2
  • A set of final states F proper subset of S
Term
What is the difference between a deterministic and non-deterministic finite automata?
Definition

Difference is in the edges:

  • A non deterministic finite automaton (NFA)
    • May have edges with label ε (ε-edges)
    • May have multiple edges with the same label (starting at the same state)
  • A deterministic finte automaton (DFA)
    • Has no ε-edges
    • Has at most one edge with label a for each state s and symbol a
Term
How can regular expressions be converted into finite automata?
Definition

McNaughton/Yamada/Tompson-algorithm:

  • Recurses over structure fo regular expression
    • Defines one pattern for each opearator
      • X | Y would be state s1 and s2 where x and y would both have an edge from s1 to s2
      • XY would be states s1, s2, s3, where s1 -x> s2 and s2 -y> s3
      • X* would be s1 -x> s1
    • uses ε-edges to glue together sub-automata
    • each automaton
      • has a single accepting state
      • has a start state with no incoming edge
Term
What is the formal definition of acceptance of a string by a DFA?
Definition
  • A string a1, a2, ... an is accepted by a DFA if there is a sequence s1, s2, ... sn+1 of staets with
    • s1 is the initial state
    • sn+1 is a final state
    • s1 -a1> s2 ... sn -an> Sn+1
  • DFA accepts ε iff s1 is in F

 

Term

What is the formal definition of acceptance of a string by a DFA?

 

Definition
  • A string a1, a2, ... an is accepted by a DFA if there is a sequence s1, s2, ... sn+1 and t1 ... tn states with
    • s1 is the initial state
    • sn+1 is a final state
    • t1 -a1> s2 ... tn -an> Sn+1
    • ti is a ε-closure(si) iff there exists a sequence of s2, ..., s-1 states with
      • s -ε> s2, ... sn-1 -ε> t
  • DFA accepts ε iff s1 is in F
Term
How is a DFA runned over a string to check acceptace?
Definition

Algorithm

  • curr = s1;
  • for i = 1 to n
    • curr:= E[curr, w[i]]; 
  • return curr in F;

w = string to be checked

E = edges represented as state array indexed by state x symbol

curr  = current state

Term
What do we need to define to convert an NFA into DFA?
Definition

DFA state ==  set(NFA states)

DFA edge == NFA edge + ε-edges

 

  • We let:
    • s, t be NFA states
    • T be a set of NFA states (= single DFA state)
    • 'a' be an input symbol
  • We define
    • move(T, a) = {t | Ê s in T and S -a> t}
    • ε-closure(T) = union of all s in T of ε-closure(s)
Term
What is the algorithm for a subset construction?
Definition

Let D = set of DFA states

Let DTran[T, a] be the DFA transition table 

  • D = ε-closure(s1);
  • while (There exists unmarked state T in D) {
    • mark T;
    • for each (a in Σ) {
      • V = ε-closure(Move(T,a));
      • if (V not in D) D = D U {V};
      • DTran[T, a] = V;
    • }
  • }
Term
What is ε-closure of a state s?
Definition

It is all the states you can reach from state s by following all the ε paths

Term
Intuitively explain the conversion from NFA to DFA
Definition

http://www.youtube.com/watch?v=taClnxU-nao

  1. We write a table where the columns are the transition inputs followed by ε*, and the rows are the states - initially we just have the starting state 
  2. We take the starting state and see all the possible transitions with the inputs followed by ε* and we write the results in the respective cell all the states we can reach with that input ε*
  3. We then add all the states we obtained as a state (if they contain a final state, they are a final state, so we circle them
  4. We then draw all the states we obtained, and we join them with their corresponding inputs
Term
How to minimize a DFA?
Definition
  1. Add "dead" state so that transitions are defined for all symbols in all states
  2. Partition states into final and non-final
  3. Partition each group so that two states s, t are in the same (new) subgroup iff for all a in ∑, s -a> S' and t -a> t' then s' and t' are in the same (old) group
  4. Repeat 3 until no further change
  5. Merge equivalent states
  6. Remove dead states
Term
Explain intuitively how to minimize a DFA
Definition

http://www.youtube.com/watch?v=TvMEX2htBYw

  1. From the DFA we make two sets, the first set will hav all the final sets, and the second set will have the rest of the sets
  2. We take one set and for each element of that set we try all the inputs and observe which are the states it reaches. 
    • If the states it reaches are the same states that we have in the set we don't have to do anything
    • Otherwise, if it reaches states outside its own set we classify it as a new set.
      • If elements reach the same sets, they would belong to the same set
  3. We repeat S3 until we see no changes
Supporting users have an ad free experience!