what page does it talk about converting nfa to dfa
Regular Language Closure Backdrop
Since DFAs are equivalent to NFAs, we can talk only nigh "credence by an automaton," using whichever grade NFA or DFA which suits usa. The main outcome we desire to prove is the following closure properties.
Theorem: Languages accepted by automata are closed under spousal relationship, concatenation, Kleene*, complementation, intersection.
Special form for finite automata
This normalization concept is discussed in the textbook on page 81. Normalization simplifies many of the necessary constructions. Every finite automaton is equivalent to one with these properties:
- there is merely i final state
- the final state and start state are distinct (and so the offset state is non-concluding)
- at that place are no transitions into the beginning country
- there are no transitions out of the terminal state
The argument is simple:
- If necessary, create a new state s′, add together the transition (s′,ε,s) to Δ, make s′ the new commencement state.
- If necessary a new country f′. Add together (f,ε,f′) to Δ, for each f ∈ F. Fix F = {f′}.
The closure proofs for union, chain, Kleene-star use NFA(s) in special form and construct a derived NFA also in special course. The proofs of union and chain assume languages
L1and
502with corresponding special form NFAs
Grandone = (Chiliadone,Σ,Δ1,s1,F1)and
M2 = (Ktwo,Σ,Δ2,sii,F2). Assume that
Maneand
Ktwohave no states in common. The starting signal of both proofs begins like this:
- create the state set K = Thousandane ∪ 10002 (us of both)
- create the transition relation Δ = Δ1 ∪ Δii ⊆ G×(Σ∪{ε})×K (the transitions of both)
Wedlock:
The thought is to merge the get-go states as a unmarried showtime state and do the aforementioned for the final states. The concept of merge means that nosotros add new start and last states
sand
fto
1000representing the start state set
{southward1,s2}, and final country fix
{fane,f2}, respectively.
The reason this works is that neither the start states have incoming edges and neither terminal states take outgoing edges. Whatever cord which is accepted must forever leave the showtime land, follow a path which is exclusively inside i car or the other, and so terminate in one concluding state or another without reentry. The thought is conceptually simple, but here are the details:
for each transition (s1,x,q) ∈ Δone, (x = ε, or 10 = σ ∈ Σ): add together (south,x,q) and remove (due southi,x,q) for each transition (s2,ten,q) ∈ Δtwo: add together (s,x,q) and remove (south2,x,q) remove s1 and s2 from Chiliad make s the start land for each transition (q,x,f1) ∈ Δ1: add (q,x,f) and remove (q,x,f1) for each transition (q,x,f2) ∈ Δ2: add (q,10,f) and remove (q,x,fii) remove fi and f2 from 1000 make f a final state (the only one) |
Concatentation:
The idea is to create a non-final merged state
{f1,due south2}and remove both
f1and
stwo. An accepting string must showtime at
due south1, pass through the merged land getting a
wane ∈ L1, and continue to
f2adding a
due west2 ∈ Ltwo, thereby creating the accepting string
westi·wii ∈ Fiftyone·L2.
Add a new country j (for join) to K which represents the merge of states {fane,sii}. | for each transition (q,x,f1) ∈ Δone: add together (q,x,j) and remove (q,10,f1) for each transition (s2,x,q) ∈ Δ2: add (j,x,q) and remove (s2,ten,q) remove f1 and southward1 from K |
Kleene-star:
Create an automaton in normal form for L with kickoff state s and terminal state f. Create a new offset state s′ and new final country f′. | Add the following transitions to Δ: (south′,ε,s) connect to the old start land (f,ε,f′) connect from the old final country (f,ε,s) loop from former last to old kickoff to allow repetitions (s′,ε,f′) insure that ε is accepted |
Complementation:
Have a DFA for
L(convert an NFA to a DFA using the Rabin-Scott construction). Then interchange final and not-concluding states, namely, prepare
F = K - F. Note that all transitions must exist nowadays — i.e., it must be a true DFA — for this construction to work properly.
Intersection
We illustrated the intersection structure before using the Cartesian produce of the state sets. Nonetheless, the closure on intersection follows because you tin derive intersection from union and complementation (DeMorgan's Laws):
L1 ∩ L2 = Σ* - ( (Σ* - L1) ∪ (Σ* - 50two) )
Equivalence of Regular Languages and Automata
This is Theorem ii.3.2 in the textbook. The event is credited to Kleene in 1956 and is chosen Kleene's Theorem.
I. Every regular language is accepted past an automaton.
The set of regular languages is the closure (smallest possible set of subsets of
Σ*) containing
∅and
σ ∈ Σnether the operations marriage, chain, Kleene*. It is evident that
∅and
σ ∈ Σare accepted by automata, and and then the class of regular languages must be contained in the class of languages accepted by automata. Here are depictions of the "base" NFAs in special grade as divers above:
Technically, here is no demand to make a "base of operations-case" automaton for
{ε}considering
{ε} = ∅*.
Comparison with textbook
Our construction of an NFA from regular expressions differs from the textbook's version based on our requirement that the NFA must be of the special form. The special form requirement makes union and concatenation simpler at the cost of making Kleene-star more circuitous. For case, our construction generates this:
ab∪aa = |
The textbook construction would give this:
Instance ii.3.1
Try this for size:
(ab∪aab)*. Our version is this:
The textbook structure would give this:
II. The linguistic communication accepted by an automaton is regular.
Assume we take an automaton
Min special course. Assume the states are the integers
1..n+1, for
n ≥ 0. The start state
s=0and the final state is
f=due north+ane. The idea is to eliminate all the intermediate states
ane..nin order, effectively adding regular expression transitions to make upward for the eliminated states.
Define, for
0 ≤ i < n+1,
0 < j ≤ n+ane, and
0 ≤ thousand ≤ nR(i,j,k) = { w ∈ Σ* : (i,west) ⊢* (j,ε) along a path with intermediate states ≤ k }
in particular, when
k=0:
(A) R(i,j,0) = strings representing whatsoever edges from i to j with no intermediate states
Thus,
R(i,j,0)= the union of symbols on transitions from
ito
j.
Continuing,
R(i,j,i) = strings representing any edges from i to j with only 1 every bit an intermediate country
Call back of it like this:
= whatsoever strings we already have, plus whatever cord from
i to 1 omitting ane,
followed by nil or more repetitions of strings from 1 to i, omitting one,
followed by a cord from i to
, omitting 1
formally written this is:
R(i,j,1) = R(i,j,0) ∪ R(i,ane,0) · R(one,1,0)* · R(1,j,0)
This equation may seem problematic if either
ior
jis i, but it is correct; eastward.g.
R(ane,1,1) = R(1,i,0) ∪ R(one,one,0) · R(1,i,0)* · R(1,i,0) "from 1 to 1" "single loop transition" "2 or more loop transitions"
In full general, for
m > 0,
(B) R(i,j,k) = R(i,j,one thousand-1) ∪ R(i,k,k-1) · R(k,g,m-ane)* · R(k,j,k-1)
When
grand = n, we have:
R(i,j,northward) = strings from i to j where yous can pass through whatsoever intermediate state
In particular
R(0,northward+1,n) = the linguistic communication accustomed by the automaton
The claim is that, for all
0 ≤ i < n+1,
0 < j ≤ n+1, and
0 ≤ thousand ≤ northward,
R(i,j,k)is regular.
The base of operations case is expressed as (A) to a higher place. The induction footstep is expressed by (B) above.
Thus we've proved our result. ■
NFA to Regular Expression Examples
In the construction of the regular expression of an automaton, when a state is eliminated with
northwardincoming (not-loop) edges and
moutgoing (non-loop) edges, we will take to add together
n*mnew regular expression edges and consolidate these with older ones.
The internal states (non-start and non-final) tin can exist eliminated in whatever order, however, in practice it is best to eliminate states which are "less circuitous" showtime. The complexity of a country tin be judged by these factors:
- the complication of the loop expression, if there is one
- the value of northward = # of incoming (non-loop) edges
- the value of m = # of approachable (non-loop) edges
Information technology may be impossible to completely quantify the complexity based on these three factors, but you tin can use them to counterbalance the choices of lodge of state elimination, in order of increasing complexity. As the construction proceeds, these complication measures may modify and you may need to revise the order of emptying of the remaining states. When you lot get downwards to the final state, you can usually read off the reply direct from the graph.
Instance ane
Consider the DFA for the language over
{a,b}of strings with an even number of
a's:
First put this into special grade NFA:
Land
q0has one ii incoming and 2 outgoing edges whereas state
q1has simply one of each. And so eliminate
q1first, getting:
The terminal regular expression tin be read equally:
(b∪ab*a)*Case two
Consider the DFA for the language over
{a,b}of strings with an odd number of
a's:
Beginning put this into special form NFA:
Country
q0has one ii incoming and 1 approachable edges whereas land
q11 incoming and 2 approachable, and so they're basically equal. We cull
q0to eliminate commencement, getting:
The concluding regular expression can be read equally:
b*a(b∪ab*a)*Case iii
Consider the language over
{a,b}of strings with which have
bbas a substring. Using a DFA that we have already constructed, put this into a special form NFA:
Country
q2is obviously the simplest, so eliminate it, getting:
Of the ii remaining,
q1is the simplest. Eliminate it, getting:
The terminal regular expression can be read as:
(a∪ba)*bb(a∪b)*.
Of grade,
(a∪b)*bb(a∪b)*is the more obvious regular expression, merely the one derived hither from the DFA expresses the beginning occurrence of
bbin the string.
Example 4
Consider this instance that nosotros have visited previously:
50 = { westward : w does not contain the substring bbb }
Here are two forms of the automaton. The first, you might say, is a partial DFA in that it is deterministic, only the dead state has been omitted. The second is the special form version which is the starting point of the algorithm.
Us are numbered in order of what I consider increasing complexity according to the special form. Compare the factors:
land | # incoming edges | # outgoing edges |
---|---|---|
1 | 1 | 2 |
2 | 1 | three |
3 | 3 | two |
Hither are the steps. To help minimize complication, nosotros volition use convenience quantifiers
?, +, {n,g}when it suits, namely
(r)? = r ∪ {ε} (r)+ = r·r* (r){m,northward} = rm ∪ rm+i ∪ ... ∪ rn
Eliminate one:
| Use the original special form NFA higher up. The path [(two,b,1), (one,ε,f)] gives (2,b,f). Combine with (2,ε,f) gives (2,b?,f). The path [(ii,b,1), (i,a,3)] gives (ii,ba,3). Combine with (2,a,3) gives (ii,b?a,f). |
Eliminate two:
| Use the graph subsequently eliminating 1. After comparison two and 3, nosotros still think two is less circuitous. The path [ (iii,b,2), (2,b?,f) ] gives (3,bb?,f). Combine with (3,ε,f) gives (3,b{0,2},f). The path [ (3,b,2), (2,b?a,3) ] gives (iii,bb?a,3). Combine with (iii,a,3) gives (3,b{0,ii}a,three). |
The final regular expression tin can be read as: (b{0,2}a)*b{0,2}.
The respond we get is in a surprisingly curtailed grade which we may not take been able to derive hands without this algorithmic construction.
Source: https://www.cs.wcupa.edu/rkline/fcs/re-dfa-equiv.html