Loops from A->A A->B->A: 0(1(00)*1)*0 A->D->A: 1(0(11)*0)*1 A->B->C->D->A: 0(1(00)*1)*1(00)*01 A->D->C->B->A: 1(0(11)*0)*0(11)*10 A -> D Loop around from A to A, go to state D, loop around D. ((0(1(00)*1)*0)|(1(0(11)*0)*1)|(0(1(00)*1)*1(00)*01)|(1(0(11)*0)*0(11)*10))*1(0(11)*0)* Loop around from A to A, go to state B, loop around BCD, go to state C, loop around CD, go to state D ((0(1(00)*1)*0)|(1(0(11)*0)*1)|(0(1(00)*1)*1(00)*01)|(1(0(11)*0)*0(11)*10))*0(1(00)*1)*1(00)*0 ((0(1(00)*1)*0)|(1(0(11)*0)*1)|(0(1(00)*1)*1(00)*01)|(1(0(11)*0)*0(11)*10))*1(0(11)*0)* | ((0(1(00)*1)*0)|(1(0(11)*0)*1)|(0(1(00)*1)*1(00)*01)|(1(0(11)*0)*0(11)*10))*0(1(00)*1)*1(00)*0 ((0(1(00)*1)*0)|(1(0(11)*0)*1)|(0(1(00)*1)*1(00)*01)|(1(0(11)*0)*0(11)*10))* ( (1(0(11)*0)*)|(0(1(00)*1)*1(00)*0)) Takes us through all possible loops from A to A. Then we imagine that we never visit A again. PUMPING LEMMA: The pumping lemma is the primary tool for proving that certain languages are NOT regular. For any regular language, there is an integer p, such that ANY string in the language can be expressed as xyz such that: length(y) >= 1 length(xy) <= p AND x y^n z is in the language for all n >= 0 [xy*z] are all strings in the language. For example, The language {a,b} in which every a is immediately followed by a b: (b | ab)* If you imagine any string of length >=2 in the language you can always "pump the string". Take any string of length>=2 in the language. And look at the first two characters. aa: Couldn't possibly be. ab: x = empty, y = ab , z = everything after that. xy*z is always in the language (ab)*whatever (ab)whatever, (ab)(ab)whatever, (ab)(ab)(ab)whatever, whatever ba: Notice that if the two characters are ba, then that third character has to be a b. x = empty y = ba, z = everything after that. xy*z is always in the language: (ba)*whatever: (ba)(ba)(ba)(ba)(ba)whatever bb: x = empty y = bb z = everything after that: (bb)*whatever This means that p is the number we're looking for. Any string larger than p can be pumped. Consider the language of {a^n b^n}, in other words, the set of strings of all a's followed by all b's with the number of a's and b's being equal. empty, ab, aabb, aaabbb, aaaabbbb, aaaaabbbbb Is this regular? Can I find a regular expression for this language? Can I build a finite automaton for this language? The answer is no way, never. This is because it fails the pumping lemma. The pumping lemma says there must be an integer p, for which the pumping rules apply. Let's assume there is such a p. For ANY string less than p, we can break it up into xyz with len(y) >=1 len(xy) <= p xy*z are ALL in the language. Let's look at the string of p a's followed by p b's. There is a way to break it up into xyz, with len(xy) <= p. In this case, though, since there are p a's followed by p b's, xy must be a string of all a's. Since len(y) >= 1, y is also all a's, and there's at least one a in it. So, by the pumping lemma all the xy*z strings must also be part of the language. But since y is all a's, if apply the star more than once, I now have more a's than b's, generating strings that are not in the language. This is a contradiction because the pumping lemma says that all the strings must also be part of the language. So, we have a contradiction, and our assumption is false. a^n b^n does not satisfy the pumping lemma, so the language is not regular. The set of strings representing balanced parentheses. () ()() (()) ((())()). Is this language regular? No. Assume the language is regular. Then the Pumping Lemma must hold. So, there is a p, such that EVERY string in the language whose length is >= p must satisfy the pumping lemma. Whatever p is, let's look at the string of p left parentheses, followed by p right parentheses. If I break up this string into xyz, with len(xy) <= p, my xy is going to be entirely left parentheses. Since len(y) is >=1, it is also all left parentheses, at least one left parenthesis. So, xy*z is going to have more left parentheses than right parentheses if the y is pumped more than once. That will introduce more left parentheses into the string without changing the number of right parentheses, so we generate strings not in the language. Contradiction, since the Pumping Lemma says we can always generate strings of the language this way. So our assumption is false, and the language of balanced parentheses is not regular. There are very basic and simple languages that cannot be recognized by a finite automaton. Hence, finite automata are not a good model for what we think of as practical computing. This also means that certain relatively basic languages, such as an equal number of a's and b's or balanced parentheses might take an arbitrarily large amount of memory, if the input is large enough. Every regular expression can be converted to a finite automaton. Let's say that there are n states in the automaton. So, any string whose length is greater than n must visit some state more than once in the process. So, there is a cycle in this automaton. Let's say state D is visited twice while the automaton is parsing the string. We can divide the input string into three parts x y and z. x gets us from start to D y is the part from D to D and z is the part from D to the final state. xyz is accepted. y takes us on a loop from D to D. So x ends at state D, z begins at state D. I don't even need y at all. xz will also be an accepted string. For that matter, xyyyyyyyyz will also be an accepted string. So, xy*z are ALL going to be accepted by that automaton.