Problem with DFSA State machines and Regex

Posted on January 1, 2013
Tags: algorithms

1 DFSA vs NDFSA

2 state machines are bad at the fine-grained level

stateDiagram-v2 s1 --> s2 : +1 s2 --> s1 : +2

In the above example, we dont keep track of what the total sum actually is.
The sum is the internal state(global context) and it is impossible to track.

3 Regex

regex are just state machines

Regular languages – the languages that regular expressions can recognise – are a proper subset of context-free languages, and the written form of s-expressions is context-free but not regular. So no regular expression can recognise the written form of an s-expression.

To see this consider a very tiny subset of s-expressions:

n = () | ( n)

So n consists of the set {(), (()), ((())), …}, where the number of left parens and right parens in each string are equal. Such a language can’t be recognised by a regular expression because you need to count parens.

not really but we could make an infinitely long state machine

stateDiagram-v2 s1 --> s2 : { s2 --> s3e : } s2 --> s3 : { s3 --> s4e : } s4e --> s4ee : } s3 --> s4 : { s4 --> s5e : } s5e --> s5ee : } s5ee --> s5eee: } s4 --> .. : ...

4 Chomsky Grammar and Regex

Chomsky hierarchy:

/-------------------------------------------\
|                                           |
|     Recursively enumerable languages      | Type 0
|                                           |
|   /-----------------------------------\   |
|   |                                   |   |
|   |    Context-sensitive languages    |   | Type 1
|   |                                   |   |
|   |   /---------------------------\   |   |
|   |   |                           |   |   |
|   |   |  Context-free languages   |   |   | Type 2
|   |   |                           |   |   |
|   |   |   /-------------------\   |   |   |
|   |   |   | Regular languages |   |   |   | Type 3
|   |   |   \-------------------/   |   |   |
|   |   \---------------------------/   |   |
|   \-----------------------------------/   |
\-------------------------------------------/

Example of what a grammar looks like:
A -> x B C where x is a terminal variable
Sentence -> Noun Verb

5 How is regex type 3?

Level 3 grammar, express the natural numbers

N -> 0
N -> 1
N -> 2
N -> 3
N -> 4
N -> 5
N -> 6
N -> 7
N -> 8
N -> 9
N -> 0N
N -> 1N
N -> 2N
N -> 3N
N -> 4N
N -> 5N
N -> 6N
N -> 7N
N -> 8N
N -> 9N

Regex is a simplification and makes the above [0-9]+