Transcript slides

Equivalence of Extended
Symbolic Finite Transducers
Presented By:
Loris D’Antoni
Joint work with:
Margus Veanes
Outline
1. Symbolic Automata and Transducers
2. Extended Symbolic Automata and Transducers
– Some negative results
– Some positive results
3. A friendlier restriction with decidable
equivalence
2
Motivations
Automata and Transducers are great!!
Used in many applications (NLP, XML, program
analysis, regex matching…)
Can only handle finite alphabets
Do not scale when the alphabet is very big
(UTF16 has 216 elements)
3
Symbolic Finite Automata (SFA)
[POPL12]
λx. x mod 2=1
λx. x mod 2=0
λx. x mod 2=0
p
Initial state Final states
q
λx. x mod 2=1
Set of states
Symbolic transition function:
labeled with a predicate
4
Symbolic Finite Automata (SFA)
[POPL12]
Execution
Example
λx. x mod 2=1
λx. x mod 2 =0
λx. x mod 2=0
p
q
λx. x mod 2=1
1
p
2
p
5
q
3
p
p
p is final  accept the input
5
Symbolic Finite Transducers (SFT)
[POPL12]
p
λx.x mod 2 = 0 / [λx.x+1, λx.x+2]
Input guard = predicate
(here int  bool)
q
Output = sequence of
functions from input theory
to output theory
(here int  int)
6
Symbolic Finite Transducers (SFT)
[POPL12]
x mod 2 =1/[x-1]
x mod 2 =0/[]
x mod 2 =0/[x, x]
p
q
x mod 2 =1/[x-1]
Input tape
1
p
Output tape
2
p
0
5
q
2
3
p
2
p
4
2
7
Closure and Decidability Properties
All closure properties and decidability results
from classical automata theory still hold
• Alphabet theory is required to be
– A Boolean algebra (closed under Boolean
operations)
– Decidable (we can check for satisfiability)
• Example: SFA intersection
x>5
q1
q2
x<10
p1
q1
p1
x>5 ∧ x<10
q2
p2
p2
8
Applications
• Analysis of .NET regular expressions (use the
theory of bit-vectors for input alphabet)
• Automatic password generation
• Analysis of string sanitizers (BEK)
A limitation of Symbolic Transducers
BASE64 encoder
Text content
M
a
n
Bytes
77
97
110
Bit Pattern
0 1 0 0 1 1 0 1 0 1 1 0 0 0 0 1 0 1 1 0 1 1 1 0
Index
19
22
5
46
Base64 Encoded
T
W
F
u
3 Bytes  4 Base64 characters
Reading one input at a time will cause a
blowup in the number of states!
10
Outline
1. Symbolic Automata and Transducers
2. Extended Symbolic Automata and Transducers
– Some negative results
– Some positive results
3. A friendlier restriction with decidable
equivalence
11
Extended Symbolic Finite Automaton
x1>0 ∧ (x2<x3)
x1
x2
x3
1
7
8
3
p
p
…
p
3
Reads sequences of 3 consecutive
symbols [x1,x2,x3]
Extended Symbolic Finite Transducers
x1≤FF ∧ x2≤FF ∧ x3≤FF /
[x1>>2, ((x1&3)<<4)|(x2>>4), ((x2&0xF)<<2)|(x3>>6), x3&0x3F]
M
p
3
a
n
p
Each output symbol
can be a function of
all the 3 symbols
…
p
T
W
F
u
…
12
A common misconception
All the results in classical automata
theory trivially extend to the symbolic
setting…
A common misconception
While for the previous models (SFAs, SFTs) most
results extend to the symbolic setting…
In the finite case they do not add
expressiveness
In finite alphabet setting reading multiple input
symbols at a time does not matter
ab/[cde]
0
0
1
a/[]
1
b/[cde]
2
15
ESFAs are more expressive than SFAs
This is not true for the symbolic case
x1>x2
0
1
?
16
Emptiness of ESFAs Intersection:
UNDECIDABLE
• Given two ESFAs A and B, is there an input accepted
by both A and B?
• The problem is undecidable:
– Given a two counter machine M we construct two
ESFAs A and B such that A ∩ B is empty iff M does
not halt on any input
17
Proof that Emptiness of ESFA
Intersection is undecidable (1/2)
Machine M
1. Inc(a)
2. Dec(a)
3. Inc(b)
4. if(a=0) goto 3 else goto 5
5. Dec(b)
6. Halt
Encode M’s run
as following
sequence
a
0
1
0
0
0
0
…
b
0
0
0
1
1
2
…
PC
1
2
3
4
3
4
…
1. Inc(a)
18
Proof that Emptiness of ESFA
Intersection is undecidable (1/2)
Machine M
1. Inc(a)
2. Dec(a)
3. Inc(b)
4. if(a=0) goto 3, goto 5
5. Dec(b)
6. Halt
Intersection is
empty if the two
counter machine
doesn’t halt
a
0
1
0
0
0
0
…
b
0
0
0
1
1
2
…
PC
1
2
3
4
3
4
…
We are only checking
half of the
configurations
x1.pc=1 ∧ x2.pc=2 ∧ x2.a=x1.a+1 ∧ x1.b=x2.b
V
……… V
x1.pc=4 ∧ x2.pc=3 ∧ x1.a=x2.a ∧ x1.a=0 ∧ x1.b=x2.b V
x1.pc=4 ∧ x2.pc=5 ∧ x1.a=x2.a ∧ ¬x1.a=0 ∧ x1.b=x2.b V
……
x1.pc=6
2
0
1
x1.pc=6
1
1
19
Other Negative Results
Universality of ESFA is undecidable
ESFA equivalence is undecidable
ESFAs are not closed under intersection
ESFAs are not closed under complement
Nondeterministic ESFAs are strictly more
expressive than deterministic ESFAs
ESFTs equivalence is undecidable
ESFTs are not closed under composition
Symbolic automata are not so trivial after all
20
Some Positive Results
Emptiness (reachability) is decidable for both
ESFAs and ESFTs
Nondeterministic ESFAs are closed under union
• Not quite satisfactory, and very limited…
– Can we do better?
21
Outline
1. Symbolic Automata and Transducers
2. Extended Symbolic Automata and Transducers
– Some negative results
– Some positive results
3. A friendlier restriction with decidable
equivalence
22
A Simpler Model:
Cartesian ESFAs and ESFTs
Most negative results use binary guards in
predicate guards
p
x1=x2+1
q
We can restrict the model to avoid this issue:
Cartesian ESFAs and Cartesian ESFTs only allow guards
to be conjunctions of unary predicates
p
x1>5 ; x2=1 / [x1+x2, x2, x1]
q
It can be decided if an ESFT (ESFA) is Cartesian
23
Cartesian ESFA = SFA
Cartesian ESFAs are now equivalent (but more
succinct) to SFAs
x1>5 ∧ x2=1
0
0
1
x>5
1
x=1
2
24
Cartesian ESFTs > SFTs
Cartesian ESFTs are strictly more expressive than SFTs!!
0
x1>5 ∧ x2=1 /
[x1+x2, x2, x1]
1
?
25
Equivalence of Cartesian ESFTs
• Given two Cartesian ESFTs A and B,
A is equivalent to B if
– A and B have the same domain
• The domain of a Cartesian ESFT is a Cartesian ESFA (just
drop outputs)
• Cartesian ESFAs are equivalent to SFAs
• Equivalence of SFAs is decidable [POPL12]
– For every input in the intersection of the domains,
A and B produce the same output (one-equality)
• ….
26
One-Equality of Cartesian ESFTs
x1<5, x2>2 / [x1+x2]
q0
x1<10, x2>0, x3=1 / [x1, x2, x3]
q1
2
p0
3
p1
Align inputs
x1<5 ∧ x1<10 / [x1+x2], [x1,x2,x3]
q0
p0
?? ∧ x3=1 / ??, [ ]
x2>2 ∧ x2>0 / [ ], [ ]
qt1
pt1
q1
pt1
?
p1
Align outputs
?? ∧ x3=1 / ??, [x3]
x1<5 ∧ x1<10 / [x1+x2], [x1, x2]
q0
p0
q1
pt1
?
p1
27
Result Summary
A theoretical analysis of ESFAs and ESFTs
A new model: Cartesian ESFAs and ESFTs (can
model BASE 64)
Clear line for decidability of equivalence:
ESFTs vs Cartesian ESFTs
This and other algorithms at
http://rise4fun.com/Bex/ (still in Beta)
28
Applications
• Analysis of string encoders:
• Proved correctness of BASE64, UTF8, etc.
• Succinct representation of regex pattern
matching
• Fast code generation
Future Work
• Analysis of composition of ESFTs
– Partially discussed in [VMCAI13]
• Use ESFAs to compute range of symbolic
transducers
– Range of SFT is not SFA but maybe is an ESFA?
– Use range for synthesizing program inversion
30
Thank you
Loris D’Antoni
[email protected]
Questions?
31
Symbolic Finite Automaton (SFA)
[POPL12]
• Classical acceptor modulo a rich alphabet
– Alphabet is an effective Boolean Algebra
• Core Idea: represent labels with predicates
– Separation of concerns: finite graph / algebra of labels
Concrete transitions:
p
a
b
q
Symbolic transition:
p
… z
bitvector
predicate
 x. 6116 ≤ x ≤ 7A16
q
32
Symbolic Finite Transducers Example
• Utf8 encoder
– Input: valid utf16 encoded string
– Output: equivalent utf8 encoded string
For example utf8encode(“\uFF28\uFF29”) = “\xEF\xBC\xA8\xEF\xBC\xA9”
Equiv. classical
transducer has
216 transitions
5 states &
11 transitions
Dagstuhl Seminar 13021
33
Complete Rutf8
34
One-Equality of Cartesian ESFTs
1.
We incrementally build a product ESFT using a depth-first search
x1<5, x2>2 / [x1+1, x2]
q0
x1<10, x2>0, x3=1 / [x1, x2, x3]
q1
2
p0
2
p1
Build early product
x1<5 ∧ x1<10 / [x1+1, x2], [x1,x2,x3]
q0
p0
Found
inequivalence
qt1
pt1
q1
pt1
Try aligning
x1<5 ∧ x1<10 / [x1+1], [x1]
q0
p0
?? ∧ x3=1 / _, _
x2>2 ∧ x2>0 / _,_
?? ∧ x3=1 / ??, [x3]
x2>2 ∧ x2>0 / [x2], [x2]
qt1
pt1
Continue with
every possible
state
?
p1
q1
pt1
?
p1
35
One-Equality of Cartesian ESFTs
Case with predicates that can’t be completely shifted
x1<5, x2>2 / [x1+x2]
q0
q1
2
x1<5 ∧ x1<10 / [x1+x2], [x1]
q0
p0
x1<10, x2>0, x3=1 / [x1, x2, x3]
p0
2
x2>2 ∧ x2>0 / [ ], [x2]
qt1
pt1
?? ∧ x3=1 / ??, [x3]
q1
pt1
?
p1
?? ∧ x3=1 / ??, [x3]
x1<5 ∧ x1<10 / [x1+x2], [x1, x2]
q0
p0
p1
q1
pt1
?
p1
36
One-Equality of Cartesian ESFTs
Case with predicates that can’t be shifted at all
x1<5, x2>2 / [x1+x2]
q0
q1
2
x1<5 ∧ x1<10 / [x1+x2], [x1]
q0
p0
x1<10, x2>0, x3=1 / [x1, x2+x3]
p0
2
?? ∧ x3=1 / ??, []
x2>2 ∧ x2>0 / [ ], [x2+x3]
qt1
pt1
p1
q1
pt1
?
p1
Alignment not possible!
Easy to generate witness for inequivalence in this case
37