Program Synthesis

Download Report

Transcript Program Synthesis

From Verification to Synthesis
Sumit Gulwani
[email protected]
Microsoft Research, Redmond
August 2013
Marktoberdorf Summer School Lectures: Part 1
Synthesis
Goal: Synthesize a computational concept in
some underlying language from user intent
using some search technique.
State of the art: We can synthesize
programs of size 10-20.
1
Dimensions in Synthesis
• Language (Application)
– Programs
• Straight-line programs
– Automata
– Queries
• User Intent (Ambiguity)
– Logic, Natural Language
– Examples, Demonstrations/Traces
– Program
• Search Technique (Algorithm)
– SAT/SMT solvers (Formal Methods)
– A*-style goal-directed search (AI)
– Version space algebras (Machine Learning)
PPDP 2010: “Dimensions in Program Synthesis”, Gulwani.
2
Compilers vs. Synthesizers
Dimension
Compilers
Synthesizers
Concept
Language
Executable Program
Variety of concepts:
Program, Automata, Query,
Sequence
User Intent
Structured language Variety/mixed form of
constraints: logic, examples,
traces
Search
Technique
Syntax-directed
Uses some kind of search
translation (No new (Discovers new algorithmic
algorithmic insights) insights)
3
• From verification to synthesis
–
–
–
–
–
Bitvector algorithms (PLDI 2011, ICSE 2012)
General loopy programs (POPL 2010)
SIMD algorithms (PPoPP 2013)
Program inverses (PLDI 2011)
Graph algorithms (OOPSLA 2010)
• End-user Programming (Examples & Natural Language)
–
–
–
–
Syntactic string transformations: Flash Fill (POPL 2011)
Semantic string transformations (VLDB 2012)
Table layout transformations (PLDI 2011)
Smartphone scripts (MobiSys 2013)
• Computer-aided Education
–
–
–
–
Problem Synthesis (AAAI 2012, CHI 2013)
Solution Synthesis (PLDI 2011, IJCAI 2013)
Feedback Synthesis (PLDI 2013, IJCAI 2013)
Content Authoring (CHI 2012)
4
From Verification to Synthesis
Application
Generating Synthesis Solving Synthesis
Constraint
Constraint
Bitvector
Loopy Alg.
SIMD
Location variables
CEGIS + SMT
Template-based
SMT
Relational verification CEGIS +
Reachability value graph
Template-based +
SMT
symbolic execution
Inverses
Graph Alg.
Reference:
Path-based Inductive Synthesis for Program Inversion,
PLDI 2011, Srivastava, Gulwani, Chaudhuri, Foster
5
Dimensions in Synthesis
• Language
– Programs
• Straight-line programs
– Automata
– Queries
• User Intent
– Logic, Natural Language
– Examples, Demonstrations/Traces
– Program
• Search Technique
– SAT/SMT solvers (Formal Methods)
– A*-style goal-directed search (AI)
– Version space algebras (Machine Learning)
6
Program Inversion: Example
In-place run-length encoding:
A = [1,1,1,0,0,2,2,2,2]
Encoder
A=[1,0,2]
N=[3,2,4]
Decoder
A’=[1,1,1,0,0,2,2,2,2]
IN(A,n); Assume (n >= 0)
i, m := 0, 0; // parallel assignment
while (i<n)
r := 1;
while (i+1<n && A[i]=A[i+1])
r, i := r+1, i+1;
A[m], N[m], m, i := A[i], r, m+1, i+1;
OUT(A,N,m);
IN(A,N,m)
i’, m’ := 0, 0;
while (m’ < m)
r’ := N[m’];
while (r’>0)
r’,i’, A’[i’] := r’-1, i’+1, A[m’];
m’ := m’+1;
OUT(A’,m’); assert(A’=A; m’=n);
7
Program Inversion as Synthesis Problem
In-place run-length encoding:
A = [1,1,1,0,0,2,2,2,2]
Encoder
A=[1,0,2]
N=[3,2,4]
Decoder
A’=[1,1,1,0,0,2,2,2,2]
E = { 0, 1, m’±1, r’±1, i’±1, A[i’],
A[m’], N[m’] }
P = { m’<m, r’>0, A’[i’]= A’[i’+1] }
IN(A,n); Assume (n >= 0)
i, m := 0, 0; // parallel assignment
while (i<n)
r := 1;
while (i+1<n && A[i]=A[i+1])
r, i := r+1, i+1;
A[m], N[m], m, i := A[i], r, m+1, i+1;
OUT(A,N,m);
IN(A,N,m)
i’, m’ := e1, e2; // ei ∈ E
while (p1)
// pi ∈ P
r’ := e3;
while (p2)
r’,i’, A’[e4] := e5, e6, e7;
m’ := e8;
OUT(A’,m’); Assert(A’=A; m’=n);
8
Synthesis Technique
• Inductive invariant required to establish correctness
are too sophisticated.
• We use symbolic execution to generate verification
condition for correctness on certain paths in the
original and the inverted program.
• This generates constraints of the form
∃ 𝑒𝑖 , 𝑝𝑖 ∀𝑉 (𝜙1 𝑉, 𝑒𝑖 , 𝑝𝑗 ∧ ⋯ ∧ 𝜙𝑘 𝑉, 𝑒𝑖 , 𝑝𝑗 )
9
Related Work: Program Sketching
Reference:
Program Synthesis by Sketching, Phd Thesis 2008,
Armando Solar-Lezama (Advisor: Ras Bodik @ UC-Berkeley)
• Key Ideas:
– Write an arbitrary program with holes, where each hole
takes values from a finite domain.
– Use CEGIS to generate SAT constraints on holes.
• Cons: Not as efficient as domain-specific synthesizers.
– (On bitvector benchmark, times out on 9/25 tasks, and on
the remaining it is slower by 20x on average).
• Pros:
– A very powerful formalism that can be used to model a
variety of synthesis problems.
– Sees synthesis as an interactive process.
10