Basically, the Why tool takes annotated programs written in a very simple imperative programming language of its own, produces verification conditions and sends them to existing provers (proof assistants such as Coq, PVS, etc. or automatic provers such as Simplify, CVC Lite, etc.), as illustrated below:
annotated programs |
↓ |
Why |
↓ |
verification conditions = first-order formulae |
↓ |
Why |
↓ |
external provers |
Simplify speaking, Why may be seen as an implementation of Dijkstra’s weakeast preconditions. Its key advantages are the following:
Figure 1.1: Supported provers
Here is a small example of Why input code:
logic min: int, int -> int parameter r: int ref let f (n:int) = r := min !r n { r <= r@ }
This code declares a function symbol min and gives its arity. Whatever the status of this function is on the prover side (primitive, user-defined, axiomatized, etc.), it simply needs to be declared in order to be used in the following of the code. The next line declares a parameter, that is a value that is not defined but simply assumed to exist i.e. to belong to the environment. Here the parameter has name r and is an integer reference (Why’s concrete syntax is very close to Ocaml’s syntax). The third line defines a function f taking a integer n as argument (the type has to be given since there is no type inference in Why) and assigning to r the value of min !r n. The function f has no precondition and a postcondition expressing that the final value of r is smaller than its initial value. The current value of a reference x is directly denoted by x within annotations (not !x) and within postconditions x@ is the notation for the value of x in the prestate (i.e. at the precondition point).
Let us assume the three lines code above to be in file test.why. Then we can produce the proof obligations for this program, to be verified with Coq, using the following command line:
why --coq test.why
A Coq file test_why.v is produced which contains the statement of a single proof obligation, which looks like
Lemma f_po_1 : forall (n: Z), forall (r: Z), forall (r0: Z), forall (HW_1: r0 = (min r n)), r0 <= r. Proof. (* FILL PROOF HERE *) Save.
The proof itself has to be filled in by the user. If the Why input code is modified and Why run again, only the statement of the proof obligation will be updated and the remaining of the file (including the proof) will be left unmodified. Assuming that min is adequately defined in Coq, the proof above is trivial.
Trying an automatic decision procedure instead of Coq is as easy as running Why with a different command line option. For instance, to use Simplify [], we type in
why --simplify test.why
A Simplify input file test_why.sx is produced. But Simplify is not able to discharge the proof obligation, since the meaning of min is unknown for Simplify:
Simplify test_why.sx ... 1: Invalid
The user can edit the header of test_why.sx to insert an axiom for min. Alternatively, this axiom can be inserted directly in the Why input code:
logic min: int, int -> int axiom min_ax: forall x,y:int. min(x,y) <= x parameter r: int ref let f (n:int) = {} r := min !r n { r <= r@ }
This way this axiom will be replicated in any prover selected by the user. When using Coq, it is even possible to prove this axiom, though it is not mandatory. With the addition of this axiom, Simplify is now able to discharge the proof obligation:
why --simplify test.why Simplify test_why.sx 1: Valid.
Dijkstra’s Dutch flag is a classical algorithm which sorts an array where elements can have only three different values. Assuming that these values are the three colors blue, white and red, the algorithm restores the Dutch (or French :-) national flag within the array. This algorithm can be coded with a few lines of C, as follows:
typedef enum { BLUE, WHITE, RED } color; void swap(int t[], int i, int j) { color c = t[i]; t[i] = t[j]; t[j] = c;} void flag(int t[], int n) { int b = 0, i = 0, r = n; while (i < r) { switch (t[i]) { case BLUE: swap(t, b++, i++); break; case WHITE: i++; break; case RED: swap(t, --r, i); break; } } }
We are going to show how to verify this algorithm—the algorithm, not the C code—using Why. First we introduce an abstract type color for the colors together with three values blue, white and red:
type color logic blue : color logic white : color logic red : color
Such a new type is necessarily an immutable datatype. The only mutable values in Why are references (and they only contain immutable values).
Then we introduce another type color_array for purely applicative arrays containing colors:
type color_array logic acc : color_array, int -> color logic upd : color_array, int, color -> color_array
To get the usual theory of applicative arrays, we can add the necessary axioms:
axiom acc_upd_eq : forall t:color_array. forall i:int. forall c:color. acc(upd(t,i,c),i) = c axiom acc_upd_neq : forall t:color_array. forall i:int. forall j:int. forall c:color. j<>i -> acc(upd(t,i,c),j) = acc(t,j)
The program arrays will be references containing values of type color_array. In order to constraint accesses and updates to be performed within arrays bounds, we add a notion of array length and two “programs” get and set with adequate preconditions:
logic length : color_array -> int axiom length_upd : forall t:color_array. forall i:int. forall c:color. length(upd(t,i,v)) = length(t) parameter get : t:color_array ref -> i:int -> { 0<=i<length(t) } color reads t { result=acc(t,i) } parameter set : t:color_array ref -> i:int -> c:color -> { 0<=i<length(t) } unit writes t { t=upd(t@,i,c) }
These two programs need not being defined (they are only here to insert assertions automatically), so we declare them as parameters. As illustrated above, a function type has named arguments, a pre- and a postcondition, a result type and an explicit declaration of its side-effects: the reads and writes keywords respectively declare the set of references possibly accessed and modified.
We are now in position to define the swap function:
let swap (t:color_array ref) (i:int) (j:int) = { 0 <= i < length(t) and 0 <= j < length(t) } let c = get t i in set t i (get t j); set t j c { t = upd(upd(t@,i,acc(t@,j)), j, acc(t@,i)) }
The precondition for swap states that the two indices i and j must lie within the array bounds and the postcondition is simply a rephrasing of the code on the model level i.e. on purely applicative arrays. Verifying the swap function is immediate.
Next we need to give the main function a specification. First, we need to express that the array only contains one of the three values blue, white and red. Indeed, nothing prevents the type color to be inhabited with other values (there is no notion of inductive type in Why logic, since it is intended to be a common fragment of many tools, including many with no primitive notion of inductive types). So we define the following predicate is_color:
predicate is_color(c:color) = c=blue or c=white or c=red
Note that this predicate is given a definition in Why.
Second, we need to express the main function postcondition that is, for the final contents of the array, the property of being “sorted” but also the property of being a permutation of the initial contents of the array (a property usually neglected but clearly as important as the former). For this purpose, we introduce a predicate monochrome expressing that a set of successive elements is monochrome:
predicate monochrome(t:color_array, i:int, j:int, c:color) = forall k:int. i<=k<j -> acc(t,k)=c
For the permutation property, we simply declare a predicate permutation:
logic permutation : color_array, color_array, int, int -> prop
The intended meaning of permutation(t1,t2,l,r) is “the (multi-)sets of the elements of the sub-arrays t1[l..r] and t2[l..r] are the same”.
To be able to write down the code, we still need to translate the switch statement into successive tests, and for this purpose we need to be able to decide equality of the type color. We can declare this ability with the following parameter:
parameter eq_color : c1:color -> c2:color -> {} bool { if result then c1=c2 else c1<>c2 }
Note that the meaning of = within annotations has nothing to do with a boolean function deciding equality that we could use in our programs.
We can now write the Why code for the main function:
let dutch_flag (t:color_array ref) (n:int) = { length(t) = n and forall k:int. 0 <= k < n -> is_color(acc(t,k)) } let b = ref 0 in let i = ref 0 in let r = ref n in while !i < !r do if (eq_color (get t !i) blue) then begin swap t !b !i; b := !b + 1; i := !i + 1 end else if (eq_color (get t !i) white) then i := !i + 1 else begin r := !r - 1; swap t !r !i end done { (exists b:int. exists r:int. monochrome(t,0,b,blue) and monochrome(t,b,r,white) and monochrome(t,r,n,red)) and permutation(t,t@,0,n-1) }
As given above, the code cannot be proved correct, since a loop invariant is missing, and so is a termination argument. The loop invariant must maintain the current situation, which can be depicted as
0 | b | i | r | n |
BLUE | WHITE | …to do… | RED |
But the loop invariant must also maintain less obvious properties such as the invariance of the array length (which is obvious since we only performs upd operations over the array, but we need not to loose this property) and the permutation w.r.t. the initial array. The termination is trivially ensured since r-i decreases at each loop step and is bound by 0. Finally, the loop is annotated as follows:
... init: while !i < !r do { invariant 0 <= b <= i and i <= r <= n and monochrome(t,0,b,blue) and monochrome(t,b,i,white) and monochrome(t,r,n,red) and length(t) = n and (forall k:int. 0 <= k < n -> is_color(acc(t,k))) and permutation(t,t@init,0,n-1) variant r - i } ...
A label init has been introduced right before the loop in order to relate the current value of the array inside the loop (denoted t) to its value right before the loop (denoted as t@init) in the loop invariant.
To be able to perform the proof, we need to “define” the permutation predicate. The simplest way to do is to axiomatize it as an equivalence relation containing the transpositions within [l..r]:
axiom permut_refl : forall t: color_array. forall l,r:int. permutation(t,t,l,r) axiom permut_swap : forall t:color_array. forall l,r,i,j:int. l <= i <= r -> l <= j <= r -> permutation(t, upd(upd(t,i,acc(t,j)), j, acc(t,i)), l, r) axiom permut_sym : forall t1,t2:color_array. forall l,r:int. permutation(t1,t2,l,r) -> permutation(t2,t1,l,r) axiom permut_trans : forall t1,t2,t3: color_array. forall l,r:int. permutation(t1,t2,l,r) -> permutation(t2,t3,l,r) -> permutation(t1,t3,l,r)
We have no way to state that this is the smallest such relation (this is not a first-order property) but this is not needed to show the correctness of the program. Note: the loop invariant part related to is_color seems redundant since we should be able to deduce it from the permutation property. But it would also require induction, and thus it is simpler to add this property to the loop invariant.
We can now proceed to the verification of the program with Simplify. We first produce the verification conditions in Simplify syntax:
why --simplify flag.why
Then we run Simplify on these obligations, using the tool dp (distributed with Why; see 1.5.4) that displays the results of the automatic decision procedures in a nice way:
dp flag_why.sx (. = valid * = invalid # = timeout ! = failure) flag_ax_why.sx: .......... (10/0/0) total : 10 valid : 10 (100%) invalid : 0 ( 0%) timeout/failure : 0 ( 0%)
As we see, the 10 verification conditions are discharged by Simplify. The source code of this example is part of the Why distribution, in examples/misc/flag_ax.mlw.
We describe here two variations around the Dutch flag example.
Inductive color : Set := blue : color | white : color | red : color.Then we can declare these external elements in the Why source file as follows:
external type color external logic blue, white, red : colorThe external modifier indicates that nothing must be produced in the prover file for these declarations. Then we do not the is_color property anymore, since we are able to prove (once on the Coq side) that there is no other value in the color type than blue, white and red.
Similarly, we could define the permutation predicate as a Coq inductive predicate, and declare it in the Why source file as follows:
external logic permutation : color_array, color_array, int, int -> prop
This section quickly describes additional features that were not present in the example above.
Assertions to be verified can be placed at any place inside code using the assert construct. Example:
begin x := 2 * !x; assert { even(x) }; x := !x - 1 end
An assertion can also be placed right after some program expression using the postcondition construct:
let x = (fib 4) { even(result) } in x+x
Such assertions (before an expression with assert and after with the brackets notation) can be used to introduce proof cuts: first, they result in proof obligations at the corresponding program point and then they are available in the context of the following verification conditions. They are transparent i.e. they do not modify the information already available in the other verification conditions.
There is also an opaque variant of the postcondition construct, which is written with double brackets:
let x = (fib 4) {{ even(result) }} in x+x
In that case, we still need to prove the property even(fib(4)) but, in the remaining of the program, we do not know that x=fib(4) anymore. Only the property even(x) is available.
Recursive functions are introduced with the let rec construct. They are annotated with a result type (there is no type inference) and a variant to ensure termination. For instance, the famous Mac Carthy’s 91 function can be written as follows:
let rec f91 (n:int) : int { variant max(0,101-n) } = if n <= 100 then (f91 (f91 (n + 11))) else n - 10 { (n <= 100 and result = 91) or (n >= 101 and result = n - 10) }
New exceptions are declared with the exception declaration and may have arguments:
exception E exception F of int
Exceptions are raised with the construct raise:
raise E raise (F 1)
and caught with the construct try … with … end:
try f 0 with E -> 1 end try f 0 with F x -> x end
It is not possible to catch an exception which is not possibly raised by the expression following the try.
Postconditions are extended to specify the behavior in case of uncaught exceptions:
raise E { false | E => true } raise (F 1) { false | F => result = 1 }
In a postcondition regarding an exception, result refers to the value carried by the exception (if any). In postconditions, it is not possible to mention exceptions which are not possibly raised by the program.
In a function type, the exceptions possibly raised by the function must be given as part of its side-effects, using the raises keyword. For instance
parameter f : x:int -> {} int raises E { P(x,result) | E => Q(x) }
declares a function f taking an integer x as argument, returning an integer and possibly raising the exception E. In the latter case, Q(x) holds.
The absurd construct can be used to denote an unreachable point in the code. Example:
{ 0 <= x <= 1 } if x = 0 then ... else if x = 1 then ... else absurd { ... }
At such program points, the user will be asked to prove that the context is indeed absurd, which corresponds to the proof obligation false.
A non-deterministic program expression is a place-holder for a program expression that is not given but only specified. Such a construct is a program specification (a type, an effect and possibly some pre/postconditions) written between square brackets. For instance
[ {} int { even(result) } ]
stands for any program expression of type int that has the property of being even. Thus we can prove correct the following code:
(1 + [ {} int { even(result) } ]) { odd(result) }
This construct can be used to proceed by successive refinements or to write incomplete programs where only the meaningful parts are coded. With respect to verification, a non-deterministic program expression behaves like a call to a function parameter.
The graphical user interface gwhy is a tool to help the user in the process of calling the various automatic provers on the goals (in order to analyze the failures and to determine what is wrong in the programs, the specifications, or sometimes the provers). gwhy is similar to Why and is passed the same options and files. A screenshot is given Figure 1.2. On the left side appears the list of all goals (either from goal commands in the Why files or resulting from verification conditions generation). In front of each goal, the results for each automatic prover is displayed. Provers can be launched by clicking on their names at the top of their columns. A green bullet means a proved goal; a red bullet means an unproved goal (that may be either a “don’t know” or an “invalid” answer from the prover); the scissors means a timeout from the prover (the timeout limit can be set in the bottom bar); finally, a “tools” icon means an unexpected failure in the prover call.
The top right window displays the currently selected goal. In this window, a right click on some identifier displays its location in the corresponding Why input file in the bottom right window.
The graphical user interface WhyWeb is a tool to help the user in the process of calling the various automatic provers on the goals (in order to analyze the failures and to determine what is wrong in the programs, the specifications, or sometimes the provers). Before use this tools we must use why with the -project option for generate the wpr file.
We launch the web server with the program whyweb.opt then we open a browser on the page http://localhost:2372 with 2372 the port number. A screenshort is given Figure 1.3. On the left side appears the list of all goals (either from goal commands in the Why files or resulting from verification conditions generation).In front of each goal, the results for each automatic prover is displayed. Provers can be launched by clicking on their names at the top of their columns. We can see the goal by unfolding the goals. On the right side appears the program.
A small tool is provided to call the decision procedures (Simplify, Ergo, Yices, haRVey, Zenon and CVC Lite) on files containing several goals, with a timeout for each goal, and to display s short summary of the results. Here is an example:
% dp -timeout 5 swap0_why.sx peano_why.sx (. = valid * = invalid # = timeout/failure) swap0_why.sx: ....... (7/0/0) peano_why.sx: ............*...*..*.... (21/3/0) total : 31 valid : 28 ( 90%) invalid : 3 ( 10%) timeout/failure : 0 ( 0%)
options:
A tool is provided to obfuscate some Why files. It is invoked as follows:
It produces a Why file equivalent to the set of Why files given as arguments, where all identifiers are renamed and all comments removed. The output is printed on the standard output unless redirected with option -o. The identifiers are renamed as x1, x2, ... unless a different name prefix is set with option -prefix.
A tool is provided to display rough statistics on the goals contained in Why files. It is invoked as follows:
It collects the symbols appearing in the conclusion of goal declarations, sums the number of goals with the same set of symbols and displays this multi-set starting with the most numerous elements. Here is a sample output:
8 acc,shift 2 valid 1 valid_range
In order to run your Why code, true Caml code can be produced with option --ocaml:
why --ocaml foo.mlw
When this option is selected, there is no generation of proof obligations (Coq or PVS files are not produced or updated). The Caml code is written on standard output, unless redirected to some file with option --output (see Section 2.1).
Programs’ annotations can be inserted as comments in the Caml code, with option --ocaml-annot. The default behavior is no annotation.
When the input file contains parameter declarations, the Caml code generated is a functor, with these parameters as arguments. For instance, the following input file
parameter x : int ref let f (y:int) = x := !x + y
is translated into the following piece of Caml code:
module type Parameters = sig val x : int ref end module Make(P : Parameters) = struct open P let f = fun y (*:int*) -> x := !x + y end
external parameter declarations are supposed to be realized by some external Caml code. However, it is possible to make them arguments of the functor too, with command line option --ocaml-ext.
A tool to convert Why input files to HTML is provided. Its usage is immediate:
why2html [-t title] files...
Each Why input file given on the command line, say foo.mlw, is translated into a HTML file foo.mlw.html. A title for the HTML page can be specified using command line option -t; otherwise the original file name is used.
A tool to convert Simplify files to Why syntax is provided. Its usage is immediate:
simplify2why files...
Each Simplify input file given on the command line, say foo.sx, is translated into a Why file foo.sx.why. Since Simplify is untyped, all data in the Simplify input file are mapped to integers in the resulting Why file.
To compile Why, you need Objective Caml to be installed, in version at least 3.08. It can be fetched from http://caml.inria.fr. Then
The directory for binaries defaults to /usr/local/bin; you can specify another directory with the --bindir option. Similarly, you can change the directory for the library with --libdir and for the man pages --mandir.
Many examples are delivered with Why in the subdirectory examples/ of the distribution. They are also available on the Why web site, at http://why.lri.fr/examples/.
C and Java programs can be verified using other tools in combination with Why. C programs can be verified using Caduceus (see caduceus.lri.fr/) and Java programs using Krakatoa (see krakatoa.lri.fr).