Previous Up Next

Chapter 1  Quick tutorial

1.1  Principles

Basically, the Why tool takes annotated programs written in a very simple imperative programming language of its own, produces verification conditions and sends them to existing provers (proof assistants such as Coq, PVS, etc. or automatic provers such as Simplify, CVC Lite, etc.), as illustrated below:

annotated programs
verification conditions = first-order formulae
external provers

Simplify speaking, Why may be seen as an implementation of Dijkstra’s weakeast preconditions. Its key advantages are the following:

 ProverCommand-line option
Proof assistantsCoq--coq
 HOL 4--hol4
 HOL Light--hol-light
Automatic theorem proversErgo--why
 CVC Lite--cvcl
 SMTlib (Yices, CVC 3)--smtlib
Figure 1.1: Supported provers

1.2  A trivial example

Here is a small example of Why input code:

  logic min: int, int -> int
  parameter r: int ref
  let f (n:int) = r := min !r n { r <= r@ }

This code declares a function symbol min and gives its arity. Whatever the status of this function is on the prover side (primitive, user-defined, axiomatized, etc.), it simply needs to be declared in order to be used in the following of the code. The next line declares a parameter, that is a value that is not defined but simply assumed to exist i.e. to belong to the environment. Here the parameter has name r and is an integer reference (Why’s concrete syntax is very close to Ocaml’s syntax). The third line defines a function f taking a integer n as argument (the type has to be given since there is no type inference in Why) and assigning to r the value of min !r n. The function f has no precondition and a postcondition expressing that the final value of r is smaller than its initial value. The current value of a reference x is directly denoted by x within annotations (not !x) and within postconditions x@ is the notation for the value of x in the prestate (i.e. at the precondition point).

Let us assume the three lines code above to be in file test.why. Then we can produce the proof obligations for this program, to be verified with Coq, using the following command line:

  why --coq test.why

A Coq file test_why.v is produced which contains the statement of a single proof obligation, which looks like

  Lemma f_po_1 : 
    forall (n: Z),
    forall (r: Z),
    forall (r0: Z),
    forall (HW_1: r0 = (min r n)),
    r0 <= r.

The proof itself has to be filled in by the user. If the Why input code is modified and Why run again, only the statement of the proof obligation will be updated and the remaining of the file (including the proof) will be left unmodified. Assuming that min is adequately defined in Coq, the proof above is trivial.

Trying an automatic decision procedure instead of Coq is as easy as running Why with a different command line option. For instance, to use Simplify [], we type in

  why --simplify test.why

A Simplify input file is produced. But Simplify is not able to discharge the proof obligation, since the meaning of min is unknown for Simplify:

  1: Invalid

The user can edit the header of to insert an axiom for min. Alternatively, this axiom can be inserted directly in the Why input code:

  logic min: int, int -> int
  axiom min_ax: forall x,y:int. min(x,y) <= x
  parameter r: int ref
  let f (n:int) = {} r := min !r n { r <= r@ }

This way this axiom will be replicated in any prover selected by the user. When using Coq, it is even possible to prove this axiom, though it is not mandatory. With the addition of this axiom, Simplify is now able to discharge the proof obligation:

  why --simplify test.why
  1: Valid.

1.3  A less trivial example: Dijkstra’s Dutch flag

Dijkstra’s Dutch flag is a classical algorithm which sorts an array where elements can have only three different values. Assuming that these values are the three colors blue, white and red, the algorithm restores the Dutch (or French :-) national flag within the array. This algorithm can be coded with a few lines of C, as follows:

typedef enum { BLUE, WHITE, RED } color;

void swap(int t[], int i, int j) { color c = t[i]; t[i] = t[j]; t[j] = c;}

void flag(int t[], int n) {
  int b = 0, i = 0, r = n;
  while (i < r) {
    switch (t[i]) {
    case BLUE: swap(t, b++, i++); break;        
    case WHITE: i++; break;
    case RED: swap(t, --r, i); break;

We are going to show how to verify this algorithm—the algorithm, not the C code—using Why. First we introduce an abstract type color for the colors together with three values blue, white and red:

  type color

  logic blue : color
  logic white : color
  logic red : color

Such a new type is necessarily an immutable datatype. The only mutable values in Why are references (and they only contain immutable values).

Then we introduce another type color_array for purely applicative arrays containing colors:

  type color_array

  logic acc : color_array, int -> color
  logic upd : color_array, int, color -> color_array

To get the usual theory of applicative arrays, we can add the necessary axioms:

  axiom acc_upd_eq : 
    forall t:color_array. forall i:int. forall c:color.
      acc(upd(t,i,c),i) = c
  axiom acc_upd_neq : 
    forall t:color_array. forall i:int. forall j:int. forall c:color.
      j<>i -> acc(upd(t,i,c),j) = acc(t,j)

The program arrays will be references containing values of type color_array. In order to constraint accesses and updates to be performed within arrays bounds, we add a notion of array length and two “programs” get and set with adequate preconditions:

  logic length : color_array -> int

  axiom length_upd : 
    forall t:color_array. forall i:int. forall c:color.  
    length(upd(t,i,v)) = length(t)

  parameter get : 
    t:color_array ref -> i:int -> 
      { 0<=i<length(t) } color reads t { result=acc(t,i) }

  parameter set : 
    t:color_array ref -> i:int -> c:color -> 
      { 0<=i<length(t) } unit writes t { t=upd(t@,i,c) }

These two programs need not being defined (they are only here to insert assertions automatically), so we declare them as parameters. As illustrated above, a function type has named arguments, a pre- and a postcondition, a result type and an explicit declaration of its side-effects: the reads and writes keywords respectively declare the set of references possibly accessed and modified.

We are now in position to define the swap function:

  let swap (t:color_array ref) (i:int) (j:int) =
    { 0 <= i < length(t) and 0 <= j < length(t) }
    let c = get t i in
    set t i (get t j);
    set t j c
    { t = upd(upd(t@,i,acc(t@,j)), j, acc(t@,i)) }

The precondition for swap states that the two indices i and j must lie within the array bounds and the postcondition is simply a rephrasing of the code on the model level i.e. on purely applicative arrays. Verifying the swap function is immediate.

Next we need to give the main function a specification. First, we need to express that the array only contains one of the three values blue, white and red. Indeed, nothing prevents the type color to be inhabited with other values (there is no notion of inductive type in Why logic, since it is intended to be a common fragment of many tools, including many with no primitive notion of inductive types). So we define the following predicate is_color:

  predicate is_color(c:color) = c=blue or c=white or c=red

Note that this predicate is given a definition in Why.

Second, we need to express the main function postcondition that is, for the final contents of the array, the property of being “sorted” but also the property of being a permutation of the initial contents of the array (a property usually neglected but clearly as important as the former). For this purpose, we introduce a predicate monochrome expressing that a set of successive elements is monochrome:

  predicate monochrome(t:color_array, i:int, j:int, c:color) =
    forall k:int. i<=k<j -> acc(t,k)=c

For the permutation property, we simply declare a predicate permutation:

  logic permutation : color_array, color_array, int, int -> prop

The intended meaning of permutation(t1,t2,l,r) is “the (multi-)sets of the elements of the sub-arrays t1[l..r] and t2[l..r] are the same”.

To be able to write down the code, we still need to translate the switch statement into successive tests, and for this purpose we need to be able to decide equality of the type color. We can declare this ability with the following parameter:

  parameter eq_color  : 
    c1:color -> c2:color -> {} bool { if result then c1=c2 else c1<>c2 }

Note that the meaning of = within annotations has nothing to do with a boolean function deciding equality that we could use in our programs.

We can now write the Why code for the main function:

let dutch_flag (t:color_array ref) (n:int) = 
  { length(t) = n and forall k:int. 0 <= k < n -> is_color(acc(t,k)) }
  let b = ref 0 in
  let i = ref 0 in
  let r = ref n in
  while !i < !r do
     if (eq_color (get t !i) blue) then begin
       swap t !b !i;
       b := !b + 1;
       i := !i + 1
     end else if (eq_color (get t !i) white) then
       i := !i + 1
     else begin
       r := !r - 1;
       swap t !r !i
  { (exists b:int. exists r:int. 
       monochrome(t,0,b,blue) and 
       monochrome(t,b,r,white) and 
    and permutation(t,t@,0,n-1) }

As given above, the code cannot be proved correct, since a loop invariant is missing, and so is a termination argument. The loop invariant must maintain the current situation, which can be depicted as

  BLUE      WHITE    …to do…   RED   

But the loop invariant must also maintain less obvious properties such as the invariance of the array length (which is obvious since we only performs upd operations over the array, but we need not to loose this property) and the permutation w.r.t. the initial array. The termination is trivially ensured since r-i decreases at each loop step and is bound by 0. Finally, the loop is annotated as follows:

  while !i < !r do
     { invariant 0 <= b <= i and i <= r <= n and
                 monochrome(t,0,b,blue) and 
                 monochrome(t,b,i,white) and 
                 monochrome(t,r,n,red) and
                 length(t) = n and
                 (forall k:int. 0 <= k < n -> is_color(acc(t,k))) and
       variant r - i }

A label init has been introduced right before the loop in order to relate the current value of the array inside the loop (denoted t) to its value right before the loop (denoted as t@init) in the loop invariant.

To be able to perform the proof, we need to “define” the permutation predicate. The simplest way to do is to axiomatize it as an equivalence relation containing the transpositions within [l..r]:

  axiom permut_refl : forall t: color_array. forall l,r:int.

  axiom permut_swap : forall t:color_array. forall l,r,i,j:int.
    l <= i <= r -> l <= j <= r ->
    permutation(t, upd(upd(t,i,acc(t,j)), j, acc(t,i)), l, r)

  axiom permut_sym : forall t1,t2:color_array. forall l,r:int.
    permutation(t1,t2,l,r) -> permutation(t2,t1,l,r)

  axiom permut_trans : forall t1,t2,t3: color_array. forall l,r:int.
    permutation(t1,t2,l,r) -> permutation(t2,t3,l,r) -> 

We have no way to state that this is the smallest such relation (this is not a first-order property) but this is not needed to show the correctness of the program. Note: the loop invariant part related to is_color seems redundant since we should be able to deduce it from the permutation property. But it would also require induction, and thus it is simpler to add this property to the loop invariant.

We can now proceed to the verification of the program with Simplify. We first produce the verification conditions in Simplify syntax:

  why --simplify flag.why

Then we run Simplify on these obligations, using the tool dp (distributed with Why; see 1.5.4) that displays the results of the automatic decision procedures in a nice way:


(. = valid * = invalid # = timeout ! = failure) .......... (10/0/0)
total           :  10
valid           :  10 (100%)
invalid         :   0 (  0%)
timeout/failure :   0 (  0%)

As we see, the 10 verification conditions are discharged by Simplify. The source code of this example is part of the Why distribution, in examples/misc/flag_ax.mlw.


We describe here two variations around the Dutch flag example.

1.4  Other features

This section quickly describes additional features that were not present in the example above.

1.4.1  Assertions

Assertions to be verified can be placed at any place inside code using the assert construct. Example:

    x := 2 * !x;
    assert { even(x) };
    x := !x - 1

An assertion can also be placed right after some program expression using the postcondition construct:

  let x = (fib 4) { even(result) } in x+x

Such assertions (before an expression with assert and after with the brackets notation) can be used to introduce proof cuts: first, they result in proof obligations at the corresponding program point and then they are available in the context of the following verification conditions. They are transparent i.e. they do not modify the information already available in the other verification conditions.

There is also an opaque variant of the postcondition construct, which is written with double brackets:

  let x = (fib 4) {{ even(result) }} in x+x

In that case, we still need to prove the property even(fib(4)) but, in the remaining of the program, we do not know that x=fib(4) anymore. Only the property even(x) is available.

1.4.2  Recursive functions

Recursive functions are introduced with the let rec construct. They are annotated with a result type (there is no type inference) and a variant to ensure termination. For instance, the famous Mac Carthy’s 91 function can be written as follows:

  let rec f91 (n:int) : int { variant max(0,101-n) } =
    if n <= 100 then
      (f91 (f91 (n + 11)))
      n - 10
    { (n <= 100 and result = 91) or (n >= 101 and result = n - 10) }

1.4.3  Exceptions

New exceptions are declared with the exception declaration and may have arguments:

  exception E
  exception F of int

Exceptions are raised with the construct raise:

     raise E
     raise (F 1)

and caught with the construct try … with … end:

     try f 0 with E -> 1 end 
     try f 0 with F x -> x end

It is not possible to catch an exception which is not possibly raised by the expression following the try.

Postconditions are extended to specify the behavior in case of uncaught exceptions:

     raise E { false | E => true }
     raise (F 1) { false | F => result = 1 }

In a postcondition regarding an exception, result refers to the value carried by the exception (if any). In postconditions, it is not possible to mention exceptions which are not possibly raised by the program.

In a function type, the exceptions possibly raised by the function must be given as part of its side-effects, using the raises keyword. For instance

  parameter f : x:int -> {} int raises E { P(x,result) | E => Q(x) }

declares a function f taking an integer x as argument, returning an integer and possibly raising the exception E. In the latter case, Q(x) holds.

1.4.4  Unreachable code

The absurd construct can be used to denote an unreachable point in the code. Example:

     { 0 <= x <= 1 }
     if x = 0 then 
     else if x = 1 then 
     { ... }

At such program points, the user will be asked to prove that the context is indeed absurd, which corresponds to the proof obligation false.

1.4.5  Non-deterministic programs

A non-deterministic program expression is a place-holder for a program expression that is not given but only specified. Such a construct is a program specification (a type, an effect and possibly some pre/postconditions) written between square brackets. For instance

  [ {} int { even(result) } ]

stands for any program expression of type int that has the property of being even. Thus we can prove correct the following code:

  (1 + [ {} int { even(result) } ]) { odd(result) }

This construct can be used to proceed by successive refinements or to write incomplete programs where only the meaningful parts are coded. With respect to verification, a non-deterministic program expression behaves like a call to a function parameter.

1.5  Tools

1.5.1  Graphical user interface (gwhy)

Figure 1.2: The gwhy graphical interface

The graphical user interface gwhy is a tool to help the user in the process of calling the various automatic provers on the goals (in order to analyze the failures and to determine what is wrong in the programs, the specifications, or sometimes the provers). gwhy is similar to Why and is passed the same options and files. A screenshot is given Figure 1.2. On the left side appears the list of all goals (either from goal commands in the Why files or resulting from verification conditions generation). In front of each goal, the results for each automatic prover is displayed. Provers can be launched by clicking on their names at the top of their columns. A green bullet means a proved goal; a red bullet means an unproved goal (that may be either a “don’t know” or an “invalid” answer from the prover); the scissors means a timeout from the prover (the timeout limit can be set in the bottom bar); finally, a “tools” icon means an unexpected failure in the prover call.

The top right window displays the currently selected goal. In this window, a right click on some identifier displays its location in the corresponding Why input file in the bottom right window.

1.5.2  Graphical user interface (WhyWeb)

The graphical user interface WhyWeb is a tool to help the user in the process of calling the various automatic provers on the goals (in order to analyze the failures and to determine what is wrong in the programs, the specifications, or sometimes the provers). Before use this tools we must use why with the -project option for generate the wpr file.

We launch the web server with the program whyweb.opt then we open a browser on the page http://localhost:2372 with 2372 the port number. A screenshort is given Figure 1.3. On the left side appears the list of all goals (either from goal commands in the Why files or resulting from verification conditions generation).In front of each goal, the results for each automatic prover is displayed. Provers can be launched by clicking on their names at the top of their columns. We can see the goal by unfolding the goals. On the right side appears the program.

Example 1   We search to proof the file in the java bench.
  1. We launch the commande : ”krakataoa” to obtain a file : Gcd.jc.
  2. We launch the commande : ”jessie Gcd.jc”
  3. We use the makefile with the commande : ”make -f Gcd.makefile project” toobtain the POs and the file : Gcd.wpr.
  4. We launch the server web with the commande : ”whyweb.opt”
  5. We open a browser web and in url we enter : ”http://localhost:2372/?bench/java/good/why/Gcd.wpr”
  6. We open the first lemma : ”mul_comm” by clicking on the ”+” of this lemma.
  7. We open the lemma goal by clicking on the ”+” of the goal.
  8. We launch ergo theorem prover on this goal by clicking on the ”Invalid” word in the ergo column.
  9. We open the first method : ”Method gcd” by clicking on the ”+” of this method.
  10. We open the first behavior : ”Safety” by clicking on the ”+” of this behavior.
  11. We open the first PO : ”initialization of loop invariant” by clicking on the ”+” of this PO.
  12. we launch the ergo theorem prover on this PO by clicking on the ”Invalid” word in the ergo column
After all this action we obtain the Figure 1.3. We can see the first lemma are proved by ergo but not the first PO.

Figure 1.3: The whyweb interface

1.5.3  Remaining Issues

1.5.4  Calling decision procedures (dp)

A small tool is provided to call the decision procedures (Simplify, Ergo, Yices, haRVey, Zenon and CVC Lite) on files containing several goals, with a timeout for each goal, and to display s short summary of the results. Here is an example:

% dp -timeout 5
(. = valid * = invalid # = timeout/failure) ....... (7/0/0) ............*...*..*.... (21/3/0)
total           :  31
valid           :  28 ( 90%)
invalid         :   3 ( 10%)
timeout/failure :   0 (  0%)


run in batch mode: no output is printed at all, but answer is given as the exit status. Notice that only the first goal is considered, so it is intended to be used only with single-goal files (see option -? of why)
  1. 0: normal exit, prover said goal is valid
  2. 1: dp error (syntax error, etc)
  3. 2: prover answered ’invalid’
  4. 3: prover answered ’unknown’
  5. 4: prover was interrupted because of timeout
  6. 5: prover failure: answer was not understood by dp
-timeout <n>
sets the time limit to n seconds. Default is 10

1.5.5  Why obfuscator (why-obfuscator)

A tool is provided to obfuscate some Why files. It is invoked as follows:

why-obfuscator [-o file] [-prefix p] files...

It produces a Why file equivalent to the set of Why files given as arguments, where all identifiers are renamed and all comments removed. The output is printed on the standard output unless redirected with option -o. The identifiers are renamed as x1, x2, ... unless a different name prefix is set with option -prefix.

1.5.6  Why statistics (why-stat)

A tool is provided to display rough statistics on the goals contained in Why files. It is invoked as follows:

why-stat files...

It collects the symbols appearing in the conclusion of goal declarations, sums the number of goals with the same set of symbols and displays this multi-set starting with the most numerous elements. Here is a sample output:

       8 acc,shift
       2 valid
       1 valid_range

1.5.7  Caml code generation (why --ocaml)

In order to run your Why code, true Caml code can be produced with option --ocaml:

     why --ocaml foo.mlw

When this option is selected, there is no generation of proof obligations (Coq or PVS files are not produced or updated). The Caml code is written on standard output, unless redirected to some file with option --output (see Section 2.1).

Programs’ annotations can be inserted as comments in the Caml code, with option --ocaml-annot. The default behavior is no annotation.

When the input file contains parameter declarations, the Caml code generated is a functor, with these parameters as arguments. For instance, the following input file

     parameter x : int ref
     let f (y:int) = x := !x + y

is translated into the following piece of Caml code:

     module type Parameters = sig
       val x : int ref

     module Make(P : Parameters) = struct
       open P
       let f = fun y (*:int*) -> x := !x + y

external parameter declarations are supposed to be realized by some external Caml code. However, it is possible to make them arguments of the functor too, with command line option --ocaml-ext.

1.5.8  Why to HTML converter (why2html)

A tool to convert Why input files to HTML is provided. Its usage is immediate:

     why2html [-t title] files...

Each Why input file given on the command line, say foo.mlw, is translated into a HTML file foo.mlw.html. A title for the HTML page can be specified using command line option -t; otherwise the original file name is used.

1.5.9  Simplify to Why converter (simplify2why)

A tool to convert Simplify files to Why syntax is provided. Its usage is immediate:

     simplify2why files...

Each Simplify input file given on the command line, say, is translated into a Why file Since Simplify is untyped, all data in the Simplify input file are mapped to integers in the resulting Why file.

1.6  Installation

To compile Why, you need Objective Caml to be installed, in version at least 3.08. It can be fetched from Then

  1. Configure with ./configure

    The directory for binaries defaults to /usr/local/bin; you can specify another directory with the --bindir option. Similarly, you can change the directory for the library with --libdir and for the man pages --mandir.

  2. Compile with make.
  3. Install with make install.

1.7  Examples

Many examples are delivered with Why in the subdirectory examples/ of the distribution. They are also available on the Why web site, at

1.8  Verifying C and Java programs

C and Java programs can be verified using other tools in combination with Why. C programs can be verified using Caduceus (see and Java programs using Krakatoa (see

Previous Up Next