2 \page cprover-manual CProver User Manual
4 \author Daniel Kroening
6 This tutorial is intended for users of CProver (CBMC, SatAbs, and
11 \section man_introduction Introduction
15 Numerous tools to hunt down functional design flaws in silicon have been
16 available for many years, mainly due to the enormous cost of hardware
17 bugs. The use of such tools is wide-spread. In contrast, the market for
18 tools that address the need for quality software is still in its
21 Research in software quality has an enormous breadth. We focus the
22 presentation using two criteria:
24 1. We believe that any form of quality requires a specific *guarantee*,
25 in theory and practice.
26 2. The sheer size of software designs requires techniques that are
29 In practice, quality guarantees usually do not refer to "total
30 correctness" of a design, as ensuring the absence of all bugs is too
31 expensive for most applications. In contrast, a guarantee of the absence
32 of specific flaws is achievable, and is a good metric of quality.
34 We document two programs that try to achieve formal guarantees of the
35 absence of specific problems: CBMC and SATABS. The algorithms
36 implemented by CBMC and SATABS are complementary, and often, one tool is
37 able to solve a problem that the other cannot solve.
39 Both CBMC and SATABS are verification tools for ANSI-C/C++ programs.
40 They verify array bounds (buffer overflows), pointer safety, exceptions
41 and user-specified assertions. Both tools model integer arithmetic
42 accurately, and are able to reason about machine-level artifacts such as
43 integer overflow. CBMC and SATABS are therefore able to detect a class
44 of bugs that has so far gone unnoticed by many other verification tools.
45 This manual also covers some variants of CBMC, which includes HW-CBMC
47 \ref man_hard-soft-introduction "hardware/software co-verification".
49 ## Bounded Model Checking with CBMC
51 CBMC implements a technique called *Bounded Model Checking* (BMC). In
52 BMC, the transition relation for a complex state machine and its
53 specification are jointly unwound to obtain a Boolean formula, which is
54 then checked for satisfiability by using an efficient SAT procedure. If
55 the formula is satisfiable, a counterexample is extracted from the
56 output of the SAT procedure. If the formula is not satisfiable, the
57 program can be unwound more to determine if a longer counterexample
60 In many engineering domains, real-time guarantees are a strict
61 requirement. An example is software embedded in automotive controllers.
62 As a consequence, the loop constructs in these types of programs often
63 have a strict bound on the number of iterations. CBMC is able to
64 formally verify such bounds by means of *unwinding assertions*. Once
65 this bound is established, CBMC is able to prove the absence of errors.
67 A more detailed description of how to apply CBMC to verify programs is
68 \ref man_cbmc-tutorial "here".
70 ## Automatic Program Verification with SATABS
72 In many cases, lightweight properties such as array bounds do not rely
73 on the entire program. A large fraction of the program is *irrelevant*
74 to the property. SATABS exploits this observation and computes an
75 *abstraction* of the program in order to handle large amounts of code.
77 In order to use SATABS it is not necessary to understand the abstraction
78 refinement process. For the interested reader, a high-level introduction
79 to abstraction refinement is provided
80 \ref man_satabs-overview "here".
82 \ref man_satabs-tutorials "tutorials on how to use SATABS".
84 Just as CBMC, SATABS attempts to build counterexamples that refute the
85 property. If such a counterexample is found, it is presented to the
86 engineer to facilitate localization and repair of the program.
88 ### Example: Buffer Overflows
90 In order to give a brief overview of the capabilities of CBMC and SATABS
91 we start with a small example. The issue of *[buffer
92 overflows](http://en.wikipedia.org/wiki/Buffer_overflow)* has obtained
93 wide public attention. A buffer is a contiguously-allocated chunk of
94 memory, represented by an array or a pointer in C. Programs written in C
95 do not provide automatic bounds checking on the buffer, which means a
96 program can – accidentally or maliciously – write past a buffer. The
97 following example is a perfectly valid C program (in the sense that a
98 compiler compiles it without any errors):
108 However, the write access to an address outside the allocated memory
109 region can lead to unexpected behavior. In particular, such bugs can be
110 exploited to overwrite the return address of a function, thus enabling
111 the execution of arbitrary user-induced code. CBMC and SATABS are able
112 to detect this problem and reports that the "upper bound property" of
113 the buffer is violated. CBMC and SATABS are capable of checking these
114 lower and upper bounds, even for arrays with dynamic size. A detailed
115 discussion of the properties that CBMC and SATABS can check
116 automatically is \ref man_instrumentation-properties "here".
118 ## Hardware/Software Co-Verification
120 Software programs often interact with hardware in a non-trivial manner,
121 and many properties of the overall design only arise from the interplay
122 of both components. CBMC and SATABS therefore support *Co-Verification*,
123 i.e., are able to reason about a C/C++ program together with a circuit
124 description given in Verilog.
126 These co-verification capabilities can also be applied to perform
127 refinement proofs. Software programs are often used as high-level
128 descriptions of circuitry. While both describe the same functionality,
129 the hardware implementation usually contains more detail. It is highly
130 desirable to establish some form for equivalence between the two
131 descriptions. Hardware/Software co-verification and equivalence checking
132 with CBMC and SATABS are described
133 \ref man_hard-soft-introduction "here".
136 \section man_installation Installation
138 \subsection man_install-cbmc Installing CBMC
142 CBMC is available for Windows, i86 Linux, and MacOS X. CBMC requires a
143 code pre-processing environment comprising of a suitable preprocessor
144 and an a set of header files.
146 1. **Linux:** the preprocessor and the header files typically come with
147 a package called *gcc*, which must be installed prior to the
148 installation of CBMC.
150 2. **Windows:** The Windows version of CBMC requires the preprocessor
151 `cl.exe`, which is part of Microsoft Visual Studio. We recommend the
152 free [Visual Studio Community
153 2013](http://www.visualstudio.com/en-us/products/visual-studio-community-vs).
155 3. **MacOS:** Install the [XCode Command Line
156 Utilities](http://developer.apple.com/technologies/xcode.html) prior
157 to installing CBMC. Just installing XCode alone is not enough.
159 Important note for Windows users: Visual Studio's `cl.exe` relies on a
160 complex set of environment variables to identify the target architecture
161 and the directories that contain the header files. You must run CBMC
162 from within the *Visual Studio Command Prompt*.
164 Note that the distribution files for the [Eclipse
165 plugin](installation-plugin.shtml) include the CBMC executable.
166 Therefore, if you intend to run CBMC exclusively within Eclipse, you can
167 skip the installation of the CBMC executable. However, you still have to
168 install the compiler environment as described above.
170 ### Installing the CBMC Binaries
172 1. Download CBMC for your operating system. The binaries are available
173 from http://www.cprover.org/cbmc/.
174 2. Unzip/untar the archive into a directory of your choice. We
175 recommend to add this directory to your `PATH` environment variable.
177 You are now ready to \ref man_cbmc-tutorial "use CBMC"!
179 ### Building CBMC from Source
181 Alternatively, the CBMC source code is available [via
182 SVN](http://www.cprover.org/svn/cbmc/). To compile the source code,
184 instructions](http://www.cprover.org/svn/cbmc/trunk/COMPILING).
186 \subsection man_install-satabs Installing SATABS
190 SATABS is available for Windows, i86 Linux, and MacOS X. SATABS requires
191 a code pre-processing environment comprising of a suitable preprocessor
192 and an a set of header files.
194 1. **Linux:** the preprocessor and the header files typically come with
195 a package called *gcc*, which must be installed prior to the
196 installation of SATABS.
197 2. **Windows:** The Windows version of SATABS requires the preprocessor
198 `cl.exe`, which is part of Visual Studio (including the free [Visual
200 Express](http://msdn.microsoft.com/vstudio/express/visualc/)).
201 3. **MacOS:** Install
202 [XCode](http://developer.apple.com/technologies/xcode.html) prior to
205 Important note for Windows users: Visual Studio's `cl.exe` relies on a
206 complex set of environment variables to identify the target architecture
207 and the directories that contain the header files. You must run SATABS
208 from within the *Visual Studio Command Prompt*.
210 Note that the distribution files for the [Eclipse
211 plugin](installation-plugin.shtml) include the command-line tools.
212 Therefore, if you intend to run SATABS exclusively within Eclipse, you
213 can skip the installation of the command-line tools. However, you still
214 have to install the compiler environment as described above.
216 ### Choosing and Installing a Model Checker
218 You need to install a Model Checker in order to be able to run SATABS.
219 You can choose between following alternatives:
221 - **Cadence SMV**. Available from http://www.kenmcmil.com/smv.html.
222 Cadence SMV is a commercial model checker. The free version that is
223 available on the homepage above must not be used for commercial
224 purposes (read the license agreement thoroughly before you download
225 the tool). The documentation for SMV can be found in the directory
226 where you unzip/untar SMV under ./smv/doc/smv/. Read the
227 installation instructions carefully. The Linux/MacOS versions
228 require setting environment variables. You must add add the
229 directory containing the `smv` binary (located in ./smv/bin/,
230 relative to the path where you unpacked it) to your `PATH`
231 environment variable. SATABS uses Cadence SMV by default.
233 - **NuSMV**. Available from http://nusmv.irst.itc.it/. NuSMV is the
234 open source alternative to Cadence SMV. Installation instructions
235 and documentation can be found on the NuSMV homepage. The directory
236 containing the NuSMV binary should be added to your `PATH`
237 environment variable. Use the option
241 to instruct SATABS to use NuSMV.
243 - **BOPPO**. Available from http://www.cprover.org/boppo/. BOPPO is
244 a model checker that uses SAT-solving algorithms. BOPPO relies on a
245 built-in SAT solver and Quantor, a solver for quantified boolean
246 formulas that is currently bundled with BOPPO, but also available
247 separately from <http://fmv.jku.at/quantor/>. We recommend to add
248 the directories containing both tools to your `PATH` environment
249 variable. Use the option
253 when you call SATABS and want it to use BOPPO instead of SMV.
255 - **BOOM**. Available from http://www.cprover.org/boom/. Boom has a
256 number of unique features, including the verification of programs
257 with unbounded thread creation.
259 ### Installing SATABS
261 1. Download SATABS for your operating system. The binaries are
262 available from <http://www.cprover.org/satabs/>.
263 2. Unzip/untar the archive into a directory of your choice. We
264 recommend to add this directory to your `PATH` environment variable.
266 Now you can execute SATABS. Try running SATABS on the small examples
267 presented in the \ref man_satabs-tutorials "SATABS tutorial". If you use
268 the Cadence SMV model checker, the only command line arguments you have
269 to specify are the names of the files that contain your program.
272 \subsection man_install-eclipse Installing the Eclipse Plugin
276 We provide a graphical user interface to CBMC and SATABS, which is
277 realized as a plugin to the Eclipse framework. Eclipse is available at
278 http://www.eclipse.org. We do not provide installation instructions for
279 Eclipse (basically, you only have to download the current version and
280 extract the files to your hard-disk) and assume that you have already
281 installed the current version.
283 CBMC and SATABS have their own requirements. As an example, both CBMC and SATABS require a suitable preprocessor and a set of header files. As
284 first step, you should therefore follow the installation instructions
285 for \ref man_install-cbmc "CBMC" and \ref man_install-satabs "SATABS".
287 Important note for Windows users: Visual Studio's `cl.exe` relies on a
288 complex set of environment variables to identify the target architecture
289 and the directories that contain the header files. You must run Eclipse
290 from within the *Visual Studio Command Prompt*.
292 ### Installing the Eclipse Plugin
294 The installation instructions for the Eclipse Plugin, including the link
295 to the download site, are available
296 [here](http://www.cprover.org/eclipse-plugin/). This includes a small
297 tutorial on how to use the Eclipse plugin.
300 \section man_cbmc CBMC: Bounded Model Checking for C, C++ and Java
302 \subsection man_cbmc-tutorial A Short Tutorial
306 We assume you have already installed CBMC and the necessary support
307 files on your system. If not so, please follow the instructions
308 \ref man_install-cbmc "here".
310 Like a compiler, CBMC takes the names of .c files as command line
311 arguments. CBMC then translates the program and merges the function
312 definitions from the various .c files, just like a linker. But instead
313 of producing a binary for execution, CBMC performs symbolic simulation
316 As an example, consider the following simple program, named file1.c:
318 int puts(const char *s) { }
320 int main(int argc, char **argv) {
324 Of course, this program is faulty, as the `argv` array might have fewer
325 than three elements, and then the array access `argv[2]` is out of
326 bounds. Now, run CBMC as follows:
328 cbmc file1.c --show-properties --bounds-check --pointer-check
330 The two options `--bounds-check` and `--pointer-check` instruct CBMC to
331 look for errors related to pointers and array bounds. CBMC will print
332 the list of properties it checks. Note that it lists, among others, a
333 property labelled with "object bounds in argv" together with the location
334 of the faulty array access. As you can see, CBMC largely determines the
335 property it needs to check itself. This is realized by means of a
336 preliminary static analysis, which relies on computing a fixed point on
338 domains](http://en.wikipedia.org/wiki/Abstract_interpretation). More
339 detail on automatically generated properties is provided
340 \ref man_instrumentation-properties "here".
342 Note that these automatically generated properties need not necessarily
343 correspond to bugs – these are just *potential* flaws, as abstract
344 interpretation might be imprecise. Whether these properties hold or
345 correspond to actual bugs needs to be determined by further analysis.
347 CBMC performs this analysis using *symbolic simulation*, which
348 corresponds to a translation of the program into a formula. The formula
349 is then combined with the property. Let's look at the formula that is
350 generated by CBMC's symbolic simulation:
352 cbmc file1.c --show-vcc --bounds-check --pointer-check
354 With this option, CBMC performs the symbolic simulation and prints the
355 verification conditions on the screen. A verification condition needs to
356 be proven to be valid by a [decision
357 procedure](http://en.wikipedia.org/wiki/Decision_problem) in order to
358 assert that the corresponding property holds. Let's run the decision
361 cbmc file1.c --bounds-check --pointer-check
363 CBMC transforms the equation you have seen before into CNF and passes it
364 to a SAT solver (more background on this step is in the book on
365 [Decision Procedures](http://www.decision-procedures.org/)). It then
366 determines which of the properties that it has generated for the program
367 hold and which do not. Using the SAT solver, CBMC detects that the
368 property for the object bounds of `argv` does not hold, and will thus
369 print a line as follows:
371 [main.pointer_dereference.6] dereference failure: object bounds in argv[(signed long int)2]: FAILURE
373 ### Counterexample Traces
375 Let us have a closer look at this property and why it fails. To aid the
376 understanding of the problem, CBMC can generate a *counterexample trace*
377 for failed properties. To obtain this trace, run
379 cbmc file1.c --bounds-check --trace
381 CBMC then prints a counterexample trace, i.e., a program trace that
382 begins with `main` and ends in a state which violates the property. In
383 our example, the program trace ends in the faulty array access. It also
384 gives the values the input variables must have for the bug to occur. In
385 this example, `argc` must be one to trigger the out-of-bounds array
386 access. If you add a branch to the example that requires that `argc>=3`,
387 the bug is fixed and CBMC will report that the proofs of all properties
388 have been successful.
390 ### Verifying Modules
392 In the example above, we used a program that starts with a `main`
393 function. However, CBMC is aimed at embedded software, and these kinds
394 of programs usually have different entry points. Furthermore, CBMC is
395 also useful for verifying program modules. Consider the following
396 example, called file2.c:
407 In order to set the entry point to the `sum` function, run
409 cbmc file2.c --function sum --bounds-check
411 It is often necessary to build a suitable *harness* for the function in
412 order to set up the environment appropriately.
416 When running the previous example, you will have noted that CBMC unwinds
417 the `for` loop in the program. As CBMC performs Bounded Model Checking,
418 all loops have to have a finite upper run-time bound in order to
419 guarantee that all bugs are found. CBMC can optionally check that enough
420 unwinding is performed. As an example, consider the program binsearch.c:
422 int binsearch(int x) {
424 signed low=0, high=16;
427 signed middle=low+((high-low)>>1);
440 If you run CBMC on this function, you will notice that the unwinding
441 does not stop on its own. The built-in simplifier is not able to
442 determine a run time bound for this loop. The unwinding bound has to be
443 given as a command line argument:
445 cbmc binsearch.c --function binsearch --unwind 6 --bounds-check --unwinding-assertions
447 CBMC verifies that verifies the array accesses are within the bounds;
448 note that this actually depends on the result of the right shift. In
449 addition, as CBMC is given the option `--unwinding-assertions`, it also
450 checks that enough unwinding is done, i.e., it proves a run-time bound.
451 For any lower unwinding bound, there are traces that require more loop
452 iterations. Thus, CBMC will report that the unwinding assertion has
453 failed. As usual, a counterexample trace that documents this can be
454 obtained with the option `--property`.
458 CBMC can also be used for programs with unbounded loops. In this case,
459 CBMC is used for bug hunting only; CBMC does not attempt to find all
460 bugs. The following program (lock-example.c) is an example of a program
461 with a user-specified property:
481 unsigned got_lock = 0;
487 /* critical section */
498 The `while` loop in the `main` function has no (useful) run-time bound.
499 Thus, a bound has to be set on the amount of unwinding that CBMC
500 performs. There are two ways to do so:
502 1. The `--unwind` command-line parameter can to be used to limit the
503 number of times loops are unwound.
504 2. The `--depth` command-line parameter can be used to limit the number
505 of program steps to be processed.
507 Given the option `--unwinding-assertions`, CBMC checks whether the
508 arugment to `--unwind` is large enough to cover all program paths. If
509 the argument is too small, CBMC will detect that not enough unwinding is
510 done reports that an unwinding assertion has failed.
512 Reconsider the example. For a loop unwinding bound of one, no bug is
513 found. But already for a bound of two, CBMC detects a trace that
514 violates an assertion. Without unwinding assertions, or when using the
515 `--depth` command line switch, CBMC does not prove the program correct,
516 but it can be helpful to find program bugs. The various command line
517 options that CBMC offers for loop unwinding are described in the section
518 on \ref man_cbmc-loops "understanding loop unwinding".
520 ### A Note About Compilers and the ANSI-C Library
522 Most C programs make use of functions provided by a library; instances
523 are functions from the standard ANSI-C library such as `malloc` or
524 `printf`. The verification of programs that use such functions has two
527 1. Appropriate header files have to be provided. These header files
528 contain *declarations* of the functions that are to be used.
529 2. Appropriate *definitions* have to be provided.
531 Most C compilers come with header files for the ANSI-C library
532 functions. We briefly discuss how to obtain/install these library files.
536 Linux systems that are able to compile software are usually equipped
537 with the appropriate header files. Consult the documentation of your
538 distribution on how to install the compiler and the header files. First
539 try to compile some significant program before attempting to verify it.
543 On Microsoft Windows, CBMC is pre-configured to use the compiler that is
544 part of Microsoft's Visual Studio. Microsoft's [Visual Studio
545 Community](http://www.visualstudio.com/en-us/products/visual-studio-community-vs)
546 is fully featured and available for download for free from the Microsoft
547 webpage. Visual Studio installs the usual set of header files together
548 with the compiler. However, the Visual Studio compiler requires a large
549 set of environment variables to function correctly. It is therefore
550 required to run CBMC from the *Visual Studio Command Prompt*, which can
551 be found in the menu *Visual Studio Tools*.
553 Note that in both cases, only header files are available. CBMC only
554 comes with a small set of definitions, which includes functions such as
555 `malloc`. Detailed information about the built-in definitions is
556 \ref man_instrumentation-libraries "here".
558 ### Command Line Interface
560 This section describes the command line interface of CBMC. Like a C
561 compiler, CBMC takes the names of the .c source files as arguments.
562 Additional options allow to customize the behavior of CBMC. Use
563 `cbmc --help` to get a full list of the available options.
565 Structured output can be obtained from CBMC using the option `--xml-ui`.
566 Any output from CBMC (e.g., counterexamples) will then use an XML
571 - \ref man_cbmc-loops "Understanding Loop Unwinding"
572 - [Hardware Verification using ANSI-C Programs as a
573 Reference](http://www-2.cs.cmu.edu/~svc/papers/view-publications-ck03.html)
574 - [Behavioral Consistency of C and Verilog Programs Using Bounded
575 Model Checking](http://www-2.cs.cmu.edu/~svc/papers/view-publications-cky03.html)
576 - [A Tool for Checking ANSI-C
577 Programs](http://www-2.cs.cmu.edu/~svc/papers/view-publications-ckl2004.html)
579 We also have a [list of interesting applications of
580 CBMC](http://www.cprover.org/cbmc/applications/).
583 \subsection man_cbmc-loops Understanding Loop Unwinding
585 ### Iteration-based Unwinding
587 The basic idea of CBMC is to model the computation of the programs up to
588 a particular depth. Technically, this is achieved by a process that
589 essentially amounts to *unwinding loops*. This concept is best
590 illustrated with a generic example:
592 int main(int argc, char **argv) {
598 A BMC instance that will find bugs with up to five iterations of the
599 loop would contain five copies of the loop body, and essentially
600 corresponds to checking the following loop-free program:
602 int main(int argc, char **argv) {
620 Note the use of the `if` statement to prevent the execution of the loop
621 body in the case that the loop ends before five iterations are executed.
622 The construction above is meant to produce a program that is trace
623 equivalent with the original programs for those traces that contain up
624 to five iterations of the loop.
626 In many cases, CBMC is able to automatically determine an upper bound on
627 the number of loop iterations. This may even work when the number of
628 loop unwindings is not constant. Consider the following example:
633 for(int i=0; i<100; i++) {
639 The loop in the program above has an obvious upper bound on the number
640 of iterations, but note that the loop may abort prematurely depending on
641 the value that is returned by `f()`. CBMC is nevertheless able to
642 automatically unwind the loop to completion.
644 This automatic detection of the unwinding bound may fail if the number
645 of loop iterations is highly data-dependent. Furthermore, the number of
646 iterations that are executed by any given loop may be too large or may
647 simply be unbounded. For this case, CBMC offers the command-line option
648 `--unwind B`, where `B` denotes a number that corresponds to the maximal
649 number of loop unwindings CBMC performs on any loop.
651 Note that the number of unwindings is measured by counting the number of
652 *backjumps*. In the example above, note that the condition `i<100` is in
653 fact evaluated 101 times before the loop terminates. Thus, the loop
654 requires a limit of 101, and not 100.
656 ### Setting Separate Unwinding Limits
658 The setting given with `--unwind` is used globally, that is, for all
659 loops in the program. In order to set individual limits for the loops,
664 to obtain a list of all loops in the program. Then identify the loops
665 you need to set a separate bound for, and note their loop ID. Then use
667 --unwindset L:B,L:B,...
669 where `L` denotes a loop ID and `B` denotes the bound for that loop.
671 As an example, consider a program with two loops in the function main:
673 --unwindset c::main.0:10,c::main.1:20
675 This sets a bound of 10 for the first loop, and a bound of 20 for the
678 What if the number of unwindings specified is too small? In this case,
679 bugs that require paths that are deeper may be missed. In order to
680 address this problem, CBMC can optionally insert checks that the given
681 unwinding bound is actually sufficiently large. These checks are called
682 *unwinding assertions*, and are enabled with the option
683 `--unwinding-assertions`. Continuing the generic example above, this
684 unwinding assertion for a bound of five corresponds to checking the
685 following loop-free program:
687 int main(int argc, char **argv) {
706 The unwinding assertions can be verified just like any other generated
707 assertion. If all of them are proven to hold, the given loop bounds are
708 sufficient for the program. This establishes a [high-level worst-case
709 execution time](http://en.wikipedia.org/wiki/Worst-case_execution_time)
712 In some cases, it is desirable to cut off very deep loops in favor of
713 code that follows the loop. As an example, consider the following
717 for(int i=0; i<10000; i++) {
723 In the example above, small values of `--unwind` will prevent that the
724 assertion is reached. If the code in the loop is considered irrelevant
725 to the later assertion, use the option
729 This option will allow paths that execute loops only partially, enabling
730 a counterexample for the assertion above even for small unwinding
731 bounds. The disadvantage of using this option is that the resulting path
732 may be spurious, i.e., may not exist in the original program.
734 ### Depth-based Unwinding
736 The loop-based unwinding bound is not always appropriate. In particular,
737 it is often difficult to control the size of the generated formula when
738 using the `--unwind` option. The option
742 specifies an unwinding bound in terms of the number of instructions that
743 are executed on a given path, irrespectively of the number of loop
744 iterations. Note that CBMC uses the number of instructions in the
745 control-flow graph as the criterion, not the number of instructions in
748 \subsection man_cbmc-cover COVER: Test Suite Generation with CBMC
751 ### A Small Tutorial with A Case Study
753 We assume that CBMC is installed on your system. If not so, follow
754 \ref man_install-cbmc "these instructions".
756 CBMC can be used to automatically generate test cases that satisfy a
757 certain [code coverage](https://en.wikipedia.org/wiki/Code_coverage)
758 criteria. Common coverage criteria include branch coverage, condition
759 coverage and [Modified Condition/Decision Coverage
760 (MC/DC)](https://en.wikipedia.org/wiki/Modified_condition/decision_coverage).
761 Among others, MC/DC is required by several avionics software development
762 guidelines to ensure adequate testing of safety critical software.
763 Briefly, in order to satisfy MC/DC, for every conditional statement
764 containing boolean decisions, each Boolean variable should be evaluated
765 one time to "true" and one time to "false", in a way that affects the
766 outcome of the decision.
768 In the following, we are going to demonstrate how to apply the test
769 suite generation functionality in CBMC, by means of a case study. The
770 following program is an excerpt from a real-time embedded benchmark
771 [PapaBench](https://www.irit.fr/recherches/ARCHI/MARCH/rubrique.php3?id_rubrique=97),
772 and implements part of a fly-by-wire autopilot for an Unmanned Aerial
773 Vehicle (UAV). It is adjusted mildly for our purposes.
775 The aim of function `climb_pid_run` is to control the vertical climb of
776 the UAV. Details on the theory behind this operation are documented in
777 the [wiki](https://wiki.paparazziuav.org/wiki/Theory_of_Operation) for
778 the Paparazzi UAV project. The behaviour of this simple controller,
779 supposing that the desired speed is 0.5 meters per second, is plotted in
782 \image html pid.png "The pid controller"
785 02: #define MAX_CLIMB_SUM_ERR 10
786 03: #define MAX_CLIMB 1
789 06: #define MAX_PPRZ (CLOCK*600)
791 08: #define CLIMB_LEVEL_GAZ 0.31
792 09: #define CLIMB_GAZ_OF_CLIMB 0.75
793 10: #define CLIMB_PITCH_OF_VZ_PGAIN 0.05
794 11: #define CLIMB_PGAIN -0.03
795 12: #define CLIMB_IGAIN 0.1
797 14: const float pitch_of_vz_pgain=CLIMB_PITCH_OF_VZ_PGAIN;
798 15: const float climb_pgain=CLIMB_PGAIN;
799 16: const float climb_igain=CLIMB_IGAIN;
800 17: const float nav_pitch=0;
802 19: /** PID function INPUTS */
803 20: // The user input: target speed in vertical direction
804 21: float desired_climb;
805 22: // Vertical speed of the UAV detected by GPS sensor
806 23: float estimator_z_dot;
808 25: /** PID function OUTPUTS */
809 26: float desired_gaz;
810 27: float desired_pitch;
812 29: /** The state variable: accumulated error in the control */
813 30: float climb_sum_err=0;
815 32: /** Computes desired_gaz and desired_pitch */
816 33: void climb_pid_run()
819 36: float err=estimator_z_dot-desired_climb;
821 38: float fgaz=climb_pgain*(err+climb_igain*climb_sum_err)+CLIMB_LEVEL_GAZ+CLIMB_GAZ_OF_CLIMB*desired_climb;
823 40: float pprz=fgaz*MAX_PPRZ;
824 41: desired_gaz=((pprz>=0 && pprz<=MAX_PPRZ) ? pprz : (pprz>MAX_PPRZ ? MAX_PPRZ : 0));
826 43: /** pitch offset for climb */
827 44: float pitch_of_vz=(desired_climb>0) ? desired_climb*pitch_of_vz_pgain : 0;
828 45: desired_pitch=nav_pitch+pitch_of_vz;
830 47: climb_sum_err=err+climb_sum_err;
831 48: if (climb_sum_err>MAX_CLIMB_SUM_ERR) climb_sum_err=MAX_CLIMB_SUM_ERR;
832 49: if (climb_sum_err<-MAX_CLIMB_SUM_ERR) climb_sum_err=-MAX_CLIMB_SUM_ERR;
841 58: /** Non-deterministic input values */
842 59: desired_climb=nondet_float();
843 60: estimator_z_dot=nondet_float();
845 62: /** Range of input values */
846 63: __CPROVER_assume(desired_climb>=-MAX_CLIMB && desired_climb<=MAX_CLIMB);
847 64: __CPROVER_assume(estimator_z_dot>=-MAX_CLIMB && estimator_z_dot<=MAX_CLIMB);
849 66: __CPROVER_input("desired_climb", desired_climb);
850 67: __CPROVER_input("estimator_z_dot", estimator_z_dot);
854 71: __CPROVER_output("desired_gaz", desired_gaz);
855 72: __CPROVER_output("desired_pitch", desired_pitch);
862 In order to test the PID controller, we construct a main control loop,
863 which repeatedly invokes the function `climb_pid_run` (line 69). This
864 PID function has two input variables: the desired speed `desired_climb`
865 and the estimated speed `estimated_z_dot`. In the beginning of each loop
866 iteration, values of the inputs are assigned non-deterministically.
867 Subsequently, the `__CPROVER_assume` statement in lines 63 and 64
868 guarantees that both values are bounded within a valid range. The
869 `__CPROVER_input` and `__CPROVER_output` will help clarify the inputs
870 and outputs of interest for generating test suites.
872 To demonstrate the automatic test suite generation in CBMC, we call the
873 following command and we are going to explain the command line options
876 cbmc pid.c --cover mcdc --unwind 6 --xml-ui
878 The option `--cover mcdc` specifies the code coverage criterion. There
879 are four conditional statements in the PID function: in line 41, line
880 44, line 48 and line 49. To satisfy MC/DC, the test suite has to meet
881 multiple requirements. For instance, each conditional statement needs to
882 evaluate to *true* and *false*. Consider the condition
883 `"pprz>=0 && pprz<=MAX_PPRZ"` in line 41. CBMC instruments three
884 coverage goals to control the respective evaluated results of
885 `"pprz>=0"` and `"pprz<=MAX_PPRZ"`. We list them in below and they
886 satisfy the MC/DC rules. Note that `MAX_PPRZ` is defined as 16 \* 600 in
887 line 06 of the program.
889 !(pprz >= (float)0) && pprz <= (float)(16 * 600) id="climb_pid_run.coverage.1"
890 pprz >= (float)0 && !(pprz <= (float)(16 * 600)) id="climb_pid_run.coverage.2"
891 pprz >= (float)0 && pprz <= (float)(16 * 600) id="climb_pid_run.coverage.3"
893 The "id" of each coverage goal is automatically assigned by CBMC. For
894 every coverage goal, a test suite (if there exists) that satisfies such
895 a goal is printed out in XML format, as the parameter `--xml-ui` is
896 given. Multiple coverage goals can share a test suite, when the
897 corresponding execution of the program satisfies all these goals at the
900 In the end, the following test suites are automatically generated for
901 testing the PID controller. A test suite consists of a sequence of input
902 parameters that are passed to the PID function `climb_pid_run` at each
903 loop iteration. For example, Test 1 covers the MC/DC goal with
904 id="climb\_pid\_run.coverage.1". The complete output from CBMC is in
905 [pid\_test\_suites.xml](pid_test_suites.xml), where every test suite and
906 the coverage goals it is for are clearly described.
910 (iteration 1) desired_climb=-1.000000f, estimator_z_dot=1.000000f
913 (iteration 1) desired_climb=-1.000000f, estimator_z_dot=1.000000f
914 (iteration 2) desired_climb=1.000000f, estimator_z_dot=-1.000000f
917 (iteration 1) desired_climb=0.000000f, estimator_z_dot=-1.000000f
918 (iteration 2) desired_climb=1.000000f, estimator_z_dot=-1.000000f
921 (iteration 1) desired_climb=1.000000f, estimator_z_dot=-1.000000f
922 (iteration 2) desired_climb=1.000000f, estimator_z_dot=-1.000000f
923 (iteration 3) desired_climb=1.000000f, estimator_z_dot=-1.000000f
924 (iteration 4) desired_climb=1.000000f, estimator_z_dot=-1.000000f
925 (iteration 5) desired_climb=0.000000f, estimator_z_dot=-1.000000f
926 (iteration 6) desired_climb=1.000000f, estimator_z_dot=-1.000000f
929 (iteration 1) desired_climb=-1.000000f, estimator_z_dot=1.000000f
930 (iteration 2) desired_climb=-1.000000f, estimator_z_dot=1.000000f
931 (iteration 3) desired_climb=-1.000000f, estimator_z_dot=1.000000f
932 (iteration 4) desired_climb=-1.000000f, estimator_z_dot=1.000000f
933 (iteration 5) desired_climb=-1.000000f, estimator_z_dot=1.000000f
934 (iteration 6) desired_climb=-1.000000f, estimator_z_dot=1.000000f
936 The option `--unwind 6` unwinds the loop inside the main function body
937 six times. In order to achieve the complete coverage on all the
938 instrumented goals in the PID function `climb_pid_run`, the loop must be
939 unwound sufficient enough times. For example, `climb_pid_run` needs to
940 be called at least six times for evaluating the condition
941 `climb_sum_err>MAX_CLIMB_SUM_ERR` in line 48 to *true*. This corresponds
942 to the Test 5. An introduction to the use of loop unwinding can be found
943 in [Understanding Loop Unwinding](cbmc-loops.shtml).
945 In this small tutorial, we present the automatic test suite generation
946 functionality of CBMC, by applying the MC/DC code coverage criterion to
947 a PID controller case study. In addition to `--cover mcdc`, other
948 coverage criteria like `branch`, `decision`, `path` etc. are also
949 available when calling CBMC.
951 ### Coverage Criteria
953 The table below summarizes the coverage criteria that CBMC supports.
955 Criterion |Definition
956 ----------|----------
957 assertion |For every assertion, generate a test that reaches it
958 location |For every location, generate a test that reaches it
959 branch |Generate a test for every branch outcome
960 decision |Generate a test for both outcomes of every Boolean expression that is not an operand of a propositional connective
961 condition |Generate a test for both outcomes of every Boolean expression
962 mcdc |Modified Condition/Decision Coverage (MC/DC)
963 path |Bounded path coverage
964 cover |Generate a test for every `__CPROVER_cover` statement
967 \section man_satabs SATABS---Predicate Abstraction with SAT
969 \subsection man_satabs-overview Overview
971 This section describes SATABS from the point of view of the user. To
972 learn about the technology implemented in SATABS, read
973 \ref man_satabs-background "this".
975 We assume you have already installed SATABS and the necessary support
976 files on your system. If not so, please follow
977 \ref man_install-satabs "these instructions".
979 While users of SATABS almost never have to be concerned about the
980 underlying refinement abstraction algorithms, understanding the classes
981 of properties that can be verified is crucial. Predicate abstraction is
982 most effective when applied to *control-flow dominated properties*. As
983 an example, reconsider the following program (lock-example-fixed.c):
1003 unsigned got_lock = 0;
1009 /* critical section */
1021 The two assertions in the program model that the functions `lock()` and
1022 `unlock()` are called in the right order. Note that the value of `times`
1023 is chosen non-deterministically and is not bounded. The program has no
1024 run-time bound, and thus, unwinding the code with CBMC will never
1027 ### Working with Claims
1029 The two assertions will give rise to two *properties*. Each property is
1030 associated to a specific line of code, i.e., a property is violated when
1031 some condition can become false at the corresponding program location.
1032 SATABS will generate a list of all properties for the programs as
1035 satabs lock-example-fixed.c --show-properties
1037 SATABS will list two properties; each property corresponds to one of the
1038 two assertions. We can use SATABS to verify both properties as follows:
1040 satabs lock-example-fixed.c
1042 SATABS will conclude the verification successfully, that is, both
1043 assertions hold for execution traces of any length.
1045 By default, SATABS attempts to verify all properties at once. A single
1046 property can be verified (or refuted) by using the `--property id`
1047 option of SATABS, where `id` denotes the identifier of the property in
1048 the list obtained by calling SATABS with the `--show-properties` flag.
1049 Whenever a property is violated, SATABS reports a feasible path that
1050 leads to a state in which the condition that corresponds to the violated
1051 property evaluates to false.
1053 \subsection man_satabs-libraries Programs that use Libraries
1055 SATABS cannot check programs that use functions that are only available
1056 in binary (compiled) form (this restriction is not imposed by the
1057 verification algorithms that are used by SATABS – they also work on
1058 assembly code). The reason is simply that so far no assembly language
1059 frontend is available for SATABS. At the moment, (library) functions for
1060 which no C source code is available have to be replaced by stubs. The
1061 usage of stubs and harnesses (as known from unit testing) also allows to
1062 check more complicated properties (like, for example, whether function
1063 `fopen` is always called before `fclose`). This technique is explained
1064 in detail in the \ref man_satabs-tutorials "SATABS tutorials".
1066 \subsection man_satabs-unit-test Unit Testing with SATABS
1068 The example presented \ref man_satabs-tutorial-driver "here" is
1069 obviously a toy example and can hardly be used to convince your project
1070 manager to use static verification in your next project. Even though we
1071 recommend to use formal verification and specification already in the
1072 early phases of your project, the sad truth is that in most projects
1073 verification (of any kind) is still pushed to the very end of the
1074 development cycle. Therefore, this section is dedicated to the
1075 verification of legacy code. However, the techniques presented here can
1076 also be used for *unit testing*.
1078 Unit testing is used in most software development projects, and static
1079 verification with SATABS can be very well combined with this technique.
1080 Unit testing relies on a number test cases that yield the desired code
1081 coverage. Such test cases are implemented by a software testing engineer
1082 in terms of a test harness (aka test driver) and a set of function
1083 stubs. Typically, a slight modification to the test harness allows it to
1084 be used with SATABS. Replacing the explicit input values with
1085 non-deterministic inputs (as explained in
1086 \ref man_satabs-tutorial-aeon "here" and
1087 \ref man_satabs-tutorial-driver "here")) guarantees that SATABS will try
1088 to achieve **full** path and state coverage (due to the fact that
1089 predicate abstraction implicitly detects equivalence classes). However,
1090 it is not guaranteed that SATABS terminates in all cases. Keep in mind
1091 that you must not make any assumptions about the validity of the
1092 properties if SATABS did not run to completion!
1096 - [Model Checking Concurrent Linux Device
1097 Drivers](http://www.kroening.com/publications/view-publications-wbwk2007.html)
1098 - [SATABS: SAT-based Predicate Abstraction for
1099 ANSI-C](http://www-2.cs.cmu.edu/~svc/papers/view-publications-cksy2005.html)
1100 - [Predicate Abstraction of ANSI-C Programs using
1101 SAT](http://www-2.cs.cmu.edu/~svc/papers/view-publications-cksy2004.html)
1103 \subsection man_satabs-background Background
1105 ### Sound Abstractions
1107 This section provides background information on how SATABS operates.
1108 Even for very trivial C programs it is impossible to exhaustively
1109 examine their state space (which is potentially unbounded). However, not
1110 all details in a C program necessarily contribute to a bug, so it may be
1111 sufficient to only examine those parts of the program that are somehow
1114 In practice, many static verification tools (such as `lint`) try to
1115 achieve this goal by applying heuristics. This approach comes at a cost:
1116 bugs might be overlooked because the heuristics do not cover all
1117 relevant aspects of the program. Therefore, the conclusion that a
1118 program is correct whenever such a static verification tool is unable to
1119 find an error is invalid.
1121 \image html cegar-1.png "CEGAR Loop"
1123 A more sophisticated approach that has been very successful recently is
1124 to generate a *sound* abstraction of the original program. In this
1125 context, *soundness* refers to the fact that the abstract program
1126 contains (at least) all relevant behaviors (i.e., bugs) that are present
1127 in the original program. In the Figure above, the first component strips
1128 details from the original program. The number of possible behaviors
1129 increases as the number of details in the abstract program decreases.
1130 Intuitively, the reason is that whenever the model checking tool lacks
1131 the information that is necessary to make an accurate decision on
1132 whether a branch of an control flow statement can be taken or not, both
1133 branches have to be considered.
1135 In the resulting *abstract program*, a set of concrete states is
1136 subsumed by means of a single abstract state. Consider the following
1141 The concrete states *x*~1~ and *x*~2~ are mapped to an abstract state
1142 *X*, and similarly *Y* subsumes *y*~1~ and *y*~2~. However, all
1143 transitions that are possible in the concrete program are also possible
1144 in the abstract model. The abstract transition *X* → *Y* summarizes the
1145 concrete transitions *x*~1~ → *y*~1~ and *x*~1~ → *x*~1~, and *Y* → *X*
1146 corresponds to *x*~1~ → *x*~2~. The behavior *X* → *Y* → *X* is feasible
1147 in the original program, because it maps to *x*~1~ → *x*~1~ → *x*~2~.
1148 However, *Y* → *X* → *Y* is feasible only in the abstract model.
1150 ### Spurious Counterexamples
1152 The consequence is that the model checker (component number two in the
1153 figure above) possibly reports a *spurious* counterexample. We call a
1154 counterexample spurious whenever it is feasible in the current abstract
1155 model but not in the original program. However, whenever the model
1156 checker is unable to find an execution trace that violates the given
1157 property, we can conclude that there is no such trace in the original
1160 The feasibility of counterexamples is checked by *symbolic simulation*
1161 (performed by component three in the figure above). If the
1162 counterexample is indeed feasible, SATABS found a bug in the original
1163 program and reports it to the user.
1165 ### Automatic Refinement
1167 On the other hand, infeasible counterexamples (that originate from
1168 abstract behaviors that result from the omission of details and are not
1169 present in the original program) are never reported to the user.
1170 Instead, the information is used in order to refine the abstraction such
1171 that the spurious counterexample is not part of the refined model
1172 anymore. For instance, the reason for the infeasibility of *Y* → *X* →
1173 *Y* is that neither *y*~1~ nor *x*~1~ can be reached from *x*~2~.
1174 Therefore, the abstraction can be refined by partitioning *X*.
1176 The refinement steps can be illustrated as follows:
1178 
1180 The first step (1) is to generate a very coarse abstraction with a very
1181 small state space. This abstraction is then successively refined (2, 3,
1182 ...) until either a feasible counterexample is found or the abstract
1183 program is detailed enough to show that there is no path that leads to a
1184 violation of the given property. The problem is that this point is not
1185 necessarily reached for every input program, i.e., it is possible that
1186 the the abstraction refinement loop never terminates. Therefore, SATABS
1187 allows to specify an upper bound for the number of iterations.
1189 When this upper bound is reached and no counterexample was found, this
1190 does not necessarily mean that there is none. In this case, you cannot
1191 make any conclusions at all with respect to the correctness of the input
1194 \subsection man_satabs-tutorials Tutorials
1196 We provide an introduction to model checking "real" C programs with
1197 SATABS using two small examples.
1199 \subsubsection man_satabs-tutorial-driver Reference Counting in Linux Device Drivers
1201 Microsoft's [SLAM](http://research.microsoft.com/SLAM) toolkit has been
1202 successfully used to find bugs in Windows device drivers. SLAM
1203 automatically verifies device driver whether a device driver adheres to
1204 a set of specifications. SLAM provides a test harness for device drivers
1205 that calls the device driver dispatch routines in a non-deterministic
1206 order. Therefore, the Model Checker examines all combinations of calls.
1207 Motivated by the success this approach, we provide a toy example based
1208 on Linux device drivers. For a more complete approach to the
1209 verification of Linux device drivers, consider
1210 [DDVerify](http://www.cprover.org/ddverify/).
1212 Dynamically loadable modules enable the Linux Kernel to load device
1213 drivers on demand and to release them when they are not needed anymore.
1214 When a device driver is registered, the kernel provides a major number
1215 that is used to uniquely identify the device driver. The corresponding
1216 device can be accessed through special files in the filesystem; by
1217 convention, they are located in the `/dev` directory. If a process
1218 accesses a device file the kernel calls the corresponding `open`, `read`
1219 and `write` functions of the device driver. Since a driver must not be
1220 released by the kernel as long as it is used by at least one process,
1221 the device driver must maintain a usage counter (in more recent Linux
1222 kernels, this is done automatically, however, drivers that must maintain
1223 backward compatibility have to adjust this counter).
1225 We provide a skeleton of such a driver. Download the files
1226 assets/spec.c, assets/driver.c, assets/driver.h, assets/kdev_t.h, and
1229 The driver contains following functions:
1231 1. `register_chrdev`: (in assets/spec.c) Registers a character device.
1232 In our implementation, the function sets the variable `usecount` to
1233 zero and returns a major number for this device (a constant, if the
1234 user provides 0 as argument for the major number, and the value
1235 specified by the user otherwise).
1239 int register_chrdev (unsigned int major, const char* name)
1243 return MAJOR_NUMBER;
1247 2. `unregister_chrdev`: (in assets/spec.c) Unregisters a character
1248 device. This function asserts that the device is not used by any
1249 process anymore (we use the macro `MOD_IN_USE` to check this).
1251 int unregister_chrdev (unsigned int major, const char* name)
1261 3. `dummy_open`: (in assets/driver.c) This function increases the
1262 `usecount`. If the device is locked by some other process
1263 `dummy_open` returns -1. Otherwise it locks the device for the
1266 4. `dummy_read`: (in assets/driver.c) This function "simulates" a read
1267 access to the device. In fact it does nothing, since we are
1268 currently not interested in the potential buffer overflow that may
1269 result from a call to this function. Note the usage of the function
1270 `nondet_int`: This is an internal SATABS-function that
1271 nondeterministically returns an arbitrary integer value. The
1272 function `__CPROVER_assume` tells SATABS to ignore all traces that
1273 do not adhere to the given assumption. Therefore, whenever the lock
1274 is held, `dummy_read` will return a value between 0 and `max`. If
1275 the lock is not held, then `dummy_read` returns -1.
1277 5. `dummy_release`: (in assets/driver.c) If the lock is held, then
1278 `dummy_release` decreases the `usecount`, releases the lock, and
1279 returns 0. Otherwise, the function returns -1.
1281 We now want to check if any *valid* sequence of calls of the dispatch
1282 functions (in driver.c) can lead to the violation of the assertion (in
1283 assets/spec.c). Obviously, a call to `dummy_open` that is immediately
1284 followed by a call to `unregister_chrdev` violates the assertion.
1286 The function `main` in spec.c gives an example of how these functions
1287 are called. First, a character device "`dummy`" is registered. The major
1288 number is stored in the `inode` structure of the device. The values for
1289 the file structure are assigned non-deterministically. We rule out
1290 invalid sequences of calls by ensuring that no device is unregistered
1291 while it is still locked. We use the following model checking harness
1292 for calling the dispatching functions:
1294 random = nondet_uchar ();
1295 __CPROVER_assume (0 <= random && random <= 3);
1300 rval = dummy_open (&inode, &my_file);
1305 __CPROVER_assume (lock_held);
1306 count = dummy_read (&my_file, buffer, BUF_SIZE);
1309 dummy_release (&inode, &my_file);
1316 The variable `random` is assigned non-deterministically. Subsequently,
1317 the value of `random` is restricted to be 0 &le `random` ≤ 3 by a call
1318 to `__CPROVER_assume`. Whenever the value of `random` is not in this
1319 interval, the corresponding execution trace is simply discarded by
1320 SATABS. Depending on the value of `random`, the harness calls either
1321 `dummy_open`, `dummy_read` or `dummy_close`. Therefore, if there is a
1322 sequence of calls to these three functions that leads to a violation of
1323 the assertion in `unregister_chrdev`, then SATABS will eventually
1326 If we ask SATABS to show us the properties it verifies with
1328 satabs driver.c spec.c --show-properties
1330 for our example, we obtain
1332 1. Claim unregister\_chrdev.1:\
1333 file spec.c line 18 function unregister\_chrdev\
1334 MOD\_IN\_USE in unregister\_chrdev\
1337 2. Claim dummy\_open.1:\
1338 file driver.c line 15 function dummy\_open\
1340 (unsigned int)inode->i\_rdev >> 8 == (unsigned
1343 It seems obvious that the property dummy\_open.1 can never be violated.
1344 SATABS confirms this assumption: We call
1346 satabs driver.c spec.c --property dummy_open.1
1348 and SATABS reports `VERIFICATION SUCCESSFUL` after a few iterations.
1350 If we try to verify property unregister\_chrdev.1, SATABS reports that
1351 the property in line 18 in file spec.c is violated (i.e., the assertion
1352 does not hold, therefore the `VERIFICATION FAILED`). Furthermore, SATABS
1353 provides a detailed description of the problem in the form of a
1354 counterexample (i.e., an execution trace that violates the property). On
1355 this trace, `dummy_open` is called **twice**, leading to a `usecount` of 2.
1356 The second call of course fails with `rval=-1`, but the counter is
1357 increased nevertheless:
1359 int dummy_open (struct inode *inode, struct file *filp)
1361 __CPROVER_assert(MAJOR (inode->i_rdev) == dummy_major,
1369 return 0; /* success */
1372 Then, `dummy_release` is called to release the lock on the device.
1373 Finally, the loop is left and the call to `unregister_chrdev` results in
1374 a violation of the assertion (since `usecount` is still 1, even though
1377 \subsubsection man_satabs-tutorial-aeon Buffer Overflow in a Mail Transfer Agent
1379 We explain how to model check Aeon version 0.2a, a small mail transfer
1380 agent written by Piotr Benetkiewicz. The description advertises Aeon as
1381 a "*good choice for **hardened** or minimalistic boxes*". The sources
1383 [here](http://www.cprover.org/satabs/examples/loop_detection/aeon-0.2a.tar.gz).
1385 Our first naive attempt to verify Aeon using
1389 produces a positive result, but also warns us that the property holds
1390 trivially. It also reveals that a large number library functions are
1391 missing: SATABS is unable to find the source code for library functions
1392 like `send`, `write` and `close`.
1394 Now, do you have to provide a body for all missing library functions?
1395 There is no easy answer to this question, but a viable answer would be
1396 "most likely not". It is necessary to understand how SATABS handles
1397 functions without bodies: It simply assumes that such a function returns
1398 an arbitrary value, but that no other locations than the one on the left
1399 hand side of the assignment are changed. Obviously, there are cases in
1400 which this assumption is unsound, since the function potentially
1401 modifies all memory locations that it can somehow address.
1403 We now use static analysis to generate array bounds checks for Aeon:
1405 satabs *.c --pointer-check --bounds-check --show-properties
1407 SATABS will show about 300 properties in various functions (read
1408 \ref man_instrumentation-properties "this" for more information on the
1409 property instrumentation). Now consider the first few lines of the
1410 `main` function of Aeon:
1412 int main(int argc, char **argv)
1414 char settings[MAX_SETTINGS][MAX_LEN];
1416 numSet = getConfig(settings);
1418 logEntry("Missing config file!");
1423 and the function `getConfig` in `lib_aeon.c`:
1425 int getConfig(char settings[MAX_SETTINGS][MAX_LEN])
1428 FILE *fp; /* .rc file handler */
1429 int numSet = 0; /* number of settings */
1431 strcpy(home, getenv("HOME")); /* get home path */
1432 strcat(home, "/.aeonrc"); /* full path to rc file */
1433 fp = fopen(home, "r");
1434 if (fp == NULL) return -1; /* no cfg - ERROR */
1435 while (fgets(settings[numSet], MAX_LEN-1, fp)
1436 && (numSet < MAX_SETTINGS)) numSet++;
1442 The function `getConfig` makes calls to `strcpy`, `strcat`, `getenv`,
1443 `fopen`, `fgets`, and `fclose`. It is very easy to provide an
1444 implementation for the functions from the string library (string.h), and
1445 SATABS comes with meaningful definitions for most of them. The
1446 definition of `getenv` is not so straight-forward. The man-page of
1447 `getenv` (which we obtain by entering `man 3 getenv` in a Unix or cygwin
1448 command prompt) tells us:
1450 `` `getenv' `` searches the list of environment variable names
1451 and values (using the global pointer `char **environ`) for a
1452 variable whose name matches the string at `NAME`. If a variable name
1453 matches, `getenv` returns a pointer to the associated value.
1455 SATABS has no information whatsoever about the content of `environ`.
1456 Even if SATABS could access the environment variables on your
1457 computer, a successful verification of `Aeon` would then only guarantee
1458 that the properties for this program hold on your computer with a
1459 specific set of environment variables. We have to assume that
1460 `environ` contains environment variables that have an arbitrary
1461 content of arbitrary length. The content of environment variables is
1462 not only arbitrary but could be malefic, since it can be modified by the
1463 user. The approximation of the behavior of `getenv` that is shipped with
1464 SATABS completely ignores the content of the string.
1466 Now let us have another look at the properties that SATABS generates for
1467 the models of the the string library and for `getenv`. Most of these
1468 properties require that we verify that the upper and lower bounds of
1469 buffers or arrays are not violated. Let us look at one of the properties
1470 that SATABS generates for the code in function `getConfig`:
1472 Claim getConfig.3: file lib_aeon.c line 19 function getConfig dereference failure: NULL plus offset pointer !(SAME-OBJECT(src, NULL))`
1474 The model of the function `strcpy` dereferences the pointer returned by
1475 `getenv`, which may return a NULL pointer. This possibility is detected
1476 by the static analysis, and thus a corresponding property is generated.
1477 Let us check this specific property:
1479 satabs *.c --pointer-check --bounds-check --property getConfig.3
1481 SATABS immediately returns a counterexample path that demonstrates how
1482 `getenv` returns a NULL, which is subsequently dereferenced. We have
1483 identified the first bug in this program: it requires that the
1484 environment variable HOME is set, and crashes otherwise.
1486 Let us examine one more property in the same function:
1488 Claim getConfig.7: file lib_aeon.c line 19 function getConfig dereference failure: array `home' upper bound !(POINTER_OFFSET(dst) + (int)i >= 512) || !(SAME-OBJECT(dst, &home[0]))
1490 This property asserts that the upper bound of the array `home` is not
1491 violated. The variable `home` looks familiar: We encountered it in the
1492 function `getConfig` given above. The function `getenv` in combination
1493 with functions `strcpy`, `strcat` or `sprintf` is indeed often the
1494 source for buffer overflows. Therefore, we try to use SATABS to check
1495 the upper bound of the array `home`:
1497 satabs *.c --pointer-check --bounds-check --property getConfig.7
1499 SATABS runs for quite a while and will eventually give up, telling us
1500 that its upper bound for abstraction refinement iterations has been
1501 exceeded. This is not exactly the result we were hoping for, and we
1502 could now increase the bound for iterations with help of the
1503 `--iterations` command line switch of SATABS.
1505 Before we do this, let us investigate why SATABS has failed to provide a
1506 useful result. The function `strcpy` contains a loop that counts from 1
1507 to the length of the input string. Predicate abstraction, the mechanism
1508 SATABS is based on, is unable to detect such loops and will therefore
1509 unroll the loop body as often as necessary. The array `home` has
1510 `MAX_LEN` elements, and `MAX_LEN` is defined to be 512 in `aeon.h`.
1511 Therefore, SATABS would have to run through at least 512 iterations,
1512 only to verify (or reject) one of the more than 300 properties! Does
1513 this fact defeat the purpose of static verification?
1515 We can make the job easier: after reducing the value of `MAX_LEN` in
1516 `aeon.h` to a small value, say to 10, SATABS provides a counterexample
1517 trace that demonstrates how the buffer overflow be reproduced. If you
1518 use the Eclipse plugin (as described \ref man_install-eclipse "here"),
1519 you can step through this counterexample. The trace contains the string
1520 that is returned by `getenv`.
1523 \section man_modelling Modelling
1525 \subsection man_modelling-nondet Nondeterminism
1529 Programs typically read inputs from an environment. These inputs can
1530 take the form of data read from a file, keyboard or network socket, or
1531 arguments passed on the command line. It is usually desirable to analyze
1532 the program for any choice of these inputs. In Model Checking, inputs
1533 are therefore modeled by means of *nondeterminism*, which means that the
1534 value of the input is not specified. The program may follow any
1535 computation that results from any choice of inputs.
1537 ### Sources of Nondeterminism
1539 The CPROVER tools support the following sources of nondeterminism:
1541 - functions that read inputs from the environments;
1542 - the thread schedule in concurrent programs;
1543 - initial values of local-scoped variables and memory allocated with
1545 - initial values of variables that are `extern` in all compilation
1547 - explicit functions for generating nondeterminism.
1549 The CPROVER tools are shipped with a number of stubs for the most
1550 commonly used library functions. When executing a statement such as
1551 `getchar()`, a nondeterministic value is chosen instead of reading a
1552 character from the keyboard.
1554 When desired, nondeterminism can be introduced explicitly into the
1555 program by means of functions that begin with the prefix `nondet_`. As
1556 an example, the following function returns a nondeterministically chosen
1559 unsigned short int nondet_ushortint();
1561 Note that the body of the function is not defined. The name of the
1562 function itself is irrelevant (save for the prefix), but must be unique.
1563 Also note that a nondeterministic choice is not to be confused with a
1564 probabilistic (or random) choice.
1566 ### Uninterpreted Functions
1568 It may be necessary to check parts of a program independently.
1569 Nondeterminism can be used to over-approximate the behaviour of part of
1570 the system which is not being checked. Rather than calling a complex or
1571 unrelated function, a nondeterministic stub is used. However, separate
1572 calls to the function can return different results, even for the same
1573 inputs. If the function output only depends on its inputs then this can
1574 introduce spurious errors. To avoid this problem, functions whose names
1575 begin with the prefix `__CPROVER_uninterpreted_` are treated as
1576 uninterpreted functions. Their value is non-deterministic but different
1577 invocations will return the same value if their inputs are the same.
1578 Note that uninterpreted functions are not supported by all back-end
1581 \subsection man_modelling-assumptions Modeling with Assertions and Assumptions
1585 [Assertions](http://en.wikipedia.org/wiki/Assertion_%28computing%29) are
1586 statements within the program that attempt to capture the programmer's
1587 intent. The ANSI-C standard defines a header file
1588 [assert.h](http://en.wikipedia.org/wiki/Assert.h), which offers a macro
1589 `assert(cond)`. When executing a statement such as
1593 the execution is aborted with an error message if the condition
1594 evaluates to *false*, i.e., if `p` is NULL in the example above. The
1595 CPROVER tools can check the validity of the programmer-annotated
1596 assertions statically. Specifically, the CPROVER tools will check that
1597 the assertions hold for *any* nondeterministic choice that the program
1598 can make. The static assertion checks can be disabled using the
1599 `--no-assertions` command line option.
1601 In addition, there is a CPROVER-specific way to specify assertions,
1602 using the built-in function `__CPROVER_assert`:
1604 __CPROVER_assert(p!=NULL, "p is not NULL");
1606 The (mandatory) string that is passed as the second argument provides an
1607 informal description of the assertion. It is shown in the list of
1608 properties together with the condition.
1610 The assertion language of the CPROVER tools is identical to the language
1611 used for expressions. Note that \ref man_modelling-nondet
1613 can be exploited in order to check a range of choices. As an example,
1614 the following code fragment asserts that **all** elements of the array
1625 The nondeterministic choice will guess the element of the array that is
1626 nonzero. The code fragment above is therefore equivalent to
1632 for(i=0; i<100; i++)
1635 Future CPROVER releases will support explicit quantifiers with a syntax
1636 that resembles Spec\#:
1638 __CPROVER_forall { *type identifier* ; *expression* }
1639 __CPROVER_exists { *type identifier* ; *expression* }
1643 Assumptions are used to restrict nondeterministic choices made by the
1644 program. As an example, suppose we wish to model a nondeterministic
1645 choice that returns a number from 1 to 100. There is no integer type
1646 with this range. We therefore use \_\_CPROVER\_assume to restrict the
1647 range of a nondeterministically chosen integer:
1649 unsigned int nondet_uint();
1651 unsigned int one_to_hundred()
1653 unsigned int result=nondet_uint();
1654 __CPROVER_assume(result>=1 && result<=100);
1658 The function above returns the desired integer from 1 to 100. You must
1659 ensure that the condition given as an assumption is actually satisfiable
1660 by some nondeterministic choice, or otherwise the model checking step
1661 will pass vacuously.
1663 Also note that assumptions are never retroactive: They only affect
1664 assertions (or other properties) that follow them in program order. This
1665 is best illustrated with an example. In the following fragment, the
1666 assumption has no effect on the assertion, which means that the
1667 assertion will fail:
1671 __CPROVER_assume(x==100);
1673 Assumptions do restrict the search space, but only for assertions that
1674 follow. As an example, the following program will pass:
1679 __CPROVER_assume(x>=1 && x<=100000);
1683 __CPROVER_assert(x<0, "x is negative");
1686 Beware that nondeterminism cannot be used to obtain the effect of
1687 universal quantification in assumptions. As an example,
1694 __CPROVER_assume(x>=0 && x<10 && y>=0 && y<10);
1696 __CPROVER_assume(a[x]>=0);
1701 fails, as there is a choice of x and y which results in a counterexample
1702 (any choice in which x and y are different).
1704 \subsection man_modelling-pointers Pointer Model
1708 C programs (and sometimes C++ programs as well) make intensive use of
1709 pointers in order to decouple program code from specific data. A pointer
1710 variable does not store data such as numbers or letters, but instead
1711 points to a location in memory that hold the relevant data. This section
1712 describes the way the CPROVER tools model pointers.
1714 ### Objects and Offsets
1716 The CPROVER tools represent pointers as a pair. The first member of the
1717 pair is the *object* the pointer points to, and the second is the offset
1720 In C, objects are simply continuous fragments of memory (this definition
1721 of "object" is not to be confused with the use of the term in
1722 object-oriented programming). Variables of any type are guaranteed to be
1723 stored as one object, irrespectively of their type. As an example, all
1724 members of a struct or array belong to the same object. CPROVER simply
1725 assigns a number to each active object. The object number of a pointer
1726 `p` can be extracted using the expression `__CPROVER_POINTER_OBJECT(p)`.
1727 As a consequence, pointers to different objects are always different,
1730 The offset (the second member of the pair that forms a pointer) is
1731 relative to the beginning of the object; it uses byte granularity. As an
1732 example, the code fragment
1737 p=(char *)(array+1);
1740 will result in a pointer with offset 5. The offset of a pointer `p` can
1741 be extracted using the expression `__CPROVER_POINTER_OFFSET(p)`.
1743 ### Dereferencing Pointers
1745 The CPROVER tools require that pointers that are dereferenced point to a
1746 valid object. Assertions that check this requirement can be generated
1747 using the option --pointer-check and, if desired, --bounds-check. These
1748 options will ensure that NULL pointers are not dereferenced, and that
1749 dynamically allocated objects have not yet been deallocated.
1751 Furthermore, the CPROVER tools check that dynamically allocated memory
1752 is not deallocated twice. The goto-instrument tool is also able to add
1753 checks for memory leaks, i.e., it detects dynamically allocated objects
1754 that are not deallocated once the program terminates.
1756 The CPROVER tools support pointer typecasts. Most casts are supported,
1757 with the following exceptions:
1759 1. One notable exception is that pointers can only be accessed using a
1760 pointer type. The conversion of a pointer into an integer-type using
1761 a pointer typecast is not supported.
1763 2. Casts from integers to pointers yield a pointer that is either NULL
1764 (if the integer is zero) or that point into a special array for
1765 modeling [memory-mapped
1766 I/O](http://en.wikipedia.org/wiki/Memory-mapped_I/O). Such pointers
1767 are assumed not to overlap with any other objects. This is, of
1768 course, only sound if a corresponding range check is instrumented.
1770 3. Accesses to arrays via pointers that have the array subtype need to
1773 ### Pointers in Open Programs
1775 It is frequently desired to validate an open program, i.e., a fragment
1776 of a program. Some variables are left undefined. In case an undefined
1777 pointer is dereferenced, CBMC assumes that the pointer points to a
1778 separate object of appropriate type with unbounded size. The object is
1779 assumed not to alias with any other object. This assumption may
1780 obviously be wrong in specific extensions of the program.
1782 \subsection man_modelling-floating-point Floating Point
1784 The CPROVER tools support bit-accurate reasoning about IEEE-754
1785 floating-point and fixed-point arithmetic. The C standard contains a
1786 number of areas of implementation-defined behaviour with regard to
1787 floating-point arithmetic:
1789 - CPROVER supports C99 Appendix F, and thus, the `__STD_IEC_559__`
1790 macro is defined. This means that the C `float` data type maps to
1791 IEEE 754 `binary32` and `double` maps to `binary64` and operations
1792 on them are as specified in IEEE 754.
1794 - `long double` can be configured to be `binary64`, `binary128`
1795 (quad precision) or a 96 bit type with 15 exponent bits and 80
1796 significant bits. The last is an approximation of Intel's x87
1797 extended precision double data type. As the C standard allows a
1798 implementations a fairly wide set of options for `long double`, it
1799 is best avoided for both portable code and bit-precise analysis. The
1800 default is to match the build architecture as closely as possible.
1802 - In CPROVER, floating-point expressions are evaluated at the 'natural
1803 precision' (the greatest of the arguments) and not at a
1804 higher precision. This corresponds to `FLT_EVAL_METHOD` set to `0`.
1805 Note that this is a different policy to some platforms (see below).
1807 - Expression contraction (for example, converting `x * y + c` to
1808 `fma(x,y,c)`) is not performed. In effect, the `FP_CONTRACT` pragma
1811 - Constant expressions are evaluated at \`run' time wherever possible
1812 and so will respect changes in the rounding mode. In effect, the
1813 `FENV_ACCESS` pragma is always off. Note that floating point
1814 constants are treated as doubles (unless they are followed by `f`
1815 when they are float) as specified in the C standard. `goto-cc`
1816 supports `-fsingle-precision-constant`, which allows
1817 the (non-standard) treatment of constants as floats.
1819 - Casts from int to float and float to float make use of the current
1820 rounding mode. Note that the standard requires that casts from float
1821 to int use round-to-zero (i.e. truncation).
1823 ### x86 and Other Platform-specific Issues
1825 Not all platforms have the same implementation-defined behaviour as
1826 CPROVER. This can cause mismatches between the verification environment
1827 and the execution environment. If this occurs, check the compiler manual
1828 for the choices listed above. There are two common cases that can cause
1829 these problems: 32-bit x86 code and the use of unsafe optimisations.
1831 Many compilers that target 32-bit x86 platforms employ a different
1832 evaluation method. The extended precision format of the x87 unit is used
1833 for all computations regardless of their native precision. Most of the
1834 time, this results in more accurate results and avoids edge cases.
1835 However, it can result in some obscure and difficult to debug behaviour.
1836 Checking if the `FLT_EVAL_METHOD` macro is non-zero (for these platforms
1837 it will typically be 2), should warn of these problems. Changing the
1838 compiler flags to use the SSE registers will resolve many of them, give
1839 a more standards-compliant platform and will likely perform better. Thus
1840 it is *highly* recommended. Use `-msse2 -mfpmath=sse` to enable this
1841 option for GCC. Visual C++ does not have an option to force the
1842 exclusive use of SSE instructions, but `/arch:SSE2` will pick SSE
1843 instructions "when it \[the compiler\] determines that it is faster to
1844 use the SSE/SSE2 instructions" and is thus better than `/arch:IA32`,
1845 which exclusively uses the x87 unit.
1847 The other common cause of discrepancy between CPROVER results and the
1848 actual platform are the use of unsafe optimisations. Some higher
1849 optimisation levels enable transformations that are unsound with respect
1850 to the IEEE-754 standard. Consult the compiler manual and disable any
1851 optimisations that are described as unsafe (for example, the GCC options
1852 `-ffast-math`). The options `-ffp-contract=off` (which replaces
1853 `-mno-fused-madd`), `-frounding-math` and `-fsignaling-nans` are needed
1854 for GCC to be strictly compliant with IEEE-754.
1858 CPROVER supports the four rounding modes given by IEEE-754 1985; round
1859 to nearest (ties to even), round up, round down and round towards zero.
1860 By default, round to nearest is used. However, command line options
1861 (`--round-to-zero`, etc.) can be used to over-ride this. If more control
1862 is needed, CPROVER has models of `fesetround` (for POSIX systems) and
1863 `_controlfp` (for Windows), which can be used to change the rounding
1864 mode during program execution. Furthermore, the inline assembly commands
1865 fstcw/fnstcw/fldcw (on x86) can be used.
1867 The rounding mode is stored in the (thread local) variable
1868 `__CPROVER_rounding_mode`, but users are strongly advised not to use
1873 CPROVER implements some of `math.h`, including `fabs`, `fpclassify` and
1874 `signbit`. It has very limited support for elementary functions. Care
1875 must be taken when verifying properties that are dependent on these
1876 functions as the accuracy of implementations can vary considerably. The
1877 C compilers can (and many do) say that the accuracy of these functions
1880 ### Fixed-point Arithmetic
1882 CPROVER also has support for fixed-point types. The `--fixedbv` flag
1883 switches `float`, `double` and `long double` to fixed-point types. The
1884 length of these types is platform specific. The upper half of each type
1885 is the integer component and the lower half is the fractional part.
1888 \section man_hard-soft Hardware and Software Equivalence and Co-Verification
1890 \subsection man_hard-soft-introduction Introduction
1892 A common hardware design approach employed by many companies is to first
1893 write a quick prototype that behaves like the planned circuit in a
1894 language like ANSI-C. This program is then used for extensive testing
1895 and debugging, in particular of any embedded software that will later on
1896 be shipped with the circuit. An example is the hardware of a cell phone
1897 and its software. After testing and debugging of the program, the actual
1898 hardware design is written using hardware description languages like
1899 [VHDL](http://en.wikipedia.org/wiki/VHDL) or
1900 [Verilog](http://en.wikipedia.org/wiki/Verilog).
1902 Thus, there are two implementations of the same design: one written in
1903 ANSI-C, which is written for simulation, and one written in register
1904 transfer level HDL, which is the actual product. The ANSI-C
1905 implementation is usually thoroughly tested and debugged.
1907 Due to market constraints, companies aim to sell the chip as soon as
1908 possible, i.e., shortly after the HDL implementation is designed. There
1909 is usually little time for additional debugging and testing of the HDL
1910 implementation. Thus, an automated, or nearly automated way of
1911 establishing the consistency of the HDL implementation is highly
1914 This motivates the verification problem: we want to verify the
1915 consistency of the HDL implementation, i.e., the product, [using the
1916 ANSI-C implementation as a
1917 reference](http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=936243&userType=inst).
1918 Establishing the consistency does not require a formal
1919 specification. However, formal methods to verify either the hardware or
1920 software design are still desirable.
1924 There have been several attempts in the past to tackle the problem.
1925 [Semeria et al.](http://portal.acm.org/citation.cfm?id=513951) describe
1926 a tool for verifying the combinational equivalence of RTL-C and an HDL.
1927 They translate the C code into HDL and use standard equivalence checkers
1928 to establish the equivalence. The C code has to be very close to a
1929 hardware description (RTL level), which implies that the source and
1930 target have to be implemented in a very similar way. There are also
1931 variants of C specifically for this purpose. The [SystemC
1932 standard](http://en.wikipedia.org/wiki/SystemC) defines a subset of C++
1933 that can be used for synthesis. Further variants of ANSI-C for
1934 specifying hardware are SpecC and Handel C, among others.
1936 The concept of verifying the equivalence of a software implementation
1937 and a synchronous transition system was introduced by [Pnueli, Siegel,
1938 and Shtrichman](http://www.springerlink.com/content/ah5lpxaagrjp8ax2/).
1939 The C program is required to be in a very specific form, since a
1940 mechanical translation is assumed.
1942 In 2000, [Currie, Hu, and
1943 Rajan](http://doi.acm.org/10.1145/337292.337339) transform DSP assembly
1944 language into an equation for the Stanford Validity Checker. The
1945 symbolic execution of programs for comparison with RTL is now common
1948 The previous work focuses on a small subset of ANSI-C that is
1949 particularly close to register transfer language. Thus, the designer is
1950 often required to rewrite the C program manually in order to comply
1951 with these constraints. We extend the methodology to handle the full set
1952 of ANSI-C language features. This is a challenge in the presence of
1953 complex, dynamic data structures and pointers that may dynamically point
1954 to multiple objects. Furthermore, our methodology allows arbitrary loop
1957 ### Further Material
1959 We provide a small \ref man_hard-soft-tutorial "tutorial" and a
1961 \ref man_hard-soft-inputs "how to synchronize inputs between the C model and the Verilog model".
1962 There is also a collection of
1963 [benchmark problems](http://www.cprover.org/hardware/sequential-equivalence/)
1968 - [Hardware Verification using ANSI-C Programs as a
1969 Reference](http://www-2.cs.cmu.edu/~svc/papers/view-publications-ck03.html)
1970 - [Behavioral Consistency of C and Verilog Programs Using Bounded
1971 Model Checking](http://www-2.cs.cmu.edu/~svc/papers/view-publications-cky03.html)
1972 - [A Tool for Checking ANSI-C
1973 Programs](http://www-2.cs.cmu.edu/~svc/papers/view-publications-ckl2004.html)
1974 - [Checking Consistency of C and Verilog using Predicate Abstraction
1975 and Induction](http://www.kroening.com/publications/view-publications-kc2004.html)
1978 \subsection man_hard-soft-tutorial Tutorial
1980 ### Verilog vs. ANSI-C
1982 We assume that CBMC is installed on your system. If not so, follow
1983 \ref man_install-cbmc "these instructions".
1985 The following Verilog module implements a 4-bit counter
1988 module top(input clk);
1994 always @(posedge clk)
1999 HW-CBMC can take Verilog modules as the one above as additional input.
2000 Similar as in co-simulation, the data in the Verilog modules is
2001 available to the C program by means of global variables. For the example
2002 above, the following C fragment shows the definition of the variable
2003 that holds the value of the `counter` register:
2006 unsigned int counter;
2009 extern struct module_top top;
2011 Using this definition, the value of the `counter` register in the
2012 Verilog fragment above can be accessed as `top.counter`. Please note
2013 that the name of the variable **must** match the name of the top module.
2014 The C program only has a view of one state of the Verilog model. The
2015 Verilog model makes a transition once the function `next_timeframe()` is
2018 As CBMC performs Bounded Model Checking, the number of timeframes
2019 available for analysis must be bounded (SATABS). As it is desirable to
2020 change the bound to adjust it to the available computing capacity, the
2021 bound is given on the command line and not as part of the C program.
2022 This makes it easy to use only one C program for arbitrary bounds. The
2023 actual bound is available in the C program using the following
2026 extern const unsigned int bound;
2028 Also note that the fragment above declares a constant variable of struct
2029 type. Thus, the C program can only read the trace values and is not able
2030 to modify them. We will later on describe how to drive inputs of the
2031 Verilog module from within the C program.
2033 As described in previous chapters, assertions can be used to verify
2034 properties of the Verilog trace. As an example, the following program
2035 checks two values of the trace of the counter module (counter.c):
2037 void next_timeframe();
2040 unsigned int counter;
2043 extern struct module_top top;
2048 assert(top.counter==2);
2050 assert(top.counter==3);
2053 The following CBMC command line checks these assertions with a bound of
2056 hw-cbmc counter.c counter.v --module top --bound 20
2058 Note that a specific version of CBMC is used, called `hw-cbmc`. The
2059 module name given must match the name of the module in the Verilog file.
2060 Multiple Verilog files can be given on the command line.
2062 The `--bound` parameter is not to be confused with the `--unwind`
2063 parameter. While the `--unwind` parameter specifies the maximum
2064 unwinding depth for loops within the C program, the `--bound` parameter
2065 specifies the number of times the transition relation of the Verilog
2066 module is to be unwound.
2070 For the given example, the verification is successful. If the first
2071 assertion is changed to
2073 assert(top.counter==10);
2075 and the bound on the command line is changed to 6, CBMC will produce a
2076 counterexample. CBMC produces two traces: One for the C program, which
2077 matches the traces described earlier, and a separate trace for the
2078 Verilog module. The values of the registers in the Verilog module are
2079 also shown in the C trace as part of the initial state.
2082 ----------------------------------------------------
2083 bound=6 (00000000000000000000000000000110)
2084 counter={ 0, 1, 2, 3, 4, 5, 6 }
2086 Failed assertion: assertion line 6 function main
2088 Transition system state 0
2089 ----------------------------------------------------
2092 Transition system state 1
2093 ----------------------------------------------------
2096 Transition system state 2
2097 ----------------------------------------------------
2100 Transition system state 3
2101 ----------------------------------------------------
2104 Transition system state 4
2105 ----------------------------------------------------
2108 Transition system state 5
2109 ----------------------------------------------------
2112 Transition system state 6
2113 ----------------------------------------------------
2118 The following program is using the bound variable to check the counter
2119 value in all cycles:
2121 void next_timeframe();
2122 extern const unsigned int bound;
2125 unsigned int counter;
2128 extern struct module_top top;
2133 for(cycle=0; cycle<bound; cycle++) {
2134 assert(top.counter==(cycle & 15));
2139 CBMC performs bounds checking, and restricts the number of times that
2140 `next_timeframe()` can be called. SATABS does not require a bound, and
2141 thus, `next_timeframe()` can be called arbitrarily many times.
2144 \subsection man_hard-soft-mapping Mapping Variables
2146 ### Mapping Variables within the Module Hierarchy
2148 Verilog modules are hierarchical. The `extern` declarations shown above
2149 only allow reading the values of signals and registers that are in the
2150 top module. In order to read values from sub-modules, CBMC uses
2153 As an example, consider the following Verilog file
2156 module counter(input clk, input [7:0] increment);
2162 always @(posedge clk)
2163 counter=counter+increment;
2167 module top(input clk);
2174 The file has two modules: a top module and a counter module. The counter
2175 module is instantiated twice within the top module. A reference to the
2176 register `counter` within the C program would be ambiguous, as the two
2177 module instances have separate instances of the register. CBMC and
2178 SATABS use the following data structures for this example:
2180 void next_timeframe();
2181 extern const unsigned int bound;
2184 unsigned char increment;
2185 unsigned char counter;
2189 struct module_counter c1, c2;
2192 extern struct module_top top;
2198 assert(top.c1.counter==3);
2199 assert(top.c2.counter==6);
2202 The `main` function reads both counter values for cycle 3. A deeper
2203 hierarchy (modules in modules) is realized by using additional structure
2204 members. Writing these data structures for large Verilog designs is
2205 error prone, and thus, HW-CBMC can generate them automatically. The
2206 declarations above are generated using the command line
2208 hw-cbmc --gen-interface --module top hierarchy.v
2210 ### Mapping Verilog Vectors to Arrays or Scalars
2212 In Verilog, a definition such as
2216 can be used for arithmetic (as in `x+10`) and as array of Booleans (as
2217 in `x[2]`). ANSI-C does not allow both, so when mapping variables from
2218 Verilog to C, the user has to choose one option for each such variable.
2219 As an example, the C declaration
2223 will allow using `x` in arithmetic expressions, while the C declaration
2225 __CPROVER_bool x[32];
2227 will allow accessing the individual bits of `x` using the syntax
2228 `x[bit]`. The `--gen-interface` option of HW-CBMC will generate the
2229 first variant if the vector has the same size as one of the standard
2230 integer types, and will use the `__CPROVER_bitvector[]` type if not
2231 so. This choice can be changed by adjusting the declaration accordingly.
2232 Note that both SpecC and SystemC offer bit-extraction operators, which
2233 means that it unnecessary to use the declaration as array in order to
2234 access individual bits of a vector.
2236 \subsection man_hard-soft-inputs Synchronizing Inputs
2238 ### Driving Primary Inputs
2240 The examples in the \ref man_hard-soft-tutorial "tutorial" are trivial
2241 in the sense that the model has only one possible trace. The initial
2242 state is deterministic, and there is only one possible transition, so
2243 the verification problem can be solved by testing a single run. In
2244 contrast, consider the following Verilog module:
2246 module top(input clk, input i);
2252 always @(posedge clk)
2258 The module above has an input named `i`. The top-level inputs of the
2259 Verilog design have to be generated by the C program. This is done by
2260 assigning the desired values to the corresponding struct member, and
2261 then calling the `set_inputs()` function before calling
2262 `next_timeframe()`. Consider the following example:
2264 void next_timeframe();
2266 extern const unsigned int bound;
2269 unsigned int counter;
2273 extern struct module_top top;
2276 assert(top.counter==0);
2279 set_inputs(); next_timeframe();
2280 assert(top.counter==1);
2283 set_inputs(); next_timeframe();
2284 assert(top.counter==2);
2287 set_inputs(); next_timeframe();
2288 assert(top.counter==2);
2291 As an example, consider a Verilog module that has a signal `reset` as an
2292 input, which is active-low. The following C fragment drives this input
2293 to be active in the first cycle, and not active in any subsequent cycle:
2296 set_inputs(); next_timeframe();
2298 for(i=1; i<bound; i++) {
2300 set_inputs(); next_timeframe();
2303 Note that the value of the input must be set *before* calling
2304 `next_timeframe()`. The effect of the input values on values derived in
2305 a combinatorial way is immediately visible. The effect on clocked values
2306 becomes visible in the next time frame.
2308 ### Using Nondeterminism
2310 The examples above use particular, constant values to drive the primary
2311 inputs. In order to check the behavior of the Verilog model for more
2312 than one specific input, use \ref man_modelling-nondet "nondeterminism".
2315 \section man_instrumentation Build Systems, Libraries, and Instrumentation
2317 \subsection man_instrumentation-libraries Build Systems and Libraries
2321 Similar to unit testing, the model checking approach requires the user
2322 to clearly define what parts of the program should be tested and what
2323 the behaviour of these parts is. This requirement has following reasons:
2325 - Despite recent advances, the size of the programs that model
2326 checkers can cope with is still restricted.
2328 - Typically, you want to verify *your* program and not the libraries
2329 or the operating that it uses (the correctness of these libraries
2330 and the OS us usually addressed separately).
2332 - CBMC and SATABS cannot verify binary libraries.
2334 - CBMC and SATABS does not provide a model for the hardware (e.g.,
2335 hard disk, input/output devices) the tested program runs on. Since
2336 CBMC and SATABS are supposed to examine the behavior of the tested
2337 program for **all** possible inputs and outputs, it is reasonable to
2338 model input and output by means of non-deterministic choice.
2342 Existing software projects usually do not come in a single source file
2343 that may simply be passed to a model checker, but is a collection of
2344 files held together by a build system. The extraction of models from
2345 such a build system using goto-cc is described
2346 \ref man_instrumentation-goto-cc "here".
2348 \subsection man_instrumentation-goto-cc Integration into Build Systems with goto-cc
2350 Existing software projects usually do not come in a single source file
2351 that may simply be passed to a model checker. They rather come in a
2352 multitude of source files in different directories and refer to external
2353 libraries and system-wide options. A *build system* then collects the
2354 configuration options from the system and compiles the software
2355 according to build rules.
2357 The most prevalent build tool on Unix (-based) systems surely is the
2358 `make` utility. This tool uses build rules given in a *Makefile* that
2359 comes with the software sources. Running software verification tools on
2360 projects like these is greatly simplified by a compiler that first
2361 collects all the necessary models into a single model file.
2362 [goto-cc](http://www.cprover.org/goto-cc/) is such a model file
2363 extractor, which can seamlessly replace `gcc` and `cl.exe` in Makefiles.
2364 The normal build system for the project may be used to build the
2365 software, but the outcome will be a model file with suitable detail for
2366 verification, as opposed to a flat executable program. Note that goto-cc
2367 comes in different variants depending on the compilation environment.
2368 These variants are described [here](goto-cc-variants.shtml).
2370 ### Example: Building wu-ftpd
2372 This example assumes a Unix-like machine.
2374 1. Download the sources of wu-ftpd from
2375 [here](ftp://ftp.wu-ftpd.org/pub/wu-ftpd/wu-ftpd-current.tar.gz).
2377 2. Unpack the sources by running ` tar xfz wu-ftpd-current.tar.gz`
2379 3. Change to the source directory, by entering, e.g.,
2382 4. Configure the project for verification by running
2384 `./configure YACC=byacc CC=goto-cc --host=none-none-none`
2386 5. Build the project by running `make`. This creates multiple model
2387 files in the `src` directory. Among them is a model for the main
2390 6. Run a model-checker, e.g., CBMC, on the model file:
2394 CBMC automatically recognizes that the file is a goto binary.
2398 More elaborate build or configuration scripts often make use of features
2399 of the compiler or the system library to detect configuration options
2400 automatically, e.g., in a `configure` script. Replacing `gcc` by goto-cc
2401 at this stage may confuse the script, or detect wrong options. For
2402 example, missing library functions do not cause goto-cc to throw an
2403 error (only to issue a warning). Because of this, configuration scripts
2404 sometimes falsely assume the availability of a system function or
2407 In the case of this or similar problems, it is more advisable to
2408 configure the project using the normal routine, and replacing the
2409 compiler setting manually in the generated Makefiles, e.g., by replacing
2410 lines like `CC=gcc` by `CC=goto-cc`.
2412 A helpful command that accomplishes this task successfully for many
2413 projects is the following:
2415 for i in `find . -name Makefile`; do sed -e 's/^\(\s*CC[ \t]*=\)\(.*$\)/\1goto-cc/g' -i $i done
2417 Here are additional examples on how to use goto-cc:
2419 - \ref man_goto-cc-linux "Linux Kernel"
2420 - \ref man_goto-cc-apache "Apache HTTPD"
2422 A description of how to integrate goto-cc into Microsoft's Visual Studio
2423 is \ref man_instrumentation-vs "here".
2425 ### Linking Libraries
2427 Some software projects come with their own libraries; also, the goal may
2428 be to analyze a library by itself. For this purpose it is possible to
2429 use goto-cc to link multiple model files into a library of model files.
2430 An object file can then be linked against this model library. For this
2431 purpose, goto-cc also features a linker mode.
2433 To enable this linker mode, create a link to the goto-cc binary by the
2434 name of goto-ld (Linux and Mac) or copy the goto-cc binary to
2435 goto-link.exe (Windows). The goto-ld tool can now be used as a seamless
2436 replacement for the `ld` tool present on most Unix (-based) systems and
2437 for the `link` tool on Windows.
2439 The default linker may need to be replaced by `goto-ld` or
2440 `goto-link.exe` in the build script, which can be achieved in much the
2441 same way as replacing the compiler.
2444 \subsection man_instrumentation-vs Integration into Visual Studio 2008 to 2012
2446 Visual Studio version 2008 onwards comes with a new XML-based build
2448 [MSBuild](http://msdn.microsoft.com/en-us/library/ms171452.aspx).
2449 The MSBuild system is also activated when triggering a build from the
2450 Visual Studio GUI. The project files created by the Visual Studio GUI
2451 are used as input by the MSBuild tool.
2453 The MSBuild system can be used to generate goto-binaries from your
2454 Visual Studio project as follows:
2456 1. Install the `goto-cl.exe` and `goto-link.exe` binaries in some
2457 directory that is contained in the PATH environment variable.
2459 2. Add a configuration for the goto-cc build for your project in the
2460 configuration manager, named "goto-cc".
2462 3. Open the Visual Studio Command Prompt (in the Tools menu).
2464 4. Locate the directory that contains the project. Change into this
2465 directory using "CD".
2469 msbuild /p:CLToolExe=goto-cl.exe /p:LinkToolExe=goto-link.exe /p:Flavor=goto-cc /p:Platform=x86
2471 The platform can be adjusted as required; the "Flavor" given should
2472 match the configuration that was created earlier.
2474 Note that the recent versions of goto-cc also support file names with
2475 non-ASCII (Unicode) characters on Windows platforms.
2477 \subsection man_instrumentation-goto-cc-variants Variants of goto-cc
2479 The goto-cc utility comes in several variants, summarised in the
2482 Executable | Environment | Preprocessor
2483 --------------|-------------------------------------------------------------------------|-------
2484 goto-cc | [gcc](http://gcc.gnu.org/) (control-flow graph only) | gcc -E
2485 goto-gcc | [gcc](http://gcc.gnu.org/) ("hybrid" executable) | gcc -E
2486 goto-armcc | [ARM RVDS](http://arm.com/products/tools/software-tools/rvds/index.php) | armcc -E
2487 goto-cl | [Visual Studio](http://www.microsoft.com/visualstudio/) | cl /E
2488 goto-cw | [Freescale CodeWarrior](http://www.freescale.com/webapp/sps/site/homepage.jsp?code=CW_HOME) | mwcceppc
2490 The primary difference between the variants is the preprocessor called.
2491 Furthermore, the language recognized varies slightly. The variants can
2492 be obtained by simply renaming the goto-cc executable. On Linux/MacOS,
2493 the variants can be obtained by creating a symbolic link.
2495 The "hybrid" executables contain both the control-flow graph for
2496 verification purposes and the usual, executable machine code.
2498 \subsection man_instrumentation-architecture Architectural Settings
2500 The behavior of a C/C++ program depends on a number of parameters that
2501 are specific to the architecture the program was compiled for. The three
2502 most important architectural parameters are:
2504 - The width of the various scalar types; e.g., compare the value of
2505 `sizeof(long int)` on various machines.
2506 - The width of pointers; e.g., compare the value of `sizeof(int *)` on
2508 - The [endianness](http://en.wikipedia.org/wiki/Endianness) of
2511 In general, the CPROVER tools attempt to adopt the settings of the
2512 particular architecture the tool itself was compiled for. For example,
2513 when running a 64 bit binary of CBMC on Linux, the program will be
2514 processed assuming that `sizeof(long int)==8`.
2516 As a consequence of these architectural parameters, you may observe
2517 different verification results for an identical program when running
2518 CBMC on different machines. In order to get consistent results, or when
2519 aiming at validating a program written for a different platform, the
2520 following command-line arguments can be passed to the CPROVER tools:
2522 - The word-width can be set with `--16`, `--32`, `--64`.
2523 - The endianness can be defined with `--little-endian` and
2526 When using a goto binary, CBMC and the other tools read the
2527 configuration from the binary, i.e., the setting when running goto-cc is
2528 the one that matters; the option given to the model checker is ignored
2531 In order to see the effect of the options `--16`, `--32` and `--64`,
2532 pass the following program to CBMC:
2538 printf("sizeof(long int): %d\n", (int)sizeof(long int));
2539 printf("sizeof(int *): %d\n", (int)sizeof(int *));
2543 The counterexample trace contains the strings printed by the `printf`
2546 The effects of endianness are more subtle. Try the following program
2547 with `--big-endian` and `--little-endian`:
2555 printf("Bytes of i: %d, %d, %d, %d\n", p[0], p[1], p[2], p[3]);
2560 \subsection man_instrumentation-properties Property Instrumentation
2564 We have mentioned *properties* several times so far, but we never
2565 explained *what* kind of properties CBMC and SATABS can verify. We cover
2566 this topic in more detail in this section.
2568 Both CBMC and SATABS use
2569 [assertions](http://en.wikipedia.org/wiki/Assertion_(computing)) to
2570 specify program properties. Assertions are properties of the state of
2571 the program when the program reaches a particular program location.
2572 Assertions are often written by the programmer by means of the `assert`
2575 In addition to the assertions written by the programmer, assertions for
2576 specific properties can also be generated automatically by CBMC and
2577 SATABS, often relieving the programmer from writing "obvious"
2580 CBMC and SATABS come with an assertion generator called
2581 `goto-instrument`, which performs a conservative [static
2582 analysis](http://en.wikipedia.org/wiki/Static_code_analysis) to
2583 determine program locations that potentially contain a bug. Due to the
2584 imprecision of the static analysis, it is important to emphasize that
2585 these generated assertions are only *potential* bugs, and that the Model
2586 Checker first needs to confirm that they are indeed genuine bugs.
2588 The assertion generator can generate assertions for the verification of
2589 the following properties:
2591 - **Buffer overflows.** For each array access, check whether the upper
2592 and lower bounds are violated.
2593 - **Pointer safety.** Search for `NULL`-pointer dereferences or
2594 dereferences of other invalid pointers.
2596 - **Division by zero.** Check whether there is a division by zero in
2599 - **Not-a-Number.** Check whether floating-point computation may
2600 result in [NaNs](http://en.wikipedia.org/wiki/NaN).
2602 - **Unitialized local.** Check whether the program uses an
2603 uninitialized local variable.
2605 - **Data race.** Check whether a concurrent program accesses a shared
2606 variable at the same time in two threads.
2608 We refrain from explaining the properties above in detail. Most of them
2609 relate to behaviors that are left undefined by the respective language
2610 semantics. For a discussion on why these behaviors are usually very
2611 undesirable, read [this](http://blog.regehr.org/archives/213) blog post
2614 All the properties described above are *reachability* properties. They
2615 are always of the form
2617 "*Is there a path through the program such that property ... is
2620 The counterexamples to such properties are always program paths. Users
2621 of the Eclipse plugin can step through these counterexamples in a way
2622 that is similar to debugging programs. The installation of this plugin
2623 is explained \ref man_install-eclipse "here".
2625 ### Using goto-instrument
2627 The goto-instrument static analyzer operates on goto-binaries, which is
2628 a binary representation of control-flow graphs. The goto-binary is
2629 extracted from program source code using goto-cc, which is explained
2630 \ref man_instrumentation-goto-cc "here". Given a goto-program,
2631 goto-instrument operates as follows:
2633 1. A goto-binary is read in.
2634 2. The specified static analyses are performed.
2635 3. Any potential bugs found are transformed into corresponding
2636 assertions, and are added into the program.
2637 4. A new goto-binary (with assertions) is written to disc.
2639 As an example, we begin with small C program we call `expr.c` (taken
2640 from [here](http://www.spinroot.com/uno/)):
2651 The program contains an obvious NULL-pointer dereference. We first
2652 compile the example program with goto-cc and then instrument the
2653 resulting goto-binary with pointer checks.
2655 goto-cc expr.c -o in.gb goto-instrument in.gb out.gb --pointer-check
2657 We can now get a list of the assertions that have been generated as
2660 goto-instrument out.gb --show-properties
2662 Using either CBMC or SATABS on `out.gb`, we can obtain a counterexample
2663 trace for the NULL-pointer dereference:
2667 The goto-instrument program supports the following checks:
2670 -----------------------------|----------------------------------------------
2671 `--no-assertions` | ignore user assertions
2672 `--bounds-check` | add array bounds checks
2673 `--div-by-zero-check` | add division by zero checks
2674 `--pointer-check` | add pointer checks
2675 `--signed-overflow-check` | add arithmetic over- and underflow checks
2676 `--unsigned-overflow-check` | add arithmetic over- and underflow checks
2677 `--undefined-shift-check` | add range checks for shift distances
2678 `--nan-check` | add floating-point NaN checks
2679 `--uninitialized-check` | add checks for uninitialized locals (experimental)
2680 `--error-label label` | check that given label is unreachable
2682 \subsection man_instrumentation-api The CPROVER API Reference
2684 The following sections summarize the functions available to programs
2685 that are passed to the CPROVER tools.
2689 #### \_\_CPROVER\_assume, \_\_CPROVER\_assert, assert
2691 void __CPROVER_assume(_Bool assumption);
2692 void __CPROVER_assert(_Bool assertion, const char *description);
2693 void assert(_Bool assertion);
2695 The function **\_\_CPROVER\_assume** adds an expression as a constraint
2696 to the program. If the expression evaluates to false, the execution
2697 aborts without failure. More detail on the use of assumptions is in the
2698 section on [Assumptions and Assertions](modeling-assertions.shtml).
2700 #### \_\_CPROVER\_same\_object, \_\_CPROVER\_POINTER\_OBJECT, \_\_CPROVER\_POINTER\_OFFSET, \_\_CPROVER\_DYNAMIC\_OBJECT
2702 _Bool __CPROVER_same_object(const void *, const void *);
2703 unsigned __CPROVER_POINTER_OBJECT(const void *p);
2704 signed __CPROVER_POINTER_OFFSET(const void *p);
2705 _Bool __CPROVER_DYNAMIC_OBJECT(const void *p);
2707 The function **\_\_CPROVER\_same\_object** returns true if the two
2708 pointers given as arguments point to the same object. The function
2709 **\_\_CPROVER\_POINTER\_OFFSET** returns the offset of the given pointer
2710 relative to the base address of the object. The function
2711 **\_\_CPROVER\_DYNAMIC\_OBJECT** returns true if the pointer passed as
2712 arguments points to a dynamically allocated object.
2714 #### \_\_CPROVER\_is\_zero\_string, \_\_CPROVER\_zero\_string\_length, \_\_CPROVER\_buffer\_size
2716 _Bool __CPROVER_is_zero_string(const void *);
2717 __CPROVER_size_t __CPROVER_zero_string_length(const void *);
2718 __CPROVER_size_t __CPROVER_buffer_size(const void *);
2720 #### \_\_CPROVER\_initialize
2722 void __CPROVER_initialize(void);
2724 The function **\_\_CPROVER\_initialize** computes the initial state of
2725 the program. It is called prior to calling the main procedure of the
2728 #### \_\_CPROVER\_input, \_\_CPROVER\_output
2730 void __CPROVER_input(const char *id, ...);
2731 void __CPROVER_output(const char *id, ...);
2733 The functions **\_\_CPROVER\_input** and **\_\_CPROVER\_output** are
2734 used to report an input or output value. Note that they do not generate
2735 input or output values. The first argument is a string constant to
2736 distinguish multiple inputs and outputs (inputs are typically generated
2737 using nondeterminism, as described [here](modeling-nondet.shtml)). The
2738 string constant is followed by an arbitrary number of values of
2741 #### \_\_CPROVER\_cover
2743 void __CPROVER_cover(_Bool condition);
2745 This statement defines a custom coverage criterion, for usage with the
2746 [test suite generation feature](cover.shtml).
2748 #### \_\_CPROVER\_isnan, \_\_CPROVER\_isfinite, \_\_CPROVER\_isinf, \_\_CPROVER\_isnormal, \_\_CPROVER\_sign
2750 _Bool __CPROVER_isnan(double f);
2751 _Bool __CPROVER_isfinite(double f);
2752 _Bool __CPROVER_isinf(double f);
2753 _Bool __CPROVER_isnormal(double f);
2754 _Bool __CPROVER_sign(double f);
2756 The function **\_\_CPROVER\_isnan** returns true if the double-precision
2757 floating-point number passed as argument is a
2758 [NaN](http://en.wikipedia.org/wiki/NaN).
2760 The function **\_\_CPROVER\_isfinite** returns true if the
2761 double-precision floating-point number passed as argument is a finite
2764 This function **\_\_CPROVER\_isinf** returns true if the
2765 double-precision floating-point number passed as argument is plus or
2768 The function **\_\_CPROVER\_isnormal** returns true if the
2769 double-precision floating-point number passed as argument is a normal
2772 This function **\_\_CPROVER\_sign** returns true if the double-precision
2773 floating-point number passed as argument is negative.
2775 #### \_\_CPROVER\_abs, \_\_CPROVER\_labs, \_\_CPROVER\_fabs, \_\_CPROVER\_fabsl, \_\_CPROVER\_fabsf
2777 int __CPROVER_abs(int x);
2778 long int __CPROVER_labs(long int x);
2779 double __CPROVER_fabs(double x);
2780 long double __CPROVER_fabsl(long double x);
2781 float __CPROVER_fabsf(float x);
2783 These functions return the absolute value of the given argument.
2785 #### \_\_CPROVER\_array\_equal, \_\_CPROVER\_array\_copy, \_\_CPROVER\_array\_set
2787 _Bool __CPROVER_array_equal(const void array1[], const void array2[]);
2788 void __CPROVER_array_copy(const void dest[], const void src[]);
2789 void __CPROVER_array_set(const void dest[], value);
2791 The function **\_\_CPROVER\_array\_equal** returns true if the values
2792 stored in the given arrays are equal. The function
2793 **\_\_CPROVER\_array\_copy** copies the contents of the array **src** to
2794 the array **dest**. The function **\_\_CPROVER\_array\_set** initializes
2795 the array **dest** with the given value.
2797 #### Uninterpreted Functions
2799 Uninterpreted functions are documented \ref man_modelling-nondet "here".
2801 ### Predefined Types and Symbols
2803 #### \_\_CPROVER\_bitvector
2805 __CPROVER_bitvector [ expression ]
2807 This type is only available in the C frontend. It is used to specify a
2808 bit vector with arbitrary but fixed size. The usual integer type
2809 modifiers **signed** and **unsigned** can be applied. The usual
2810 arithmetic promotions will be applied to operands of this type.
2812 #### \_\_CPROVER\_floatbv
2814 __CPROVER_floatbv [ expression ] [ expression ]
2816 This type is only available in the C frontend. It is used to specify an
2817 IEEE-754 floating point number with arbitrary but fixed size. The first
2818 parameter is the total size (in bits) of the number, and the second is
2819 the size (in bits) of the mantissa, or significand (not including the
2820 hidden bit, thus for single precision this should be 23).
2822 #### \_\_CPROVER\_fixedbv
2824 __CPROVER_fixedbv [ expression ] [ expression ]
2826 This type is only available in the C frontend. It is used to specify a
2827 fixed-point bit vector with arbitrary but fixed size. The first
2828 parameter is the total size (in bits) of the type, and the second is the
2829 number of bits after the radix point.
2831 #### \_\_CPROVER\_size\_t
2833 The type of sizeof expressions.
2835 #### \_\_CPROVER\_rounding\_mode
2837 extern int __CPROVER_rounding_mode;
2839 This variable contains the IEEE floating-point arithmetic rounding mode.
2841 #### \_\_CPROVER\_constant\_infinity\_uint
2843 This is a constant that models a large unsigned integer.
2845 #### \_\_CPROVER\_integer, \_\_CPROVER\_rational
2847 **\_\_CPROVER\_integer** is an unbounded, signed integer type.
2848 **\_\_CPROVER\_rational** is an unbounded, signed rational number type.
2850 #### \_\_CPROVER\_memory
2852 extern unsigned char __CPROVER_memory[];
2854 This array models the contents of integer-addressed memory.
2856 #### \_\_CPROVER::unsignedbv<N> (C++ only)
2858 This type is the equivalent of **unsigned \_\_CPROVER\_bitvector\[N\]**
2859 in the C++ front-end.
2861 #### \_\_CPROVER::signedbv<N> (C++ only)
2863 This type is the equivalent of **signed \_\_CPROVER\_bitvector\[N\]** in
2866 #### \_\_CPROVER::fixedbv<N> (C++ only)
2868 This type is the equivalent of **\_\_CPROVER\_fixedbv\[N,m\]** in the
2873 Asynchronous threads are created by preceding an instruction with a
2874 label with the prefix **\_\_CPROVER\_ASYNC\_**.
2876 \subsection man_goto-cc-linux goto-cc: Extracting Models from the Linux Kernel
2878 The Linux kernel code consists of more than 11 million lines of
2879 low-level C and is frequently used to evaluate static analysis
2880 techniques. In the following, we show how to extract models from Linux
2883 1. First of all, you will need to make sure you have around 100 GB of
2884 free disc space available.
2886 2. Download the Kernel sources at
2887 <http://www.kernel.org/pub/linux/kernel/v2.6/linux-2.6.39.tar.bz2>.
2891 `bunzip2 linux-2.6.39.tar.bz2`\
2892 `tar xvf linux-2.6.39.tar`\
2895 4. Now ensure that you can actually compile a kernel by doing
2900 These steps need to succeed before you can try to extract models
2903 5. Now compile [gcc-wrap.c](gcc-wrap.c) and put the resulting binary
2904 into a directory that is in your PATH variable:
2906 `lwp-download http://www.cprover.org/cprover-manual/gcc-wrap.c`\
2907 `gcc gcc-wrap.c -o gcc-wrap`\
2908 `cp gcc-wrap ~/bin/`\
2910 This assumes that the directory `~/bin` exists and is in your
2913 6. Now change the variable CC in the kernel Makefile as follows:
2922 This will re-compile the kernel, but this time retaining the
2923 preprocessed source files.
2925 8. You can now compile the preprocessed source files with goto-cc as
2928 find ./ -name .tmp_*.i > source-file-list
2929 for a in `cat source-file-list` ; do
2930 goto-cc -c $a -o $a.gb
2933 Note that it is important that the word-size of the kernel
2934 configuration matches that of goto-cc. Otherwise, compile-time
2935 assertions will fail, generating the error message "bit field size
2936 is negative". For a kernel configured for a 64-bit word-width, pass
2937 the option --64 to goto-cc.
2939 The resulting `.gb` files can be passed to any of the CPROVER tools.
2941 \subsection man_goto-cc-apache goto-cc: Extracting Models from the Apache HTTPD
2943 The [Apache HTTPD](http://httpd.apache.org/) is still the most
2944 frequently used web server. Together with the relevant libraries, it
2945 consists of around 0.4 million lines of C code. In the following, we
2946 show how to extract models from Apache HTTPD 2.4.2.
2948 1. First of all, we download the sources of Apache HTTPD and two
2949 supporting libraries and uncompress them:
2951 lwp-download http://www.mirrorservice.org/sites/ftp.apache.org/apr/apr-1.4.6.tar.bz2
2952 lwp-download http://www.mirrorservice.org/sites/ftp.apache.org/apr/apr-util-1.4.1.tar.bz2
2953 lwp-download http://mirror.catn.com/pub/apache/httpd/httpd-2.4.2.tar.bz2
2954 bunzip2 < apr-1.4.6.tar.bz2 | tar x
2955 bunzip2 < apr-util-1.4.1.tar.bz2 | tar x
2956 bunzip2 < httpd-2.4.2.tar.bz2 | tar x
2958 2. Now compile [gcc-wrap.c](gcc-wrap.c) and put the resulting binary
2959 into a directory that is in your PATH variable:
2961 lwp-download http://www.cprover.org/cprover-manual/gcc-wrap.c
2962 gcc gcc-wrap.c -o gcc-wrap
2965 This assumes that the directory `~/bin` exists and is in your
2968 3. We now build the sources with gcc:
2970 (cd apr-1.4.6; ./configure; make CC=gcc-wrap)
2971 (cd apr-util-1.4.1; ./configure --with-apr=../apr-1.4.6 ; make CC=gcc-wrap)
2972 (cd httpd-2.4.2; ./configure --with-apr=../apr-1.4.6 --with-apr-util=../apr-util-1.4.1 ; make CC=gcc-wrap)
2974 4. You can now compile the preprocessed source files with goto-cc as
2977 find ./ -name *.i > source-file-list
2978 for a in `cat source-file-list` ; do
2979 goto-cc -c $a -o $a.gb
2982 The resulting `.gb` files can be passed to any of the CPROVER tools.