2 \defgroup analyses analyses
6 This contains the abstract interpretation frameworks and several
7 static analyses that instantiate them.
9 \section analyses-frameworks Frameworks:
11 There are currently three abstract interpretation frameworks provided in this
12 directory. \ref analyses-ait, \ref analyses-flow-insensitive-analysis, and the
13 deprecated and obsolete \ref analyses-static-analysist.
15 \subsection analyses-ait Abstract interpreter framework (ait)
17 This abstract interpretation framework is the focus of current active
18 development, and is where most scalability improvements will happen.
19 It should be used as the basis for any new development work. This framework
20 is provided by \ref ait. This analysis framework is currently location sensitive
21 (meaning there is one abstract domain per code location) and is designed to be
22 run after the function pointer removal and return removal passes. There is
23 ongoing work to make this framework also support context sensitivity.
25 \subsection analyses-static-analysist Old Abstract interpreter framework (static_analysist)
27 The obsolete static analysis framework \ref static_analysist is only used by
28 \ref value_set_analysist. This abstract interpretation framework is deprecated in
29 favour of \ref analyses-ait, and should not be used as the basis for new code.
30 This framework is location sensitive (one domain per code location), but is able
31 to be run before function pointer removal and return removal phases.
33 \subsection analyses-flow-insensitive-analysis Flow-insensitive analysis (flow_insensitive_analysist)
35 Framework for flow-insensitive analyses. Maintains a single global abstract
36 value which instructions are invited to transform in turn. Unsound (terminates
38 (a) there are two or more instructions that incrementally reach a fixed point,
39 for example by walking a chain of pointers and updating a points-to set, but
40 (b) those instructions are separated by instructions that don't update the
41 abstract value (for example, SKIP instructions). Therefore, not recommended for
44 Only current user in-tree is \ref value_set_analysis_fit and its close
45 relatives, \ref value_set_analysis_fivrt and \ref value_set_analysis_fivrnst
47 \section analyses-specific-analyses Specific analyses:
49 \subsection analyses-call-graph Call graph and associated helpers (call_grapht)
51 A [https://en.wikipedia.org/wiki/Call_graph](call graph) for a GOTO model or
52 GOTO functions collection. \ref call_grapht implements a basic call graph, but
53 can also export the graph in \ref grapht format, which permits more advanced
54 graph algorithms to be applied; see \ref call_graph_helpers.h for functions
55 that work with the \ref grapht representation.
57 \subsection analyses-dominator Dominator analysis (cfg_dominators_templatet)
59 A [https://en.wikipedia.org/wiki/Dominator_(graph_theory)](dominator analysis)
60 for a GOTO model or GOTO functions collection. Briefly, if a CFG node is
61 dominated by {A, B, C} then in order to reach this node control must have flowed
62 through all of A, B and C; similarly if it is post-dominated by {D, E, F} then
63 after we pass through this node we will inevitably reach all of D, E and F
64 eventually if the program terminates.
66 This is useful for e.g. checking against introducing a redundant check: if we
67 have a `const int *x` parameter and wish to introduce a null check at node Y,
68 then we can also mark it checked at all the nodes that Y dominates, as at those
69 program points it has necessarily already been checked.
71 This analysis defines `cfg_dominatorst` (dominator analysis) and
72 `cfg_post_dominatorst` (post-dominator analysis). Run these analyses using
73 `operator()(const goto_programt &)`. Alternatively, the template can be
74 instantiated with a different type that can be used with
75 `procedure_local_cfg_baset` -- this is done by `natural_loops_mutablet`
76 using (non-const) `goto_programt`, and by `java_bytecode_convert_methodt` to
77 apply the dominator algorithm to its Java bytecode representation.
79 `cfg_dominators_templatet::output` is a good place to check how to query the
80 dominators it has found.
82 \subsection analyses-constant-propagation Constant propagation (\ref constant_propagator_ait)
84 A simple, unsound constant propagator. Replaces RHS symbol expressions (variable
85 reads) with their values when they appear to have a unique value at a particular
86 program point. Unsound with respect to pointer operations on the left-hand side
89 \subsection analyses-taint Taint analysis (custom_bitvector_analysist)
93 \subsection analyses-dependence-graph Data- and control-dependence analysis (dependence_grapht)
97 \subsection analyses-dirtyt Address-taken lvalue analysis (dirtyt)
101 \subsection analyses-const-cast-removal const_cast removal analysis (does_remove_constt)
105 \subsection analyses-escape Escape analysis (escape_analysist)
107 This is a simple [https://en.wikipedia.org/wiki/Escape_analysis](escape analysis).
108 It is intended to implement `__CPROVER_cleanup` instructions, which say that a given
109 cleanup function should be run when a particular object goes out of scope. Examples of
110 its usage can be seen in `ansi-c/library/stdio.c` and `ansi-c/library/pthread_lib.c`,
111 where it is used to introduce assertions that a `FILE*` going out of scope has been
112 `fclose`'d or similarly that a `pthread_mutex*` has been properly destroyed.
114 For example `f = fopen("/tmp/hello", "wb"); __CPROVER_cleanup(f, check_closed);`
115 should result in a call to `check_closed(f)` or `check_closed()` (depending on
116 whether it has an argument) when `f` and all its aliases go out of scope.
118 The alias analysis is a simple union-find algorithm, and can introduce cleanup calls
119 too soon (i.e. before all aliases are really gone), so this overestimates the possible
120 problems caused by escaping pointers.
122 Note this differs from a typical escape analysis, which would conservatively
123 (over-)estimate the objects that may be reachable when (e.g.) a function exits;
124 this *underestimates* the reachable objects in order to favour false positives
125 over false negatives when testing for missing close operations.
127 \subsection analyses-global-may-alias Global may-alias analysis (global_may_aliast)
129 This is a pointer alias analysis (analysing the memory locations a pointer
130 expression may point to, and finding when two pointer expressions refer to
131 the same storage locations). It is flow-insensitive, meaning that it computes
132 what memory locations pointer expressions may refer to at any time during the program
133 execution. It's called may-alias, because it looks for aliasing that may occur
134 during program execution (compared to analysis for aliasing that must occur). It's
135 an over-approximating analysis.
137 \subsection analyses-rwt Read-write range analysis (goto_rwt)
141 \subsection analyses-invariant-propagation Invariant propagation (invariant_propagationt)
145 \subsection analyses-is-threaded Multithreaded program detection (is_threadedt)
149 \subsection analyses-pointer-classification Pointer classification analysis (is-heap-pointer, might-be-null, etc -- local_bitvector_analysist)
153 \subsection analyses-cfg Control-flow graph (local_cfgt)
157 \subsection analyses-local-may-alias Local may-alias analysis (local_may_aliast)
161 \subsection analyses-safe-dereference Safe dereference analysis (local_safe_pointerst)
165 \subsection analyses-locals Address-taken locals analysis (localst)
169 \subsection analyses-natural-loop Natural loop analysis (natural_loops_templatet)
171 A natural loop is when the nodes and edges of a graph make one self-encapsulating
172 circle (implemented in /ref natural_loops_templatet) with no incoming edges from external nodes.
173 For example A -> B -> C -> D -> A is a natural loop, but if B has an incoming edge from X,
174 then it isn't a natural loop, because X is an external node. Outgoing edges don't affect
175 the natural-ness of a loop.
177 /ref cfg_dominators_templatet provides the dominator analysis used to determine if a nodes
178 children can only be reached through itself and is thus part of a natural loop, and whose specifics
179 is covered in a separate subsection.
181 A basic description for how a natural loop works is here: https://web.cs.wpi.edu/~kal/PLT/PLT8.6.4.html
183 \subsection analyses-reaching-definitions Reaching definitions (reaching_definitions_analysist)
187 \subsection analyses-uncaught-exceptions Uncaught exceptions analysis (uncaught_exceptions_domaint)
191 \subsection analyses-uninitialized-locals Uninitialized locals analysis (uninitialized_analysist)
195 \section analyses-transformations Transformations (arguably in the wrong directory):
197 \subsection analyses-goto-checkt Pointer / overflow / other check insertion (goto_checkt)
201 \subsection analyses-interval-analysis Integer interval analysis -- both an analysis and a transformation
203 \ref interval_analysis interprets instructions of the input \ref goto_modelt
204 over the \ref interval_domaint, evaluates variables for each program location
205 (as intervals) and instruments the program with assumptions representing the
206 over-approximation of variable values.