forked from acsl-language/acsl
-
Notifications
You must be signed in to change notification settings - Fork 0
/
compjml_modern.tex
384 lines (323 loc) · 15 KB
/
compjml_modern.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
\section{Comparison with JML}
\label{sec:comp-jml}
\ifCPP{This section was written in conjunction with the definition of \acsl. Most of the points are also accurate for \NAME. However,
since \lang an object-oriented language with inheritance and exceptions, some
aspects of \NAME, more so than \acsl, are similar (and were inspired by) JML.}
Although \acsl took inspiration from the Java Modeling Language (aka
JML~\cite{jmlhomepage}), \acsl is notably different from JML in
two crucial aspects:
\begin{itemize}
\item \acsl is a BISL for C, a low-level structured language, while JML
is a BISL for Java, an object-oriented inheritance-based high-level
language. Not only are the language features not the same between Java and C, but the
programming styles and idioms are very different, which then entails
different ways of specifying behaviors. In particular, C has no
inheritance or exceptions, and no language support for the simplest
properties on memory (\emph{e.g.}, the size of an allocated memory block).
\item JML also supports runtime assertion checking (RAC) when typing,
static analysis and automatic deductive verification fail. The
example of CCured~\cite{necula02ccured,condit03ccured}, which also adds
strong typing to C by relying on RAC, shows that it is not possible
to do it in a modular way. Indeed, it is necessary to modify the
layout of C data structures for RAC, which is not modular. The
follow-up project Deputy~\cite{condit07deputy} thus reduces the
checking power of annotations in order to preserve modularity. In contrast, we choose not to restrain the power of annotations
(\emph{e.g.}, all first order logic formulas are allowed). To that end, we
rely on manual deductive verification using an interactive theorem
prover (\emph{e.g.}, Coq) when every other technique fails.
\end{itemize}
\noindent
In the remainder of this chapter, we describe these differences in
further detail.
\subsection{Low-level language vs. inheritance-based one}
\subsubsection*{No inherited specifications}
JML has a core notion of specification inheritance, which
enables support for behavioral subtyping, by applying
specifications of parent methods to overriding methods. Inheritance
combined with visibility and modularity account for a number of
complex features in JML (\emph{e.g.}, \verb|spec_public| modifier, data
groups, represents clauses, etc), that are necessary to express the
desired inheritance-related specifications while respecting visibility
and modularity. Since C has no inheritance, these intricacies are
avoided in \acsl.
\subsubsection*{Error handling without exceptions}
\label{sec:errorhandling}
The usual way of signaling errors in Java is through
exceptions. Therefore, JML specifications are tailored to express
exceptional postconditions, depending on the exception raised. Since C
has no exceptions, \acsl does not use exceptional
specifications. Instead, C programmers typically signal errors by
returning special values, as is mandated in various ways by the C standard.
\begin{example}
In \S 7.12.1 of the standard, it is said that functions in <math.h>
signal errors as follows:
``On a domain error, [...] the integer expression errno
acquires the value EDOM.''
\end{example}
\begin{example}
In \S 7.19.5.1 of the standard, it is said that function fclose signals
errors as follows:
``The fclose function returns [...] EOF if any errors were detected.''
\end{example}
\begin{example}
In \S 7.19.6.1 of the standard, it is said that function fprintf
signals errors as follows:
``The fprintf function returns [...] a negative value if an output or
encoding error occurred.''
\end{example}
\begin{example}
In \S 7.20.3 of the standard, it is said that memory management functions
signal errors as follows:
``If the space cannot be allocated, a null pointer is returned.''
\end{example}
As shown by these few examples, there is no unique way to signal
errors in the C standard library, not to mention errors from user-defined
functions. But since errors are signaled by returning special values, it
is sufficient to write an appropriate postcondition:
\begin{listing-nonumber}
/*@ ensures \result == error_value || normal_postcondition; */
\end{listing-nonumber}
% \noindent
% A tool could easily set error conditions aside, by providing an
% appropriate extension of behaviors, \emph{e.g}, using a new keyword
% \verb|failswith|:
% \begin{flushleft}\ttfamily
% /*@ failswith $\mathit{\result == error\_value}$; \\
% ~~@ ensures $\mathit{normal\_postcondition}$; \\
% ~~@*/
% \end{flushleft}
%\input{fwrite-malloc.pp}
\subsubsection*{C contracts are not Java ones}
In Java, the precondition of the following function that nullifies an
array of characters is always true. Even if there was a precondition
on the length of array {\ttfamily a}, it could easily be expressed using
the Java expression {\ttfamily a.length} that gives the dynamic length
of array {\ttfamily a}.
\begin{listing}{1}
public static void Java_nullify(char[] a) {
if (a == null) return;
for (int i = 0; i < a.length; ++i) {
a[i] = 0;
}
}
\end{listing}
On the other hand, the precondition of the same function in C, whose
definition follows, is more involved. First, note that the
C programmer has to add an extra argument for the size of the array,
or rather a lower bound on this array size.
\begin{listing}{1}
void C_nullify(char* a, unsigned int n) {
int i;
if (n == 0) return;
for (i = 0; i < n; ++i) {
a[i] = 0;
}
}
\end{listing}
\noindent
A correct precondition for this function is the following:
\begin{listing-nonumber}
/*@ requires \valid(a + 0..(n-1)); */
\end{listing-nonumber}
where predicate \valid is the one defined in Section~\ref{subsec:memory}.
(note that \lstinline|\valid(a + 0..(-1))| is the same as
\lstinline|\valid(\empty)| and thus is true regardless of the validity of
\lstinline|a| itself).
When \lstinline|n| is 0, \lstinline|a| does
not need to be valid at all, and when \lstinline|n| is strictly
positive, \lstinline|a| must point to an array of size at least
\lstinline|n|. To make it more obvious, the C programmer adopted a
defensive programming style, which returns immediately when \lstinline|n| is
0. We can duplicate this in the specification:
\begin{listing-nonumber}
/*@ requires n == 0 || \valid(a + 0..(n-1)); */
\end{listing-nonumber}
Many memory requirements are only necessary for some paths
through the function, which correspond to some particular
behaviors, selected according to some tests performed along the
corresponding paths. Since C has no memory
primitives, these tests involve other variables that the C programmer
adds to track additional information, such as {\ttfamily n} in our example.
To make it easier, it is possible in \acsl to distinguish between the
\lstinline|assumes| part of a behavior, that specifies the tests that need
to succeed for this behavior to apply, and the \lstinline|requires| part
that specifies the additional assumptions that must be true when a
behavior applies. The specification for our example can then be
translated into:
\begin{listing}{1}
/*@ behavior n_is_null:
@ assumes n == 0;
@ behavior n_is_not_null:
@ assumes n > 0;
@ requires \valid(a + 0..(n-1));
@*/
\end{listing}
This is equivalent to the previous requirement, except here behaviors
can be completed with postconditions that belong to one behavior only.
\subsubsection*{\acsl contracts vs. JML ones}
In JML, the set of stated behaviors is assumed to cover all
permitted uses of the function; any calling context in which none of the requires preconditions are true would be identified as an error.
In \acsl, the set of behaviors for a function do not
necessarily cover all cases of use for this function, as mentioned in
Section~\ref{subsec:behaviors}. This allows for partial
specifications. In the example above, our two behaviors are clearly mutually exclusive,
and, since \lstinline|n| is an \lstinline|unsigned int|,
they cover all the possible cases. We could have specified that as well, by
adding the following lines in the contract (see
Section~\ref{sec:compl-behav}).
\begin{listing}{last}
@ ...
@ disjoint behaviors;
@ complete behaviors;
@*/
\end{listing}
To fully understand the difference between specifications in \acsl and
JML, we detail below the requirements on the pre-state and
the guarantees in the post-state given by behaviors in JML and \acsl.
A JML contract is either \emph{lightweight} or \emph{heavyweight}.
For the purpose of our comparison, it is sufficient to know that a
lightweight contract is syntactic sugar for a single specific
heavyweight contract; a contract can have multiple heavyweight behaviors and these can be nested.
Here is a hypothetical JML contract:
\begin{listing}{1}
/*@ behavior $x_1$:
@ requires $A_1$;
@ requires $R_1$;
@ ensures $E_1$;
@ behavior $x_2$:
@ requires $A_2$;
@ requires $R_2$;
@ ensures $E_2$;
@*/
\end{listing}
It assumes that the pre-state satisfies the condition:
\begin{listing-nonumber}
(($A_1$ && $R_1$) || ($A_2$ && $R_2$))
\end{listing-nonumber}
and guarantees that the following condition holds in post-state:
\begin{listing-nonumber}
(\old($A_1$ && $R_1$) ==> $E_1$) && (\old($A_2$ && $R_2$) ==> $E_2$)
\end{listing-nonumber}
Note particularly that the pre-state is required to satisfy
the precondition of at least one behavior.
Here is now a syntactically similar \acsl specification:
\begin{listing}{1}
/*@ requires $P_1$;
@ requires $P_2$;
@ ensures $Q_1$;
@ ensures $Q_2$;
@ behavior $x_1$:
@ assumes $A_1$;
@ requires $R_1$;
@ ensures $E_1$;
@ behavior $x_2$:
@ assumes $A_2$;
@ requires $R_2$;
@ ensures $E_2$;
@*/
\end{listing}
\noindent
Syntactically, the only difference with the JML specification is the
addition of the \lstinline|assumes| clauses and allowing an
anonymous behavior at the beginning of the contract. Rewriting the anonymous behavior with a name gives
\begin{listing}{1}
/*@
@ behavior $x_0$:
@ assumes \true;
@ requires $P_1$;
@ requires $P_2$;
@ ensures $Q_1$;
@ ensures $Q_2$;
@ behavior $x_1$:
@ assumes $A_1$;
@ requires $R_1$;
@ ensures $E_1$;
@ behavior $x_2$:
@ assumes $A_2$;
@ requires $R_2$;
@ ensures $E_2$;
@*/
\end{listing}
\noindent
Its translation to assume-guarantee is however quite different than JML.
It assumes the pre-state satisfies the condition
\begin{listing-nonumber}
(\true ==> ($P_1$ && $P_2$)) && ($A_1$ ==> $R_1$) && ($A_2$ ==> $R_2$)
\end{listing-nonumber}
Here, it is acceptable that none of the behaviors are active (that is, that none of the \lstinline|assumes| clauses are true, even without the unnamed behavior). In that case there is no post-condition guarantee either.
The contract guarantees that the following condition holds in the post-state:
\begin{listing-nonumber}
(\true ==> ($Q_1$ && $Q_2$)) && (\old($A_1$) ==> $E_1$) && (\old($A_2$) ==> $E_2$)
\end{listing-nonumber}
Thus, \acsl allows distinguishing between the clauses that control
which behavior is active (the \lstinline|assumes| clauses) and the
clauses that are preconditions for a particular behavior (the internal
\lstinline|requires| clauses).
In addition, as mentioned above, there is
by default no requirement in \acsl for the specification to be complete. In JML an incomplete specification may cause a warning in a calling context; partial behavior is specified by an explicitly underspecified postcondition. In \acsl, an incomplete specification specifies partial behavior; a warning for a particular behavior is produced by a \lstinline|requires \false;| clause.
\subsubsection*{Modifies vs. writes semantics}
As described in \S\ref{sec:writesSemantics}, \acsl interprets frame conditions with \emph{modifies} semantics, whereas JML defines frame conditions with \emph{writes} semantics.
\subsection{Deductive verification vs. RAC}
\subsubsection*{Sugar-free behaviors}
As explained in detail in~\cite{raghavan00desugaring}, JML
heavyweight behaviors can be viewed as syntactic sugar that can be translated automatically into more basic
contracts consisting mostly of pre- and postconditions and frame
conditions. This allows complex nesting of behaviors from the user
point of view, while tools only have to deal with basic contracts. In
particular, older tools on JML used this desugaring process, such as
the Common JML tools to do assertion checking, unit testing,
etc. (see~\cite{leavens00jml}) and the tool ESC/Java2 for
automatic deductive verification of JML specifications
(see~\cite{Kiniry-Cok05}).
One issue with such a desugaring approach is the complexity of the
transformations involved, as \emph{e.g.} for desugaring assignable clauses
between multiple \textit{spec-cases} in
JML~\cite{raghavan00desugaring}. Another issue is precisely that
tools only see one global contract, instead of multiple independent
behaviors, that could be analyzed separately in more detail.
Instead, we favor the view that a function implements multiple
behaviors, that can be analyzed separately if a tool feels like
it. Therefore, we do not intend to provide a desugaring process.
Indeed, the current JML tool, OpenJML ~\cite{Cok-2011-OpenJML,Cok-2014-OpenJML}, also does only a partial desugaring, which at minimum is able to give more informative error messages when proof attempts fail.
\subsubsection*{Axiomatized functions in specifications}
JML allows pure Java methods to be called in
specifications~\cite{leavens00preliminary}. This avoids having
to write essentially duplicate logical functions that mimic Java functions. It is also useful when relying on RAC: methods called should be defined
so that the runtime can call them, and they should not have
side-effects in order not to pollute the program they are supposed to
annotate.
JML also permits model (logical) functions to be
used in specifications; if the model function does not have
a body, then RAC cannot be used. But for deductive verification,
the properties of a model function can be specified axiomatically.
\acsl focuses on deductive verification and currently only allows calls to logical functions in
specifications. These functions may be defined, like program functions, but
they may also be only declared (with a suitable declaration of \reads
clause) and their behavior defined through an axiomatization.
This makes for richer specifications that may be useful either in
automatic or in manual deductive verification.
\subsection{Syntactic differences}
The following table summarizes the difference between JML and \acsl keywords, when the intent is the same, although minor differences
might exist.
\begin{center}
\begin{tabular}{|l|l|}
\hline
JML & ACSL \\ \hline
modifiable, assignable & assigns \\
measured\_by & decreases \\
loop\_invariant & loop invariant \\
decreases & loop variant \\
\lstinline|(\forall $\tau$ x ; P ; Q)| &
\lstinline|(\forall $\tau$ x ; P ==> Q)| \\
\lstinline|(\exists $\tau$ x ; P ; Q)| &
\lstinline|(\exists $\tau$ x ; P && Q)| \\
\lstinline|\max $\tau$ x ; a <= x <= b ; f)| &
\lstinline|\max(a,b,\lambda $\tau$ x ; f)| \\
\hline
\end{tabular}
\end{center}
%%% Local Variables:
%%% mode: latex
%%% TeX-PDF-mode: t
%%% TeX-master: "main"
%%% End: