compjml_modern.tex

\section{Comparison with JML}
\label{sec:comp-jml}

\ifCPP{This section was written in conjunction with the definition of \acsl. Most of the points are also accurate for \NAME. However, 
since \lang an object-oriented language with inheritance and exceptions, some
aspects of \NAME, more so than \acsl, are similar (and were inspired by) JML.}

Although \acsl took inspiration from the Java Modeling Language (aka
JML~\cite{jmlhomepage}), \acsl is notably different from JML in
two crucial aspects:

\begin{itemize}
\item \acsl is a BISL for C, a low-level structured language, while JML
  is a BISL for Java, an object-oriented inheritance-based high-level
  language. Not only are the language features not the same between Java and C, but the
  programming styles and idioms are very different, which then entails
  different ways of specifying behaviors. In particular, C has no
  inheritance or exceptions, and no language support for the simplest
  properties on memory (\emph{e.g.}, the size of an allocated memory block).
\item JML also supports runtime assertion checking (RAC) when typing,
  static analysis and automatic deductive verification fail. The
  example of CCured~\cite{necula02ccured,condit03ccured}, which also adds
  strong typing to C by relying on RAC, shows that it is not possible
  to do it in a modular way. Indeed, it is necessary to modify the
  layout of C data structures for RAC, which is not modular. The
  follow-up project Deputy~\cite{condit07deputy} thus reduces the
  checking power of annotations in order to preserve modularity.  In contrast, we choose not to restrain the power of annotations
  (\emph{e.g.}, all first order logic formulas are allowed). To that end, we
  rely on manual deductive verification using an interactive theorem
  prover (\emph{e.g.}, Coq) when every other technique fails.
\end{itemize}

\noindent
In the remainder of this chapter, we describe these differences in
further detail.

\subsection{Low-level language vs. inheritance-based one}

\subsubsection*{No inherited specifications}

JML has a core notion of specification inheritance, which
enables support for behavioral subtyping, by applying
 specifications of parent methods to overriding methods.  Inheritance
combined with visibility and modularity account for a number of
complex features in JML (\emph{e.g.}, \verb|spec_public| modifier, data
groups, represents clauses, etc), that are necessary to express the
desired inheritance-related specifications while respecting visibility
and modularity. Since C has no inheritance, these intricacies are
avoided in \acsl.

\subsubsection*{Error handling without exceptions}
\label{sec:errorhandling}

The usual way of signaling errors in Java is through
exceptions. Therefore, JML specifications are tailored to express
exceptional postconditions, depending on the exception raised. Since C
has no exceptions, \acsl does not use exceptional
specifications. Instead, C programmers typically signal errors by
returning special values, as is mandated in various ways by the C standard.

\begin{example}
In \S 7.12.1 of the standard, it is said that functions in <math.h>
signal errors as follows:
``On a domain error, [...] the integer expression errno
acquires the value EDOM.''
\end{example}

\begin{example}
In \S 7.19.5.1 of the standard, it is said that function fclose signals
errors as follows:
``The fclose function returns [...] EOF if any errors were detected.''
\end{example}

\begin{example}
In \S 7.19.6.1 of the standard, it is said that function fprintf
signals errors as follows:
``The fprintf function returns [...] a negative value if an output or
encoding error occurred.''
\end{example}

\begin{example}
In \S 7.20.3 of the standard, it is said that memory management functions
signal errors as follows:
``If the space cannot be allocated, a null pointer is returned.''
\end{example}

As shown by these few examples, there is no unique way to signal
errors in the C standard library, not to mention errors from user-defined
functions. But since errors are signaled by returning special values, it
is sufficient to write an appropriate postcondition:

\begin{listing-nonumber}
/*@ ensures \result == error_value || normal_postcondition; */
\end{listing-nonumber}

% \noindent
% A tool could easily set error conditions aside, by providing an
% appropriate extension of behaviors, \emph{e.g}, using a new keyword
% \verb|failswith|:

% \begin{flushleft}\ttfamily
% /*@ failswith $\mathit{\result == error\_value}$; \\
% ~~@ ensures $\mathit{normal\_postcondition}$; \\
% ~~@*/
% \end{flushleft}


%\input{fwrite-malloc.pp}


\subsubsection*{C contracts are not Java ones}

In Java, the precondition of the following function that nullifies an
array of characters is always true. Even if there was a precondition
on the length of array {\ttfamily a}, it could easily be expressed using
the Java expression {\ttfamily a.length} that gives the dynamic length
of array {\ttfamily a}.

\begin{listing}{1}
public static void Java_nullify(char[] a) {
  if (a == null) return;
  for (int i = 0; i < a.length; ++i) {
    a[i] = 0;
  }
}
\end{listing}

On the other hand, the precondition of the same function in C, whose
definition follows, is more involved. First, note that the
C programmer has to add an extra argument for the size of the array,
or rather a lower bound on this array size.

\begin{listing}{1}
void C_nullify(char* a, unsigned int n) {
  int i;
  if (n == 0) return;
  for (i = 0; i < n; ++i) {
    a[i] = 0;
  }
}
\end{listing}

\noindent
A correct precondition for this function is the following:

\begin{listing-nonumber}
/*@ requires \valid(a + 0..(n-1)); */
\end{listing-nonumber}

where predicate \valid is the one defined in Section~\ref{subsec:memory}.
(note that \lstinline|\valid(a + 0..(-1))| is the same as
\lstinline|\valid(\empty)| and thus is true regardless of the validity of
\lstinline|a| itself).
When \lstinline|n| is 0, \lstinline|a| does
not need to be valid at all, and when \lstinline|n| is strictly
positive, \lstinline|a| must point to an array of size at least
\lstinline|n|. To make it more obvious, the C programmer adopted a
defensive programming style, which returns immediately when \lstinline|n| is
0. We can duplicate this in the specification:

\begin{listing-nonumber}
/*@ requires n == 0 || \valid(a + 0..(n-1)); */
\end{listing-nonumber}

Many memory requirements are only necessary for some paths
through the function, which correspond to some particular
behaviors, selected according to some tests performed along the
corresponding paths. Since C has no memory
primitives, these tests involve other variables that the C programmer
adds to track additional information, such as {\ttfamily n} in our example.

To make it easier, it is possible in \acsl to distinguish between the
\lstinline|assumes| part of a behavior, that specifies the tests that need
to succeed for this behavior to apply, and the \lstinline|requires| part
that specifies the additional assumptions that must be true when a
behavior applies. The specification for our example can then be
translated into:

\begin{listing}{1}
/*@ behavior n_is_null:
  @   assumes n == 0;
  @ behavior n_is_not_null:
  @   assumes n > 0;
  @   requires \valid(a + 0..(n-1));
  @*/
\end{listing}

This is equivalent to the previous requirement, except here behaviors
can be completed with postconditions that belong to one behavior only.

\subsubsection*{\acsl contracts vs. JML ones}

In JML, the set of stated behaviors is assumed to cover all
permitted uses of the function; any calling context in which none of the requires preconditions are true would be identified as an error.
In \acsl, the set of behaviors for a function do not
necessarily cover all cases of use for this function, as mentioned in
Section~\ref{subsec:behaviors}. This allows for partial
specifications. In the example above, our two behaviors are clearly mutually exclusive,
and, since \lstinline|n| is an \lstinline|unsigned int|, 
they cover all the possible cases. We could have specified that as well, by
adding the following lines in the contract (see
Section~\ref{sec:compl-behav}).
\begin{listing}{last}
  @ ...
  @ disjoint behaviors;
  @ complete behaviors;
  @*/
\end{listing}

To fully understand the difference between specifications in \acsl and
JML, we detail below the requirements on the pre-state and
the guarantees in the post-state given by behaviors in JML and \acsl.

A JML contract is either \emph{lightweight} or \emph{heavyweight}.
For the purpose of our comparison, it is sufficient to know that a
lightweight contract is syntactic sugar for a single specific 
heavyweight contract; a contract can have multiple heavyweight behaviors and these can be nested.
Here is a hypothetical JML contract:
\begin{listing}{1}
/*@ behavior $x_1$:
  @   requires $A_1$;
  @   requires $R_1$;
  @   ensures $E_1$;
  @ behavior $x_2$:
  @   requires $A_2$;
  @   requires $R_2$;
  @   ensures $E_2$;
  @*/
\end{listing}
It assumes that the pre-state satisfies the condition:
\begin{listing-nonumber}
(($A_1$ && $R_1$) || ($A_2$ && $R_2$))
\end{listing-nonumber}
and guarantees that the following condition holds in post-state:
\begin{listing-nonumber}
  (\old($A_1$ && $R_1$) ==> $E_1$) && (\old($A_2$ && $R_2$) ==> $E_2$)
\end{listing-nonumber}
Note particularly that the pre-state is required to satisfy
the precondition of at least one behavior.

Here is now a syntactically similar \acsl specification:

\begin{listing}{1}
/*@ requires $P_1$;
  @ requires $P_2$;
  @ ensures  $Q_1$;
  @ ensures  $Q_2$;
  @ behavior $x_1$:
  @   assumes $A_1$;
  @   requires $R_1$;
  @   ensures $E_1$;
  @ behavior $x_2$:
  @   assumes $A_2$;
  @   requires $R_2$;
  @   ensures $E_2$;
  @*/
\end{listing}

\noindent
Syntactically, the only difference with the JML specification is the
addition of the \lstinline|assumes| clauses and allowing an 
anonymous behavior at the beginning of the contract. Rewriting the anonymous behavior with a name gives

\begin{listing}{1}
	/*@ 
	@ behavior $x_0$:
	@   assumes \true;
	@   requires $P_1$;
	@   requires $P_2$;
	@   ensures  $Q_1$;
	@   ensures  $Q_2$;
	@ behavior $x_1$:
	@   assumes $A_1$;
	@   requires $R_1$;
	@   ensures $E_1$;
	@ behavior $x_2$:
	@   assumes $A_2$;
	@   requires $R_2$;
	@   ensures $E_2$;
	@*/
\end{listing}

\noindent
Its translation to assume-guarantee is however quite different than JML.
It assumes the pre-state satisfies the condition

\begin{listing-nonumber}
  (\true ==> ($P_1$ && $P_2$)) && ($A_1$ ==> $R_1$) && ($A_2$ ==> $R_2$)
\end{listing-nonumber}
Here, it is acceptable that none of the behaviors are active (that is, that none of the \lstinline|assumes| clauses are true, even without the unnamed behavior). In that case there is no post-condition guarantee either.

The contract guarantees that the following condition holds in the post-state:

\begin{listing-nonumber}
(\true ==> ($Q_1$ && $Q_2$)) && (\old($A_1$) ==> $E_1$) && (\old($A_2$) ==> $E_2$)
\end{listing-nonumber}

Thus, \acsl allows distinguishing between the clauses that control
which behavior is active (the \lstinline|assumes| clauses) and the
clauses that are preconditions for a particular behavior (the internal
\lstinline|requires| clauses). 

In addition, as mentioned above, there is
by default no requirement in \acsl for the specification to be complete. In JML an incomplete specification may cause a warning in a calling context; partial behavior is specified by an explicitly underspecified postcondition. In \acsl, an incomplete specification specifies partial behavior; a warning for a particular behavior is produced by a \lstinline|requires \false;| clause.

\subsubsection*{Modifies vs. writes semantics}


As described in \S\ref{sec:writesSemantics}, \acsl interprets frame conditions with \emph{modifies} semantics, whereas JML defines frame conditions with \emph{writes} semantics.

\subsection{Deductive verification vs. RAC}

\subsubsection*{Sugar-free behaviors}

As explained in detail in~\cite{raghavan00desugaring}, JML
heavyweight behaviors can be viewed as syntactic sugar that can be translated automatically into more basic
contracts consisting mostly of pre- and postconditions and frame
conditions.  This allows complex nesting of behaviors from the user
point of view, while tools only have to deal with basic contracts. In
particular, older tools on JML used this desugaring process, such as
the Common JML tools to do assertion checking, unit testing,
etc. (see~\cite{leavens00jml}) and the tool ESC/Java2 for
automatic deductive verification of JML specifications
(see~\cite{Kiniry-Cok05}).

One issue with such a desugaring approach is the complexity of the
transformations involved, as \emph{e.g.} for desugaring assignable clauses
between multiple \textit{spec-cases} in
JML~\cite{raghavan00desugaring}.  Another issue is precisely that
tools only see one global contract, instead of multiple independent
behaviors, that could be analyzed separately in more detail.
Instead, we favor the view that a function implements multiple
behaviors, that can be analyzed separately if a tool feels like
it. Therefore, we do not intend to provide a desugaring process.
Indeed, the current JML tool, OpenJML ~\cite{Cok-2011-OpenJML,Cok-2014-OpenJML}, also does only a partial desugaring, which at minimum is able to give more informative error messages when proof attempts fail.

\subsubsection*{Axiomatized functions in specifications}

JML allows pure Java methods to be called in
specifications~\cite{leavens00preliminary}. This avoids having
to write essentially duplicate logical functions that mimic Java functions. It is also useful when relying on RAC: methods called should be defined
so that the runtime can call them, and they should not have
side-effects in order not to pollute the program they are supposed to
annotate. 
JML also permits model (logical) functions to be
used in specifications; if the model function does not have
a body, then RAC cannot be used. But for deductive verification, 
the properties of a model function can be specified axiomatically.

\acsl focuses on deductive verification and currently only allows calls to logical functions in
specifications. These functions may be defined, like program functions, but
they may also be only declared (with a suitable declaration of \reads
clause) and their behavior defined through an axiomatization.
This makes for richer specifications that may be useful either in
automatic or in manual deductive verification.


\subsection{Syntactic differences}

The following table summarizes the difference between JML and \acsl keywords, when the intent is the same, although minor differences
might exist.
\begin{center}
\begin{tabular}{|l|l|}
\hline
  JML                  & ACSL \\ \hline
  modifiable, assignable           & assigns \\
  measured\_by         & decreases \\
  loop\_invariant      & loop invariant \\
  decreases            & loop variant \\
  \lstinline|(\forall $\tau$ x ; P ; Q)| &
       \lstinline|(\forall $\tau$ x ; P ==> Q)| \\
  \lstinline|(\exists $\tau$ x ; P ; Q)| &
        \lstinline|(\exists $\tau$ x ; P && Q)| \\
  \lstinline|\max $\tau$ x ; a <= x <= b ; f)| &
        \lstinline|\max(a,b,\lambda $\tau$ x ; f)| \\
  \hline
\end{tabular}
\end{center}

%%% Local Variables:
%%% mode: latex
%%% TeX-PDF-mode: t
%%% TeX-master: "main"
%%% End: