Tuesday, July 23, 2013

What Declarative Languages Are

On his blog, Bob Harper asks what, if anything, a declarative language is. He notes that "declarative" is often used to mean "logic or functional programming", and is (justly) skeptical that this is a useful pairing.

However, there's actually a surprisingly simple and useful definition of declarative language: a declarative language is any language with a semantics has some nontrivial existential quantifiers in it. To illustrate this definition, let's begin by looking at some examples of declarative languages:

  • Regular expressions
  • Context-free grammars
  • Database query languages based on relational algebra (eg, SQL)
  • Logic programming languages (eg, Prolog)
  • Constraint-based languages for layout (eg, CSS)

The common factor is that all of these languages have a semantics which relies on some existential quantifiers which it is not immediately obvious how to discharge. For example, the semantics of regular expressions can be explained in terms of string membership, by giving a judgement $w \in r$ meaning that the string $w$ is an inhabitant of the regular expression $r$. $$ \begin{array}{c} \frac{\displaystyle } {\displaystyle \cdot \in \epsilon} \\[1em] \frac{\displaystyle } {\displaystyle c \in c} \\[1em] \frac{\displaystyle w \in r_i \qquad i \in \{1,2\} } {\displaystyle w \in r_1 \vee r_2} \\[1em] \mbox{(no rule for $w \in \bot$)} \\[1em] \frac{\displaystyle \exists w_1, w_2.\; w = w_1 \cdot w_2 \qquad w_1 \in r_1 \qquad w_2 \in r_2 } {\displaystyle w \in r_1 \cdot r_2} \\[1em] \frac{\displaystyle } {\displaystyle \cdot \in r*} \\[1em] \frac{\displaystyle \exists w_1, w_2.\; w = w_1 \cdot w_2 \qquad w_1 \in r \qquad w_2 \in r* } {\displaystyle w \in r*} \end{array} $$

In particular, note the appearance of an existential quantifier in the premises of the sequential composition and Kleene star cases, and note the nondeterministic choice of a branch in the alternation case. So read as a logic program, this semantics is not well-moded.

Of course, to implement a regular expression matcher, you need an operational semantics which is well-moded and functional, which we get by constructing a finite state machine from the regular expression. But now we have a proof obligation to show that the operational semantics agrees with the declarative semantics.

We can make a similar distinction for each of the examples above. Context-free languages also have a declarative membership relation, but are recognized with parsing algorithms such as CYK and Earley parsing. Query languages have a declarative semantics in terms of the relational algebra, where relational composition is hiding an existential, but are implemented in terms of nested loops and indices. Logic programming has a declarative semantics in terms of the model theory of first-order logic, and an implementation in terms of backtracking and unification. Constraint languages have a declarative semantics in terms of simultaneous satisfaction of a collection of equations, with the existentials lurking in the values assigned to the free variables, but are implemented in terms of simplification and propagation through imperative graphs.

This also lets us make the prediction that the least-loved features of any declarative language will be the ones that expose the operational model, and break the declarative semantics. So we can predict that people will dislike (a) backreferences in regular expressions, (b) ordered choice in grammars, (c) row IDs in query languages, (d) cut in Prolog, (e) constraint priorities in constraint languages. And lo and behold, these are indeed the features which programmers are encouraged to avoid as much as possible!

This definition also lets say that functional programming is not a declarative language -- the reduction relation of the lambda calculus is well-moded, in that we do not need to guess any part of a term to figure out how to reduce it. (And if you fix an evaluation order, the reduction relation is even deterministic.) The same is true for imperative languages like ML or Haskell: now we're just additionally threading a store through the reduction relation.