Programming means to program productively.

Let me first quote HARLAN MILLS on productivity:

*Mankind, under the grace of God, hungers for spiritual peace,
esthetic achievements, family security, justice, and liberty,
none directly satisfied by industrial productivity. But productivity
allows the sharing of the plentiful rather than fighting over scarcity;
it provides time for spiritual, esthetic, and family matters. It allows
society to delegate special skills to institutions of religion, justice,
and the preservation of liberty.*

I am passionate about programming. My primary research interest lies in programming language design and implementation. In particular, I have long been studying type-based approaches and applying them to static debugging, that is, detecting potential programming errors before run-time. Recently, I have also been studying template-based programing (TBP), which advocates using function templates in place of functions to achieve a great deal more degree of code sharing.

The initial idea of the Applied Type System framework (for formulating type theory on practical programming) came to me around the end of year 2002. I have since been continually working on the design and implmentation of ATS, a programming language of a functional core that is equipped with support for advanced types such as linear types and dependent types. As of now, I am actively working on ATS/Xanadu, which is an implementation for ATS3, the third edition of ATS. My expectation is for ATS to turn from a programming language for research into one that is of production strength.

How many times have you heard someone saying “I know the answer but I can’t explain it”? A lot, I bet. I can easily tell the image of a cat from that of a dog, but I cannot readily explain how I do it. In programming, it is often easier for someone to write code than to explain what the code actually does. This is very worrisome, especially, if we want to build high-quality software systems.

In order to be precise, we need to be able to state clearly what we want. Let me use the following elementary problem as an example.

Alice and Bob ate 10 apples together. Alice ate 3 apples. What is the number of apples eaten by Bob?

My daughter as a first grader had no difficulty telling me immediately that the answer is 7. I, however, wanted her to give me a more formal specification of the problem. We essentially came up with the following one at the end:

Let A and B stand for the numbers of apples eaten by Alice and Bob, respectively. It is given that A+B=10 and A=3. What is the value of B?

With this specification, we could readily and formally derive that (A+B)-A = 10-3. Addition being both commutative and associative, we derived (A+B)-A = B+(A-A) = B+0 = B. Therefore, B = 7. So we not only found that 7 is an answer but also proved that 7 is the unique answer.

While I might have been a bit overly pedantic about the above
elementary problem, I do hope that you can see my point of being
precise here. I strive to be precise in programming and have a passion
for *effectively* enforcing precision in realistic software
development.

Programming languages are tools and programming language design and implementation should focus on increasing a programmer’s productivity. Ideally, a programming language should be simple and general, and it should permit extensive error checking, facilitate proofs of program properties and possess a correct and efficient implementation. Invariably, there will be conflicts among these goals, which must be resolved with trade-offs being carefully tailored. In order to make significant progress, I firmly believe the necessity to adopt approaches that can scale well.

Types have been invented for many purposes in the course of programming language design and implementation. For instance, types in C are primarily used for determining data layouts at compile-time. In order for types to be used for verification, the underlying type system should be sound (or at least assumed to be sound). For a property on a program to be verified, the property is encoded into the type assigned to the program; and verifying the property is then reduced to verifying the well-typedness of the program, which is often referred to as type-checking.

ATS advocates a programming paradigm in which programs and proofs can be constructed in a syntactically intertwined manner. This paradigm is often referred to as programming with theorem-proving (PwTP), and it plays a central indispensible role in the development of ATS. Let us now see a simple and concrete example that clearly illustrates PwTP as is being supported in ATS.

A function `fib`

can be specified as follows for computing Fibonacci numbers:

`fib(0) = 0`

`fib(1) = 1`

`fib(n+2) = fib(n) + fib(n+1) for n >= 0`

Following is a direct implementation of this specified function in ATS:

```
fun
fib (
n: int
) : int =
if n >= 2 then fib(n-2) + fib(n-1) else n
// end of [fib]
```

Clearly, this is a terribly inefficient implementation of exponential
time-complexity. An implementation of `fib`

in C is given as follows
that is of linear time-complexity:

```
int
fibc(int n)
{
int tmp, f0 = 0, f1 = 1 ;
while (n-- > 0) { tmp = f1 ; f1 = f0 + f1 ; f0 = tmp ; } ; return f0 ;
} // end of [fibc]
```

If translated into ATS, the function `fibc`

can essentially be implemented as follows:

```
fun
fibc (
n: int
) : int = let
//
fun
loop(n: int, f0: int, f1: int): int =
if n > 0 then loop(n-1, f1, f0+f1) else f0
// end of [loop]
//
in
loop(n, 0, 1)
end // end of [fibc]
```

There is obviously a logic gap between the defintion of `fib`

and its
implementation as is embodied in `fibc`

. In ATS, an implementation of
`fib`

can be given that completely bridges this gap. First, the
specification of `fib`

needs to be encoded into ATS, which is fulfilled
by the declaration of the following dataprop:

```
dataprop
FIB(int, int) =
| FIB0(0, 0) of ()
| FIB1(1, 1) of ()
| {n:nat}{r0,r1:int}
FIB2( n+2, r0+r1 ) of (FIB(n, r0), FIB(n+1, r1))
```

This declaration introduces a type `FIB`

for proofs, and such a type is
referred to as a prop in ATS. Intuitively, if a proof can be assgined
the type `FIB(n,r)`

for some integers `n`

and `r`

, then `fib(n)`

equals `r`

.
In other words, `FIB(n,r)`

encodes the relation `fib(n)=r`

. There are three
constructors `FIB0`

, `FIB1`

and `FIB2`

associated with `FIB`

, which are given
the following types corresponding to the three equations in the definition of
`fib`

:

`FIB0 : () -> FIB(0, 0)`

`FIB1 : () -> FIB(1, 1)`

`FIB2 : {n:nat}{r0,r1:int} (FIB(n, r0), FIB(n+1, r1)) -> FIB(n+2, r0+r1)`

Note that {…} is the concrete syntax in ATS for universal
quantification. For instance, `FIB2(FIB0(), FIB1())`

is a term of the
type `FIB(2,1)`

, attesting to `fib(2)=1`

.

A fully verified implementaion of the `fib`

function in ATS can now be
given as follows:

```
fun
fibats
{n:nat}
( n
: int(n))
: [r:int] (FIB (n, r) | int r) = let
//
fun
loop
{i:nat|i <= n}{r0,r1:int}
( pf0: FIB(i, r0)
, pf1: FIB(i+1, r1)
| n_i: int(n-i)
, r0: int r0, r1: int r1
) : [r:int] (FIB(n, r) | int(r)) =
(
if
(n_i > 0)
then
loop{i+1}
(
pf1, FIB2(pf0, pf1) | n_i-1, r1, r0+r1
) (* then *)
else (pf0 | r0)
) (* end of [loop] *)
in
loop{0}(FIB0(*void*), FIB1(*void*) | n, 0, 1)
end // end of [fibats]
```

Note that `fibats`

is given the following declaration:

```
fun fibats : {n:nat} int(n) -> [r:int] (FIB(n,r) | int(r))
```

where the concrete syntax […] is for existential quantification and
the bar symbol (|) is just a separator (like a comma) for separating
proofs from values. For each integer `I`

, `int(I)`

is a singleton type
for the only integer whose value equals `I`

. When `fibats`

is applied to
an integer of value `n`

, it returns a pair consisting of a proof and an
integer value `r`

such that the proof, which is of the type `FIB(n,r)`

,
asserts `fib(n)=r`

. Therefore, `fibats`

is a verified implementation of
`fib`

as is encoded by `FIB`

. Note that the inner function `loop`

directly
corresponds to the while-loop in the body of the function `fibc`

(written in C).

Lastly, it should be emphasized that proofs are completely erased after they pass typechecking. In particular, there is no proof construction at run-time.

Traditionally, types are touted for helping debug programs. As
debugging is an essential part of programming, such help from types
can undoubtedly result in increased programming productivity. This
kind of *indirect* support of types for increased productivity is
already well-known. What is much less well-known is that types can
also *directly* result in increased programming productivity by
facilitating code reuse at compile-time. For instance, the feature of
type classes in Haskell makes direct use of types in choosing type
class instances. Moreover, types play an even more conspicuous role in
template-based programming, a defining feature of ATS that is partly
inspired by type classes.

The notion of *late-binding* is prevalent in programming language
design and implementation. For instance, linking the name of a
function to the actual code implementing the function can be seen as a
form of late-binding (at link-time). More conspicuously, method
dispatching in object-oriented programming (OOP) is also a well-known
form of late-binding (at run-time). As a form of late-binding at
compile-time, one can think of overloading supported by type classes
in Haskell.

Template-Based Programming (TBP) advocates the use of (function)
templates in place of functions, which can be regarded as a form of
late-binding at compile-time. Intuitively, one may think of templates
as functions containing placeholders inside their bodies that can be
replaced with code obtained *contextually* at compile-time. It is
important to emphasize that the mentioned code replacement is
inherently of a recursive nature as code obtained contextually may
itself contain templates.

There are plenty of motivating examples for templates. For the moment, let us take a look at equality types in Standard ML (SML), which can be said to have directly motivated the addition of type classes into Haskell.

Various aspects of templates have already appeared in languages such as LISP (macros), Haskell (type classes), Scala (implicits), etc. One may also see great similarity between resolving a template call (in TBP) at compile-time and dispatching a method call (in OOP) at run-time.

Basically, an equality type in SML is one that supports polymorphic
equality (=), which is given the type `''a * ''a -> bool`

(instead of
`'a * 'a -> bool`

). In other words, a type variable `''a`

can only be
instantiated with an equality type (where a special function is
designated for testing equality on elements of this type).

There is *no* polymorphic equality in ATS3. In order to test whether two
given lists (of the same generic type) are equal, one could try to
implement a function template of the following interface:

```
fun
<a:type>
list_fequal
( xs: list(a)
, ys: list(a)
, eq: (a, a) -> bool): bool
```

This is a workable solution based on the notion of higher-order
function but it suffers from the requirement that each caller of
`list_fequal`

must pass to it *explicitly* a function argument (for
testing equality on elements in the two given lists).

Let us see a template-based solution to implementing equality test on
lists. In ATS3, the name `g_equal`

refers to a template of the
following interface:

```
fun
<a:type>
g_equal(a, a): bool
```

A function `list_equal`

can be defined as follows to test whether two
given lists are equal by calling `g_equal`

to test equality on elements:

```
#extern
fun
<a:type>
list_equal
( xs: list(a)
, ys: list(a)): bool
(* ****** ****** *)
impltmp
<a>(*tmp*)
list_equal
(xs, ys) =
( loop(xs, ys) ) where
{
//
fun
loop
( xs: list(a)
, ys: list(a) ): bool =
(
case+ xs of
| list_nil() =>
(
case+ ys of
| list_nil() => true
| list_cons _ => false
)
| list_cons(x0, xs) =>
(
case+ ys of
| list_nil() => true
| list_cons(y0, ys) =>
let
val ans =
g_equal<a>(x0, y0)
in
if ans then loop(xs, ys) else false
end
)
) (* end of [loop] *)
//
} (* end of [list_equal] *)
```

Note that `g_equal`

is already given a standard implementation on
basic types like int, bool, char, string, etc. If `list_equal`

is
called on two lists of the type `list(int)`

, the compiler can
automatically find based on contextual information an implementation
of `g_equal<int>`

needed for compiling `list_equal<int>`

. Moreover,
one can direct the compiler to locate another implementation of
`g_equal<int>`

if needed. It is in this sense of directing the
compiler to establish a link between a call to `g_equal<int>`

and an implementation of `g_equal<int>`

that TBP can be regarded
as a form of late-binding at compile-time.

Templates in ATS3 are embeddable for they can be implemented in the
body of other templates. As an example, the previous higher-order
template `list_fequal`

can be given an implementation based on
`list_equal`

as follows:

```
impltmp
<a>(*tmp*)
list_fequal
(xs, ys, eq) =
(
list_equal<a>(xs, ys)
) where
{
impltmp g_equal<a>(x, y) = eq(x, y)
}
```

where an implementation of `g_equal<a>`

is given in the body of the
implementation of `list_fequal<a>`

. It will become clear soon that
the embeddability of templates is a feature that can be made of use
to greatly simplify the way in which a program is structured.

With the above simple example, I have demonstrated a bit of TBP where templates are employed to replace higher-order functions. While this is a typical entry point for TBP, there are many more aspects of TBP that are yet to be explored.