OCaml for the Skeptical

The whole point of programming in a functional language is to define functions. We've
already seen OCaml's `fun` lambda expression, and we
have seen how to name values, and since a function is a
value like any other, that means we already know how to define functions:

# let inc = fun n -> n + 1;; val inc : int -> int = <fun> #

We've now named the above lambda that adds `1` to some `n`, `inc`. Fear not:
OCaml has substantial syntactic sugar for function definitions that you will like better
than this. But first, let's consider the important topic of how to *call* or
*apply* functions.

Probably the most popular notation for applying a function to its argument is
`inc(2)`, the value of which would be `3`. And this works in OCaml:

# inc(2);; - : int = 3 #

But I would be deceiving you if I left you with the impression that this is the idiomatic syntax for function applications in OCaml! Since OCaml is a functional language, and applying functions is the most important thing you do as an OCaml programmer, the syntax of function application is designed to be as simple as it can possibly be: it is simple prefix juxtaposition. No parens are needed (other than for the usual purpose of disambiguation and changing precedence). No whitespace is needed (other than for disambiguation)! So the natural, idiomatic way to write the above function application would be:

# inc 2;; - : int = 3 #

Note that in this case we need whitespace between the function name and the argument
simply because `inc2` would be ambiguous (is it the name `inc2`, or the function
named `inc` applied to `2`)? `inc(2)` or `inc (2)` works simply because
`2` is an expression and you can always pointlessly wrap parens around an expression
without changing its value. Don't put needless parens around your function arguments
unless you want to look like a newbie!

You may have noticed I seem to have implied that OCaml only has unary functions
i.e. functions of one argument. Well, it's true! OCaml really *does* only have
functions of one argument, simply because that's all that's necessary in any language!
You may claim you've seen me adding two numbers together with `+`, so clearly `+`
is a function of more than one argument. Well, it's not: that's mere syntactic sugar.
The way to understand this notion of "only unary functions" is to consider *partial
application*.

Pretend that OCaml *does* have functions of more than one argument and pretend that
`+` is one of them. First note that (most) any infix operator in OCaml can also be
used as a prefix operator by surrounding the operator with parens and putting it in a
prefix position, like so: `2 + 3 = (+) 2 3`. (Remember, function application is
indicated by juxtaposition, so we just put spaces between the function name (`(+)`)
and its arguments, and between the arguments too.) This is legal OCaml, which would
seem to *prove* that OCaml has functions of more than one argument. But wait.

Now imagine what it should mean if you were allowed to call `(+)` with fewer than its
supposed two arguments. Can we assign any useful meaning to `(+) 2`? Well, Lisp
assigns the meaning `2` to `(+ 2)` (and `0`, the identity, to `(+)`) but
OCaml has a better idea.

`(+) 2` is interpreted as a function that's waiting for one more argument before it can
be applied. So if you call a function of "two arguments" with only one argument, you
get a function with 2-1 = 1 argument. You can assign this function a name as in `let
plustwo = (+) 2` and then use that name as a function of one argument, as in `plustwo
3 = 5`. Of course you can also use that new function of one argument without naming it:
`((+) 2) 3 = 5` or `List.map ((+) 2) [1;2;3] = [3; 4; 5]`, which is very
convenient.

This is in fact how OCaml functions work: every function of *n* > 1 arguments is
actually a function of one argument that returns a function of *n*-1 arguments as its
result. OCaml's function-application syntax, combined with the left-associativity of
function application, makes this transparent. `(+) 2` is `fun a -> (+) 2 a`,
the function that adds 2 to some number. We can abstract out the `2` to get the
function that adds some number to some other number: `fun b -> fun a -> (+) b
a` and that's `(+)`. We can use this technique to define functions of any number of
arguments: `let f = fun a -> fun b -> fun c -> a + b * c`. So what's
`f 6`? The `6` is the "first" argument of `f`, so the definition above tells
us that `f 6` is `fun b -> fun c -> a + b * c` *with the a replaced
with 6* or

OCaml doesn't work this way just to be perverse; it does it because partial application
is an extremely useful programming technique, once you get used to it. Also, having
only functions of one argument simplifies the type of functions. (In fact most modern
functional languages — including Standard ML, Haskell, and Clean — work this
way; this way of looking at functions is called *Currying* after the logician Haskell
B. Curry.)

Except in specialized circumstances, nobody wants to define a function of three
arguments with three nested lambdas: `let f = fun a -> fun b -> fun c -> a +
b * c`. `fun` actually has a sugared form to make this less verbose: `fun a b
c -> a + b * c`. `let` supports a similar sugaring: `let f a b c = a + b * c`.
So these are all the same function:

# let f1 = fun a -> fun b -> fun c -> a + b * c;; val f1 : int -> int -> int -> int = <fun> # let f2 = fun a b c -> a + b * c;; val f2 : int -> int -> int -> int = <fun> # let f3 a b c = a + b * c;; val f3 : int -> int -> int -> int = <fun> #

This finally explains the type expressions that OCaml uses for functions. Remember that
the type of a function is determined by its *domain* (the type of its argument) and
its *range* (the type of its result). (Having only unary functions simplifies the
notion of function types because we don't need to take the number of arguments into
account.) The syntax for a function type is ` domain -> range`. So
for example the type of the Boolean negation function,

What's the type of `(+)`? As we saw above, `(+)` takes a parameter of type
`int` and returns a function: a function that takes another `int` and returns an
`int`. The type of the latter function — `(+)`'s result — is `int
-> int`, so the type of `(+)` itself is clearly `int -> (int -> int)`.
The type operator `->` is right-associative, so there's no need to parenthesize:
the type of `(+)` is `int -> int -> int`.

Note that while the type operator `->` is *right*-associative, function
application (juxtaposition) is *left*-associative! This minimizes parentheses in both
cases, and there's no confusion because the realm of values and the realm of types are
quite distinct.

Now that we know how and why all functions are unary, I will use the phrase "a function
of *n* arguments" to mean "a function that must be applied *n* times to be fully
applied i.e. to return a non-function value".

Whenever you see a function type, you can know how many arguments it takes by counting
the number of arrows. So a function with this type `int -> int -> int` takes
two arguments (i.e. it's a binary function). The rightmost type is the range of the
function and all the other types describe the domain. So this function takes two
`int`'s and returns an `int`.

There's one more way to define functions with more than one argument: gathering the multiple arguments into a single compound data structure. You could use any compound data structure — a list, an array, a record — but the most natural one is a tuple:

# let f4 (a,b,c) = a + b * c;; val f4 : int * int * int -> int = <fun> # f4 (2,3,4);; - : int = 14 # [f1 2 3 4; f2 2 3 4; f3 2 3 4; f4 (2,3,4)];; - : int list = [14; 14; 14; 14] #

Note that this one isn't really the same function as the first three — its type is
different! You can see that this function is (of course) unary by counting the arrows:
it takes one argument, a tuple of type `int * int * int`. Also, the role of
`(a,b,c)` in the argument position of the definition is actually an OCaml pattern match, which I won't explain here.

This may make your OCaml code look more like a "normal" language, but I recommend you don't do this in general. You can't partially apply a function defined this way, and it's not idiomatic.

`let` has one more trick up it's sleeve. Since definitions must occur before use,
you might wonder how to define mutually recursive functions. Here's a nice definition
of a pair of functions `take`, which returns every other element of its list argument
starting with the first, and `skip`, which also returns every other element but
starts with the second (skipping the first). They can be conveniently defined in terms
of each other:

# let take list = if list = [] then [] else List.hd list ::skip(List.tl list) ;; Unbound value skip # let skip list = if list = [] then [] elsetake(List.tl list) ;; Unbound value take #

Since they're mutually recursive, neither one can be defined first! Some languages,
like Pascal and C, solve this with the notion of a *forward declaration*, but OCaml
uses *simultaneous definitions* via `let rec ... and ...`. You can chain together
any number of definitions with `and`, and all the names defined in the `let` are
simultaneously defined in all the bodies:

# let rec take list = if list = [] then [] else List.hd list :: skip (List.tl list) and skip list = if list = [] then [] else take (List.tl list) ;; val take : 'a list -> 'a list = <fun> val skip : 'a list -> 'a list = <fun> # take [1;2;3;4;5;6;7];; - : int list = [1; 3; 5; 7] # skip [1;2;3;4;5;6;7];; - : int list = [2; 4; 6] #

(I got this pair of functions from Jeffrey D. Ullman: *Elements of ML Programming*,
Upper Saddle River, NJ: Prentice Hall, 1998 — it's a book on Standard ML, but the
translation to OCaml was trivial.)

*Coming soon*.