OCaml for the Skeptical: Defining and Applying Functions

The whole point of programming in a functional language is to define functions. We've already seen OCaml's fun lambda expression, and we have seen how to name values, and since a function is a value like any other, that means we already know how to define functions:

    # let inc = fun n -> n + 1;;
    val inc : int -> int = <fun>
    #

We've now named the above lambda that adds 1 to some n, inc. Fear not: OCaml has substantial syntactic sugar for function definitions that you will like better than this. But first, let's consider the important topic of how to call or apply functions.

Function Application

Probably the most popular notation for applying a function to its argument is inc(2), the value of which would be 3. And this works in OCaml:

    # inc(2);;        
    - : int = 3
    #

But I would be deceiving you if I left you with the impression that this is the idiomatic syntax for function applications in OCaml! Since OCaml is a functional language, and applying functions is the most important thing you do as an OCaml programmer, the syntax of function application is designed to be as simple as it can possibly be: it is simple prefix juxtaposition. No parens are needed (other than for the usual purpose of disambiguation and changing precedence). No whitespace is needed (other than for disambiguation)! So the natural, idiomatic way to write the above function application would be:

    # inc 2;;
    - : int = 3
    #

Note that in this case we need whitespace between the function name and the argument simply because inc2 would be ambiguous (is it the name inc2, or the function named inc applied to 2)? inc(2) or inc (2) works simply because 2 is an expression and you can always pointlessly wrap parens around an expression without changing its value. Don't put needless parens around your function arguments unless you want to look like a newbie!

Only Unary Functions?

You may have noticed I seem to have implied that OCaml only has unary functions i.e. functions of one argument. Well, it's true! OCaml really does only have functions of one argument, simply because that's all that's necessary in any language! You may claim you've seen me adding two numbers together with +, so clearly + is a function of more than one argument. Well, it's not: that's mere syntactic sugar. The way to understand this notion of "only unary functions" is to consider partial application.

Partial Application

Pretend that OCaml does have functions of more than one argument and pretend that + is one of them. First note that (most) any infix operator in OCaml can also be used as a prefix operator by surrounding the operator with parens and putting it in a prefix position, like so: 2 + 3 = (+) 2 3. (Remember, function application is indicated by juxtaposition, so we just put spaces between the function name ((+)) and its arguments, and between the arguments too.) This is legal OCaml, which would seem to prove that OCaml has functions of more than one argument. But wait.

Now imagine what it should mean if you were allowed to call (+) with fewer than its supposed two arguments. Can we assign any useful meaning to (+) 2? Well, Lisp assigns the meaning 2 to (+ 2) (and 0, the identity, to (+)) but OCaml has a better idea.

(+) 2 is interpreted as a function that's waiting for one more argument before it can be applied. So if you call a function of "two arguments" with only one argument, you get a function with 2-1 = 1 argument. You can assign this function a name as in let plustwo = (+) 2 and then use that name as a function of one argument, as in plustwo 3 = 5. Of course you can also use that new function of one argument without naming it: ((+) 2) 3 = 5 or List.map ((+) 2) [1;2;3] = [3; 4; 5], which is very convenient.

This is in fact how OCaml functions work: every function of n > 1 arguments is actually a function of one argument that returns a function of n-1 arguments as its result. OCaml's function-application syntax, combined with the left-associativity of function application, makes this transparent. (+) 2 is fun a -> (+) 2 a, the function that adds 2 to some number. We can abstract out the 2 to get the function that adds some number to some other number: fun b -> fun a -> (+) b a and that's (+). We can use this technique to define functions of any number of arguments: let f = fun a -> fun b -> fun c -> a + b * c. So what's f 6? The 6 is the "first" argument of f, so the definition above tells us that f 6 is fun b -> fun c -> a + b * c with the a replaced with 6 or fun b -> fun c -> 6 + b * c. Clearly, that's a function (because it's a lambda expression). What's f 6 2? Since f 6 is fun b -> fun c -> 6 + b * c, f 6 2 is fun c -> 6 + 2 * c and so on. This is why OCaml only needs functions of one argument.

OCaml doesn't work this way just to be perverse; it does it because partial application is an extremely useful programming technique, once you get used to it. Also, having only functions of one argument simplifies the type of functions. (In fact most modern functional languages — including Standard ML, Haskell, and Clean — work this way; this way of looking at functions is called Currying after the logician Haskell B. Curry.)

Syntactic Sugar for Function Definitions

Except in specialized circumstances, nobody wants to define a function of three arguments with three nested lambdas: let f = fun a -> fun b -> fun c -> a + b * c. fun actually has a sugared form to make this less verbose: fun a b c -> a + b * c. let supports a similar sugaring: let f a b c = a + b * c. So these are all the same function:

    # let f1 = fun a -> fun b -> fun c -> a + b * c;;
    val f1 : int -> int -> int -> int = <fun>
    # let f2 = fun a b c -> a + b * c;;
    val f2 : int -> int -> int -> int = <fun>
    # let f3 a b c = a + b * c;;
    val f3 : int -> int -> int -> int = <fun>
    #

Function Types

This finally explains the type expressions that OCaml uses for functions. Remember that the type of a function is determined by its domain (the type of its argument) and its range (the type of its result). (Having only unary functions simplifies the notion of function types because we don't need to take the number of arguments into account.) The syntax for a function type is domain -> range. So for example the type of the Boolean negation function, not, is bool -> bool, i.e. it's a function from a bool to a bool. not true is false: true is a bool, and false is a bool.

What's the type of (+)? As we saw above, (+) takes a parameter of type int and returns a function: a function that takes another int and returns an int. The type of the latter function — (+)'s result — is int -> int, so the type of (+) itself is clearly int -> (int -> int). The type operator -> is right-associative, so there's no need to parenthesize: the type of (+) is int -> int -> int.

Note that while the type operator -> is right-associative, function application (juxtaposition) is left-associative! This minimizes parentheses in both cases, and there's no confusion because the realm of values and the realm of types are quite distinct.

Now that we know how and why all functions are unary, I will use the phrase "a function of n arguments" to mean "a function that must be applied n times to be fully applied i.e. to return a non-function value".

Whenever you see a function type, you can know how many arguments it takes by counting the number of arrows. So a function with this type int -> int -> int takes two arguments (i.e. it's a binary function). The rightmost type is the range of the function and all the other types describe the domain. So this function takes two int's and returns an int.

There's one more way to define functions with more than one argument: gathering the multiple arguments into a single compound data structure. You could use any compound data structure — a list, an array, a record — but the most natural one is a tuple:

    # let f4 (a,b,c) = a + b * c;;
    val f4 : int * int * int -> int = <fun>
    # f4 (2,3,4);;
    - : int = 14
    # [f1 2 3 4; f2 2 3 4; f3 2 3 4; f4 (2,3,4)];;
    - : int list = [14; 14; 14; 14]
    #

Note that this one isn't really the same function as the first three — its type is different! You can see that this function is (of course) unary by counting the arrows: it takes one argument, a tuple of type int * int * int. Also, the role of (a,b,c) in the argument position of the definition is actually an OCaml pattern match, which I won't explain here.

This may make your OCaml code look more like a "normal" language, but I recommend you don't do this in general. You can't partially apply a function defined this way, and it's not idiomatic.

`let rec ... and ...` — Simultaneous Definitions

let has one more trick up it's sleeve. Since definitions must occur before use, you might wonder how to define mutually recursive functions. Here's a nice definition of a pair of functions take, which returns every other element of its list argument starting with the first, and skip, which also returns every other element but starts with the second (skipping the first). They can be conveniently defined in terms of each other:

    # let take list =
	if list = []
	then []
	else List.hd list :: skip (List.tl list)
      ;;
    Unbound value skip
    # let skip list =
	if list = []
	then []
	else take (List.tl list)
      ;;
    Unbound value take
    #

Since they're mutually recursive, neither one can be defined first! Some languages, like Pascal and C, solve this with the notion of a forward declaration, but OCaml uses simultaneous definitions via let rec ... and .... You can chain together any number of definitions with and, and all the names defined in the let are simultaneously defined in all the bodies:

    # let rec take list =
	  if list = []
	  then []
	  else List.hd list :: skip (List.tl list)
      and skip list =
	  if list = []
	  then []
	  else take (List.tl list)
      ;;
    val take : 'a list -> 'a list = <fun>
    val skip : 'a list -> 'a list = <fun>
    # take [1;2;3;4;5;6;7];;
    - : int list = [1; 3; 5; 7]
    # skip [1;2;3;4;5;6;7];;
    - : int list = [2; 4; 6]
    #

(I got this pair of functions from Jeffrey D. Ullman: Elements of ML Programming, Upper Saddle River, NJ: Prentice Hall, 1998 — it's a book on Standard ML, but the translation to OCaml was trivial.)

Labels: Optional and Default Parameters

Coming soon.

Defining and Applying Functions