OCaml for the Skeptical
U of C Library DLDC Informal OCaml Class

Compiling and Running Programs

OCaml gives you several ways to run your program. You can run it interactively by invoking its component functions in the top-level, but this is normally just how you develop and debug your code.

Running Source Files Under the Top-Level

You can also run source files under the top-level like so:

    $ ocaml foo.ml arg1 arg2

and some people like to do this for short OCaml "scripts", but I don't recommend this for several reasons.

  1. It's inherently slower, since you are re-parsing your program and re-compiling it to byte-code every time you invoke it (this is the way things always work for many interpreted languages, like Perl and Tcl).
  2. To productionize your program, you'll need to write a shell-script wrapper hiding the invocation.
  3. It's awkward for programs that are structured in several separate source files.
  4. It's awkward for programs that use libraries that aren't in the OCaml core.

You can actually get around the latter two problems by embedding top-level directives like #cd "/path/to/source/files;;" and #load "foo.cma";; into your source code, but these aren't part of the OCaml language and so now your source files aren't compilable. I recommend you avoid this. Besides, OCaml's byte-code compiler specifically solves all these problems and it's really not hard to use (especially with one of the third-party tools I recommend below).

Compiling and Linking with the Byte-Code Compiler

Compiling a single-file executable like cat.ml is simplicity itself:

    $ ocamlc -o cat cat.ml

The -o cat option specifies the name for the resulting executable (cat in this case), just as it does for most C compilers. Leave it out and you'll get the traditional name a.out.

Compiling and linking a program made up of several source files is almost as easy. There's only the issue of doing it in dependency order. Consider this trivial program that prints each command line argument on a separate line, structured as three one-line source-files (or modules):

Filename Contents
a.ml let _ = B.main ()
b.ml let main () = Array.iter C.doit Sys.argv
c.ml let doit arg = print_endline arg

File a.ml invokes the main function from the module in b.ml, which invokes the doit function from the module in c.ml. If we try to compile this program with ocamlc *.ml we'll be listing the files in alphabetical order, which is the reverse of the dependency order (module A depends on module B, so b.ml must be compiled before a.ml; likewise module B depends on module C). Let's see what would happen:

    $ ocamlc a.ml b.ml c.ml       
    Error while linking a.cmo: Reference to undefined global `B'

While compiling a.ml, OCaml finds the reference to B.main and doesn't know the type of this function, because b.ml hasn't been compiled yet! We just need to compile and link in the reverse order — it's fine to do it in one invocation of ocamlc:

    $ ocamlc c.ml b.ml a.ml 
    $ ./a.out hello there

But the easiest thing is to use one of the many third-party utilities like ocamake (see below) that compute dependencies for you:

    $ ocamake *.ml                                                  
    $ ./a.out hello there

A More Complex Project

Let's demonstrate a more complex project with my ocolumn program from earlier. It isn't a big program, but it illustrates:

My program is written in two source files; note that OCaml source code files use the extension .ml:

This is the main program.
This is a separate module of utility functions that I thought would be generally useful in other programs.
This is the signature for the Utils module; it gives the types of all the functions, types and values that comprise the module. I let OCaml generate this for me automatically; if you wanted to hide some of this information you would write this yourself; see Modules for more information.

Several steps are needed to compile a program like this (but see below for automated ways!).

First, we need a signature for the Utils module. A signature specifies the interface to a module in terms of types; you need one for each module you implement as a separate source file. You can write and maintain a module's signature, with comments, etc, as you go along, but we're going to be lazy and let OCaml generate a signature for us; the signature of a module goes in a file named with the extension .mli (module interface) and the compiler's -i option causes it to generate a signature on standard out:

    $ ocamlc -i utils.ml > utils.mli
    $ ls -l utils.mli
    -rw-rw-r--  1 keith  keith  1899 May 17 17:31 utils.mli

Signatures need to be themselves compiled; .mli files compile to .cmi (compiled module interface) files; the -c option is like a C compiler's -c, meaning to compile only (don't try to link):

    $ ocamlc -c utils.mli
    $ ls -l utils.cmi
    -rw-rw-r--  1 keith  keith  4679 May 17 17:32 utils.cmi

Now we can compile our main program to executable byte-code; byte-code gets the extension .cmo (compiled module object-code, I guess). Since it uses the Utils module, it won't compile if it can't find utils.cmi; that's why we compiled that interface first. But ocolumn.ml also uses a third-party module called Pcre, and the compiler likewise needs to find the .cmi file for that module. Pcre is installed on my system, but in a non-standard place, so I need to tell the compiler where to look for it with the -I option:

    $ ocamlc -c -I /usr/local/lib/ocaml/site-lib/pcre ocolumn.ml
    $ ls -l ocolumn.cmo
    -rw-rw-r--  1 keith  keith  4785 May 17 17:32 ocolumn.cmo

It's time to actually compile the source code of the Utils module into byte-code:

    $ ocamlc -c utils.ml
    $ ls -l utils.cmo
    -rw-rw-r--  1 keith  keith  7819 May 17 17:32 utils.cmo

Now we're about to link all this compiled byte-code together into an executable program. The Pcre module is already compiled on my system, in a more flexible archive format that uses the extension .cma (we again have to point the compiler to where it lives):

    $ ocamlc -I /usr/local/lib/ocaml/site-lib/pcre -o ocolumn utils.cmo pcre.cma ocolumn.cmo
    $ ls -l ocolumn
    -rwxrwxr-x  1 keith  keith  103920 May 17 17:33 ocolumn*

Notice how we list the byte-code files in dependency order: utils and pcre have no dependency relationship, so they can come in either order, but both must come before ocolumn, which depends on them.

Finally we have the compiled executable, ocolumn:

    $ ./ocolumn -help
    Usage: ocolumn [-i STR] [-m] [-o STR] [-r REGEXP] [-s STR] [-t] [--] file ...
      -i STR: input field separator (default: '%s')
      -m : merge adjacent empty fields (like awk; default: false)
      -o STR: output field separator (default: '  ')
      -r REGEXP: field separator as regular expression
      -s STR: field separator (sets -i and -o)
      -t : noop for compatibility with BSD column
      -- : stop interpreting options
      -help   Display this list of options
      --help  Display this list of options

Easier Ways to Compile

Now, I will be first to admit that that was a lot of work! Fortunately there are many third-party development tools to make it easy to compile an OCaml program.

One example is Markus Mottl's popular OCamlMakefile. If you copy this single file into your project directory, then you can easily make a Makefile for your particular project; my ocolumn Makefile is just this:

    SOURCES = utils.ml ocolumn.ml
    PACKS   = pcre
    RESULT  = ocolumn

    -include OCamlMakefile

and with this I can compile and link with just the command gmake.

Another alternative is to install Nicolas Cannasse's ocamake application and then ocolumn can be compiled with one command:

    $ ocamake -o ocolumn -I /usr/local/lib/ocaml/site-lib/pcre *.ml pcre.cma

ocamake will also generate a nice, short easy-to-understand Makefile for your project if you like.

I highly recommend that you use either OCamlMakefile or ocamake for your projects, rather than working with the compiler directly.

Compiling and Linking with the Native-Code Compiler

All the steps to generate a native-code executable (instead of byte-code) are the same; we simply need to use the native-code compiler, ocamlopt, instead of the byte-code compiler ocamlc, and some extensions are different:

File Type Byte-Code Extension Native-Code Extension
source code .ml .ml
module interface (signature) .mli .mli
compiled module interface .cmi .cmi
compiled object code .cmo .cmx
compile module archive .cma .cmxa

Generating and compiling signatures is exactly the same (you can use ocamlopt for both, but you don't need to). Compiling an object file is done the same way, but you need to be sure to use ocamlopt to get native-code, and the output file will be named with the .cmx extension (you will also get a .o file, which you don't usually need to think about):

    $ ocamlopt -c utils.ml
    $ ls -l utils.o utils.cmx
    -rw-rw-r--  1 keith  keith   1723 May 17 19:02 utils.cmx
    -rw-rw-r--  1 keith  keith  18436 May 17 19:02 utils.o

Linking is done the same way, but again, you need to use ocamlopt and use the .cmx and .cmxa extensions to refer to compiled modules and libraries:

    $ ocamlopt -c -I /usr/local/lib/ocaml/site-lib/pcre ocolumn.ml  
    $ ocamlopt -I /usr/local/lib/ocaml/site-lib/pcre -o ocolumn utils.cmx pcre.cmxa ocolumn.cmx  

But again, with OCamlMakefile, you can compile to native-code (without changing the project Makefile) by saying gmake nc (nc for native-code) and with ocamake, just give the -opt option.