Programming-in-the-large requires a number of features of a programming language; in
OCaml, many of these are achieved through the module system.
- Namespaces. In large programs it's important to be able
to avoid name collisions; by defining values, types, exceptions and functions in a
module, you force the user to qualify those names with the module's name (in the
familiar dot notation: module.component).
- Information hiding. By giving a module a signature,
you can prevent certain names from being referenced from outside the module.
Thus, you can define helper functions that exploit knowledge about the
implementation types but prevent users from calling these functions. If you see
this as "fascist", then your signature can expose everything (this is the default
with no signature), but if you see this as a way to allow yourself to change the
underlying implementation without breaking any of the code that uses your module,
you can use a more restricted signature. OCaml gives you the choice.
- Abstract types. Most modern languages give you namespaces
and some way of doing information hiding. OCaml's module system goes a step farther
by allowing you to abstract over the types in your modules. When you abstract over
the values or expressions in a piece of code, you are defining a parameterized
function. OCaml gives you parameterized modules. Suppose you define a module
implementing sets of integers. You can generalize this module by making the type a
parameter, so that you now have sets of anything. But you may have restrictions on
the type -- perhaps it needs to have an equality relation to be suitable (it's hard to
imagine how you could implement sets of things that you can't compare). OCaml let's
you express this restriction and use functors to build new modules of the right
type (functors are functions from modules to modules).
I will be the first to admit that OCaml's powerful module system (which is very similar
to that of Standard ML) is perhaps the most complex part of the language, and that I
really don't understand it yet. Fortunately, OCaml makes it very easy to get started.
Simply by defining values and functions in a separate file you define a module whose
name is the base-part of the filename. You also get a default signature for free, which
exposes everything in the module. Just compile this module together with your main
program and it works. You can worry about information hiding and type abstraction
later.
Parameterized modules may sound like the classes of an object-oriented language, and
they are in fact another way OCaml gives you to avoid doing OO. There are
advantages to this: calling the functions from a parameterized module is just as
efficient as an ordinary function call (which is very efficient in OCaml),
without any of the overhead associated with instantiating objects and calling
methods. And while the module system has complexities (from the point of view of
strong typing), in many ways it's much simpler than an object system -- especially
an object system (like OCaml's) that also gives you the advantages of strong typing.
Finally, the OCaml module system is unique (even compared to Standard ML) in that it
gives you strong static typing with the efficiency of separate compilation. OCaml
can even perform most optimizations (even function inlining) across
separately-compiled module boundaries.
What You Need to Know to Use Somebody Else's Modules
Module Signatures for Free