Bridging the Gap: OCaml and C Interoperability
One of the quiet superpowers of OCaml is its ability to reach outside its own runtime and talk to C. This matters because while OCaml provides strong typing, pattern matching, and a high-level functional programming model, sometimes you need to access low-level facilities, squeeze out performance-critical paths, or leverage existing C libraries that have stood the test of decades. Interoperability between OCaml and C is therefore not an afterthought — it’s a core part of how OCaml is actually used in practice in systems programming, formal verification tools, and industrial-scale compilers.
In this post, we’ll explore how OCaml and C can interoperate, what patterns are common, and where the pitfalls lie.
Why Interop Matters
There are three main reasons OCaml developers turn to C:
Performance: OCaml is fast enough for most workloads, but C can still win when you’re dealing with tight loops, cryptography, or numerics that need to run at native speed without the abstraction overhead.
Access to System APIs: Many operating system interfaces and libraries are exposed primarily through C headers. To talk to the OS or native libraries, you need to bridge through C.
Reusing Existing Libraries: There’s a huge ecosystem of existing C code (e.g., SQLite, zlib, OpenSSL). Rewriting them in OCaml would be wasteful — interop allows you to reuse them directly.
The OCaml Foreign Function Interface (FFI)
At the heart of interop lies the OCaml FFI. The basic model is:
On the OCaml side, you declare a function as
external.On the C side, you implement the function using the OCaml runtime API.
Here’s the simplest possible example:
(* hello.ml *)
external hello_world : unit -> unit = "caml_hello_world"And the corresponding C code:
#include <caml/mlvalues.h>
#include <stdio.h>
CAMLprim value caml_hello_world(value unit) {
printf("Hello from C!\n");
return Val_unit;
}The C function must follow a few rules:
It receives and returns OCaml values, not raw C ints/pointers.
You use macros like
Val_unit,Val_int,Int_val, etc. to convert between the OCaml runtime representation and C values.
You then compile and link:
ocamlopt -c hello.ml
ocamlopt -c hello_stubs.c
ocamlopt -o hello hello.cmx hello_stubs.oRun it, and OCaml calls into C seamlessly.
Passing Data Across the Boundary
Simple values like integers and unit are straightforward. But what about more complex structures?
Integers and floats: Use
Val_int,Int_val,caml_copy_double,Double_val.Strings: OCaml strings are not NUL-terminated, but
String_valgives you achar*pointer with a length accessible bycaml_string_length.Arrays: Use
Field(array, i)to access elements, keeping in mind that arrays are boxed OCaml values.
Example with strings:
external shout : string -> unit = "caml_shout"
CAMLprim value caml_shout(value v_str) {
const char* str = String_val(v_str);
printf("%s!!!\n", str);
return Val_unit;
}Memory Management and the Garbage Collector
One of the trickiest parts of interop is OCaml’s garbage collector (GC). The GC can move objects around in memory, so naïvely holding pointers into OCaml space across C calls is unsafe.
That’s where GC roots come in. If you want to keep an OCaml value around in C code across allocations, you must register it as a root:
CAMLparam1(v_str);
CAMLlocal1(copy);
copy = caml_copy_string(String_val(v_str));
CAMLreturn(copy);The CAMLparam and CAMLlocal macros tell the GC about your local variables. Forgetting them can lead to mysterious crashes.
Callbacks: From C into OCaml
Interop isn’t just one-way. Sometimes your C code needs to call back into OCaml — for example, when registering event handlers or passing function pointers.
OCaml exposes this via caml_callback and its variants:
value* callback_closure = caml_named_value("my_ocaml_callback");
if (callback_closure != NULL) {
caml_callback(*callback_closure, Val_int(42));
}On the OCaml side:
let () = Callback.register "my_ocaml_callback"
(fun x -> Printf.printf "OCaml got %d\n%!" x)This mechanism makes it possible to embed OCaml inside a larger C application, not just the other way around.
Higher-Level Interop: ctypes
The low-level FFI is powerful, but boilerplate-heavy. You have to write C stub code, manage roots, and compile extra .c files. For many use cases, the ocaml-ctypes library is a better fit.
With ctypes, you can bind directly to C functions from pure OCaml code, without writing C stubs. Example:
open Ctypes
open Foreign
let puts = foreign "puts" (string @-> returning int)
let () = ignore (puts "Hello via ctypes!")This will dynamically load libc and call the real puts. ctypes handles marshalling, struct layouts, and calling conventions. It’s not quite as fast as handwritten stubs, but far more productive.
Common Pitfalls
Forgetting GC macros: Leads to subtle corruption bugs.
Misaligned expectations of string null-termination: OCaml strings may contain NUL bytes.
Mixing OCaml allocation with raw malloc: Can lead to mismatched memory ownership.
Performance overhead: Each call across the OCaml–C boundary has a cost; batch work where possible.
Use Cases in the Real World
Coq Proof Assistant: Written in OCaml, but uses C stubs for performance-critical arithmetic and parsing.
CompCert Compiler: A formally verified C compiler in OCaml that still interfaces with system-level C code.
MirageOS: Unikernel framework that relies on OCaml but calls out to C for networking and cryptography.
These projects demonstrate that OCaml’s C interop is not just academic — it’s industrial-grade.
Conclusion
OCaml and C interoperability is a practical necessity, not a theoretical feature. The FFI provides raw, low-level power when you need it, while higher-level abstractions like ctypes make it accessible for day-to-day bindings. The key challenge is memory management—understanding how the GC works and respecting its invariants. Once you master that, you unlock the ability to bring the best of both worlds: the safety and expressiveness of OCaml, and the raw performance and ecosystem of C.


