GNU Make: A Build Tool Not A Task Runner

Some time ago, while traversing the deep trenches of online technical forums, I came across a link to an interesting paper titled "Recursive Make Considered Harmful" by Peter Miller.

This paper, in my opinion is a must read for users of the GNU Make tool.
It provides valuable insight into using Makefiles that you simply would not get from reading the GNU Make Manual.

Many times I see GNU Make treated as a task runner rather than a build tool. Perhaps that's the reason why there are so many alternatives out there.

GNU Make is not a task runner, at least not a very good one.

A task runner is best suited for executing one off "tasks" or "jobs" upon a
user's request. For example, npm scripts or the chron daemon. A build tool on the other hand, should facilitate the compilation of a project and its dependencies in an orderly deterministic fashion.

The two goals are very different.

I first started using GNU Make seriously after much displeasure with lengthy Gruntfiles andGulpfiles. Being a Linux user with a strong affinity for the CLI, the idea of importing a module to copy the contents of a file did not have a positive effect on me.

NPM scripts were a much more pleasant experience, however, there is only so much one should try to fit in a single line of JSON.

I avoided GNU Make in the past due to what I perceived as confusing syntax, not to mention how common it is to find complex Makefile hierarchies in open source projects.

An attractive feature projects like Gulp and Grunt have over GNU Make is the marketing. The landing pages for their websites instantly give you the feeling that you are using a mature, well maintained project.

Contrast that to GNU Make where a naive first impression might be "Is this page archived?".

Still the GNU Make documentation is very informative and a must read before using the tool. Reading the documentation provides valuable insight on syntax, terminology and features however it falls short on how to properly structure your project.

Initially I approached GNU Make like another task runner, so naturally after some instances of the "X is up to date" message. I learned about the ".PHONY" directive and used it everywhere.

It did not take to long for my Makefile to look like another Gruntfile, Gulpfile or overloaded npm scripts section.

This is where Peter Miller's paper comes in:

This paper explores a number of problems regarding the use of recursive make, and shows that they are all symptoms of the same problem.

What the paper addresses is the issue of project authors configuring their project builds to run the make command in each sub-module of the project.

Let's say you have the following project layout:

.
├── main
│   ├── complex.c
│   └── Makefile
├── Makefile
├── shared-objects
│   ├── complex.c
│   └── Makefile
└── some-asm
    ├── complex.c
    └── Makefile

In the above project layout we have three modules to compile, main, shared-objects and some-asm. Each of these modules may have some unique complicated build process that needs to take place before the main project can be built.

Also, shared-objects need to be built but not before assembling some-asm into native assembly code, finally the main project needs to be built.

The top-level Makefile might look like this:


 all: some-asm shared-objects main

 main:
      $(make) main

 shared-objects:
      $(make) shared-objects

 some-asm:
      $(make) some-asm

.PHONY: asm shared-objects main all

(Note that each module has it's own Makefile.)

The seemingly innocent approach you may be tempted to take to properly organize your code, might be to place a separate Makefile in each module and have your top-level Makefile call make on each via a "task" called "all".

The paper calls this "recursive make" and describes various problems
that it leads to, some of which I encountered myself.

In a typical JS project, one module may need to compile some css files,
another Babel or TypeScript. You may even copy/fetch some binary files such as images or pdfs.

When it comes to running these as one off tasks, Grunt/Gulp probably
make for a better experience than Make. After all remember; GNU Make is not a task runner.

I see task runners as being somewhat imperative, whereas the concept behind Make is a bit smarter than that.

From the paper:

Make determines how to build the target by constructing a directed
acyclic graph.

That's right a Directed Acyclic Graph or DAG!

Unless you have a strong math background, you may not be familiar
with the term. Especially when applied to software development.

I'll present a definition from Wikipedia:

In mathematics, particularly graph theory, and computer science,
a directed acyclic graph , is a finite directed graph with no directed cycles.

That's right graph theory! I happen to own a copy of "Discrete Mathematics for Computing" by another Peter, Peter Grossman.

In the chapter "Introduction to graph theory" he writes:

The objects that we study in the branch of mathematics known as graph theory are not graphs drawn with x and y axes. In this chapter, the word 'graph' refers  to a structure consisting of points (called 'verticies'), some of which may be joined to other vertices by lines (called 'edges') to form a network.

A brutally simplistic way of describing a "graph" in "graph theory" is
a collection of points called "vertices" and lines called "edges" that connect them usually representing some relationship between the points.

An acyclic graph means there is no way to follow any of the lines (edges)
from one point and get back to that point. That is to say, no loops.

By now the relation to Makefiles should start to take form.

Later in Peter Grossman's book he writes:

in the technique known as critical path analysis, the vertices of a
graph represent tasks to be carried out, and an edge from one vertex to another indicates that one task has to be completed before the other can begin.

That's exactly how building software works these days!

In Browser Land:
`Transpile -> Bundle -> Build Styles -> Compress Assets -> Ship!`

Make actually understands these relationships and decides which tasks
need to be run and in what order each time you do make. Because of this,
it is very important that Makefile rules and targets are written to represent the build dependency graph of your project.

Not to run tasks.

GNU Make works by looking at the access time of each file it considers
a dependency or build target. If the build target is younger than
its dependencies, it will run the build script you wrote for the target.

This means that each run of the make command is not just a blind
invocation of a command. There are actual calculations taking place
before your code is built.

In small  or uncomplicated projects, this may seem unnecessary but as
your project grows bigger and has more parts to build, time spent waiting for npm run or gulp <task> to complete adds up.

The Make way of building projects actually scales pretty well as a well
written file can result in only the parts of the project that have
changed being re-built.

Make calculates your DAG and looks for the points that have changed and follows the edges to the other build points that would need to be rebuilt.

If you have ever tried to pull down and hack on something as large
as Android OS sources you should be able to appreciate the value in that.

The Peter Miller paper goes on to further describe how to properly organize projects with recursive builds without losing the benefits above.

Since first reading it, I now use Make for most projects. New build tools such as Broccoli have come up in the NodeJS community, however I have a hard time seeing the benefits over Makefiles. Especially when it comes to running shell commands.

GNU Make is in no way perfect, but used correctly it is actually quite
elegant. Most of the issues I have encountered so far are due to the
syntax and implementation itself and none with the DAG concept.

In the future, I intend to follow up this post with more details on
actually using GNU Make and the way I use it with Node.js apps.

If you enjoyed this read, please share.