Construction, Efficiency, and Production Systems
We talk a lot around here about construction being an inefficient industry. Whereas other industries have shown increasing productivity over time, construction has remained flat or even declined. There is no shortage of depressing graphs showing this:
But these productivity measures are fairly abstract and high-level. They identify the problem, but don’t give many clues as to what the solution might be. So it’s worth drilling down into the mechanisms that underlie a production process, and understand exactly what has gone wrong in an inefficient one.
A Simple Production Model
In the grand tradition of Adam Smith, let’s consider a simple pin factory. In our factory, coils of wire come in on one end, get processed at a series of workstations, and come out the other side as finished pins. In this simplified model, the manufacturing process has 4 discrete steps:
Cutting - a piece of wire is cut off from the coil.
Straightening - the piece of wire is straightened.
Attaching head - the head of the pin is attached to the body.
Sharpening - the end of the pin is sharpened.
For simplicity, we’ll assume each step in the process takes exactly one second. With this, we can calculate a few different measures of the production process:
The factory’s production rate, or throughput, is one pin per second.
The factory’s work-in-process, the amount of partially completed work currently in the system, is always 4 pins (one at each station).
The factory’s cycle time, the time it takes a pin to go through the factory, is 4 seconds (one second for each of the 4 stations).
The equipment’s utilization rate is all 100% - every station is in use at all times.
The queue length at each station is zero - pins don’t spend any time in the process waiting for a piece of equipment to become available.
These metrics tell us how efficient our production process is. More efficient factories will produce more (higher production rate), produce faster (lower cycle time), and hold as little inventory as possible (lower work-in-process).
Our example pin factory is actually a perfectly efficient production process - assuming the times and process arrangement are fixed, the production rate is as high as possible, and the cycle time and work-in-process are as low as possible. The reason it’s so efficient is that there’s no waiting anywhere in this system - as soon as a pin is done with one step of the process, it can immediately proceed to the next step.
But real life production processes almost never move in perfect lockstep like this. In particular, they often have a significant amount of variation in them. This can be anything from natural process variation (it might take between 10 and 15 minutes for a task to complete), to random equipment breakdown, to occasional production errors that need to be reworked.
So let’s look at a slightly modified version of our pin factory, with a bit of variance injected into it. Instead of each step taking exactly 1 second, each step now takes on average one second, but is distributed normally with a standard deviation of 0.5 seconds.
We might think that this process would yield the same result as the first process - after all, the average process times are exactly the same, and the variation is relatively small. But what in fact happens is that adding variability degrades performance.
Simulating the above factory for 500 seconds yields the following results:
Our production rate has declined a bit - it’s at 92% of the theoretical maximum (still pretty good!) But the WIP and cycle time are much worse - instead of 4 pins in the system, we have almost 40, and instead of a 4-second cycle time, we have over 20. What’s going on?
The culprit is the variation. With work proceeding sequentially, sometimes the upstream process will be faster than the downstream one - when this happens, queues build up. But if the downstream process is faster than the upstream and there’s no queue, that extra speed is wasted - the downstream process has to sit and wait until the upstream one finishes. Chaining together processes means that sometimes your ‘good’ outcomes will be screened off while you wait for something farther up the line .
The result is that material tends to accumulate in the system as queues form. The longer the queues, the more time it takes for material to progress through the system. If we run the simulation for longer, we find that production actually approaches the theoretically optimal level, but WIP and cycle time continue to rise:
What happens if we inject even more variance into the process? Consider another pin factory - in this one, each process takes just 0.4 seconds, but has a 0.1% chance of stopping for 10 minutes whenever it runs.
In this arrangement, each step still has an average time of 1 second. But all our production metrics completely collapse. Running for 5000 seconds yields the following:
Our production rate is now less than 40% of our “optimal” value, we have thousands of pins tied up in the system, and they’re taking over 400x as long to move through it!
The takeaway is that the more variation and unpredictability in your production process, the worse it will perform. If you can’t reduce variability, the only option is to buffer against it, either with extra material (as happened naturally in our above simulations), extra time, or extra capacity.
The upshot is that it’s often possible to improve a system’s performance substantially simply by reducing variability. Even production systems with long process times and significant manual labor can be substantially improved if you can control and restructure them in a way that makes them more predictable. One of the major benefits of the assembly line is turning production into a series of well-defined steps that can be performed in a predictable amount of time. Returning again to Adam Smith’s pin factory, simply rearranging the workers so that each one performed just one or two steps massively increased production.
This same queueing model can be applied to any sort of production system - any system where a set of inputs is transformed, step by step, into a set of outputs. In software development, for instance, things like scrum, agile, and devops all stem from this basic production framework (particularly lean methods and the Toyota Production System).
For construction, we can think of it the same way - work consists of a series of processes (design, foundations, framing, MEP, etc.) that gradually transform raw materials into a finished building. It’s more complex than our simple pin factory, but the same rules can be applied.
Using this lens makes the deficiencies in the construction process obvious:
Throughput is extremely low - only a small fraction of a building gets completed each day. For single family construction, an hour of labor produces just 1.15 square feet of building, roughly 0.07% of a 1600 square foot home.
Cycle time is incredibly high - buildings take months or years to move from the initial design process through completing construction.
And work-in-process is enormous - while a building is under construction, millions of dollars of materials and labor are tied up in the partially completed building.
The above model suggests that process variability might be a source of inefficiencies in the construction industry.
Construction is rife with variability, at nearly every level of the process. At the micro level, workers are making very few repetitive movements, or going long stretches doing the same thing - they’re constantly changing tasks, moving around the jobsite, having to go find a tool or acquire material, or waiting for something to arrive. Unlike a factory (where the work can take place at a particular station, and supplies can be carefully orchestrated to prevent slowdowns), construction workers must be constantly moving themselves, their equipment, and their material around the jobsite. Not only does this increase the time it takes to complete tasks, but it makes it harder to predict how long they’ll take.
At a level up from this, lack of coordination between subcontractors, designers and contractors means that tasks are often re-done or increased in scope. The insulation installer might need to re-do much of their insulation after the electrician pulled it out to run their wiring, or the plumber might need to drill through a solid wood beam that the drawings didn’t show (that the framer will then have to go repair). Any given task has a non-negligible possibility of taking double or triple the time predicted.
One more level up, we have the variability in the environment itself. Instead of a controlled factory, construction takes place outside, on the ground, in the open air. Rain, wind, cold, poor soil conditions, even traffic can all cause unpredictable delays to a project.
And zooming out to 10,000 foot view, the entire construction process is not executing a well-defined plan, but gradually figuring out what needs to be built. Architects produce an initial set of drawings, send it out to engineers, who come back with questions, comments, and suggestions, and the drawings get gradually refined. These drawings then get sent out to subcontractors, who repeat the process with their own questions, comments, and suggestions, And this entire set of drawings gets sent out to the site crews, who are tasked with figuring out how to turn it into a finished building. This inevitably entails more questions, comments, and suggestions, a process that doesn’t stop until the last nail is hammered in, months or years after the process began.
In a more streamlined production process, the step of figuring out what needs to be made is wisely separate from actually making it. But in construction, they’re deeply intertwined. All this adds up to a process that’s wildly difficult to predict with any accuracy.
Much of the slowest, most variable elements of the construction process fall into the category of “setup time”.
Setups are the time it takes to set-up at the beginning of a production process. Whenever a worker or piece of equipment needs to change what they’re doing, there’s a setup involved (traditional factories often strive to make large numbers of identical items specifically to avoid setups).
Construction has an enormous number of setups. Every time a worker puts down a hammer and picks up a saw, every time a crew moves to a different part of the building, there’s a setup. Every time the superintendent has to look at a set of plans, every time the crane unhooks from one piece and hooks on to another, there’s a setup. The months architects and engineers spend producing the drawings for the building is one long setup for the actual construction process.
Setup time can dramatically exceed the actual process time. A nail leaves a nailgun in a fraction of a second, but it can take minutes to get the material to be nailed into position.
If you can reduce your setup time, not only do you increase your throughput and decrease inventory, you make your production process more flexible, by making it less costly to change what you’re making. This is one of the key insights behind the Toyota Production System, which let Toyota efficiently produce a smaller number of cars with greater product variety. Toyota became so good at reducing setup time that they were able to reduce certain equipment change times from 3 days to less than 10 minutes.
Process and Scale
Something interesting about this sort of production model is the fractal nature of it. As you drill down, each individual step consists of several sub-steps, each with their own setup time, variation, failure rate, etc.
So we might model nailing a piece of sheathing to a wall as a setup (getting the sheathing in place, getting the nails, finding the hammer, positioning the sheathing) followed by the nailing process. But as you drill down, each component of that is it's own sub-process - each nail drive has a process time (the hammer strike) and a setup time (positioning the hammer). Setups, too, have their own structure - positioning the sheathing is a sequence of moving it (process), seeing where it moved (setup) and moving it again.
This means it’s often possible to create controlled micro-environments that encapsulate a particular portion of a process, and significantly improve its “production rate”. If you look inside a nail gun, you’ll see several processes strung together - a trigger engages a valve, which releases air, which propels a nail forward. We can think of a nail gun as sort of a micro factory, transforming a stationary nail into one with forward momentum.
Despite its reputation, construction does see it’s share of advances and innovation. But they mostly occur at this lower level, improvements to one small facet of the process. So a manufacturer may come out with an exterior cladding product that’s faster and easier to install, or produce a new power tool that can more quickly install fasteners, or a new piece of software that makes PDF manipulation easier. But these are embedded in a larger, unchanging process, and ultimately have limited impact.
The Way Forward for Construction
The reason factory production is often synonymous with efficiency is that a factory is a good environment for solving all these problems. A factory gives you control over your environment, making it possible to screen off a lot of potential variation, and control how production is structured. And it allows you to invest in various improvements that can pay off over a long period of time or be amortized over a large number of jobs. A construction site that's in the elements, and where most of the work is done by subcontractors interacting with each other is inherently much higher variability.
But as we’ve seen, factory-based construction brings with it it’s own problems. And evidence suggests the gains are limited. So it’s worth considering if we might be able to get the benefits (controlled process, reduced variability) in some other way.
 - One way to see this, assume a two step process, where each step has two speeds: Fast (1 second), and normal (2 seconds) that are randomly distributed. A naive calculation will get an average process time of 1.5 + 1.5 = 3 seconds. But if we enumerate the possible outcomes, and a different answer appears:
The process time will be higher until the queues are long enough to provide a buffer against waiting.
 - It works in the other direction as well - a factory can be thought of as a single step in a firm’s production process. Firms too generally strive to minimize setup cost by targeting specific markets and business models. A whole firm could retool to change what it produced if it needed to, as Intel did in the 70s, or as every manufacturer did temporarily during WWII.
Feel free to contact me!