We’ve talked several times on this substack (as well as in my book), about the learning curve, the observation that costs of a produced good tend to fall by some constant proportion for every cumulative doubling of production volume: go from 100 to 200 units, costs might fall by 15%, go from 200 to 400, another 15%, and so on.
Thank you for the close reading of this paper — it is much-talked-about and not obvious on a skim, making it a perfect choice. Learning curves are such a tantalizing phenomenon: of central importance, seemingly the product of fundamental regularities (I’m partial to the interconnectedness angle), but with N so small and the range of interacting dynamics fairly numerous that we find it vexingly difficult to isolate clear evidence of causality.
Thank you also for The Origins of Efficiency, which I enjoyed very much.
We see some items with increasing unit costs with the increase in production volume. For items not including nuclear energy, it seems likely that those increases are due to increases in the costs of inputs, perhaps due to constricted supply, which are later alleviated.
In the case of nuclear power, it seems plausible that the safety costs and fuel life-cycle costs increased as we learned more about the technology.
On second thought - safety/pollution costs might apply for other items as well. Also, there is an implied time variable which could be interesting.
Some of these things are wrong to call a technology. Perhaps most of them. Transistors and ICs today have essentially zero commonality with what Moore wrote about, indeed pretty much everything before 1980 was alien - the transistors, the circuits, the lithography, the interconnects, the wiring, the scaling laws - nothing is actually the same.
It really is important to bring economics and market into it. Moore wrote his original article about the economics of integration. It is that economic engine that has driven the past 60 years, and as it grew, the incentives grew. Exponential growth is built on exponential marginal value, a nice mathematical reason for it to keep happening. The electronics industry is like an enormous suction engine incentivizing any qualifying invention to step up and share the wealth. The inventions get harder but the payoff gets larger. Physical limits be damned, there is always some new idea that steps in. Planarization. Damascene copper. Fins. HiK. EUV. Chiplets. DSPs on interconnects. CPO. Just look at that list of almost unrelated tech - Wright's law is regular because the economic benefits scale with prior success.
So if you want to understand how long a Wright's law runs and at what slope, look at the economics. How fast does the market expand? Does it expand, or just chasing fixed or shrinking revenue ever harder to satisfy? How elastic is it, relative to absorbing innovations? Did changes in the rate reflect changes in market growth?
The most obvious explanation is that as the main cost contributor is reduced via learning, at some point it is no longer the main bottleneck, and something else with fundamentally different characteristics is.
As an exampme, for a long time, the largest cost in a solar system was the cells/modules themselves. Now those are cheap enougj that the labour of installing them, and the filed wiring, is actually the biggest cost. The former behaves as a manufactured goods. The later looks a lot like roofing / electrical manual labour. Hence, it should not be a surprise that solar system costs look mature and no longer following the old trend, even if graphs of module prices do for a while.
I feel the need to be pedantic because I just got done with an essay detailing exactly what Moore predicted and how it changed over time. Moore's Law doesn't actually say anything about cost—it's purely about doubling the number of transistors that can be integrated in an amount of time. I also argue that Wright's Law doesn't really apply to the semiconductor industry because the nodes really don't have that much in common, the way an airplane or utility-scale energy might. There is production-volume-based learning happening, but abstracting Moore's Law to just a special case of Wright's Law is a mistake.
Yeah, the traditional formulation of Moore's Law isn't about costs at all, just about the number of components on an integrated circuit. But folks studying learning curves will sometimes use a "generalized" Moore's Law, which is just constant rate of improvement (cost or something else) per unit time. Nagy et al "Statistical Basis for Predicting Technological Progress" is an example of this sort of paper.
There are a number of related laws. It sounds like Wright's Law is the most generic. For solar energy, there's Swanson's Law which says that the price of solar photovoltaic modules tends to drop 20 percent for every doubling of cumulative shipped volume so that at present rates, costs go down 75% about every 10 years. I'm sure there are a number of other related laws, but Moore's and Swanson's both involve processing silicon to manipulate electrons.
The weak correlation between early and later learning rates is most interesting here for me. It doesn’t negate learning curves, but it does undercut straight-line extrapolation as a forecasting crutch.
Framed that way, Wright’s Law feels more like a directional intuition than a timing or magnitude guarantee...especially once technologies pass through regime changes or constraints. The outside-view framing is particularly useful in that respect.
Thanks for the walk-through! This was a great read.
FYI, you can even apply learning curves to agriculture. My 1984 Ag Econ M.S. thesis was, "The Cost of Learning by Doing Effect on Technology Adoption."
Do you know if learning curves apply to software development? Over many years I have seem improvements in tools, but not real improvement in the software construction itself. Is the fact that every software piece is effectively a one-off something that cancels the learning curve effect?
Exactly. Matt Clancy put out an internet essay on what learning curves measure 3 years ago (https://www.newthingsunderthesun.com/pub/4xnyepnn) arguing that although learning-by-doing probably is a real thing, and learning curves are also empirically visible, they're not causally related precisely because economies of scale also contribute to learning curves and they're impossible to detangle.
I was wondering this myself. I think there are two distinctions and wondering if I’m right:
1. Where a production function is based on variables of technology, capital and labor, the concept of economies of scale holds technology constant and only refers to capital and labor: and
2. Learning curves are based on cumulative production over time, while economies of scale relate to production in a given time period.
Is this right? Ie if after ten years of constant production per year you see costs dropping that’s the learning curve; if production per year doubles from one year to the next and costs drop that’s mostly(?) economies of scale. I do wonder how you separate out those effects in data analysis
An outside-view approach is liable to picking up on transient, non-fundamental noise (cost of energy, input shortages), so heavy regularization is warranted but isn't always enough. An inside-view approach tends to assign too much value to explanations that "feel right" but lack evidence. A hybrid is to use inside-view knowledge to constrain the outside-view regularization; only allow new cut points to be inserted when there's a candidate explanation, then use regularization to further trim out the cut points that don't provide a significant boost in explanatory power.
For instance, people predicted that RLVR would create an inflection on the METR Long Tasks evaluation. This makes sense to me, and eyeballing the charts it looks right too. But it would be more reassuring to put a list of candidate innovations on a timeline, run regularized piecewise linear regression constrained to specific innovations as cut points, and check to make sure that a) a cut point was actually warranted and b) the one we were thinking of was the one that got selected.
Thank you for the close reading of this paper — it is much-talked-about and not obvious on a skim, making it a perfect choice. Learning curves are such a tantalizing phenomenon: of central importance, seemingly the product of fundamental regularities (I’m partial to the interconnectedness angle), but with N so small and the range of interacting dynamics fairly numerous that we find it vexingly difficult to isolate clear evidence of causality.
Thank you also for The Origins of Efficiency, which I enjoyed very much.
We see some items with increasing unit costs with the increase in production volume. For items not including nuclear energy, it seems likely that those increases are due to increases in the costs of inputs, perhaps due to constricted supply, which are later alleviated.
In the case of nuclear power, it seems plausible that the safety costs and fuel life-cycle costs increased as we learned more about the technology.
On second thought - safety/pollution costs might apply for other items as well. Also, there is an implied time variable which could be interesting.
Thoughts?
edit: On second though => On second thought
Some of these things are wrong to call a technology. Perhaps most of them. Transistors and ICs today have essentially zero commonality with what Moore wrote about, indeed pretty much everything before 1980 was alien - the transistors, the circuits, the lithography, the interconnects, the wiring, the scaling laws - nothing is actually the same.
It really is important to bring economics and market into it. Moore wrote his original article about the economics of integration. It is that economic engine that has driven the past 60 years, and as it grew, the incentives grew. Exponential growth is built on exponential marginal value, a nice mathematical reason for it to keep happening. The electronics industry is like an enormous suction engine incentivizing any qualifying invention to step up and share the wealth. The inventions get harder but the payoff gets larger. Physical limits be damned, there is always some new idea that steps in. Planarization. Damascene copper. Fins. HiK. EUV. Chiplets. DSPs on interconnects. CPO. Just look at that list of almost unrelated tech - Wright's law is regular because the economic benefits scale with prior success.
So if you want to understand how long a Wright's law runs and at what slope, look at the economics. How fast does the market expand? Does it expand, or just chasing fixed or shrinking revenue ever harder to satisfy? How elastic is it, relative to absorbing innovations? Did changes in the rate reflect changes in market growth?
The paper describes them as technologies, so I've maintained that terminology here.
My thesis is that maintaining that perspective is not harmless, it misdirects insights.
Power laws in general are often difficult to fit. My favorite reference on the pitfalls: https://www.cs.cornell.edu/courses/cs6241/2019sp/readings/Newman-2005-distributions.pdf
The most obvious explanation is that as the main cost contributor is reduced via learning, at some point it is no longer the main bottleneck, and something else with fundamentally different characteristics is.
As an exampme, for a long time, the largest cost in a solar system was the cells/modules themselves. Now those are cheap enougj that the labour of installing them, and the filed wiring, is actually the biggest cost. The former behaves as a manufactured goods. The later looks a lot like roofing / electrical manual labour. Hence, it should not be a surprise that solar system costs look mature and no longer following the old trend, even if graphs of module prices do for a while.
What is the difference between wright’s law and moore’s law?
Wright's Law is falling costs in proportion to cumulative production volume, Moore's Law (the costs version) is falling costs over time.
I feel the need to be pedantic because I just got done with an essay detailing exactly what Moore predicted and how it changed over time. Moore's Law doesn't actually say anything about cost—it's purely about doubling the number of transistors that can be integrated in an amount of time. I also argue that Wright's Law doesn't really apply to the semiconductor industry because the nodes really don't have that much in common, the way an airplane or utility-scale energy might. There is production-volume-based learning happening, but abstracting Moore's Law to just a special case of Wright's Law is a mistake.
Yeah, the traditional formulation of Moore's Law isn't about costs at all, just about the number of components on an integrated circuit. But folks studying learning curves will sometimes use a "generalized" Moore's Law, which is just constant rate of improvement (cost or something else) per unit time. Nagy et al "Statistical Basis for Predicting Technological Progress" is an example of this sort of paper.
There are a number of related laws. It sounds like Wright's Law is the most generic. For solar energy, there's Swanson's Law which says that the price of solar photovoltaic modules tends to drop 20 percent for every doubling of cumulative shipped volume so that at present rates, costs go down 75% about every 10 years. I'm sure there are a number of other related laws, but Moore's and Swanson's both involve processing silicon to manipulate electrons.
The weak correlation between early and later learning rates is most interesting here for me. It doesn’t negate learning curves, but it does undercut straight-line extrapolation as a forecasting crutch.
Framed that way, Wright’s Law feels more like a directional intuition than a timing or magnitude guarantee...especially once technologies pass through regime changes or constraints. The outside-view framing is particularly useful in that respect.
Thanks for the walk-through! This was a great read.
FYI, you can even apply learning curves to agriculture. My 1984 Ag Econ M.S. thesis was, "The Cost of Learning by Doing Effect on Technology Adoption."
Do you know if learning curves apply to software development? Over many years I have seem improvements in tools, but not real improvement in the software construction itself. Is the fact that every software piece is effectively a one-off something that cancels the learning curve effect?
Awesome read, thanks for sharing, will check out the database
I love the graphs presented. Thanks for sharing, very informative...
How do you disentangle learning curves from economies of scale?
Exactly. Matt Clancy put out an internet essay on what learning curves measure 3 years ago (https://www.newthingsunderthesun.com/pub/4xnyepnn) arguing that although learning-by-doing probably is a real thing, and learning curves are also empirically visible, they're not causally related precisely because economies of scale also contribute to learning curves and they're impossible to detangle.
I was wondering this myself. I think there are two distinctions and wondering if I’m right:
1. Where a production function is based on variables of technology, capital and labor, the concept of economies of scale holds technology constant and only refers to capital and labor: and
2. Learning curves are based on cumulative production over time, while economies of scale relate to production in a given time period.
Is this right? Ie if after ten years of constant production per year you see costs dropping that’s the learning curve; if production per year doubles from one year to the next and costs drop that’s mostly(?) economies of scale. I do wonder how you separate out those effects in data analysis
A potential synthesis of your final paragraph:
An outside-view approach is liable to picking up on transient, non-fundamental noise (cost of energy, input shortages), so heavy regularization is warranted but isn't always enough. An inside-view approach tends to assign too much value to explanations that "feel right" but lack evidence. A hybrid is to use inside-view knowledge to constrain the outside-view regularization; only allow new cut points to be inserted when there's a candidate explanation, then use regularization to further trim out the cut points that don't provide a significant boost in explanatory power.
For instance, people predicted that RLVR would create an inflection on the METR Long Tasks evaluation. This makes sense to me, and eyeballing the charts it looks right too. But it would be more reassuring to put a list of candidate innovations on a timeline, run regularized piecewise linear regression constrained to specific innovations as cut points, and check to make sure that a) a cut point was actually warranted and b) the one we were thinking of was the one that got selected.