Information and Technological Evolution

I'm trying to chip away at a better understanding of technological progress. This essay was my attempt at working through some of these ideas.

Michael Magoon

Apr 4

Yes, you are, and I appreciate the effort.

Michael Frank Martin

https://www.symmetrybroken.com/invention-as-exploration/

One thing I like less about Arthur's approach to the problem is how he focuses on the technology itself rather than on the humans who invent the technology. I understand the convenience of that, but I feel the technology will always be downstream of the process of invention — an artifact of progress.

That being said, I believe that finding ledges in the landscape that allow for new abstraction layers in a search through its combinatorial complexity is part of the process of invention and innovation, and the human evidence for that is plentiful.

The biggest open question in technology right now is whether machines will be able to replicate the process that humans have followed for doing this kind of work. There's money at stake, which means motivated reasoning and magical thinking are working hand in hand to obfuscate the problems that remain.

One of the core problem that remains is that we don't yet have an architecture that allows for reliable causal inferencing. The transformers are great at recognizing causal connections that have already been established. They're not currently capable of independent causal reasoning. In reflecting upon how we humans choose what parts of the sparse, high-dimensional landscape to explore, I believe that causal reasoning has been necessary.

The machines cannot tell us where our attention *should* be.

Martin Sustrik

The article argues that technology evolves by being modular. Simple pieces must evolve first, then they can be combined into more complex assemblies. This emerges, in some way, from the experimenter setting ever harder goals. But is the emergence of a modular system inevitable? Or is it just a byproduct of the experimenter setting the goals in a particular way?

This seems to be highly relevant: https://www.lesswrong.com/posts/JBFHzfPkXHB2XfDGj/evolution-of-modularity

It's an experiment with setting the goals in different ways:

"Kashtan & Alon tackle the problem by evolving logic circuits under various conditions. They confirm that simply optimizing the circuit to compute a particular function, with random inputs used for selection, results in highly non-modular circuits. However, they are able to obtain modular circuits using “modularly varying goals” (MVG). The idea is to change the reward function every so often (the authors switch it out every 20 generations). Of course, if we just use completely random reward functions, then evolution doesn’t learn anything. Instead, we use “modularly varying” goal functions: we only swap one or two little pieces in the (modular) objective function."

To sum it up: modularity in evolved systems (such as having distinct brains, kidneys, etc.) matches the modularity of the requirements posed by the environment.

This is really interesting, appreciate you sharing this.

Richard

Perhaps I am misreading something but what I see presented is not how circuits are designed. Karnaugh maps are used on truth tables to get the optimum mix of gates for given inputs and desired outputs. Has been done this way for decades.

This isn't to show how circuits are designed in practice - obviously you would never try to design a circuit via random combination, just like you wouldn't try to build a car by randomly putting pieces of metal together. It just uses them as a nice easily-simulable example for how simple elements can combine to make more complex technologies, which make even more complex technologies.

Richard

Okay, I understand what you were trying to explain, however I would have picked a better example because circuits are not built up that way at all. Perhaps even a car analogy going from a 1915 car to a modern one with all the advances along the way.

Sam Penrose

Wonderful, thank you. Do you have specific favorite references for the scholars in your first footnote?

Apr 2Edited

For Mokyr I most often return to "The Gifts of Athena," though he has a lot of books that discuss the same constellation of ideas. Bernard Carlson has a good biography on Tesla, and a few interesting papers I've read about Edison's invention process. Hugh Aitken has two good books on the development of radio, "Syntony and Spark" and "The Continuous Wave." Lillian Hoddeson is the co-author of "crystal fire," and she's also the co-author of a great book about Los Alamos called "Critical Assembly." Joan Bromberg is the author of a good book about the laser, "The Laser in America," and a good book about fusion. Clay Christensen I mostly reference his ideas about disruption that show up in a few places, such as "The Innovator's Dilemma." Edward Constant has basically one good book, "The Origins of the Turbojet Revolution."

Also I forgot to include him but Donald Mackenzie is another good scholar in this area. "Inventing Accuracy" about ICBM control systems is especially good, but he has a few good books and papers. I also forgot to include Walter Vincenti, author of the very good "What Engineers Know and How They Know It."

Thanks so much!

The development of neural networks seems to follow the same principle. Before LLMs there were separate models which were very good at translation, sentiment, topic modeling. It’s a bit too hard to yolo a giant model without an existence proof of smaller wins.

It’s an interesting way to articulate whether humanoid robotics will work and be useful. For one, a vision foundation model equivalent to LLMs does not truly exist yet, and that’s to speak nothing of grasping, planning, etc.

Ken

Apr 4

A subtle thing but there should be a - sign on the left hand side of the equation. "The entropy of a guess is (1/8) * log2(1/8) + (7/8) * log2(7/8) = 0.54 bits." It should be: -[(1/8) * log2(1/8) + (7/8) * log2(7/8)] = 0.54 bits.

Otherwise, a cool treatise on the solutions spaces for complex problems.

Also, random searches can be improved by weighting the probability of the guesses in a search space. For example, if you need an ordered bit-stream with a special phase (timing) property, and you know that on the average the probablity of a bit in a range of the solution sequence is say 0.6, then it makes sense to use a random number generator weighted to give 0.6 as a typical guess for that section.

Tanj

Apr 4Edited

Arthur's approach seems far from reality. It ignores tools like theory, induction, and rule-based combinations.

In the case of an XOR gate you need to go down to the transistor and wiring level to understand the best arrangements, and you throw in things like whether the result needs to be glitch-free (equal delay for all inputs). A technology like FinFET may use a different arrangement than planar, although planar designs would be used as starting points. Often the basic cells are designed in great detail to meet perfomance and lithography constraints, then a combinatorial library with as few as 30 basic cells or as many as 200 is tested and eventually put to use.

we get wider adders mostly by induction. Induction does not need to be count-by-1, it can be doubling for example. This brings in the rule based combination where the elementary cells have rules for fitting together. It is a bit like a game: bring things you need into proximity and then snap the wires in place. We can then evaluate the results against theory (which is what Shannon was brilliant at, but he also designed electronics and used more applicable theory than information). For adders we know the optimal scaling laws for latency and power and can assess a design, for example a particular carry acceleration, against theory to know where to stop. The particular way in which we fall short of theory can be a big clue on what needs to change.

Imagining we advance technology with information theory and trials is a bit like imagining we build furniture out of Lego blocks with no hints.

An interesting bit of math perhaps but does not feel like insight into what we actually do.

A Nascente

I find it almost depressing how few people talk about progress. Thanks for the recommendation of "The Nature of Technology," I'll take a look.

Do you know of anyone researching progress in agriculture and/or biomedical engineering?

rahul razdan

Interesting article... one thing to keep in mind is that the world consists of components and the rules for combining them (model of computation). The model of computation can be quite complex: Maxwell's equations, KVL charge, synchronous Finite State Machine models, and the simplest... your interconnect/Discreate logic example. These two are duals of each other... fun stuff.

WindUponWaves

Interesting. From my perspective, you're essentially describing why "multi-step Save Scumming" in games works, at least when it comes to things that require multiple different rolls in a row. If I need, say, a thing that has only a 10% chance of happening to happen 10 times in a row, one way to do that would be to run it 10^10 times, reloading my save back to the start every time I fail.

But that's ridiculously inefficient, most of my time is spent on runs where, say, the first roll fails and yet I'm continuing anyways. If I instead cut off each run the moment it hits a failure, I can avoid wasting time on doomed runs. And if I save after every success, so I can reload back to there instead of all the way back to the start, I can save even more time by not throwing away perfectly good runs (e.g. 5 successes then 1 failure. Instead of throwing that away and restarting back at 0, I can reload my save and avoid having to spend 10^5 runs to get back to 5 successes. Failure only costs me 1 failed roll rather than 10^5 of them).

So what previously took 10^10 runs now only needs 10*10 = 100 runs, a speedup factor of 10^8 / 100 000 000 / 100 hundred million. By 'cutting off the failed branches', and "checkpointing" my success, I can make miracles happen.

In that regards then, science is just savescumming real life. Back in the very early days, when it was just alchemists doing things in an unstructured way, an experiment failing meant you had no idea whether it was the fundamental *idea* that was wrong, or just how you had run the experiment. When experiments were not replicable and experimental procedure not *painfully* detailed... you had no idea whether your failure to run, say, Torricelli's 1643 mercury barometer experiment meant that the entire idea of "atmospheric pressure" was a failed branch that you should cut off... or whether you had just run the experiment wrong.

Likewise, without constant replication of the blindingly obvious, you couldn't build up any 'checkpoints' of rock solid stability to fall back to when things went wrong (e.g. your mercury immortality potion has poisoned and killed the Emperor. Does that mean you brewed the potion on the wrong day using the wrong astrological theory, or miscalculated the dose using the wrong theory of the 4 humors... or that the entire idea of mercury improving things is wrong? How far back do you have to go?)

So to cap off this rather tortured analogy, the Scientific Method is just a framework for save scumming. (Indeed, I've noticed a remarkable similiarity between the speedrunning community and the scientific community, honestly... the collaborative nature, the "trust, but verify" ethos, the focus on replicability, the "monk mentality" required to want to spend the rest of your life staring at beetles / Mario 64 speedruns, the conferences... and most relevantly, the fact that many speeruns are based off collaborative save scumming.)

Another somewhat tortured analogy: To answer Kaleberg's objection, you can sorta cram the idea of "Theory" into this framework, by saying "Theory" is just a way of saying "These two things are alike". Normally, when an experiment tries something, and it either works or doesn't work, the only thing we learn is that it either works or doesn't work. The universe of things is divided into 2 parts, "The thing we tested" and "Everything else", and we learn much much less than 1 bit of information from the success/failure.

However, with the observation "This thing is kinda like the thing we tested", we can instead divide the universe into "The thing we tested + the things that are kinda similar" + "Everything else remaining". You get closer to gaining one bit of information / dividing the universe in half. When you try something, and it doesn't work, you learn not only that the thing doesn't work, but that the entire family of related approaches might not work too. Or when you try something that works, you learn not only is it something that works, but that the entire family of related things might work too.

i.e. The ultimate theory is one that, for any experiment, can divide into the universe into two equal halves of things that are similar, and things that are different. So that, no matter what the actual result is, you gain one bit of information. A theory is too broad if it instead says everything is similar, and too narrow if it says everything is unique & different, because in either case you're not learning as much as you could have. Most theories are too narrow; a few are too broad; none actually apply to "*any* experiment". But it's still a useful thing to think about: a theory is the observation "These things are similar", which allows us to test things we haven't actually tested, which helps us divide the universe in two with each experiment (or get closer to that ideal), which is how you gain the maximum amount of information per experiment.

Arguably. In this framework, at least. If you torture the analogies a bit...

Kaleberg

This was an interesting article, but it doesn't shed a lot of light on how technology advances. Humans explicitly look for higher level abstractions and working theories. The combinatoric space is usually way too large to even consider random or genetic search. There's almost always some kind of model driving things whether explicit or implicit. Progress usually comes with higher level abstractions and improved models, and these are often driven by improvements in practice and application.

Steam power was for novelties until cannon and metal technology improved, and that took centuries. The early engines relied on atmospheric pressure, a recently discovered phenomenon. There was an era of experimentation, but it was far from a randomized search. Watt turned to steam pressure but resisted and delayed the adoption of high pressure steam even as the underlying technologies made it safer. Thermodynamics offered a model for evaluating the efficiency of such engines in general, and Diesel was guided by a theoretical approach in developing his engine in the early 20th century. At each stage, there was a prevailing theory and a background of improving abstraction and support technology.

For example, computer logic started to develop once electric switches and relays became practical in the early 19th century. There were abstractions and manufacturing capabilities. By the mid-20th century, relay logic had advanced to the point of enabling pushbutton elevators and automatic phone exchanges. No one assembled these by randomly searching in a components box. The designers had a theory about how they should work, and while they may have had to explore a number of possible designs, their search was far from open. It wasn't until the second half of the 20th century that Shannon recognized that relay system design could be modeled with Boolean algebra. Still, the complexity of logic was limited by the inaccuracies and latencies of relays and vacuum tubes. The latter had been used in analog computer, but Flowers turned to digital logic.

It's no surprise that a genetic algorithm can produce useful binary circuits. There's a good model for how such circuits work. If you are searching in chemical or genetic space, modern theory gets one a whole lot less. Chemistry is still full of surprises. Even the best simulations are advisory at best. One still has to validate results in the laboratory. Genetics is even worse since there's a multi-billion year old code base that we are just beginning to understand. That code base was built by variation and descent, so Alphafold works because so many proteins are just variations of earlier proteins.

I agree that search spaces are generally too large for random search to work to create new technologies (though evolution can build biological "technology" via millions of years of random mutation). But I think understanding how large the search spaces are, and how information can help you winnow down extremely large search spaces, is a useful lens to think about the problem.