I owe blogger Dan Hirschman thanks for linking to Freeman Dyson’s “How We Know”, a review essay of James Gleick’s new book The Information: A History, a Theory, a Flood in the 10 March 2011 issue of the New York Review of Books.
The central dogma [of information theory] says, “Meaning is irrelevant.” Information is independent of the meaning that it expresses, and of the language used to express it. Information is an abstract concept, which can be embodied equally well in human speech or in writing or in drumbeats. All that is needed to transfer information from one language to another is a coding system. A coding system may be simple or complicated. If the code is simple, as it is for the drum language with its two tones, a given amount of information requires a longer message. If the code is complicated, as it is for spoken language, the same amount of information can be conveyed in a shorter message.
The renowned American engineer Claude Shannon, in Dyson’s telling of Gleick’s book, is responsible for devising a theory of information and for our current early 21st century information abundance (surfeit?).
Claude Shannon was the founding father of information theory. For a hundred years after the electric telegraph, other communication systems such as the telephone, radio, and television were invented and developed by engineers without any need for higher mathematics. Then Shannon supplied the theory to understand all of these systems together, defining information as an abstract quantity inherent in a telephone message or a television picture. Shannon brought higher mathematics into the game.
When Shannon was a boy growing up on a farm in Michigan, he built a homemade telegraph system using Morse Code. Messages were transmitted to friends on neighboring farms, using the barbed wire of their fences to conduct electric signals. When World War II began, Shannon became one of the pioneers of scientific cryptography, working on the high-level cryptographic telephone system that allowed Roosevelt and Churchill to talk to each other over a secure channel. Shannon’s friend Alan Turing was also working as a cryptographer at the same time, in the famous British Enigma project that successfully deciphered German military codes. The two pioneers met frequently when Turing visited New York in 1943, but they belonged to separate secret worlds and could not exchange ideas about cryptography.
In 1945 Shannon wrote a paper, “A Mathematical Theory of Cryptography,” which was stamped SECRET and never saw the light of day. He published in 1948 an expurgated version of the 1945 paper with the title “A Mathematical Theory of Communication.” The 1948 version appeared in the Bell System Technical Journal, the house journal of the Bell Telephone Laboratories, and became an instant classic. It is the founding document for the modern science of information. After Shannon, the technology of information raced ahead, with electronic computers, digital cameras, the Internet, and the World Wide Web.
According to Gleick, the impact of information on human affairs came in three installments: first the history, the thousands of years during which people created and exchanged information without the concept of measuring it; second the theory, first formulated by Shannon; third the flood, in which we now live. The flood began quietly. The event that made the flood plainly visible occurred in 1965, when Gordon Moore stated Moore’s Law. Moore was an electrical engineer, founder of the Intel Corporation, a company that manufactured components for computers and other electronic gadgets. His law said that the price of electronic components would decrease and their numbers would increase by a factor of two every eighteen months. This implied that the price would decrease and the numbers would increase by a factor of a hundred every decade. Moore’s prediction of continued growth has turned out to be astonishingly accurate during the forty-five years since he announced it. In these four and a half decades, the price has decreased and the numbers have increased by a factor of a billion, nine powers of ten. Nine powers of ten are enough to turn a trickle into a flood.
In 1949, one year after Shannon published the rules of information theory, he drew up a table of the various stores of memory that then existed. The biggest memory in his table was the US Library of Congress, which he estimated to contain one hundred trillion bits of information. That was at the time a fair guess at the sum total of recorded human knowledge. Today a memory disc drive storing that amount of information weighs a few pounds and can be bought for about a thousand dollars. Information, otherwise known as data, pours into memories of that size or larger, in government and business offices and scientific laboratories all over the world. Gleick quotes the computer scientist Jaron Lanier describing the effect of the flood: “It’s as if you kneel to plant the seed of a tree and it grows so fast that it swallows your whole town before you can even rise to your feet.”
What is there to be done? Not one thing. Dyson ends by quoting Gleick quoting from Jorge Luis Borges’ famous short story “The Library of Babel”. I agree with Hirschman in that the two men misread the short story subtly, in that the library really isn’t a library–an organized, systematized information space–in that it isn’t indexed.
The Library of Babel shows that information is not simply in the possession of statements, texts (facts), but rather in the structure that maps between them (and is missing in that ill-fated library). The Library of Babel is a misnomer – a collection of every possible book is not a library, but rather an unordered chaos in the guise of shelves and books. It is the opposite of a library. This is a universe with too little information, not too much!
The way forward, as Dyson would have it, is through the mass of individuals’ efforts in trying to systematize all this information, through systems like Wikipedia which through masses of data together with masses of individual critics who compete to produce some order. (I confess, here, that I do have a Wikipedia account of my own, although I’m not very active on it at al.)
Among my friends and acquaintances, everybody distrusts Wikipedia and everybody uses it. Distrust and productive use are not incompatible. Wikipedia is the ultimate open source repository of information. Everyone is free to read it and everyone is free to write it. It contains articles in 262 languages written by several million authors. The information that it contains is totally unreliable and surprisingly accurate. It is often unreliable because many of the authors are ignorant or careless. It is often accurate because the articles are edited and corrected by readers who are better informed than the authors.
Jimmy Wales hoped when he started Wikipedia that the combination of enthusiastic volunteer writers with open source information technology would cause a revolution in human access to knowledge. The rate of growth of Wikipedia exceeded his wildest dreams. Within ten years it has become the biggest storehouse of information on the planet and the noisiest battleground of conflicting opinions. It illustrates Shannon’s law of reliable communication. Shannon’s law says that accurate transmission of information is possible in a communication system with a high level of noise. Even in the noisiest system, errors can be reliably corrected and accurate information transmitted, provided that the transmission is sufficiently redundant. That is, in a nutshell, how Wikipedia works.
[. . .]
The rapid growth of the flood of information in the last ten years made Wikipedia possible, and the same flood made twenty-first-century science possible. Twenty-first-century science is dominated by huge stores of information that we call databases. The information flood has made it easy and cheap to build databases.
[. . .]
The explosive growth of information in our human society is a part of the slower growth of ordered structures in the evolution of life as a whole. Life has for billions of years been evolving with organisms and ecosystems embodying increasing amounts of information. The evolution of life is a part of the evolution of the universe, which also evolves with increasing amounts of information embodied in ordered structures, galaxies and stars and planetary systems. In the living and in the nonliving world, we see a growth of order, starting from the featureless and uniform gas of the early universe and producing the magnificent diversity of weird objects that we see in the sky and in the rain forest. Everywhere around us, wherever we look, we see evidence of increasing order and increasing information. The technology arising from Shannon’s discoveries is only a local acceleration of the natural growth of information.
“As finite creatures who think and feel, we can create islands of meaning in the sea of information”, Dyson ends his text. Indeed. It may not be everything, but it’s something–a list of somethings, even.