From the From the President column of the March 2011 issue of Perspectives on History
Loneliness and Freedom
By
One of the most fascinating events at the AHA annual meeting in Boston was a special presentation that took place late on Saturday night—after the regular sessions and the Business Meeting, even after the receptions and the departmental parties. In December, two scientists at Harvard, Erez Lieberman-Aiden and Jean-Baptiste Michel, contacted President Barbara Metcalf and asked if they could discuss “Culturomics,” a project they have been developing, using materials from Google Books in a radically new way.“Culturomics” starts with a vast body of data—more than 5 million digitized books, stripped down into words and phrases of varying lengths, and precisely dated (as many Google Books are not). Using “very high-turbo data analysis,” the two scientists have constructed tools that enable pretty much anyone to trace the ways in which words and phrases appear and disappear, become popular and fall into disuse, over time.
Assume, for the sake of argument, that a very large corpus of material from books can roughly represent the larger culture that produced it, and you have a powerful new tool for cultural history. Culturomics can lay out in graphic form anything from the time it takes words in general usage to change their forms to the speed with which individuals become, and cease to be, famous. As Lieberman-Aiden put it in one interview, “The goal is to give an 8-year-old the ability to browse cultural trends throughout history, as recorded in books.”
The two scientists and their associates described their work in Science in December 2010. Their project provoked a firestorm of announcements and discussions in the media. A Google search for the term Culturomics, introduced less than two months ago, now yields 99,600 results. Normally, the AHA program is set many months in advance. But Lieberman-Aiden and Michel’s program had attracted a great deal of attention, and they themselves had not, as yet, presented their work to any audience outside Harvard. It seemed a great pity for historians—of all people—to pass up the chance to hear about it from the creators. The AHA staff did magic, a room was found and equipped with wifi, and in the end about 25 historians, including President Metcalf and me, turned up to watch the creators of Culturomics show off their creation.
The session was astonishing, in more than one way. Lieberman-Aiden and Michel spoke freely, offering something like a crosstalk act, quick-paced, dense with material, and as funny as it was erudite. They put their tool, which produces graphs, through some amazing and instructive paces. For example, they traced the disappearance of the names of artists labeled by the Nazis as “degenerate” from German public discourse, as represented in German books.
Lieberman-Aiden and Michel explained that they had insisted on making their database, as well as their results, available to the public, though few will have the terabytes of computer space to download everything, and they warmly encouraged others to build their own search tools and apply them (as a number of historians and others already have).
Both spoke eloquently, even ardently of the possibilities for this sort of analysis, once more books—and more documents of other kinds, from magazines to letters—have been made accessible in digital form. Yet Lieberman-Aiden also made clear that they see their new tool not as a replacement for traditional forms of textual study, but as a complement to them.
For all their panache—and all the fun their tool permits—Lieberman-Aiden and Michel also inspired a little worry, as well as some hard thinking about the status of our discipline. The 13 individual signatories of their Science article included members of Harvard’s Program for Evolutionary Dynamics, Institute for Quantitative Social Sciences, and Department of Systems Biology, but not a single historian. The omission was striking. Harvard’s own history department employs distinguished specialists in the history of books and media, including two winners of MacArthur grants. And the lack of historical expertise occasionally showed. Recognizing that books do not exhaust the written record, the two scientists called for a 10-year project for the digitization of the complete written record of the human race. It’s a wonderful plan, but also one impossible to realize in their proposed time frame. As historians know, millions of archival documents in dozens of languages await decipherment, which would have to precede or accompany digitalization before they could be analyzed.
Lieberman-Aiden and Michel immediately saw the force of this objection. Over time, they will find historians and other humanists to work with, and historians will test and use their method. More significant than this glitch are the two larger points, connected but by no means identical, that it suggests. The first is simple: apparently, historians have not established, in the eyes of many of their colleagues in the natural sciences, that they possess expert knowledge that might be valuable, or even crucial—even when a scientific project is concerned with reconstructing part of the human past.
True, historians and scientists regularly work together—especially, though not only, on archaeological projects. The medievalist Michael McCormick and the Byzantinist John Haldon, to name only two, have organized substantial research teams, linking multiple universities as well as disciplines in efforts to recreate demographic and military history. But when the impetus comes from the scientists who form an interdisciplinary team to investigate—say—the history of culture, involving a historian or two seems not to be the first item on the agenda, and the notion that historians have useful expert knowledge seems less widely diffused than it might be. In this area, as in so many others, historians apparently have some educating to do.
More striking still was the collaborative nature of the project. When it comes to organizing research on a large scale, historians have a great deal to learn from their colleagues in the natural and social sciences. As new forms of scientific research offer historians research possibilities that complement the textual record, as digital archives and exhibitions expand and digital research methods become more accessible, historians will have to learn how to form and work in teams. And they will have to learn to create these not only in the traditional way, in which a principal investigator devises a project, raises funds, and hires staff to carry it out, but also in the way embodied by Culturomics—in which early career scholars as well as senior ones form teams, and in which multiple forms of organization—partnerships as well as hierarchies—can take shape.
To make this possible, historians will need to find ways—as our colleagues in many other disciplines have done—to award credit to multiple creators for a single project. We will also have to create physical and social spaces—as the natural scientists at every major university already have—where interdisciplinary collaborative research can take place. And, of course, we’ll have to find financial support for these expensive enterprises, at a time when research support for the humanities is not easy to scare up. Individual universities from Stanford to George Mason have already done some of this, with impressive results. But the rest of us have a long way to go.
Despite the harsh climate and the difficulties, the effort is eminently worth making. Collaborative research could transform many fields, and not only those that involve scientific collaborators. Fifteen years ago, Carolyn Walker Bynum warmly encouraged the rise of new forms of global history in a memorable essay in Perspectives. At the same time, though, she worried about how she and other master practitioners could maintain the traditional skills of their specialized fields.
Dedication to traditional, local craft need not exclude a certain cosmopolitanism in outlook—and even in all or some of one’s scholarly enterprises. Collaboration offers one way—potentially a very powerful one—for scholars of traditional bent to create global histories of economic, cultural, and political relations that rest on deep archival and textual foundations.
The traditional vision of historical work still reflects the ideals of Wilhelm von Humboldt, the founder of the modern university: “loneliness and freedom.” Like the central figure in Durer’s Melencolia I, the historian is condemned to unremitting, solitary intellectual work, even as time runs out. Yet this vision contains an element of myth. Every realized work of scholarship, even one that bears the name of a single author, is the product of many individuals working together in different ways.
Watching Lieberman-Aiden and Michel at work made clear that there is much to be gained by recognizing, and promoting, collaboration, not only in writing textbooks but also at the cutting edge of scholarship—and, with it, the elements of joy and creative fantasy that can too easily be lost as we go about our traditionally lonely craft.Anthony Grafton (Princeton Univ.) is president of the AHA.
Copyright © American Historical Association
Last Updated: February 24, 2011 4:16 PMComments
Erez Lieberman Aiden It's really great to read such extraordinarily positive comments about culturomics, Anthony. We're so pleased that you enjoyed our presentation at AHA. We agree with nearly everything you wrote. We apologize in advance for a somewhat lengthy comment.
However, we did want to take a moment to respond to a few of your more critical observations, because they relate to issues - and occasionally misconceptions - that have been widely discussed in the context of our paper, the clarification of which may be relevant to some of the bigger points you make.
As you may know, the authors of our paper include PhDs in English Literature, Chinese Literature, Psychology, Media Arts and Sciences, Computer Science, Biology, and Mathematics, as well as at least one Master's degree in history. You rightly point out that this wide-ranging author list did not include an academic historian. You then suggest a diagnosis, writing that "historians have not established, in the eyes of many of their colleagues in the natural sciences, that they possess expert knowledge that might be valuable, or even crucial — even when a scientific project is concerned with reconstructing part of the human past."
We disagree with this diagnosis, and suspect that it emerges from a misunderstanding about how the project was actually carried out.
Part of the issue is a latent - and widespread - assumption that the absence of an academic historian on the author list means that we did not seek out and receive significant input from academic historians.
This assumption seems to turn on questions of how multiple authorship works in the sciences, and the difference between the role of an author vs. the role of someone who appears in the acknowledgments of a paper. Part of what makes multiply-authored papers work - without trivializing the role of individual authors - is that the bar for authorship is-and-ought-to-be set quite high. Each author needs to make specific contributions to specific parts of the data collection or analysis. I (Erez) have a close scientific mentor (Michael Brenner, the chair of my PhD thesis committee) who has given me invaluable feedback and advice time and time again for over 7 years, completely changing the course of specific projects and even of my work as a whole. And yet he has never been my co-author: when I ask him to be a co-author on the resulting papers, he always says no, and reminds me that each co-author needs to make specific contributions and that he doesn't feel that his own are specific enough to meet that bar. This is an example of great scientific integrity, but it highlights the fact that there's a difference between the guides who make great science possible, and the people who do the work. The former are usually recognized in the acknowledgments of a paper, not on its author list. This is where I have often acknowledged Michael.
Every author on our recent paper directly contributed to either the creation of the corpus, or to the design and execution of the specific analyses we performed. No academic historians met this bar.
But it does not follow that we lacked input from historians. In fact, we sought and received extensive input from historians; the value of their "expert knowledge" was apparent to us from the get-go.
For instance, your column suggests three Harvard historians who would have been natural to contact: Michael McCormick and the MacArthur winners Robert Darnton and Ann Blair. What you may not realize is that two of the three - Michael and Bob - were both involved-with and supporters-of our work from quite early on, providing us with regular guidance and feedback throughout the lifetime of the project.
Bob Darnton in particular served as an important bellwether. As an academic historian, a MacArthur winner, Director of Harvard University Library, one of your predecessors as AHA president, and - crucially - the most outspoken academic critic of Google Books, we knew we could trust him to be both wise and skeptical in his assessment of our ongoing work and the compromises (such as releasing the data in N-gram format) that we made in order to make it a reality. We therefore approached him early on, and presented our work to him repeatedly and in great detail over the years in which the project took shape. We took his comments and critiques very seriously. He, in turn, was a major supporter of our project, even going so far as to propose brown-bag lunches at Harvard Library where we got extensive input on our work. On the whole, he has been enormously positive, and has repeatedly counseled us on the urgency of making an N-gram data release happen, both for historians and for researchers of all kinds. He also partnered with us to create an extensive - and ongoing - project using Harvard Library data.
Both he and Prof. McCormick are named in the acknowledgments of the paper.
We initially sought even greater participation by professional historians, which might have led to authorship. Early on, this was one of our highest priorities. But we found that setting up day-to-day working collaborations with historians was much harder than we expected.
Some historians were just not interested. For example, on one failed "recruiting" trip, a prominent historian (whose work we very much admire) told us that there would be little interest in our project: historians had tried quantitation in the 50s, and it hadn't worked out.
Even when we found historians who shared our enthusiasm, there were still great barriers to working together. For instance, Michael McCormick helped us convene a multi-hour meeting with himself and about a dozen interested history students and faculty. The historians who came to the meeting were intelligent, kind, and encouraging. But they didn't seem to have a good sense of how to wield quantitative data to answer questions, didn't have relevant computational skills, and didn't seem to have the time to dedicate to a big multi-author collaboration.
It's not their fault: these things don't appear to be taught or encouraged in history departments right now.
Ultimately, a large team was depending on us, and we had to keep moving forward.
I think a lesson that can perhaps be derived from this is that while "expert knowledge" is important, shared paradigms, a shared language, and common intellectual values are a big part of what makes a successful team come together. This suggests that history departments have to grapple with several emerging responsibilities: to encourage familiarity with quantitative methods, with computational techniques, and - as you so eloquently wrote - with large-scale collaboration.
We also wanted to expand briefly on our proposal to digitize all the world's pre-1900 texts by 2020.
Firstly, decipherment need not precede digitization. Minimally, digitization requires us to take a high-quality picture of a text. Even better would be a digital character stream. Creating this does not require us to decipher a text, only to figure out the set of characters that the text tends to use. For instance, the Voynich manuscript has been digitized, despite the fact that no one knows what it means and that it may not mean anything at all.
Second, we recognized that this would be extraordinarily difficult to do by 2020. But "impossible" is far too strong a word. The human genome project, with a price tag of $3B, and the Large Hadron Collider, at $9B, are examples of collaborative intellectual projects at scales unthinkable in the humanities today. We believe the case for the digital archiving of recorded history warrants comparable outlays. Our very deliberate purpose in proposing such an extraordinarily ambitious goal was to urge our listeners to think about what 'Big Humanities' could mean for us all. We don't know whether a motivated nucleus of scholars could lobby the governments and philanthropists of the world effectively enough to achieve such a dream. But if our dreams are big enough, even our failures will change the world.
Erez Lieberman Aiden
Jean-Baptiste Michel
