Data Storytelling and Historical Knowledge
The role that data plays in our society is changing. Institutions and corporations collect vast amounts of information about us. Individuals contribute to this further by creating data about themselves on social media. One of the world’s largest corporations, Google, earned its status by collecting vast amounts of data that have enormous value to advertisers. But what Google does on a grand scale, and with claimed pinpoint accuracy, is not that different from what media companies that rely on advertising sales have done for decades, if not centuries. A change in how data is utilized that is potentially more interesting and relevant to historians is the growth of the idea of data storytelling.
Rather than merely presenting information in a bar graph or pie chart, or even using more sophisticated visualization tools, data storytelling uses narrative techniques in conjunction with qualitative and quantitative information to make the point. Advocates of this approach claim that data storytelling does a better job of making that point and persuading the audience than merely presenting graphs and charts. Persuasion is the key here; this is an idea that comes out of corporate marketing departments and business schools. But it has also become a favorite technique of fundraisers and nonprofit advocacy groups for getting their messages across to potential donors, voters, and politicians. An example from the field of history is the series of visualizations telling the famous story of John Snow’s discovery of the cause of the 1854 cholera outbreak in London.1
Narrative approaches such as these require access to good data that the storyteller can develop into an argument. This explains why, on a freezing afternoon in late February, I ventured across town to the offices of the Institute of Museum and Library Services (IMLS) to attend an “open data open house.” For those unfamiliar with IMLS, it is a small federal agency that provides funding for museums and libraries, much of it through block grants to states, which then disburse funds to local institutions. The IMLS also has discretionary grant-making capacities similar to those of the NEH and the government research funding bodies that support scientific research. These grants fund education, research, outreach, and other activities that allow libraries and museums of all types to better serve their users.
By utilizing ships’ logs and treating these sources as data, historian Ben Schmidt has been able to reconstruct all of the voyages of American whaling ships from 1830 through 1855. Photo by Robert Harris-Stoertz, artist unknown.
The occasion for the workshop that IMLS hosted was the launch of data.imls.gov, a website that provides access to a catalog of datasets that the IMLS collects and manages. Some are directly about the agency’s activities, such as the Administrative Discretionary Grants dataset, which lists successful research grant applications that have been funded since the IMLS was established in 1996. Data such as these are invaluable for anyone applying for grants from the IMLS. Researchers, librarians, and museum professionals all have an interest in understanding the kinds of work that the IMLS funds and the ways in which those projects are described. These data allow a view into a range of information about IMLS grants, such as the type of library or museum that received a grant in the past (public library, academic library, history museum, etc.), the location of the organization that received the award, the amounts requested and awarded, and a brief description of the project.
The agency also collects data about the institutions that it supports (financially and in other ways): museums and libraries. These data make it possible for the IMLS to assess the needs of the organizations it is tasked with supporting, and it serves those individuals and organizations that have an interest in the landscape of museums and libraries across the country. For example, the Public Library Trend Survey File provides invaluable insights into public libraries across the United States over the past two decades; it would be possible, for example, to use the data in this file to build a visualization of the history of library openings and closings over those years, in the process telling stories about the communities the libraries serve.
One creative use of library data is highlighted in the article “Where Gun Stores Outnumber Museums and Libraries,” which appeared in the Washington Post in June 2014.2 Another example of data storytelling that we’ve done here at the American Historical Association is the interactive data visualization on the trends in undergraduate history majors from 2004 to 2013 compiled by Allen Mikaelian in Perspectives on History (November 2014).3
Data of this sort is very valuable to people who work at associations, government agencies, foundations, universities, etc., especially those who spend their professional lives thinking about how to better serve the users of local public libraries and museums. This leads to some interesting ideas on how to use the data. One of the most important aspects of data.imls.gov is its approachability. Rather than presenting the information in lists, the catalog gives the user access to spreadsheets from which the user can manipulate the data in a variety of ways. Researchers, library advocates, and data wonks can access and utilize the human and machine readable data, and it would be possible for a developer to build an application that provided access to the data.
By now you’re thinking that data storytelling is exactly what historians have been doing long before the Internet and the World Wide Web. And that’s very true. To a great extent our identity as researchers is built around a focus on sources (the data) and the interpretation of that data in narrative form to tell a story about change over time. Nonetheless, one thing we can take away from this approach is potentially useful to historians—the notion that our sources are data. Examples of these sorts of approaches include treating text as data to enable text mining approaches, digging into sources for information about quantifiable phenomena (population expansion, transportation infrastructure, criminal justice patterns, etc.), or transcribing household budgets from family papers into Excel files. Treating them as sources for historical data allows us to interpret and analyze patterns that might otherwise remain invisible.
Thinking about our sources as data opens up possibilities for how we can understand and present our arguments about the past. One excellent example is Ben Schmidt’s work on the 19th-century American whaling industry. He has written a series of blog posts about the process of turning ships’ logs into the data that drive his excellent visualizations of whaling voyages around the globe. In one of these posts, Schmidt argues that humanists working with data cannot rely on scientific approaches and assumptions about data that attempt to remove the biases in a sample: “the humanistic approach is to understand a source through its biases without expecting it to yield definitive results.”4 Thinking of our sources as data is not about giving up on traditional humanistic and historical approaches. Instead, a move toward a data-driven methodology opens up new possibilities, but only if the historian remains committed to humanistic modes of inquiry.
Seth Denbo is the AHA’s director of scholarly communication and digital initiatives. He tweets @Seth_Denbo.
1. “London’s 1854 Cholera Outbreak: Data Mapping Halts an Epidemic,” tableau public beta,
2. Christopher Ingraham, “Where Gun Stores Outnumber Museums and Libraries,” Washington Post, June 17, 2014,
3. Allen Mikaelian, “Drilling Down into the Latest Undergraduate Data,” Perspectives on History, November 2014,
4. Ben Schmidt, “Sapping Attention: Digital Humanities: Using Tools from the 1990s to Answer Questions from the 1960s about 19th Century America,” blog post, November 15, 2012,
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License. Attribution must provide author name, article title, Perspectives on History, date of publication, and a link to this page. This license applies only to the article, not to text or images used here by permission.
Please read our commenting and letters policy before submitting.