Call for Applications | Archives as Data: An Institute for Advanced Topics in the Digital Humanities for Archivists and Historians

Event Details

End: June 21, 2024
More Info:

Archives as Data: An Institute for Advanced Topics in the Digital Humanities for Archivists and Historians

Workshop at Columbia University, hosted by History Lab, June 10 - June21


Digital history and archiving are thriving, but the increasing volume of digitized and “born digital” materials for historical research also presents new challenges for archivists and historians. Typically, the only way to explore these resources has been through keyword searching. More direct access to the data creates tremendous new research opportunities, but the barriers to entry can seem daunting.


This NEH-funded program will offer practical training for historians and archivists in processing and analyzing textual data. Participants in the Archiving Digital Records workshop, designed for archivists, will learn how to use new technology to improve the description and arrangement of digital or digitized records, especially PDFs, and provide users with new ways to access them. Participants in the Text-as-Data workshop, designed for historians, will learn how to organize and analyze large document collections and use new methods to formulate original arguments. All participants will come together in seminar-style discussions on the novel challenges posed by doing archival research in the age of “big data,” including issues related to community representation, protecting private information in online archives, and the professional and scholarly pitfalls in navigating this new terrain.

The Institute will be led by Matthew Connelly and Courtney Chartier, with co-teachers Ray Hicks and Ben Lis, who have extensive experience processing and analyzing textual data. It will also feature presentations from archivists, historians, and data scientists (see list below). The Text as Data workshop will run for two weeks, while the Archiving Digital Records workshop will be in-person for classes only the first week. In the second week, participants in the Archiving workshop will participate in the lunchtime talks and discussions remotely. Attendance is free, and funding is available for those who need to travel to participate.


The Institute is a joint project of Columbia's History Lab and Columbia Libraries, and is funded by the NEH Institutes for Advanced Topics in the Digital Humanities. Hands-on training will use textual data from the Freedom of Information Archive, a project that has aggregated the largest database of declassified government documents in the world. Here are the draft syllabi for the workshops.


When: June 10-21, 2024. Sessions will be from 9am - 3pm each weekday. Workshop participants will be invited to submit proposals to a conference that will take place at Columbia at the same time the American Historical Association holds its annual meeting in NYC in January 2025.


Where: Columbia University Campus in New York City.


Eligibility: This workshop is open rank. Masters students through established scholars are encouraged to apply. Priority in the Text-as-Data workshop will be given to historians, while priority in the Archiving Digital Records workshop will be given to archivists. Others will be eligible to participate on a space-available basis.


Financial Support: We are happy to offer financial support for those workshop participants who need it for travel and accommodations. In your application, we will ask you to describe your budget and prospects for obtaining other funding. We will use the limited funds we have to ensure broad participation, including from under-resourced institutions.


How to Apply: Please use this form to apply. In addition to providing a CV, we will ask you to describe any previous experience or training in either processing digital collections (for archivists) or analyzing textual data (for historians). We will also ask you what motivates you to apply to the workshop and what you hope to gain from attending it. Feel free to contact us with questions.


Confirmed Participants:

Courtney Chartier is the director of Columbia's Rare Books and Manuscripts Library. She has long-standing interest and experience in the archiving of electronic records, and was previously the Head of Research Services at the Stuart A. Rose Manuscript, Archives, & Rare Book Library at Emory University and taught at Georgia State University. Chartier is also the immediate past President of the Society of American Archivists.

Matthew Connelly is a professor of history at Columbia University. He received his B.A. from Columbia in 1990 and his Ph.D. from Yale in 1997. His publications include A Diplomatic Revolution: Algeria’s Fight for Independence and the Origins of the Post-Cold War Era, which won five prizes, Fatal Misconception, The Struggle to Control World Population, an Economist and Financial Times book of the year, and The Declassification Engine: What History Reveals About America’s Top Secrets, which was published in February 2023 by Random House. In 2011 he also co-directed (with Stephen Morse) a summer research program on “The History of the Next Pandemic.”


Raymond Hicks (Data Scientist) has been working with History Lab since 2017. Before starting at Columbia, he worked as the Statistical Programmer for the Niehaus Center for Globalization and Governance at Princeton University. His research has appeared in the Journal of Politics, International Organization, and the British Journal of Political Science, among other journals. He received his B.A. from The College of William and Mary and his Ph.D. in political science from Emory University. He has taught a 2-week workshop on the introduction to text analysis for several different audiences, including economists, political scientists, and historians.

Benjamin Lis (Instructor - Archives Workshop) has been the History Lab's data engineer since 2019, where he has developed some of the tools used in the workshop. He has also taught as an adjunct in the Applied Analytics department of Columbia University's School of Professional Studies and co-taught “Hacking the Archive” with Connelly in spring 2020. He has a B.S. from Montclair State University and an M.S. from Stevens Institute of Technology.


Invited Speakers at the 2023 workshop:

Cameron Blevins, University of Colorado Denver

Merlin Chowkwanyun, Columbia University

Greg Eow, Center for Research Libraries

Jo Guldi, Southern Methodist University

Tim Hitchcock, University of Sussex

Lara Putnam, University of Pittsburgh

Barbara Rockenbach, Yale University

Heidi Tworek, University of British Columbia