Event Type

Workshop

Contact Info

Website

Location

  • Columbia University
  • New York, NY

Event Description

We are pleased to announce that we have opened the application process for our third edition of the NEH-funded Archives as Data Summer Institute. The Institute will run from June 2 to June 13, 2025, and will offer practical training for historians and archivists in processing and analyzing textual data. Participants in the Archiving Digital Records workshop, designed for archivists, will learn how to use new technology to improve the description and arrangement of digital or digitized records, especially PDFs, and provide users with new ways to access them. Participants will receive training in using metadata tools such as PDF Processing, OCR Processing, and Named Entity Recognition (NER) analysis. Participants in the Text-as-Data workshop, designed for historians, will learn how to organize and analyze large document collections and use new methods to formulate original arguments. Participants will receive training in using data science technologies like R and SQL and will be expected to attend afternoon lab sessions where they will put these tools into practice.

The Institute will be led by Matthew Connelly and Courtney Chartier, with co-teacher Ray Hicks, who has extensive experience processing and analyzing textual data. The Text as Data workshop will run for two weeks, while the Archiving Digital Records workshop classes will only run for one week. In the second week, participants in the Archiving workshop will be expected to participate in the lunchtime talks and discussions remotely. Attendance is free, and funding is available, although limited, for those who need to travel to participate. Note that we expect all participants to attend daily, and group activities will require everyone to be present and actively contributing.

Eligibility: This workshop is open rank. Masters students through established scholars are encouraged to apply. Priority in the Text-as-Data workshop will be given to historians, while priority in the Archiving Digital Records workshop will be given to archivists. Others will be eligible to participate on a space-available basis.

Financial Support: We are happy to offer financial support for those workshop participants who need it for travel and accommodations. In your application, we will ask you to describe your budget and prospects for obtaining other funding. We will use the limited funds we have to ensure broad participation, including from under-resourced institutions.

The draft syllabi for the workshops as well as the slides from a typical Text-as-Data course are available at https://lab.history.columbia.edu/content/archives-data-sample-material.