Data Management Plans for Historians: How to Document and Protect Your Research
Several national grant-funding agencies currently require applicants to submit data management plans (DMPs) in their applications. These plans ensure that researchers are prepared to preserve and share federally funded data and research. So to help researchers with this portion of their grant applications, many academic librarians are being trained in data management planning. As I go through the training process and learn more about DMPs, I see elements of these plans that can be of help to anyone in the research process. While DMPs for historians may not strictly include numerical data, the principles are still of use for anyone managing large amounts of information. Proper information management ensures research integrity and reproducibility; it also increases research efficiency, saving time and effort in the long run. I have, therefore, started talking to incoming history graduate students about data management planning.
First, document your work. In the sciences, this might be addressed by the lab notebook. For historians, much of this translates into the familiar research log. Research logs keep track of what you plan to do and what was done. But perhaps do even more. Define what the project is about and the resources (databases, archives, books, etc.) you plan to consult and when you plan to consult them. Map out your research strategies. Are you using sensitive records or human subjects in your research? You may need to go before an institutional review board and/or undertake special training. Be prepared to cite your material correctly, completely, and accurately from the beginning, perhaps using a citation management tool like Mendeley or Zotero. Decide what citation method you will use before you start your project so there is no need to spend time shifting from one method to another midstream.
How will you name your files so that they make sense to you (or anyone else) in 3, 5, 10 years?
After using an archive, note the name and contact information of the archivist, what record group you consulted, when, what folders you used, and any other information you might need to cite your work. Indicate whether you took notes on paper or on your computer, and where those notes are stored. If you have made reproductions, indicate whether they are images, PDFs, photocopies, or another format, and where you have stored them. The archive itself may limit the way in which you take notes and make copies.
When using databases, note the date you searched the database and, most importantly, the search strategy used. An easy way to do this is to take screen shots of the search screen with your search term typed in. Simply copy and paste into a text document. Free software such as Jing (Jing.com) makes this easy. If you are using data, use metadata to describe the dataset, based on standards for your particular subject area. Speaking of software, it is best if all this information is stored on proprietary free software. That makes the information accessible to anyone in the future.
File Naming Conventions
Next, consider file naming conventions. How will you name your files so that they make sense to you (or anyone else) in 3, 5, 10 years? Even if you are now writing your dissertation, some of the materials you collect might be useful for another project in a few years. Perhaps you will be challenged on some of your source material. It might be best, therefore, that files are saved not as Chapter2.txt or version3.txt but using a more meaningful and descriptive file name. Best practices for file naming include using underscores instead of spaces, avoiding special characters, using at most 25 characters, and including a date. It would be helpful to include your file naming conventions in your research log. If you are using data, use the file naming conventions within your discipline. For example, the Data Documentation Initiative (DDI) works to create standards for describing social science data.
Security and Backup
Security and backup are essential and basic, but when I mention these to students, I often get doleful looks. You want your files (paper or electronic) to be physically secure. Limit access to the room or computer they are on. If you are working with confidential files, keep confidential information off the Internet by using computers not connected to it. Let only trusted individuals troubleshoot computer problems. Keep virus protection up to date, and don’t send confidential information via e-mail. If you must send confidential information, use encryption.
As for backups, use the rule of three: three copies of your work, be it data, text, images, or another format. Do you really want to be on the last draft of your dissertation and have your computer or briefcase stolen, or your flash drive go through the wash or be lost, and have no backup? Have at least two physical backups—external hard drives, a flash drive, a personal or work hard drive, a university or departmental server. Then store a version in a cloud. Many universities offer some form of cloud storage. If you work solely with paper notes or note cards, consider photocopying or scanning them and keeping the copies in a different physical place from where you usually work. Paper backups can be stored in the freezer. Heat resistant to some degree (though the door can open if it falls through a floor), freezers are, however, not burglar proof.
Preservation and Publication
No longer asking, “Did I search that database with these terms?” or worrying about losing a flash drive can save time and relieve stress.
Finally, there is preservation and publication. After the project is completed, where and how will you keep your notes and files, and for how long? Data can be stored on the many discipline-oriented data repositories; perhaps your university has a data archives. Otherwise, cloud storage may be a viable solution. Decisions should be noted in your research log. Many universities offer institutional repositories for storing completed and/or published work. These repositories are usually indexed by Google and other search engines so your work is discoverable. Negotiating with a publisher for the rights to post pre-publication or post-publication copies of your work is important for any scholar, but especially the independent one. Consider adding your work to other repositories, such as ResearchGate or ORCID, for increased discoverability. An ORCID record will also provide you with a unique identifier, important if you have not been consistent with your name on works or if you plan to apply for grants.
None of these ideas is unique, but together they present a cohesive, documented plan for research of any type. Being methodical in your work habits is as important as the work itself. No longer asking, “Did I search that database with these terms?” or worrying about losing a flash drive with the sole copy of your work can save time and relieve stress. Documenting your work well can reduce any charges of research impropriety. Data management planning should be an essential part of any research process, and one that I encourage historians at all levels to adopt.
Susan L. Collins is a senior librarian with responsibilities to the Department of History at Carnegie Mellon University.
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License. Attribution must provide author name, article title, Perspectives on History, date of publication, and a link to this page. This license applies only to the article, not to text or images used here by permission.
Please read our commenting and letters policy before submitting.