Overview

This seminar will introduce graduate students to computational approaches to data analysis in the humanities and interpretive social sciences. Using the R programming language, we will analyze and visualize numeric, spatial, textual, and network data. We will consider how these modes of analysis can help us pose and explore valuable research question, and we will interrogate these approaches from a critical perspective grounded in science and technology studies. No prior experience necessary.

Meetings

We will meet Thursdays from 12:10 to 3pm in the STS conference room (SSH 1246). Attendance is mandatory. In general, the first two hours will be devoted to discussion of readings and websites. The remaining hour will be used for hands-on practice in programming and data analysis and visualization. Please bring a laptop to each class session. It is also fine to bring your lunch to class and eat it during discussion.

Class discussions will be largely student-led. You will sign up to lead two discussions during the quarter, which means that 2-3 students will lead each discussion. All readings should be completed prior class on the day listed in the schedule. Discussion leaders should prepare questions that encourage students to make connections across readings and to other texts we have discussed in this class. During the final hour of each class meeting, when we are not working together as a group, students who have more experience with R and with the topic of the day’s activity should help those who have less experience.

Notebook Assignments

Our work in R will take the form of R Markdown Notebooks. Notebooks support the use of literate programming, which combines natural language (in this case, English) with executable code (in this case, R) and its output. You will complete four assignments in this format over the quarter: Numbers, due 1/28; Maps, due 2/11; Texts, due 2/25; and Networks, due 3/11 (all by 5pm). These assignments will ask you to do a short exercise in data analysis and visualization, and to describe your methods and interpret your results. They will therefore allow you to practice communicating about data anlaysis as well as performing and presenting the analysis. You may work on these assignments together, but each person must turn in their own notebook, and the text components must be written in your own words. If you have your own data, you are welcome to work with it as a substitute for any (or all) of the four notebook assignments, so long as your work with that data fulfils the pedagogical goals of the assignment. Please consult with me about how to make that substitution.

Final Project

The purpose of this class is to give you the background and skills you will need to advance your career as a scholar. For that reason, you will design your own final project in the way that will best suit your needs at this moment. It can take the form of a conference paper, a grant proposal, a proposal for a project to complete independently in another quarter (I will be happy to supervise this), a teaching portfolio, a background paper for your qualifying exams, or any other project that addresses the themes of the class and is 3,500-5,000 words in length. You will submit a one-page (maximum) proposal by 5pm on 2/8. We will workshop projects in class on 3/14; final projects will be due by 5pm on 3/21.

Grading

Your grade will be calculated as follows:

  • Leading class discussion (5% each) - 10%
  • Notebook assignments (10% each) - 40%
  • Final project (including proposal) - 50%

Resources

All class readings are available online and linked from the syllabus. You may also find the following resources helpful:

Acknowledgments

This syllabus was inspired by and adapted from numerous syllabi in the digital humanities and digital history. The most direct and obvious influences are the courses HIST 7370: Texts, Maps, and Networks and HIST 7219: Humanities Data Analysis, taught by Ben Schmidt, Ryan Cordell, and Cameron Blevins at Northeastern University. I am also grateful to Scott Weingart for his list of coding resources.