Using Digital Trace Data in the Social Sciences, University of Konstanz (Summer 2018)

Instructor: Andreas Jungherr

Course Description

In the course, students will learn fundamental techniques of data collection preparation, and analysis with digital trace data in the social sciences. In this, we will focus on working with the microblogging-service Twitter. Over the course, students will learn fundamental techniques of data collection preparation, and analysis with digital trace data in the social sciences. In this, we will focus on working with the microblogging-service Twitter. Over the course, students are expected to become proficient in the use of two programming languages, Python and R.

The course is designed for students without prior training in programming or exmploratory data analysis. Still, by the end of course students are expected to independently perform theory-driven data collections on the microblogging-service Twitter and use these data in the context of a series of specified prototypical analyses. So make sure to take the time over the course of the semester to get acquainted with Python, R, and potentially SQL.

We will start the course by focusing on conceptual issues associated with the work with digital trace data. You will then learn to use fundamental practices in the use of the programming language Python. Following this, we will collect data from Twitter’s APIs through a set of example scripts written in Python. After downloading data from Twitter through Python, we will load these data into a SQLite database for ease of access and flexibility in data processing tasks. Finally, we will discuss a series of typical analytical procedures with Twitter-data. Here, we will focus on counting entities and establishing their relative prominence, time series analysis, and basic approaches to network analysis. For these analyses, we will predominantly rely on R.

The course follows closely a tutorial written by Pascal Jürgens and me, A Tutorial for Using Twitter Data in the Social Sciences: Data Collection, Preparation, and Analysis. The tutorial is freely available on the Social Science Research Network (SSRN) at. I recommend all participants in the course to download the tutorial and the accompanying set of scripts available on GitHub. You will very likely profit from preparing the respective sections of the tutorial before and after the corresponding session.

Here is the course syllabus.

The articles in readings are linked to from the syllabus and should be available to you by using your Uni-Konstanz VPN-access. The books listed are not linked to from the document but should be available to you through the dedicated online services of the respective publishers. Books by O’Reilly, PACKT, and Apress are available through the Proquest Safari-Books Online-Shelf. Access to Safari-Books Online is freely available by using your Uni-Konstanz VPN-access.

You can find detailed information on the content of the sessions, background readings, slides, and example code at the dedicated posts per session:

Course Schedule

Week 1: Introduction and Conceptual Issues in the Use of Digital Trace Data in Social Science (April 19)

Week 2: Set Up and Introduction to Collecting Data on Twitter (April 26)

Week 3: Introduction to Python (May 3)

Week 4: Christi Himmelfahrt (May 10)—no meeting

Week 5: Collecting Data Through Twitter’s API (May 17)

Week 6: How to Find A Research Question? & Data Lab (May 24)

Week 7: Fronleichnam (May 31)—no meeting

Week 8: Loading Twitter Data Into a Database (May 31)

Week 9: Sample Analyses: Counts & Time Series (June 7)

Week 10: Sample Analyses: Networks (June 14)

Week 11: Independent Study and Preparation of Presentations (June 21 )—no meeting

Week 12: Student Presentations I. (June 28)

Week 13: Student Presentations II. (July 5)

Week 14: Student Presentations III. / Where to take it from here? Discussion of Open Questions and Paper (July 19)

Instructor Bio

Andreas Jungherr is Juniorprofessor (Assistant Professor) for Social Science Data Collection and Analysis at the University of Konstanz, Germany. His research addresses political communication with a focus on the use of digital technology by organizations. His work realizes the potential of computational social science and addresses methodological challenges in its integration into the social sciences.

He is author of the books Analyzing Political Communication with Digital Trace Data: The Role of Twitter Messages in Social Science Research (Springer: 2015) and Das Internet in Wahlkämpfen: Konzepte, Wirkungen und Kampagnenfunktionen (with Harald Schoen, Springer VS: 2013). His articles have appeared in, among other places, Review of International Political Economy, Journal of Communication, Journal of Computer-Mediated Communication, Party Politics, The International Journal of Press/Politics, and The Social Science Computer Review.


Week 1


Impressum